We’re still excited about stackconf 2022! Our Open Source Infrastructure Conference, which for the very first time took place in person in Berlin this year. We’ve had many awesome speakers on stage and one of their outstanding lectures I will present to you in the following.
Spotify had one of its most disruptive outages in recent history in the evening of 8.3.2022 Tue 19:00 CET, which resulted in over an hour of downtime and users getting logged out. Kat Liu, Senior Software Engineer at Spotify Berlin, explained the storm of this incident.
Kat was enjoying her day off because of International Women’s Day when she received the first alert. In a short time, she received many such alerts, and it became clear that there was a serious issue. Hundreds of people have posted online that they have been logged out and can no longer log in.
As you can see in the screenshot above, there is a warning with the message: Failed to resolve name. The reason for this warning was that the internal system could not resolve the name of service2 because service2 was down, which caused the outage.
The solution for this problem was very simple, just revert all services back to using the Nameless system. The outage was mostly restored by 19:40 CET.
But why were users logged out?
The screenshot above shows how service1 calls service2. Since Service2 was not available, an incorrect NOT_FOUND error was returned, causing the user to be logged out and unable to log back in.
This error was later changed to UNAVAILABLE.
An outage lasting about 40 minutes resulted in about 50 million login sessions were disrupted.
Over the next few days/weeks, 3 million new duplicate accounts were created as many users were not regularly logging into Spotify and had forgotten their credentials.
That was just a short summary of Kat Liu’s talk at stackconf 2022. You can watch her full talk on our YouTube channel. Enjoy!
And don’t forget to register for the stackconf newsletter to stay tuned about the upcoming plans for next year’s stackconf! See you there!