pixel
Seite wählen

NETWAYS Blog

OSMC 2021 | pg_stat_monitor: A cool extension for better database (PostgreSQL) monitoring

This entry is part 2 of 22 in the series OSMC 2021

OSMC 2021 has brought many insights into latest monitoring trends and we’re still amazed about that great on-site event last autmn. We’re happy to had a big amount of international attendees, 28 top-level speakers and not to forget our much valued sponsors on bord – in short: we’re grateful for everyone who made OSMC 2021 a special and an exciting event once again!

For all of you who would like to listen to last year’s experts sessions as a follow-up, I’ve created this blog series. It reminds you bi-weekly of one of our OSMC lectures including its video recording.

Today it’s all about Denys Kondratenko and his talk pg_stat_monitor: A cool extension for better database (PostgreSQL) monitoring“. His talk covers the usage of pg_stat_monitor and how it is better than pg_stat_statements.

Enjoy Denys’ lecture!

 

YouTube player

 

If you’re already curious about what will await you at this year’s Open Source Monitoring Conference mark your calendars for November 14 – 16, 2022.

Tickets for the conference itself as well as for our beloved workshops are already available. Grab yours now!

 

We’re looking forward to meeting you all again this autumn in Nuremberg.

Stay tuned!

Katja Kotschenreuther
Katja Kotschenreuther
Marketing Manager

Katja ist seit Oktober 2020 Teil des Marketing Teams. Als Online Marketing Managerin kümmert sie sich neben der Optimierung unserer Websites und Social Media Kampagnen hauptsächlich um die Bewerbung unserer Konferenzen und Trainings. In ihrer Freizeit ist sie immer auf der Suche nach neuen Geocaches, bereist gern die Welt, knuddelt alle Tierkinder, die ihr über den Weg laufen und stattet ihrer niederbayrischen Heimat Passau regelmäßig Besuche ab.

stackconf 2022 | Spotify’s outage of 8.3.2022, explained

We’re still excited about stackconf 2022! Our Open Source Infrastructure Conference, which for the very first time took place in person in Berlin this year. We’ve had many awesome speakers on stage and one of their outstanding lectures I will present to you in the following.

Spotify had one of its most disruptive outages in recent history in the evening of 8.3.2022 Tue 19:00 CET, which resulted in over an hour of downtime and users getting logged out. Kat Liu, Senior Software Engineer at Spotify Berlin, explained the storm of this incident.

Kat was enjoying her day off because of International Women’s Day when she received the first alert. In a short time, she received many such alerts, and it became clear that there was a serious issue. Hundreds of people have posted online that they have been logged out and can no longer log in.

The Outage

As you can see in the screenshot above, there is a warning with the message: Failed to resolve name. The reason for this warning was that the internal system could not resolve the name of service2 because service2 was down, which caused the outage.

The Fix

The solution for this problem was very simple, just revert all services back to using the Nameless system. The outage was mostly restored by 19:40 CET.

But why were users logged out?

The screenshot above shows how service1 calls service2. Since Service2 was not available, an incorrect NOT_FOUND error was returned, causing the user to be logged out and unable to log back in.

This error was later changed to UNAVAILABLE.

The Aftermath

An outage lasting about 40 minutes resulted in about 50 million login sessions were disrupted.

Over the next few days/weeks, 3 million new duplicate accounts were created as many users were not regularly logging into Spotify and had forgotten their credentials.

That was just a short summary of Kat Liu’s talk at stackconf 2022. You can watch her full talk on our YouTube channel. Enjoy!
And don’t forget to register for the stackconf newsletter to stay tuned about the upcoming plans for next year’s stackconf! See you there!

Sukhwinder Dhillon
Sukhwinder Dhillon
Developer

Sukhwinder hat 2021 seine Ausbildung als Fachinformatiker für Anwendungsentwicklung bei NETWAYS erfolgreich abgeschlossen. In seiner Freizeit fährt er gerne Fahrrad, trifft sich mit Freunden, geht Joggen oder sitzt vorm Computer und lernt etwas Neues.

OSMC 2021 | Gamification of Observability

This entry is part 7 of 22 in the series OSMC 2021

OSMC 2021 has brought many insights into latest monitoring trends and we’re still amazed about that great on-site event last autmn. We’re happy to had a big amount of international attendees, 28 top-level speakers and not to forget our much valued sponsors on bord – in short: we’re grateful for everyone who made OSMC 2021 a special and an exciting event once again!

For all of you who would like to listen to last year’s experts sessions as a follow-up, I’ve created this blog series. It reminds you bi-weekly of one of our OSMC lectures including its video recording.

Today it’s all about Bram Vogelaar and his talk Gamification of Observability“. 

Enjoy Bram’s lecture!

 

YouTube player

 

If you’re already curious about what will await you at this year’s Open Source Monitoring Conference mark your calendars for November 14 – 16, 2022.

Tickets for the conference itself as well as for our beloved workshops are already available. Grab yours now!

 

We’re looking forward to meeting you all again this autumn in Nuremberg.

Stay tuned!

Katja Kotschenreuther
Katja Kotschenreuther
Marketing Manager

Katja ist seit Oktober 2020 Teil des Marketing Teams. Als Online Marketing Managerin kümmert sie sich neben der Optimierung unserer Websites und Social Media Kampagnen hauptsächlich um die Bewerbung unserer Konferenzen und Trainings. In ihrer Freizeit ist sie immer auf der Suche nach neuen Geocaches, bereist gern die Welt, knuddelt alle Tierkinder, die ihr über den Weg laufen und stattet ihrer niederbayrischen Heimat Passau regelmäßig Besuche ab.

stackconf 2022 | DevOps or DevX – Lessons We Learned Shifting Left the Wrong Way

stackconf 2022 was a full success! On July 19 and 20, our conference took place in Berlin and we very much enjoyed the event, which has been on site for the very first time! stackconf was all about open source infrastructure solutions in the spectrum of continuous integration, container, hybrid and cloud technologies. We’re still excited about our expert speaker sessions. In the following you get a deeper insight into one of our talks.

We kicked off the lecture program with a talk by Hannah Foxwell on “DevOps or DevX – Lessons We Learned Shifting Left The Wrong Way”. Here is what I’ve learned from Hannah:

Once Upon A Time

DevOps is such a common term now that it has almost lost its accurate meaning. Once upon a time there were two teams, Devs and Ops, with different missions and goals – rapid development vs. stable user experience. Changes were handed over just like that and great effort was put into getting even the smallest features into production to the customer in a stable way. For sure: This needed to change!

While some people felt the problem was the Ops team. Here, NoOps was a thing. This misconception came from thinking that the Ops team didn’t care about users because the Ops team didn’t want to release the new features fast enough. As a result, more and more typical Ops tasks like backup, monitoring or cost management were outsourced to developers. At a certain point, these additional tasks became too much for the dev team, which some developers were also unhappy with.

Focus on Team Health

According to a report by Haystack Analytics, 83% of all developers suffer from burnout, mostly triggered by the demands of having to learn and consider more and more technologies and areas.
Here you have to pay more attention to HumanOps again to focus on the health of the team.

Just like the old ways of splitting everything into silos, the NoOps approach was the wrong way to go. Here, it’s important to use mixed teams with a product-owner mentality for the different layers. Each team is responsible for delivering the best possible experience for their users.

Hannah also touched on how important the proper site reliabitily is and how it can impact the team. With a 99% reliability over 28 days, you have 400 minutes, enough time for manual intervention. The larger the reliability, the less time and more stress the team has until only automatic interventions are possible to stay within the time. Here, no human can react fast enough.

On Site Realiability

But you also have to see if this is needed by the user. Many users don’t even notice a short disruption, and if they do, some aren’t even bothered by it – contrast this with the cost and effort of taking measures. Depending on the level of site reliability needed, monitoring measures range from user input to active monitoring to automatic rollbacks.

You also have to decide how to allocate this downtime at each level – the closer you are to the physical hardware, the lower the downtime needs to be.
Whereas site reliability should not be a single responsibility, this is where all teams need to work together.

Finally, Hannah explained the security aspects that need to be considered with software. Bugs like Log4Shell can be avoided with the right security mindset. An open culture is important here, where you can also discuss and criticize your own concept.

When creating the security concept, you should also consider the people who implement the measures as well as how to automate it. Some security aspects should also not be carried out by individual teams alone, but across entire teams. You can avoid a strong leftward slide towards the dev team with this and still not work in isolated silos if you have a user-centric focus with it and pay attention to the people in the process.

That was just a short summary of Hannah Foxwell’s talk at stackconf 2022. You can watch her full talk on our YouTube Channel.
I’m already looking forward to the talks at the next stackconf and the opportunity to share thoughts and experiences with a wide variety of cool people there.

Take a look at our conference website to learn more about stackconf, check out the archives and register for our newsletter to stay tuned!

Michael Kübler
Michael Kübler
Systems Engineer

Michael war jahrelang in der Gastronomie tätig, bevor er 2022 seine Umschulung als Fachinformatiker bei Netways abschloss. Seitdem unterstützt er unsere Kunden bei ihren Projekten als MyEngineer und sucht auch nebenbei kleinere Projekte, die er realisieren kann. Privat geht er gerne Campen und fährt Rad. Er genießt auch einen entspannten Abend daheim mit einem Buch und Whisky.

OSMC 2021 | Secure Password Vaults with Naemon

This entry is part 7 of 22 in the series OSMC 2021

OSMC 2021 has brought many insights into latest monitoring trends and we’re still amazed about that great on-site event last autmn. We’re happy to had a big amount of international attendees, 28 top-level speakers and not to forget our much valued sponsors on bord – in short: we’re grateful for everyone who made OSMC 2021 a special and an exciting event once again!

For all of you who would like to listen to last year’s experts sessions as a follow-up, I’ve created this blog series. It reminds you bi-weekly of one of our OSMC lectures including its video recording.

Today it’s all about Sven Nierlein and his talk Secure Password Vaults with Naemon“. He presents a tool called Naemon. It introduces the Vault API for secure storage of passwords and other things you won’t like to store in plain text. So it makes things a bit harder for black hats.

Enjoy Sven’s lecture!

 

YouTube player

 

If you’re already curious about what will await you at this year’s Open Source Monitoring Conference mark your calendars for November 14 – 16, 2022.

Have you already checked out this year’s speakers line-up? It’s already online.

Tickets for the conference itself as well as for our beloved workshops are already available. Grab yours now!

 

We’re looking forward to meeting you all again this autumn in Nuremberg.

Stay tuned!

Katja Kotschenreuther
Katja Kotschenreuther
Marketing Manager

Katja ist seit Oktober 2020 Teil des Marketing Teams. Als Online Marketing Managerin kümmert sie sich neben der Optimierung unserer Websites und Social Media Kampagnen hauptsächlich um die Bewerbung unserer Konferenzen und Trainings. In ihrer Freizeit ist sie immer auf der Suche nach neuen Geocaches, bereist gern die Welt, knuddelt alle Tierkinder, die ihr über den Weg laufen und stattet ihrer niederbayrischen Heimat Passau regelmäßig Besuche ab.