stackconf online: stream, chat, network

stackconf is coming! We have crafted an inspiring agenda for you! Here are some highlights that you surely don’t wanna miss.

Hybrid cloud environments

Hybrid cloud environments were never easy and mostly just an idea. With Kubernetes though it is easy to deploy your infrastructure on premise and to a cloud provider.

This is where you learn more about hybrid cloud environments:


Increased portability makes containers so powerful. A big plus is that it can prevent a vendor lock-in. Applications running in containers can be deployed easily to multiple different operating systems and hardware platforms.

Learn more about containers in these talks:

Agile methods & continuous integration

Good IT is not possible with just the right tool. It is important that you have your communication and processes in shape. DevOps, agile, you name it always comes down to communication and transparency. Tear down silos to make your IT infrastructure and business as fast and efficient as possible.

Get to know agile methods and continuous integration practices in these talks:

And we have even more for you: Find the full schedule at

This is how it works: stream, chat, network

All the recorded talks will be presented in a single track over the course of three days from June 16 – 18, 2020. You can easily attend and watch!

Just watching, or what? Of course not! stackconf is a highly interactive online event – if you want it to be! Chat with the experts. Spice up open discussions with your thoughts in the hallway channel and in various topic-related channels. Meet friends and fellow infrastructure experts in private chats. Plus, there might be one or another challenge or gaming opportunity.

Short: There is plenty of opportunity to learn and have fun!

Fruitful discussions and hardest questions

While you watch a talk in the stream the speaker will be present in a parallel live chat to answer your questions. Feel free to comment the presentation, add your thoughts to it and ask your hardest questions! We are looking forward to a fruitful and inspiring exchange of ideas.

During lunch break a moderated discussion will take place. Join in and contribute to it. There will be additional room for networking and informal get-together during the community evening at the second conference day – Wednesday, June 17, 2020. We’ll continue the topic-related discussions, and open the gaming arena, where you can compare your gaming skills.

Stay up to date

Get posted on the latest expert knowledge on open source infrastructure solutions! We look forward to seeing you online at stackconf.

stackconf online takes place June 16 – 18, 2020. More:

Julia Hornung
Julia Hornung
Marketing Manager

Julia ist seit Juni 2018 Mitglied der NETWAYS Family. Vor ihrer Zeit in unserem Marketing Team hat sie als Journalistin und in der freien Theaterszene gearbeitet. Ihre Leidenschaft gilt gutem Storytelling, klarer Sprache und ausgefeilten Texten. Privat widmet sie sich dem Klettern und ihrer Ausbildung zur Yogalehrerin.

Evolution of a Microservice-Infrastructure by Jan Martens | OSDC 2019

This entry is part 1 of 6 in the series OSDC 2019 | Recap


At the Open Source Data Center Conference (OSDC) 2019 in Berlin, Jan Martens invited to audience to travel with him in his talk „Evolution of a Microservice-Infrastructure”. You have missed him speaking? We got something for you: See the video of Jan‘s presentation and read a summary (below).

The former OSDC will be held for the first time in 2020 under the new name stackconf. With the changes in modern IT in recent years, the focus of the conference has increasingly shifted from a mainly static infrastructure approach to a broader spectrum that includes agile methods, continuous integration, container, hybrid and cloud solutions. This development is taken into account by changing the name of the conference and opening the topic area for further innovations.

Due to concerns around the coronavirus (COVID-19), the decision was made to hold stackconf 2020 as an online conference. The online event will now take place from June 16 to 18, 2020. Join us, live online! Save your ticket now at:


Evolution of a Microservice-Infrastructure

Jan Martens signed up with a talk titled “Evolution of a Microservice Infrastructure” and why should I summarize his talk if he had done that himself perfectly: “This talk is about our journey from Ngnix & Docker Swarm to Traefik & Nomad.”

But before we start getting more in depth with this talk, there is one more thing to know about it. This is more or less a sequel to “From Monolith to Microservices” by Paul Puschmann a colleague of Jan Martens, but it’s not absolutely necessary to watch them in order or both.


So there will be a bunch of questions answered by Jan during the talk, regarding their environment, like: “How do we do deployments? How do we do request routing? What problems did we encounter, during our infrastructural growth and how did we address them?”

After giving some quick insight in the scale he has to deal with, that being 345.000 employees and 15.000 shops, he goes on with the history of their infrastructure.

Jan works at REWE Digital, which is responsible for the infrastructure around services, like delivery of groceries. They started off with the takeover of an existing monolithic infrastructure, not very attractive huh? They confronted themselves with the question: “How can we scale this delivery service?” and the solution they came up with was a micro service environment. Important to point out here, would be the use of Docker/Swarm for the deployment of micro services.

Let’s skip ahead a bit and take a look at the state of 2018 REWE Digital. Well there operating custom Docker-Environment consists of: Docker, Consul, Elastic Stack, ngnix, dnsmasq and debian

Jan goes into explaining his infrastructure more and more and how the different applications work with each other, but let’s just say: Everything was fine and peaceful until the size of the environment grew to a certain point. And at that point problems with nginx were starting to surface, like requests which never reached their destination or keepalive connections, which dropped after a short time. The reason? Consul-template would reload all ngnix instances at the same time. The solution? Well they looked for a different reverse proxy, which is able to reload configuration dynamically and best case that new reverse proxy is even able to be configured dynamically.

The three being deemed fitting for that job were envoy, Fabio and traefik, but I have already spoiled their decision, its treafik. The points Jan mentioned, which had them decide on traefik were that it is dynamically configurable and is able to reload configuration live. That’s obviously not all, lots of metrics, a web ui, which was deemed nice by Jan and a single go binary, might have made the difference.

Jan drops a few words on how migration is done and then invests some time in talking about the benefits of traefik, well the most important benefit for us to know is, that the issues that existed with ngnix are gone now.

Well now that the environment was changed, there were also changes coming for swarm, acting on its own. The problems Jan addresses are a poor container spread, no self-healing, and more. You should be able to see where this is going. Well the candidates besides Docker Swarm are Rancher, Kubernetes and Nomad. Well, this one was spoiled by me as well.

The reasons to use nomad in this infrastructure might be pretty obvious, but I will list them anyway. Firstly, seamless consul integration, well both are by HashiCorp, who would have guessed. Nomad is able to selfheal and comes in a single go binary, just like traefik. Jan also claims it has a nice web UI, we have to take his word on that one.

Jan goes into the benefits of using Nomad, just like he went into the benefits of ngnix and shows how their work processes have changed with the change of their environment.

This post doesn’t give enough credit to how much information Jan has shared during his talk. Maybe roughly twenty percent of his talk are covered here. You should definitely check it out the full video to catch all the deeper more insightful topics about the infrastructure and how the applications work with each other.

Alexander Stoll
Alexander Stoll
Junior Consultant

Alexander ist ein Organisationstalent und außerdem seit Kurzem Azubi im Professional Services. Wenn er nicht bei NETWAYS ist, sieht sein Tagesablauf so aus: Montag, Dienstag, Mittwoch Sport - Donnerstag Pen and Paper und ein Wochenende ohne Pläne. Den Sportteil lässt er gern auch mal ausfallen.

5 Steps to a DevOps Transformation by Dan Barker | OSDC 2019

This entry is part 2 of 6 in the series OSDC 2019 | Recap


“It’s not what we believe, it’s what we do that defines our culture”, was on his first slide. At the Open Source Data Center Conference (OSDC) 2019 Dan Barker presented “5 Steps to a DevOps Transformation”. Those who missed the talk back then now get the chance to see the video of Dan’s presentation and read a summary (below).

The former OSDC will be held for the first time in 2020 under the new name stackconf. With the changes in modern IT in recent years, the focus of the conference has increasingly shifted from a mainly static infrastructure approach to a broader spectrum that includes agile methods, continuous integration, container, hybrid and cloud solutions. This development is taken into account by changing the name of the conference and opening the topic area for further innovations. Transformation rules!

Due to concerns around the coronavirus (COVID-19), the decision was made to hold stackconf 2020 as an online conference. The online event will now take place from June 16 – 18, 2020. Join us, live online! Save your ticket now at

5 Steps to a DevOps Transformation

In order to be successful in the new digital economy, it is essential to continuously improve the quality, speed and efficiency of your own organization.

“In this session, we’ll walk through the five steps to transformational change that I’ve found to be important. These are really applicable to any continuously improving organization or any large amount of change in a system. Establish the vision. Create shared experiences. Educate, educate, educate. Find evangelists; Get feedback. I’ll elaborate on each item with methods I’ve used in real transformations at multiple companies. I’ll also describe how these all tie into the DevOps culture, which is really the transformation that’s occurring within the company.”

DevOps professionals primarily work in the tech and software world, creating new technology products, software, and other user services. You will play a key role in the development of new ideas for products and services and manage the process of turning these ideas into realities.

Establish the vision

“A strong team can take any crazy vision and turn it into reality” – John Carmack

The vision creates empowerment

  • But I‘m not a leader!!!
  • Bold
  • Inspiring
  • Actionable

Pathological – Power oriented

Bureaucratic – Rule oriented

Generative – Performance oriented

If your company values increased productivity, profitability, and market share then DevOps is essential. Even if your goals are non-financial, DevOps will enhance your ability to achieve those goals. The State of DevOps report soundly backs up these claims. More importantly, if your competition has already implemented DevOps and you haven’t, you are already behind. That’s how Walmart feels now that Amazon has built the world’s most efficient shopping platform.

Bad vision → bad outcomes

  • Biased for failure
  • No vision
  • IT-focused
  • Lack of clarity – JFK Moonrace
  • Not actionable

Find evangelists

“It is not about whether you call yourself a leader or not. It is about what you have to show to people as a leader. Leadership is contagious, you carry it and share it” – Israelmore Ayivor

The control mechanisms that are currently in place to manage your people and projects may not be suited for the DevOps world. You have to be willing to look at items that prevent agility, scalability, and responsiveness and change them. DevOps will provide agility, scalability, and responsiveness, so anything that hinders that process needs to be aligned with the new model.

You can‘t do it alone

  • Use anyone willing to help
  • Nurture this team
  • This team is a bellwether
  • Publicly praise team members

When your organization moves towards developing a DevOps culture, it’s signaling to everyone that participates in the production and release of software they have an equal stake in the success of the company. It’s an all for one, one for all mentality that will break down the communication barriers between teams and make everyone accountable. Once DevOps roles and responsibilities are implemented positive changes will occur, and everyone wins.

Create shared experiences

“Words are symbols for shared memories. If I use a word, then you should have some experience of what the word stands for. If not, the word means nothing to you.” – Jorge Luis BorgesIm

Bringing people together by sharing

  • Two levels
    • Leadership
    • Organization
  • Equally important

Leadership teams need landmarks

  • Shared information model
  • Reference point
  • Provides inspiration
  • Repeat

To start down your path to DevOps success you need to build a proper DevOps organization which includes all the proper team members. However, the size of your organization plays a big role on how granular you can be with your team. But size doesn’t really matter if you properly define the roles and responsibilities across the organization. The important thing is to make a commitment to the process and get started

The core responsibility that needs to exist is the person who owns the entire DevOps process. This person would usually be someone in a senior position. They are the keeper of the process and procedures and guarantor of the delivery of DevOps value. I like to think of this person as the DevOps evangelist. Aside from the leader, you would need to establish, at a minimum, the following roles: Code Release Manager, Automation Expert, Quality Assurance, Software Developer/Tester, and Security Engineer. The DevOps duties for each of these resources are described below.

Don‘t leave everyone else behind

  • Shared information model
  • Provides motivation
  • Leaders should be leading
  • How?


“An investment in knowledge pays the best interest” – Benjamin Franklin

Learn something new to build something new

  • Knowledge changes outcomes
  • Make it priority
  • Make it available
  • Monitor it

Measure what matters

  • Accelerate by Dr. Forsgren
  • Westrum Culture Survey
  • User Surveys
  • 1:1 Feedback
  • CultureAmp

Everyone in the company is sailing on the same ship. If the tide goes up so does the ship and everyone on it. But if the tide goes down so does the ship, but no one on the ship is to blame.

Everyone learns differently

  • Online training
  • In-person classes
  • Newsletters
  • Conferences
  • Hackathons

Get feedback

“True intuitive expertise is learned from prolonged experience with good feedback on mistakes” – Daniel Kahneman

Quellen und Nachschlagewerke

Aleksander Arsenovic
Aleksander Arsenovic
Junior Consultant

Aleksander macht eine Ausbildung zum Fachinformatiker für Systemintegration in unserem Professional Service. Wenn er nicht bei NETWAYS ist, schraubt er an seinem Desktop-PC rum und übertaktet seine Hardware. Er ist immer für eine gute Konversation zu haben.
Monthly Snap April 2020

Monthly Snap April 2020

NETWAYS trainings – now also online!

Julia kicked off the month with the announcement, that we now offer online-trainings: Weiterbildung im Homeoffice: Wir bieten jetzt Online Trainings. Sign up for one of our trainings and attend from wherever you prefer! If you want to sit on your balcony, on a mountain top or lie in bed while learning – who are we to keep you from it?



Stackconf 2020 goes online! Julia shared this good news with us! One of the highlights of the conference year will not be cancelled. And you don’t even have to convince your boss that the travel time and expenses for accommodation are necessary! Join us in June – online!
Last year the stackconf was called OSDC, and one of the talks was Achim‘s Storage Wars – Using Ceph since Firefly | OSDC 2019. If you missed it here is your chance to see – and read it! Alexander recapped Fast log management for your infrastructure by Nicolas Frankel | OSDC 2019.



The current situation with working from home put Dirk in a position where he had to seek new ways to teach our apprentices and find new solutions for online trainings. Read Einfacher als gedacht: Openstack + Vagrant. A while ago Johannes had the task of setting up a Redis cluster. He shared his experience with us in Aufsetzen eines Redis-clusters. One way or another most companies want Kubernetes. But does it have to be a hosted solution? Sebastian gave us the pros and cons in Managed Kubernetes vs. Kubernetes On-Premises.
Markus informed us that HPE SSD drives are vulnerable to uptime counter bug. Luckily, they have developed an Icinga plugin, that recognizes which drives are affected. Also read his follow up blog HP controller firmware issues to check.


More news from Icinga!

Florian proudly presents a new feature for Icinga DB: Bringing Bulk Editing to a new Level with Icinga DB. Our very own poet and songwriter Thomas wrote this great Icinga- and Corona-Poem: Das Runde ist das Fleckige. It was definitely one of the highlights in April! Lennart gave us tips on notifications with Icinga in Benachrichtigung mal einfach. Check out the cool new Icinga Web themes in Tobias’ Neue Icinga Web Themes verfügbar! – Bayerisch, Fränkisch, Österreichisch.



Starface offers several options for companies in situations in which working from home is the better option. As Nicole states, it doesn’t always have to be because of a virus: STARFACE: Relevante Themen für Telefonie im Home Office. Also read her blog HW group: Neue Features für das STE2 for information about the renewed STE2 R2. For more information on HW products, have a look at their YouTube channel! A link can be found in this post: HW group: Stark auf YouTube!



In our blog-series NETWAYS stellt sich vor new members of the NETWAYS family share a bit about themselves. This month read about Sukhwinder and Moumen!


Catharina Celikel
Catharina Celikel
Office Manager

Catharina unterstützt seit März 2016 unsere Abteilung Finance & Administration. Die gebürtige Norwegerin ist Fremdsprachenkorrespondentin für Englisch. Als Office Manager kümmert sie sich deshalb nicht nur um das Tagesgeschäft sondern übernimmt nebenbei zusätzlich einen Großteil der Übersetzungen. Privat ist der bekennende Bücherwurm am liebsten mit dem Fahrrad unterwegs.

Fast log management for your infrastructure by Nicolas Frankel | OSDC 2019

This entry is part 3 of 6 in the series OSDC 2019 | Recap


Nicolas Frankel is a Developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts. “Fast log management for your infrastructure” was his topic at the Open Source Data Center Conference (OSDC) 2019 in Berlin. Those who missed the talk back then now have the opportunity to see the video of Nicolas’ presentation and read a summary (below).

The former OSDC will be held for the first time in 2020 under the new name stackconf. With the changes in modern IT in recent years, the focus of the conference has increasingly shifted from a mainly static infrastructure approach to a broader spectrum that includes agile methods, continuous integration, container, hybrid and cloud solutions. This development is taken into account by changing the name of the conference and opening the topic area for further innovations.

We are proud to announce that Nicolas Frankel is in our speaker lineup this year, too. We are looking forward to his talk: “Real Continuous Deployment of JVM applications”.

Due to concerns around the coronavirus (COVID-19), the decision was made to hold stackconf 2020 as an online conference. The online event will now take place June 16 – 18, 2020. Be there, live online! Save your ticket now at:

Fast log management for your infrastructure

Fast log management for your infrastructure”, well that is one way to get OSDC visitors excited. Nicolas Frankel signed up with that one and he did not disappoint. The issues, he was tackling, were issues produced by optimization, that being said do you think about the logs when it comes to migrating your application to reactive micro services?

Before we get to all that, Nicolas had to take a little detour through programming logic and how logging works, and he also points out some misconceptions of how things are done and how they work. Like for example, his so called “[…] root of all evil”.

"Cart price is now {}", cart.getPrice())

He states the question, who believes that in case of the log level being above debug the statements will be ignored? That’s what is to be expected, however it is not the case. In a small demo section he gives further insight on the topic from the perspective of a software developer.

From the developer point of view one should only do physical logging is the statement he ends his demo explanation run on. Directly afterwards he states that developers do not like to think that they are dealing with the physical world, then he goes further on about the respective storage possibilities like the write time regarding SSDs, HDDs or on an NFS, which should be taken into account.

Tackled some issues already, Nicolas keeps switching back and forth between the perspective of a software developer and an operator. He puts a lot of empathizes on these perspective changes to make sure that everyone involved starts to understand where the issue lies and if there is an issue at all.

For example the writing process and the opening and closing of streams for single log statements. It would be great if the stream could be continuously open and log statements can be written until the stream can be closed. But arguably and in most cases by default, logging is blocking. While most frameworks allow asynchronous logging, there is no right or wrong. And it also doesn’t have to be a software development mistake nor a bad infrastructure.

He dives deeper into asynchronous logging, because if you want to use it, you have to understand it: from queue size to discarding thresholds, the difference between blocking and dropping messages, everything. Nicolas also covers some logging basics, like metadata and what is especially important. Most essential metadata named timestamp, log level, line number and more. You may ask, why? Because some metadata is more expensive to get than others.

After some more detours through log aggregations and common pitfalls, with searching in logs or mandatory metadata, we get to a well-known application stack in the world of logging, the Elastic Stack.

He explains the basic architecture of the Elastic Stack and how the applications work with each other. Especially Filebeat and Logstash take the spotlight during this part. Step by step he works his way through an abstraction of the path a log takes from Filebeat to Logstash until you get a JSON you are familiar with. Then common misunderstandings like “Why do I need Logstash at all?” are being tackled by him, before he goes onto how he is doing logging at Exoscale.

They are using syslog-ng instead of Filebeat, basically just because when they started Filebeat was not ready for production. Then a regular Logstash and before we come to Elasticsearch there is a Kafka running. The reason why they are using Kafka is that Kafka being a decentralized data store, and using Logstash to get data out of it there is lower risk of dropping data instead of buffering towards elasticsearch, because there are not multiple nodes writing at once.

Nicolas summarizes his talk at the end with six short statements or maybe even lessons for log management. If you want, head over to the video above to learn about them from Nicolas himself or experience him live to learn from him.

Alexander Stoll
Alexander Stoll
Junior Consultant

Alexander ist ein Organisationstalent und außerdem seit Kurzem Azubi im Professional Services. Wenn er nicht bei NETWAYS ist, sieht sein Tagesablauf so aus: Montag, Dienstag, Mittwoch Sport - Donnerstag Pen and Paper und ein Wochenende ohne Pläne. Den Sportteil lässt er gern auch mal ausfallen.

Storage Wars – Using Ceph since Firefly | OSDC 2019

This entry is part 4 of 6 in the series OSDC 2019 | Recap


Achim Ledermüller ist Lead Senior Systems Engineer bei NETWAYS und kennt sich aus in Sachen Storage. 2019 hat er seinen Talk “Storage Wars – Using Ceph since Firefly” auf der Open Source Data Center Conference (OSDC) in Berlin gehalten. Wer den Talk verpasst hat, bekommt nun die Möglichkeit, Achims Vortrag noch einmal zu sehen und zu lesen (siehe weiter unten).

Die OSDC wird 2020 erstmalig unter neuem Namen stackconf veranstaltet. Mit den Veränderungen in der modernen IT hat sich in den vergangenen Jahren zunehmend auch der Fokus der Konferenz verlagert: Von einem hauptsächlich auf statische Infrastrukturen zielenden Ansatz zu einem breiteren Spektrum, das agile Methoden, Continuous Integration, Container-, Hybrid- und Cloud-Lösungen umfasst. Dieser Entwicklung soll mit der Namensänderung der Konferenz Rechnung getragen und das Themenfeld für weitere Innovationen geöffnet werden.

Aufgrund der Bedenken rund am das Coronavirus (COVID-19) wurde die Entscheidung getroffen, die stackconf 2020 als Online-Konferenz stattfinden zu lassen. Das Online-Event findet nun vom 16. bis 18. Juni 2020 statt. Live dabei sein lohnt sich! Jetzt Ticket sichern unter:

Storage Wars – Using Ceph since Firefly

Wenn es um Storage geht, ist die Erwartungshaltung klar definiert. Storage muss immer verfügbar, skalierbar, redundant, katastrophensicher, schnell und vor allem billig sein. Mit dem Unified Storage Ceph kann man zumindest die meisten Erwartungen ohne Abstriche erfüllen. Durch das Prinzip, wie Ceph Daten zerlegt, speichert und repliziert, ist die Verfügbarkeit, Skalierbarkeit und Redundanz gewährleistet und auch eine katastrophensichere Spiegelung eines Clusters ist ohne Probleme möglich.

Neben den angebotenen Features ist aber auch ein reibungsloser unterbrechungsfreier Betrieb wichtig. Während des fast siebenjährigen Betriebs unseres Clusters änderten sich fast alle Komponenten. Betriebsysteme, Kernel und Init-Systeme wurden ausgewechselt. Alte Netzwerkkarten wurden durch 10GE- und 40GE-Schnittstellen abgelöst und vervierzigfachten ihren Durchsatz. Layer 2 wird wegen Routing on the Host immer unwichtiger, Festplatten sind plötzlich dank moderner SSDs und NVMe nicht mehr das Bottleneck und natürlich gab es auch immer wieder neue Versionen von Ceph selbst. Zwischen all diesen Neuerungen in den letzten Jahren ist natürlich genügend Platz für kleine und große Katastrophen. Umso wichtiger ist es, dass man von Anfang an ein paar grundlegende Dinge richtig macht:

Limitiere IO-intensive Jobs

Im normalen Betrieb laufen verschiedene Aufgaben im Hintergrund, um einen gesunden Zustand des Clusters zu garantieren. Scrubbing Jobs prüfen die Integrität aller gespeicherten Daten einmal pro Woche, Platten- und andere Hardwarefehler veranlassen Ceph, die Anzahl der Replika automatisch wieder herzustellen und auch das Löschen von Snaphosts bringt die Festplatten zum Glühen. Für jedes dieser kleinen Probleme bietet Ceph Konfigurationsmöglichkeiten, um größere Auswirkungen auf Latenzen und Durchsatz der Clients zu verhindern.

Neben dem Datenmanagement durch Ceph selbst wird das Cluster natürlich auch von vielen Clients beansprucht. In unserem Fall wollen virtuelle Maschinen aus OpenStack und OpenNebula an ihre Daten, verschiedenste WebClients wie GitLab, Nextcloud, Glance und andere senden Swift- und S3-Anfragen und ein zentrales NFS-Storage will natürlich auch Tag und Nacht bedient werden. Auch hier kann eine Begrenzung der Requests durch libvirtd, rate-limiting oder andere Mechanismen sinnvoll sein.

Kenne deine Anforderungen

Die Anforderungen der Clients an die Latenz und den Durchsatz des Storage-Systems können sehr unterschiedlich sein. Features wie Replikation und Verfügbarkeit werden mit erhöhten Latenzen erkauft. Große Festplatten mit Spindeln drücken die Kosten, allerdings sind auch nur noch wenige Benutzer und Anwendungen mit wenigen IOPs, höheren Latenzen und dem geringen Durchsatz zufrieden. Was für Archivdaten kein Problem ist, sorgt bei Datenbanken oft für unglückliche Benutzer. Schnellere Festplatten und eine geringe Latenz im Netzwerk verbessern die Situation erheblich, erhöhen aber auch die Kosten. Zudem ändern sich die Bedürfnisse der Clients auch im Laufe der Zeit. Dank Crush kann Ceph den unterschiedlichen Ansprüchen gerecht werden. Ein schneller SSD-Pool kann ohne Probleme parallel zu einem Datengrab auf langsamen großen Spindeln betrieben werden und auch eine Umschichtung der Daten ist jederzeit flexibel möglich.

Plane im Voraus

Neben eines Datenverlusts ist ein volles Cluster wohl eines der schlimmeren Szenarios. Um eine Beschädigung der Daten zu verhindern, werden bei 95% Füllstand keine weiteren Daten mehr angenommen. In den meisten Fällen macht dies den Storage unbenutzbar. Zu diesem Zeitpunkt hat man eigentlich nur zwei Möglichkeiten: Man kann versuchen, nicht mehr benötigte Daten zu entfernen, z.B. in Form von alten, nicht mehr benötigten Snapshots, oder man vergrößert den Cluster rechtzeitig. Hierbei sollte man bedenken, dass die Beschaffung und der Einbau der Hardware schnell mal 7 bis 14 Tage in Anspruch nehmen kann. Genügend Zeit zum Handeln sichert man sich durch verschiedene Thresholds, so dass der Cluster z.B. ab einem Füllstand von 80% warnt.

Ceph kann die klaren Erwartungen an ein modernes Storage-System in den meisten Fällen erfüllen. Die gegebene Flexibilität und die ständige Weiterentwicklung sichert eine einfache Anpassung an neue Anforderungen und ein sich ständig änderndes Umfeld. Somit ist mit etwas Planung, Monitoring und Liebe ❤ ein reibungsloser und stressfreies Betreiben über viele Jahre möglich.

Achim Ledermüller
Achim Ledermüller
Lead Senior Systems Engineer

Der Exil Regensburger kam 2012 zu NETWAYS, nachdem er dort sein Wirtschaftsinformatik Studium beendet hatte. In der Managed Services Abteilung ist unter anderem für die Automatisierung des RZ-Betriebs und der Evaluierung und Einführung neuer Technologien zuständig.