Obstacles when setting up Mesos/Marathon

Sebastian has already mentioned Mesos some time ago, now it’s time to have a more practical look into this framework.
We’re currently running our NWS Platform under Mesos/Marathon and are quite happy with it. Sebastians talk at last years OSDC can give you a deeper insight into our setup. We started migrating our internal coreos/etcd/fleetctl setup to Mesos with Docker and also could provide some of our customers with a new setup.
Before I will give you a short description about snares I ran into during the migration, let’s have a quick overview on how Mesos works. We will have a look at Zookeeper, Mesos, Marathon and Docker.
Zookeeper acts as centralized key-value store for the Mesos cluster and as such has to be installed on both the Mesos-Master and -Slaves
Mesos is a distributed system kernel and runs on the Mesos-Masters and Slaves. The Masters distribute jobs and workload to the slaves and therefore need to know about their available ressources, e.g. RAM and CPU
Marathon is used for orchestration of docker containers and can access on information provided by Mesos.
Docker is one way to run containerized applications and used in our setup.
As we can see, there are several programs running simultaneously which creates needs for seamless integration.
What are obstacles you might run into when setting up your own cluster?
1. Connectivity:
When you set up e.g. different VMs to run your cluster, please make sure they are connected to each other. Which might look simple, can become frustrating when the Zookeeper nodes can’t find each other due to “wrong” etc/hosts settings, such as  localhost
This should be altered to $hostname, e.g. mesos-slave1
2. Configuration
Whenever you make changes to your configuration, it has to be communicated through your complete cluster. Sometimes it doesn’t even a need a service restart. Sometimes you may need to reboot. In desperate times you might want to purge packages and reinstall them. In the end it will work and you will happily run into


3. Bugs
While Marathon provides you with an easy to use Web-UI to interact with your containers, it has one great flaw in the current version. As the behaviour is so random, you could tend to search for issues in your setup.  You might or might not be able to make live changes to your configured containers. Worry not, the “solution” may be simply using an older version of Marathon.
Version 1.4.8 may help.


Have fun setting up your own cluster and avoiding annoying obstacles!
Edit 20180131 TA: fixed minor typo

Tim Albert
Tim Albert
System Engineer

Tim kommt aus einem kleinen Ort zwischen Nürnberg und Ansbach, an der malerischen B14 gelegen. Er hat in Erlangen Lehramt und in Koblenz Informationsmanagement studiert, wobei seine Tätigkeit als Werkstudent bei IDS Scheer seinen Schwenk von Lehramt zur IT erheblich beeinflusst hat. Neben dem Studium hat Tim sich außerdem noch bei einer Werkskundendienstfirma im User-Support verdingt. Blerim und Sebastian haben ihn Anfang 2016 zu uns ins Managed Services Team geholt, wo er sich nun insbesondere...

Like meeting the family – OSDC 2017: Day 1

I was happy to join our conference crew for OSDC 2017 again because it is like meeting the family as one of our attendees said. Conference started for me already yesterday because I could join Gabriel‘s workshop on Mesos Marathon. It was a quite interesting introduction into this topic with examples and know how from building our Software-As-A-Service platform “Netways Web Services“. But it was also very nice to meet many customers and long-time attendees again as I already knew more than half of the people joining the workshops. So day zero ended with some nice conversation at the hotel’s restaurant.
As always the conference started with a warm welcome from Bernd before the actual talks (and the hard decision which talk to join) started. For the first session I joined Daniel Korn from Red Hat’s Container Management Team on “Automating your data-center with Ansible and ManageIQ“. He gave us an good look behind “one management solution to rule them all” like ManageIQ (the upstream version of Red Hat Cloudform) which is designed as an Open source management platform for Hybrid IT. So it integrates many different solutions like Openshift, Foreman or Ansible Tower in one interface. And as no one wants to configure such things manually today there are some Ansible modules to help with automating the setup. Another topic covered was Hawkular a time series database including triggers and alarming which could be used get alerts from Openshift to ManageIQ.
The second talk was Seth Vargo with “Taming the Modern Data Center” on how to handle the complexity of data centers today. He also covered the issues of life cycles shrinking from timeframes measured in days, weeks and month to seconds and minutes and budget moving from CapEx to OpEx by using cloud or service platforms. With Terraform he introduced one of HashiCorp’s solutions to help with solving these challenges by providing one abstraction layer to manage multiple solutions. Packer was another tool introduced to help with image creation for immutable infrastructure. The third tool shown was Consul providing Service Discovery (utilizing DNS or a HTTP API), Health Checking (and automatic removal from discovered services), Key/Value Store (as configuration backend for these services) and Multi-Datacenter (for delegating service request to nearest available system). In addition Seth gave some good look inside workflows and concepts inside HashCorp like they use their own software and test betas in production before releasing or trust developers of the integrated software to maintain the providers required for this integration.
Next was Mandi Walls on “Building Security Into Your Workflow with InSpec”. The problem she mentioned and is tried to be resolved by InSpec is security reviews can slow down development but moving security reviews to scanning a production environment is to late. So InSpec is giving the administrator a spec dialect to write human-readable compliance tests for Linux and Windows. It addresses being understandable for non-technical compliance officers by doing so and profiles give them a catalog to satisfy all their needs at once. If you want an example have a look at the chef cookbook os-hardening and the InSpec profile /dev-sec/linux-baseline working nicely together by checking compliance and running remediation.
James Shubin giving a big life demo of mgmt was entertaining and informative as always. I have already seen some of the demos on other events, but it is still exciting to see configuration management with parallelization (no unnecessary waiting for resources), event driven (instant recreation of resources), distributed topology (no single point of failure), automatic grouping of resource (no more running the package manager for every package), virtual machines as resources (including managing them from cockpit and hot plug cpus), remote execution (allowing to spread configuration management through SSH from one laptop over your data center). mgmt is not production ready for now, but its very promising. Future work includes a descriptive language, more resource types and more improvements. I can recommend watching the recording when it goes online in the next days.
“Do you trust your containers?” was the question asked by Erez Freiberger in his talk before he gave the audience some tools to increase the trust. After a short introduction into SCAP and OpenSCAP Erez spoke about Image inspector which is build on top of them and is utilized by OpenShift and ManageIQ to inspect container images. It is very good to see security getting nicely integrated into such tools and with the mentioned future work it will be even nicer to use.
For the last talk of today I joined Colin Charles from Percona who let us take part on “Lessons learned from database failures”. On his agenda were backups, replication and security. Without blaming and shaming Colin took many examples which failed and explained how it could be done better with current software and architecture. This remembers me to catch up on MySQL and MariaDB features before they hit enterprise distributions.
So this is it for today, after so many interesting talks I will have some food, drinks and conversation at the evening event taking place at Umspannwerk Ost. Tomorrow I will hand over the blog to Michael because I will give a talk about Foreman myself.

Dirk Götz
Dirk Götz
Senior Consultant

Dirk ist Red Hat Spezialist und arbeitet bei NETWAYS im Bereich Consulting für Icinga, Puppet, Ansible, Foreman und andere Systems-Management-Lösungen. Früher war er bei einem Träger der gesetzlichen Rentenversicherung als Senior Administrator beschäftigt und auch für die Ausbildung der Azubis verantwortlich wie nun bei NETWAYS.

Ready, Steady, Go! — The faster the better!

This is your LAST CHANCE to be part of the best open source conference this May in Berlin!
On May 16 to 18 it’s all about open source data center solutions for complex IT infrastructures once again. Three days of hands-on workshops, presentations and social networking in a super relaxed atmosphere with a bunch of really great people is what you can expect. The 2017 main conference topics are
Containers and Microservices
Configuration Management
Testing, Metrics and Analyses
Tools  & Infrastructure
Join the open source community, learn from well-known data center experts, get the latest know-how for your daily business and meet international open source professionals.
So hurry up if you want to grab one of the last remaining tickets for OSDC 2017 and register now at www.osdc.de!

Pamela Drescher
Pamela Drescher
Head of Marketing

Pamela hat im Dezember 2015 das Marketing bei NETWAYS übernommen. Sie ist für die Corporate Identity unserer Veranstaltungen sowie von NETWAYS insgesamt verantwortlich. Die enge Zusammenarbeit mit Events ergibt sich aus dem Umstand heraus, dass sie vor ein paar Jahren mit Markus zusammen die Eventsabteilung geleitet hat und diese äußerst vorzügliche Zusammenarbeit nun auch die Bereiche Events und Marketing noch enger verknüpft. Privat ist sie Anführerin einer vier Mitglieder starken Katzenhorde, was ihr den absolut...

MesosCon Europe

Dieses Jahr findet die MesosCon Europe in Amsterdam statt. Neben schönem Wetter und einer tollen Stadt erwartet uns ein breites Programm rund um das Apache Mesos Framework.
Mesos selbst ist ein Cluster Framework zur Verwaltung der im Rechenzentrum zur Verfügung stehenden Ressourcen (z.B. CPU, Ram, Storage). Ein Scheduler in Mesos bietet diese Ressourcen verschiedensten Applikationen an und startet diese. Insbesondere im Containerumfeld bietet Mesos in Kombinationen mit Marathon (ein Mesos Plugin) viele Möglichkeiten seine Applikationen zu verwalten.
Die kürzlich veröffentlichte Version 1.0 ist natürlich ein großes Thema. So bietet Mesos jetzt neben einer neuen HTTP API auch einen unified containerizer zum Starten verschiedener Container Formate. Auch im Networking Bereich bietet die neue Version neue Features, vor allem die Möglichkeit eine IP je Container zu vergeben gehört zu den Highlights. Nicht zuletzt wird der Release von einem neuen Autorisierungsmodell abgerundet.
Das breit angelegte Programm bietet in den nächsten zwei Tagen Vorträge zu vermutlich allen Themen rund um Mesos, Marathon, Microservices, Service Discovery, Storage, DC/OS und mehr, aber natürlich geben auch namhafte Firmen wie Twitter und Netflix Einsicht in ihre Setups, bei denen natürlich Mesos die Microservices verwaltet.
Ein Hackathon am Freitag lässt die Konferenz schön ausklingen. Leider geht es für uns vorher schon zurück nach Nürnberg.

Achim Ledermüller
Achim Ledermüller
Lead Senior Systems Engineer

Der Exil Regensburger kam 2012 zu NETWAYS, nachdem er dort sein Wirtschaftsinformatik Studium beendet hatte. In der Managed Services Abteilung ist unter anderem für die Automatisierung des RZ-Betriebs und der Evaluierung und Einführung neuer Technologien zuständig.

OSDC 2016: More for your datacenter stack

We need more coffee – the first talk of day 2 directly kicked off with automation and challenges. We are using Foreman at NETWAYS and it helps me on a daily basis to deploy development boxes with Opennebula. So it was interesting to find out about insights and challenges by Julien Pivotto with Automating a R&D lab with Foreman: What can be hard?.

Sadly the second talk about Interesting things you can do with ZFS by Allan Jude & Benedict Reuschling was at the same time – again one for the conference archive.

Decisions decisions. Mesos and the Architecture of the New Datacenter by Jörg Schad or Hybrid Cloud – A Cloud Migration Strategy by Schlomo Shapiro. I guess I’m one of those devops hipsters going for Mesos. Although I hear that its implementation can be tricky (hi Sebastian) we’re using it at NETWAYS and I wanted to learn more about it. Especially since Mesosphere recently announced DC/OS.

Ingesting Logs with Style sounds like a swiss army knife presentation style. Logs always remind me of the days when there was not Logstash or Graylog around, just some unacceptable expensive Splunk license and your own central syslog server plus some custom handmade scripts. Pere gave an awesome outlook on what’s coming with Elastic Stack 5.0 including hist sports activities as live demos.

Everyone knows about Docker. Everyone uses it in production already? Or at least tried to until the company’s security team stepped into? Inspecting Security of Docker formatted Container Images to find Peace of Mind provided insights on the most often asked questions when it comes to production deployments.
Right after lunch David Schmitt gave an interesting Introduction to Testing Puppet Modules. Being a developer and writing tests? Meh. The least exciting part right after the documentation bits. Though tests will make your life easier especially with Puppet modules ensuring they won’t break. The talk’s topic also reminds me of Tom de Vylder maintaining the Icinga puppet modules and insisting on rspec tests on every single PR – chapeau 🙂

Coming from MariaDB Colin Charles provided tipps and tricks on Tuning Linux for your Database.

An Introduction to Software Defined Networking (SDN) by Martin Loschwitz or Bareos Backup Integration with Standard Open Source Tools with Maik Aussendorf?

Finally the last talks for this years OSDC. We already learned about Puppet and Salt, now it is time for Kaiten Zushi – Chef at Goodgame Studios by Jan Ulferts. Florian Lautenschlager with Chronix – A fast and efficient time series storage based on Apache Solr was again my favourite for the conference archive.

Thank you and see you in 2017

The conference archive will be made available in the next couple of days. Save the date – 16.-18.5.2017 🙂
PS: Everything is a freaking DNS problem!

Michael Friedrich
Michael Friedrich
Senior Developer

Michael ist seit vielen Jahren Icinga-Entwickler und hat sich Ende 2012 in das Abenteuer NETWAYS gewagt. Ein Umzug von Wien nach Nürnberg mit der Vorliebe, österreichische Köstlichkeiten zu importieren - so mancher Kollege verzweifelt an den süchtig machenden Dragee-Keksi und der Linzer Torte. Oder schlicht am österreichischen Dialekt der gerne mit Thomas im Büro intensiviert wird ("Jo eh."). Wenn sich Michael mal nicht in der Community helfend meldet, arbeitet er am nächsten LEGO-Projekt oder geniesst...

Weekly Snap: Metacircular Evaluator, Puppet Camp Program & Mesos

weekly snap8 – 12 September offered tips for cluster admins, programmers and new RPM packagers, plus news on the events front.
Eva counted 78 days to the OSMC with Páll Sigurdsson’s talk on Adagios, an Icinga / Nagios addon for web configuration.
She went on to announce the program for Puppet Camp Dusseldorf, and our participation and sponsorship of DevOps Camp Nuremberg.
Sebastian then shared his thoughts on the Apache Mesos cluster manager while Jean-Marcel explained the Metacircular Evaluator programming process.
Finally, Dirk continued his series on RPM packaging with a post on useful tools.


Mit Mesos hat das Apache Projekt ein Cluster-Manager geschaffen, der sich um das Ausführen verteilter Applikationen kümmert.
Der Mehrwert ist, dass mit dem Framework ein Standard geschaffen werden soll, der das komplexe “Rad neu erfinden” für solche Zwecke abgeschafft werden soll und auf bereits funktionierende Software gesetzt werden kann. Ein Beispiel ist z.B. das managen eines Quorum in einem Cluster. Jeder Clusterstack wie Pacemaker/Corosync etc. hat seine eigene Logik. Mesos setzt hierfür auf das ebenfalls aus dem Hause Apache stammende Zookeeper. Das wiederum kann dann modular für Mesos, SolR, eigene Entwicklungen usw. genutzt werden.
Was macht jetzt Mesos? Mesos besteht aus mind. einem Master und einem Slave. Der Master verteilt Jobs oder Anwendungen und die Slaves führen diese aus. Die Slaves melden außerdem regelmäßig ihre aktuelle Auslastung, so dass der Mesos-Master weiß wohin er die neuen Aufgaben verteilen soll. In dieses Konstrukt kann sich ein weiteres Framework, dass für die unterschiedlichsten Zwecke verwendet werden kann, registrieren. Es gibt die wildesten Frameworks genannt Marathon, Aurora, Chronos und viele mehr. Jedes nutzt als Basis Mesos, aber eben für seine eigenen Zwecke.
Chronos ist z.B. für das Ausführen von zeitgesteuerten Jobs entwickelt worden. Also ein Crond, nur eben, dass die Cronjobs nicht nur auf einem Host sondern auf vielen ausgeführt werden können. Mit Marathon werden Commands oder ganze Anwendungen am laufen gehalten. Einmal gestartet, sorgt es dafür dass es immer mind. xmal gestartet ist. Ein Job kann z.B. ein “/usr/sbin/apachectl -d /etc/apache2 -f apache2.conf -e info -DFOREGROUND”. Das setzt natürlich die installierte Software auf dem Slave voraus, also Apache2.
Noch abstrakter geht es indem man Mesos konfiguriert einen externen containerizer zu nutzen: Docker. Obiges Beispiel würde also nicht auf dem Slave direkt gestartet werden, sondern ein Docker Container starten und in diesem das Command ausführen. Betrieben werden kann das ganze auf physischen oder in virtuellen Servern oder beidem.
Wozu man so etwas verwenden kann muss man letztendlich selbst entscheiden. Die Lösung verfolgt den SaaS/PaaS Ansatz und stellt im wesentlichen Hardwareressourcen aus einem Pool(IaaS) zur Verfügung und bietet Bibliotheken, Frameworks und APIs um sie zu steuern. Das Buzzword nennt sich ganz allgemein Cloudcomputing 🙂
Für Docker werden wir übrigens noch dieses Jahr Schulungen anbieten.

Sebastian Saemann
Sebastian Saemann
Head of Managed Services

Sepp kam von einem großen deutschen Hostingprovider zu NETWAYS, weil ihm dort zu langweilig war. Bei uns kann er sich nun besser verwirklichen, denn er leitet zusammen mit Martin das Managed Services Team. Wenn er nicht gerade Server in MCollective einbindet, versucht er mit seinem Motorrad einen neuen Geschwindigkeitsrekord aufzustellen.