Sebastian has already mentioned Mesos some time ago, now it’s time to have a more practical look into this framework.
We’re currently running our NWS Platform under Mesos/Marathon and are quite happy with it. Sebastians talk at last years OSDC can give you a deeper insight into our setup. We started migrating our internal coreos/etcd/fleetctl setup to Mesos with Docker and also could provide some of our customers with a new setup.
Before I will give you a short description about snares I ran into during the migration, let’s have a quick overview on how Mesos works. We will have a look at Zookeeper, Mesos, Marathon and Docker.
Zookeeper acts as centralized key-value store for the Mesos cluster and as such has to be installed on both the Mesos-Master and -Slaves
Mesos is a distributed system kernel and runs on the Mesos-Masters and Slaves. The Masters distribute jobs and workload to the slaves and therefore need to know about their available ressources, e.g. RAM and CPU
Marathon is used for orchestration of docker containers and can access on information provided by Mesos.
Docker is one way to run containerized applications and used in our setup.
As we can see, there are several programs running simultaneously which creates needs for seamless integration.
What are obstacles you might run into when setting up your own cluster?
When you set up e.g. different VMs to run your cluster, please make sure they are connected to each other. Which might look simple, can become frustrating when the Zookeeper nodes can’t find each other due to “wrong” etc/hosts settings, such as
This should be altered to
127.0.1.1 $hostname, e.g. mesos-slave1
Whenever you make changes to your configuration, it has to be communicated through your complete cluster. Sometimes it doesn’t even a need a service restart. Sometimes you may need to reboot. In desperate times you might want to purge packages and reinstall them. In the end it will work and you will happily run into
While Marathon provides you with an easy to use Web-UI to interact with your containers, it has one great flaw in the current version. As the behaviour is so random, you could tend to search for issues in your setup. You might or might not be able to make live changes to your configured containers. Worry not, the “solution” may be simply using an older version of Marathon.
Version 1.4.8 may help.
Have fun setting up your own cluster and avoiding annoying obstacles!
Edit 20180131 TA: fixed minor typo