Lessons learned – how not to make beginner’s mistakes: Migrating server

At Netways we try to learn all the time. Often you can simply read man pages, change logs, or even tabtab through your shell command at hand.

Trainings and conferences are a bit more time consuming, but can offer you one priceless advantage: the direct communication with someone, who has seen several sides of the topic at hand. I think, you learn best from other peoples experiences – even more from the situations where failure ensued. Here it is critical to not only look at the failure itself, but why it occured and how it was finally resolved.

Using other persons failure as a mean to learn  might seem a bit cynically at first. In an ideal world, however, the same failure would only ocurr once.

So let’s begin with my contribution to a better world, maybe even as a start of a series. This entry is not meant as full post-morten per se, it only describes mistakes you can make and should avoid.

As many might know, our office moved. Having been 10ish years in our “old” office, you might figure that quite a lot “historical infrastructure” can grow in this time. Especially in an IT environment where everybody wants to try something new, better, and undocumented.

Being in the distinguished situation of having direct access to our DC via 1GBit-Fiber, it was possible to use some of our external IPs in our office. VLAN-Tagging, firewall policies, iptables rules – all well known and understood best practices. The services got installed, used and worked flawlessly for all the time and have lived happily ever after.

With the announcement of moving to the new premises, the blue sky became a bit hazy. We had to move all needed hardware devices (NASA could tell you how small and well hidden some of them can be) and also separate them from the unneeded devices. Todays story is about two specific systems, each consisting of two 1U server. Those were some of the systems which used external IPs to provide services to the public. They have been running flawlessly for quite some time and given their age, had started to develop some adorable quirks.

It was my task to move these “dear old ladies” out of the cozy office into the cold, professional DC downstairs.

What harm can 4 old and lovable server possibly do, you might ask? The answer is: None, if you treat them the way they were used to.

First things first: How can you gain access to the DC? Is there a registration process, which has to be followed? How long does this take? Can you access the DC after business hours easily? (Hints: yes, long, no)

Also don’t try to rush things when it comes to shutdown the machines for moving  them. The machines owners like to what is going to happen.

Grab all the tools you will need to remove the server from your previous rack and install them in their new home (cordless screwdriver, all bits you can find)

Are all installation material available? This is not only referring to rack rails, but also cage nuts, screws (size matters) and front covers for feng-shui and air flow. (depends, mostly: no)

Cables! Just collect all the cables you need and then some. Usually, they will be too short. Too long is not an issue you can’t fix with zip ties (you will forget these)

Do you have network access in the DC for debugging, communication etc.? (Hint: depends)

Do you want to move more than one server? Be cautios and do them step by step. You’re absolutely allowed to install them all at the same time. Be aware you might experience crooked rails, incorrect cabling and other time consuming things.

When you have installed the machines, definitely take your time to check these with a KVM device. Whichever is in reach. (you guessed it: there won’t be any when you need them the most). Don’t rely on the machine and its fancy blinkenlights: Some may flash when everything is ok, some flash to indicate errors, some don’t flash at all.

Check all your cabling at least twice, give them a gentle pull – if they come lose, you have to start over again.

Take breaks between different machines. Either try to find a cool spot in the DC (haha) or get outside, have something to drink and return refreshed. The noise (ear plugs, ANC Headphones), temperature, confinement while working in the rack will wear you out eventually.

If you route the machines traffic through several VLANs, make sure all needed switch ports are tagged (or untagged? You decide!) and firewall policies applied for the new location.

Always have a piece of paper and a (working!) pen with you – it’s faster to scribble something on paper than to crawl through yor rats nest of cabling, climb over all the machines you’re up to install and then find your trusty notebook with dead battery.

Before you finally leave, make sure to give everything one last check and, if possible, communicate with the owners of the respective machines. Collect all the tools you brought with you. If you didn’t bring them but “found” them somewhere and used them: make sure to return them.

If you run into any issues, make sure all colleagues you could ask for assistance are currently at the party you’re rushing to attend.

Also make sure to communicate only via phone, so you don’t leave any paper trail when it comes to DC access, network config or time accounting.

When you don’t experience some of these mistakes because of this post, this post was a success. Of course some points are missing (feel free to comment), but I hope the overall pattern is visible:

be prepared, double check and take your time

 

Oh, and don’t forget the key to your racks. There is just one key, right?

Tim Albert
Tim Albert
System Engineer

Tim kommt aus einem kleinen Ort zwischen Nürnberg und Ansbach, an der malerischen B14 gelegen. Er hat in Erlangen Lehramt und in Koblenz Informationsmanagement studiert, wobei seine Tätigkeit als Werkstudent bei IDS Scheer seinen Schwenk von Lehramt zur IT erheblich beeinflusst hat. Neben dem Studium hat Tim sich außerdem noch bei einer Werkskundendienstfirma im User-Support verdingt. Blerim und Sebastian haben ihn Anfang 2016 zu uns ins Managed Services Team geholt, wo er sich nun insbesondere...

Weekly Snap: VM migration & PuppetCamp Dusseldorf Deal

weekly snap29 September – 3 October started a new month with VM migration and much ado about Puppet – from courses to Camps, plus a promo to boot.
Eva started by counting 57 days to the OSMC with Thomas Lehmann’s talk: “Monitoring as the Source of Truth in Deploying a Dynamic Website by Waves.
She went on to share a 25% off promo code for the upcoming Puppet Camp in Dusseldorf, and hurry on interested attendees to get their tickets while they last.
Thilo then compared three methods of migrating virtual machines, as Lennart reported back from his Puppet Architect course in London.

Drei Wege um virtuelle Maschinen zu migrieren

OpenNebulaConf Grafiken 09Neue Storage-Lösungen sprießen wie Tulpen im Frühling aus dem Boden. Jede einzelne ist flexibler, schlanker und hochverfügbarer.
Da kommt meine Cloud ins Spiel, die eigentlich gut läuft aber so ein schnelleres Storage ist eine willkommene Abwechslung.
So ein neues Storage ist schnell aufgesetzt, was uns dann aber vor eine neue Aufgabe stellt,
denn unsere VMs laufen… nur nicht auf unserem hippen Storage.
Nun gibt es diverse Methoden um eine virtuelle Maschine in ein neues Image bzw. neues Storage zu transferieren.
Da haben wir zum einen die altbewährte Methode, mit dem Urgestein aller blockorientierten Kopiervorgänge dd.
Dazu muss die virtuelle Maschine komplett ausgeschaltet sein. Da sich der Zustand der VMs nicht mehr ändert, kann man beruhigt die VM kopieren.
dd if=/path/to/input/file of=/path/to/output/file bs=4096
Zum anderen die Methode ein qcow2 Image in ein Blockdevice zu schreiben.
In Worten gesagt: das Image wird mit “qemu-img convert” in Raw umgewandelt und danach mit dd auf das neue Blockdevice kopiert. (Auch hier sollte die VM nicht mehr laufen!)
qemu-img convert -p -f qcow2 -O raw /path/to/input/file /path/to/outputfile.raw && dd if=/path/to/outputfile.raw of=/path/of/device bs=4M
Da die beiden genannten Arten eine lange Downtime benötigen, sind sie nur für VMs geeignet die nicht zeitkritisch sind.
Ein UNIX System kann mit guten Kenntnissen, mit relativ kurzer Ausfallszeit migriert werden. Ein hilfreiches Werkzeug dabei ist Rsync.
Leider kann ich hierzu kein fixes Beispiel vorzeigen, da die einzelnen Schritte von System zu System unterschiedlich ausfallen.
Die essentiellen Schritte sind:
1. Neues Device in der VM mounten und das gewünschte Filesystem erstellen.
2. Systemverzeichnisse auf dem neuen Device erstellen.
3. Das komplette System mit Rsync auf das neue Device kopieren. Hier muss man natürlich etwas aufpassen und Verzeichnisse wie /proc oder ggf. /mnt exkludieren. Auch auf bind Mounts sollte man achten, damit man Daten nicht ausversehen doppelt kopiert.
4. Die grub.cfg natürlich im neuen /boot Pfad aktualisieren. (grub-install und update-grub sind hierfür hilfreich)
5. Das “alte Device” als read-only einbinden und die neue fstab anpassen.
6. Und last but not least, einen weiteren Rsync in dem die restlichen Files auf das neue Image übertragen werden. (auch hier bitte das Exkludieren von wichtigen Pfaden nicht vergessen. z.B.: /etc/fstab oder auch /boot !!)
Der Vorteil hierbei ist: die Downtime erstreckt sich dabei nur über den zweiten Rsync, bei dem die Festplatte im “read-only” Modus ist.
Habt Ihr weitere coole Möglichkeiten einen VM zu migrieren?
Dann dürft ihr euch in den Kommentaren dazu äußern.
Oder seid Ihr interessiert an dem Thema Cloud und alles was damit zu tun hat? Dann besucht uns einfach auf der OpenNebula Conf 2014

Thilo Wening
Thilo Wening
Consultant

Thilo hat bei NETWAYS mit der Ausbildung zum Fachinformatiker, Schwerpunkt Systemadministration begonnen und unterstützt nun nach erfolgreich bestandener Prüfung tatkräftig die Kollegen im Consulting. In seiner Freizeit ist er athletisch in der Senkrechten unterwegs und stählt seine Muskeln beim Bouldern. Als richtiger Profi macht er das natürlich am liebsten in der Natur und geht nur noch in Ausnahmefällen in die Kletterhalle.

Weekly Snap: Syslog & Graphite, Dashing & Brainstorming

weekly snap27 – 31 January finished off the month with events aplenty, monitoring in different forms and best practices in brainstorming and virtual machines.
In usual style, Eva started the week counting 71 days to the OSDC by sharing Philipp Reisner’s talk on the latest in DRBD9. She went on to announce our Call for Papers for the next Puppet Camp in Berlin on 11 April.
Seeking out best practices, Marius H considered the best way to brainstorm as Dirk compared Cloud Provisioner to Compute Resource to find the best method to deploy virtual machines.
Thomas then added a second instalment to his Logstash series with a look at the Logstash predecessor Syslog and Bernd shared a video on the reality of conference calling.
The week ended on monitoring, as Marius G took a look at Dashing for good-looking, custom dashboards and Markus W introduced our advanced course on Real-time Graphing with Graphite.