At Netways we try to learn all the time. Often you can simply read man pages, change logs, or even tabtab through your shell command at hand.
Trainings and conferences are a bit more time consuming, but can offer you one priceless advantage: the direct communication with someone, who has seen several sides of the topic at hand. I think, you learn best from other peoples experiences – even more from the situations where failure ensued. Here it is critical to not only look at the failure itself, but why it occured and how it was finally resolved.
Using other persons failure as a mean to learn might seem a bit cynically at first. In an ideal world, however, the same failure would only ocurr once.
So let’s begin with my contribution to a better world, maybe even as a start of a series. This entry is not meant as full post-morten per se, it only describes mistakes you can make and should avoid.
As many might know, our office moved. Having been 10ish years in our „old“ office, you might figure that quite a lot „historical infrastructure“ can grow in this time. Especially in an IT environment where everybody wants to try something new, better, and undocumented.
Being in the distinguished situation of having direct access to our DC via 1GBit-Fiber, it was possible to use some of our external IPs in our office. VLAN-Tagging, firewall policies, iptables rules – all well known and understood best practices. The services got installed, used and worked flawlessly for all the time and have lived happily ever after.
With the announcement of moving to the new premises, the blue sky became a bit hazy. We had to move all needed hardware devices (NASA could tell you how small and well hidden some of them can be) and also separate them from the unneeded devices. Todays story is about two specific systems, each consisting of two 1U server. Those were some of the systems which used external IPs to provide services to the public. They have been running flawlessly for quite some time and given their age, had started to develop some adorable quirks.
It was my task to move these „dear old ladies“ out of the cozy office into the cold, professional DC downstairs.
What harm can 4 old and lovable server possibly do, you might ask? The answer is: None, if you treat them the way they were used to.
First things first: How can you gain access to the DC? Is there a registration process, which has to be followed? How long does this take? Can you access the DC after business hours easily? (Hints: yes, long, no)
Also don’t try to rush things when it comes to shutdown the machines for moving them. The machines owners like to what is going to happen.
Grab all the tools you will need to remove the server from your previous rack and install them in their new home (cordless screwdriver, all bits you can find)
Are all installation material available? This is not only referring to rack rails, but also cage nuts, screws (size matters) and front covers for feng-shui and air flow. (depends, mostly: no)
Cables! Just collect all the cables you need and then some. Usually, they will be too short. Too long is not an issue you can’t fix with zip ties (you will forget these)
Do you have network access in the DC for debugging, communication etc.? (Hint: depends)
Do you want to move more than one server? Be cautios and do them step by step. You’re absolutely allowed to install them all at the same time. Be aware you might experience crooked rails, incorrect cabling and other time consuming things.
When you have installed the machines, definitely take your time to check these with a KVM device. Whichever is in reach. (you guessed it: there won’t be any when you need them the most). Don’t rely on the machine and its fancy blinkenlights: Some may flash when everything is ok, some flash to indicate errors, some don’t flash at all.
Check all your cabling at least twice, give them a gentle pull – if they come lose, you have to start over again.
Take breaks between different machines. Either try to find a cool spot in the DC (haha) or get outside, have something to drink and return refreshed. The noise (ear plugs, ANC Headphones), temperature, confinement while working in the rack will wear you out eventually.
If you route the machines traffic through several VLANs, make sure all needed switch ports are tagged (or untagged? You decide!) and firewall policies applied for the new location.
Always have a piece of paper and a (working!) pen with you – it’s faster to scribble something on paper than to crawl through yor rats nest of cabling, climb over all the machines you’re up to install and then find your trusty notebook with dead battery.
Before you finally leave, make sure to give everything one last check and, if possible, communicate with the owners of the respective machines. Collect all the tools you brought with you. If you didn’t bring them but „found“ them somewhere and used them: make sure to return them.
If you run into any issues, make sure all colleagues you could ask for assistance are currently at the party you’re rushing to attend.
Also make sure to communicate only via phone, so you don’t leave any paper trail when it comes to DC access, network config or time accounting.
When you don’t experience some of these mistakes because of this post, this post was a success. Of course some points are missing (feel free to comment), but I hope the overall pattern is visible:
be prepared, double check and take your time
Oh, and don’t forget the key to your racks. There is just one key, right?