Friday 27 January 2012

The Server Upgrade Uncertainty

I am often asked what it is like to work in IT Support. Most people assume that it is a geeky and uncool job. Others believe it to be never ending fun; playing with gadgets all day long and never doing any "real work". The rest are usually amazed at the flashing lights, and the unidentifiable, yet intriguing components that lay about the office, and believe it to be a glamorous, and trouble free profession. Most never see what we do in the background, unable to appreciate the hard work we do: The early mornings, the late nights, the endless progression bars. Failures and down time are fixed in their memories like the 1966 world cup, but our successes are usually forgotten almost instantaneously.

Occasionally, but not very often, the frustrations of the IT world come not from the users, but from the systems themselves.

We were recently running upgrades on some of our major systems, getting them up to the latest builds, introducing new features, and generally improving our resources' speeds and productivity. It's at times like these that you realise that technology is no different from humans; it can be as stubborn as hell, and tends to have an eccentric and sometimes illogical mind of its own.

It all started a year or so ago, just after the new software was released...

"The new email system is out" one of my colleagues announced to the office, expecting the reply before it even came.

Upgrading the email system is no small feat. An audible groan filled the room followed by a chorus of "F*** THAT!" Knowing full well that one day soon we would have to face the inevitable, bite the bullet, and take the plunge.

Over the next twelve months, we were plagued with jokes, and anecdotes of the software' wonderful new features. Fortunately, it took that same twelve months for other software vender's to make their own software compatible with it, giving us a little breathing room and enabling us to push it back.

Finally we had to give into the pressure, and jump in with both feet as our old server was slowly and surely grinding to a halt, as more and more users sent and received more and more emails. Isn't communication a wonderful thing..?

First port of call, we had to spec the new servers out. It's a good job we didn't follow the recommended specifications provided by the vendor. If we had, we would have needed about 8 servers, each with quad quad core processors, a terabyte of memory, and more disk space than was physically possible to squeeze into our server room. How the vendor gets away with recommending these specs is a mystery to us, their testing labs must look like something from the 25th century. It was only after Googling what other professionals in the real world had done, that we found that no one follows them anyway.
In the future all server rooms will look this pretty!
The servers built, it was time to start the installation of the software, we soon realised in the usual IT specialists special way, that following the instructions was about as useful as a user at a log on screen.

The install went well, a little too well actually, we had three servers up and running within a few hours (each a considerably lower spec that was recommended, and each firing on all cylinders), with the software installed and configured with the basics. We were a little worried by this time that everything had gone off without a hitch. By the end of the first day we had a new environment integrated with the current one. It was then that reality set in. Now there was no turning back; even if we wanted to.

It is also at this point we suddenly found all the of things that need to be done in order for the environment to function: The extra plugins' that needed installing, the preliminary scripts that had to be run, the prayers that had to be said to the gods of computing. Things that were not in the instructions even if we had bothered read them. Things you only find out about when you get support calls from your users asking why certain things are not working.

It was with smug, geeky pride we used mile long, complicated command lines to configure the most simplistic of functions, and ran impressive looking scripts instead of using the graphical interface, just because we could!

Not everything went to plan, as we found when we lost 9 users mailboxes due to a typo, and spent an entire day trying to them back. This was no small feat, as we had to eliminate every trace of them from the system before it would let us restore them. It took a long time for our fingernails to grow back after that one.

This wasn't the only hiccup, myself and my colleagues spent numerous hours on the Internet searching for answers to questions we hadn't even gotten around to asking. It is never good when the software gives you an error,  a link offering you further information and a supposed fix, that contains nothing but this:

"We have no further information on this error. If you find a solution please contact us."

Very helpful. NOT!

Then there were the times when there were no errors at all, and the software just blatantly refused to function correctly, and not tell us why. Many a curse was uttered from our lips as we battled to get the websites to redirect, or to get certificates that the system would recognise without throwing its toys out of its pram.

It took about 4 to 5 weeks to get everything migrated over. During that time every problem that came through the support line, was blamed upon our upgrade. Whether it was to blame or not: The Internet is down, the upgrade was to blame. A switch went down, the upgrade was the cause. The best ones were the people complaining we were preventing them from getting to their email because of the migrations. Only these were usually the people we hadn't moved yet.

We got complaints if we didn't notify the users what we were doing, we got complaints if we did. Most of them didn't read the notifications correctly anyway, and jumped to conclusions, causing a mass panic. If web mail was down for more than a minute, the world was coming to an end. If email was down at all, we were very close at one point to having a suicide pact on our hands. Thankfully this was avoided, although there would have been less mailboxes to move.

Then there were the new features. Firstly the web mail interface: Nice, new, fresh! And yellow. Yes, the log on screen had changed from grey to yellow, and before you know it the phone is ringing off the hook. Users calling to complain that the log on box was now a different colour. Apart from that it was identical to the old version, but for the fuss some people made, we may as well have just deleted their mailbox, formatted their PC, cleared out their bank accounts, and stolen their cars.

The archiving was not much better. For years we had tried to pry our users away using local archives that got easily deleted or corrupt, or lost when the computer was formatted. The new feature allowed us to use the same principle, but store the archives centrally, and even better, enable access from anywhere that you can retrieve your mail. Anyone would have thought we had moved their mail in to a small wooden hut in the centre of the Amazon rain forest. The users can cope with a separate archive section that is stored on their local drive, but give them a separate archive section that is stored centrally (it looks exactly the same) and they are ready to call their local MPs, gather their pitch forks and go on a lynching rampage.
Unbelievable!

With the email system fully functional, we turned our attentions to our next project. Upgrading our communication services. We always like this one: Quick, simple and we get to play with video conferencing.

No such luck!

We started off cautiously, by reading a professional walk-through from the Internet. Yep, we did our research this time. It showed us exactly how to go about the install, how to configure the system, and most importantly how to migrate from the old version to the new one seamlessly.

Our plan was to run the new version on a new server beside the old one. This way we could switch over gradually. What actually happened was as we installed the new version it overwrote the configuration of the old one. Effectively leaving us without a working system at all. We didn't see this in the small print.

It took us a good day or so get the system into a workable state, during which time we had several members of staff suffering from withdrawal, and very nearly needed counselling. It did take an impossible amount of time to get the simplest of functions working though, and to this day some of it remains a complex conundrum that can only be solved by standing on our heads, and looking at it cross eyed.

Finally we had to remove the old server. This could only be done by removing the software first to ensure it was no longer registered in the network. Leaving traces of it could give us further problems in the future.

We should have learnt by then NEVER to follow instructions. After the initial installation problems, it was no surprise that running the uninstallation caused us a few more. As the software kindly removed the databases too. The same databases that the new version was now quite happily running on. It was a bit of a shock as they disappeared before our very eyes. Our clients disconnected, and we sat in stunned silence until more obligatory swearing and cursing took hold. It only took us the rest of the morning to get the databases restored from backups, and the software reinstalled to a state where everything sprang back to life.

Testing the video conferencing was an experience on its own: Six of us, three in the same office, each armed with a web cam, and a microphone. Yes, three microphones in the same room; the feedback was incredible. We spent the first five minutes covering our ears from the noise, and scrambling for the mute button. We then passed an exciting half an hour, having a long distance conversation with the guys in the next room. Gotta love communication technology!

All in all the upgrades went well, with only a few sleepless nights, panic attacks, and the occasional brain twister. However, it's not something we are going to do again anytime soon.

"The new version of sharepoint portal is out" we are told a week later.
"F*** THAT!" We all answer in unison.

No comments:

Post a Comment

Popular Posts