Server Maintenance – March 9th & 10th from 11PM – 4AM Arizona time (UTC-7)

in Uncategorized

Summary:

The TechSurgeons web & email server will be moved to our new facility March 9th, starting at 11PM Arizona time. Client sites will not be impacted by this move – just *.techsurgeons.com sites and services.

Estimated downtime is 4 hours. Status will be updated as we can at www2.techsurgeons.com and on Facebook.

If the email server move goes well, we may update the web servers and reboot them. Estimated downtime is 10 minutes each, and would affect client sites.


All the details:

We’ve been dissatisfied with the Phoenix facility we’ve colocated our servers at for a while. The original company we chose had been acquired, and the new company has not been performing. Which is why we’d been putting almost everyone on servers at our Quebec facility. (Which we still like for sites where most traffic goes to the UK/Europe.)

On October 7th, we started moving into our new Phoenix facility with all new servers. Since then, we’ve migrated over 350 sites from the old facility to the new Phoenix one, and an additional 100ish sites from Quebec.

All of these migrations have been done one at a time, by hand, completely behind the scenes to minimize any impact on you or your site’s visitors. While incredibly time consuming, it was more successful than we expected, with only a few sites having issues. (Early on, we kinda overloaded the network connection on a couple servers, and had to bump them from a gigabit connection to a ten gigabit connection.)

One of the advantages of the new facility is that we can add additional ISPs for network redundancy. Almost all sites now have multiple “phone numbers” split between two providers – “multi-homed” in geek speak. If someone tries to connect to a site on one, and it’s not working right, browsers are smart enough to try the second.
This has worked amazingly well, but confuses some monitoring tools (sorry Pingdom) which don’t know how to deal with “multi-homed” sites.

Now that all the production web sites have been moved, we’re down to the stuff that can’t be moved without downtime.

The primary goal for Thursday’s maintenance window is of our original infrastructure server. It’s been 711 days since that server has last been rebooted. That’s been too long, but we had a bad experience last time, and are wary of rebooting it unless we’re physically there. The poor thing has been a little flaky the last couple weeks, but we’ve been keeping it going until Thursday.

This server is responsible for:

  • mail.techsurgeons.com
  • webmail.techsurgeons.com
  • portal.techsurgeons.com
  • www.techsurgeons.com
  • support.techsurgeons.com

We will start the shutdown process at 11PM. It will take about 30 minutes to shut the server down and disassemble it. Then 30 minutes to drive it to the new facility. Once we get it there, figure 45 minutes to reassemble and mount it. And then something between 15 minutes and 2 hours for it to come back up.

Yes, we’ll drive carefully, have a fresh backup of the server, and have extra hardware on hand in case the server doesn’t come back up correctly. Those should mitigate the worst case scenarios. (In a future maintenance window, we’ll upgrade the server itself.)

During the downtime, mail destined for us should just queue up. The email protocol says that sending mail servers should retry sending mail for at least 3 days.

If the mail server move goes well, there’s some other maintenance we’d like to do, but nothing is as urgent.

Please understand that we won’t be very reachable during this time. We’ll put up a status page at the soon to be created www2.techsurgeons.com, and try to post updates on FB to keep folk informed.

Wish us luck!

-TS

Previous post:

Next post: