This page is updated manually with status of current and recent events.
(Times are US/Arizona UTC-7)
Current status is: Green: I am completely operational, and all my circuits are functioning perfectly normally.
20161002 @ 6:00 PM – The sites hosted on WebhostQC5 have grown and at peak times exceed our comfortable load threshold. We’ll be bringing up new servers in the next week and will divide the sites on this server to two new ones. In the meantime we’ll be shunting what traffic we can to the CDN, and possibly moving a couple of sites to existing servers.
20160929 @ 8:20 AM – We’ve discovered a bug in how W3-Total-Cache updates itself. Don’t do the update. We’ll do it for you as soon as the latest version has that and a couple other bugs fixed. Ignore the security warning, we have an unofficial interim version with that flaw fixed that we’re be going to first.
20160926 @10:45 AM WebhostQC5’s network interfaces hung – we reset the server & it came back up fine. Downtime was approximately 5 min. We’re investigating the cause.
20160910 @7:22 PM – There was a network incident at the Quebec facility, first slowness, then a network break. When that ended, webhostqc5 was overloaded from >10,000 queued connections. Rather than waiting for them to time out, we rebooted the server. Everything looks to be back up. If we find out what the network issue was, we’ll report it in a subsequent update.
20160910 @7:05 PM – One, possibly 2 servers are down in Quebec, we’re digging in to it now.
20160905 @ 2:55 PM – All tests passed. Plugins which try to do the wrong thing will do the correct thing, and give an error for trying to create the wrong table type. Second, database operations which could cause the cluster to hang (large optimizations) no longer do so. And finally backups can no longer impact database writes when running.
20160905 @ 1:15 PM – Testing is going great. Afternoon backups took 20% less time, and installing and the database cluster no longer allows the plugin which caused the failure yesterday to succeed in its mischief.
20160905 @ 10:02AM – We’re about to start testing the DB cluster fixes. Fingers crossed.
20160904 @ 7:45PM We’ve found the cause of the database cluster instability – a WordPress plugin was trying to create MyISAM tables – these tables are not cluster compatible.
20160904 @ 4:20PM – The Phoenix database cluster is unstable. It’s crashed 4 times. Unsure if this is due to an error in MySQL, or if there’s a problem with certain lookups. We’re disabling two of the servers to try to eliminate the clustering software itself.
20160330 @ 9:57PM – All good. Server is back online & all sites are functioning.
20160330 @ 9:55PM – Maintenance is complete. Server is rebooting. Let’s hope for another 165 days before we have to do it again.
20160330 @ 9:32PM – Maintenance is starting.
20160830 @ 7:24PM – WebhostQC2 has a failing CPU fan. We’re coaxing it along until around midnight Eastern / 9PM Pacific/Arizona. Estimated downtime is 20-30 minutes, we’ll do our best to speed that along. (Time correction)
20160827 @ 11:58PM -Server Back up. All is well.
20160827 @ 11:51PM -Hardware updates done. Server coming back up. Sorry for the delay.
20160827 @ 11:47PM – Hardware updates just about complete. We ran into a small snag.
20160827 @ 11:24PM – Hardware updates about to commence. Shutting server down now.
20160827 @ 11:06PM – Software updates starting for webhostqc5.
20160827 @ 3:00PM – We’re going to be upgrading the CPU in WebhostQC and doing software updates tonight (11PM AZ time). Estimated downtime is 15 minutes, but could be done in 10.
20160826 @ 8:03 AM – The internet break was at the big peering point in Chicago (what is it about Chicago?) and some of our vendors seem to have routed traffic away from it – just like you do when you hear of a collision on the freeway and a big backup. Some ISPs (Cox for sure) are still routing traffic through there.
20160826 @ 7:55 AM – We’re seeing a huge network issue that’s causing some large parts of the Internet from seeing our Quebec hosted sites. Not sure if it’s the US Pacific and Mountain regions being cut off, or if the issue is closer to Quebec (or with the data center / network there).
20160817 @11:45 AM: Phoenix database cluster crashed due to an occasional recurring bug. It’s back up. Estimated outage was 5 minutes.
20160815 @ 9:00 AM – Changing status to amber since the Level3 issue in Chicago is ongoing & seems a little worse. It’s still not something we, or our vendors, can route around. We’ll update when we know more.
20160814 @11:49 PM – There’s a network issue in Chicago that’s causing a bit of latency for traffic flowing through that major traffic hub. It’s a third-party-provider (Level 3 we think) with the troublesome equipment, and we’ve asked our bandwidth vendor to see if they can route traffic around them. Leaving status at green, since it’s out of our control & slow, not down.
Green: I am completely operational, and all my circuits are functioning perfectly normally.
AMBER: External network issues .