Status

This page is updated manually with status of current and recent (30ish days) events.

(Times are US/Arizona UTC-7)

Current status is: AMBER: Possible Network issues in Quebec tonight – facility is doing upgrades

20170621 @1022:AM – The Quebec facility is upgrading their network gear. There may be momentary glitches of a few seconds as they prep for the transition – which is scheduled for midnight tonight.

20170614 @12:08PM: One client reported that the Social Warfare plugin alone – NOT the Pro version – seems to be working fine.

20170614 @9:58AM: We can confirm that the “mashshare” plugins also are breaking sites in a similar way. We’d not heard of that one before.

20170614 @6:48AM – There is a known bug in Social Warfare that causes 50x errors – for details go to: http://bit.ly/2soiNL9

We don’t yet know if this is the cause of issues, it’s been a rough month with Genesis having a failed update (2.5.1) and WordPress 4.8 dropping. For now, we recommend disabling Social Warfare until the next update.

20170614 @5:15AM – Updates (not of PHP) and reboots didn’t change anything. We’ve crawled through every aspect of our servers to reasonably justify looking outside our environment for the source of the timeouts. Logs have not been helpful.

20170613 @ 10PM – We’re doing some quick updates and reboots to see if it clears the weird errors we’re seeing. We don’t find it likely, but it only takes a minute each server, and won’t hurt anything.

20170613 @2:00PM – We have reverted PHP to 7.0.19 and 7.1.5 so that we can re-enable our anti-hacker protections. This puts us back to where we were when we first noticed the timeout errors. There are no resource bottlenecks on our servers which could be causing the 503 timeout errors – all have sufficient CPU, memory, and disk I/O capacity.  These errors occur when our servers call a remote service like Akismet and it takes more than 45 seconds to get a response from them.

20170613 @11:50AM – We can confirm a bug in PHP versions 7.0.20 & 7.1.6 that affects our use of the “open_basedir” function to help protect client sites. We’re looking for a workaround, if not, we’ll revert to versions 7.0.19 & 7.1.5 until a fix exists.

20170613 @11:00AM – Some test/sandbox sites may be flighty for the next while. We’re going to be testing a work around to the PHP bug we encountered earlier today. The fix we have in place is sub-0ptimal, as it forced us to deactivate one of our anti-hacking tools.

20170613 @ 10:46AM: The PHP issue occurred when we updated to the newest point release during troubleshooting of the network issue – which manifested to us as server load. (Simplifying the issue, our servers were getting bogged down because of a traffic jam elsewhere.)

20170613 @ 10:38AM: We’ve detected a network issue in Utah which was causing connections to Automattic (the folk behind Jetpack, Akismet, and now parts of WooCommerce). As part of our troubleshooting, we reloaded the latest version of PHP and it has a bug which was causing “Input file not found” errors. That has been fixed.

20170525 @11:11PM – Everything is back up. The hardware change went perfectly. Took <15 minutes.

20170525 @6PM – We’ll be upgrading webhostqc and replacing its motherboard right after its nightly backups finish at around 10:30PM Arizona/Pacific time. Estimated downtime of an hour.

 


 

Green: I am completely operational, and all my circuits are functioning perfectly normally.

AMBER: External network issues.

RED: Zombie Apocalypse

Magenta – a service is down, but not really an emergency.