Status

This page is updated manually with status of current and recent (30ish days) events.

(Times are US/Arizona UTC-7)

Current status is: RED: Drive failure in Phoenix affecting one webhost server in Phoenix. Drive array rebuilding.

20171210: One Phoenix webserver (whphx5) is on a server that just suffered a disk failure. We’re doing an emergency out-of-band backup and will then replace the disk. (Doing it in this order lessens the risk of changes since the last backup being lost if additional disks fail.)

20171201 @11:15AM – We need to reboot servers to patch a security flaw. Most can wait until off-hours. A few needed to be done right away.

20171102 @8:31PM – Final server’s database software is fixed, and we’re migrating site databases back – taking backups before and after.

20171102 @ 3:35PM – Final server’s database software isn’t starting. We restored the backup of the database to a different server so that sites are back up. Everything should be back up, and we’ll be figuring out this weekend how to prevent runaway MySQL processes from filling the disks – SSD’s are way faster, so fill WAY faster than normal hard drives.

20172202 @1:25PM – no such luck on the quick fix on the remaining server. Going to restore from backups of 9:45AM.

20171102 @ 11:30AMto 1:10PM – Database servers on 4 servers filled their disks with journal changes. This caused the databases to crash, which ultimately caused those servers to crash. 1 came up quickly with a reboot. Two needed a bit of help to boot cleanly. And the fourth is still down, but will be up soon. (Somehow this server is running an older version of MySQL, and got confused by the new configuration files. It’s updating now.)

 

 

 

 


 

Green: I am completely operational, and all my circuits are functioning perfectly normally.

AMBER: External network issues.

RED: Zombie Apocalypse

Magenta – a service is down, but not really an emergency.