Our own Tim Sabat keeps a vigilant eye on the stockpile of servers that run all things CodePen.
One of the things he has his eye on was the disk space usage on our main database server. At our current run rate, we were looking at about 3 months of growth left on that server safely. But he was also noticing short bursts of usage spikes up to 98% from various activities the disk has to perform. 98% is too damn close to maxing out and makes any server person sweat.
So Tim set set about building us some new database servers. I don't personally know all the hardcore super database-nerdy details. But essentially, we build a new Master for MySQL. Once the new Master was in sync with the current Master, we built a new Slave. Then we did a cutaway from the old to the new Master.
All in all, it was only 12 minutes of downtime. We let everyone know 12 hours in advance it was coming.
Tim tells me the new server is on better hardware. It has faster processors, more memory, with "SSD-backed root and data volumes". Hey sounds sweet to me.
We did have a little hiccup though =(. We forgot to move one particular table. It was the table that stored all the slugs that we use for new Pens and Collections and stuff. You know, that string of random characters we use as identifiers in URLs. We didn't notice right away because a few thousand of them did get moved, thus it took an hour or so to deplete them. When it ran out, new ones couldn't be generated and thus nobody could save. Sadly that resulted in a few hours of unplanned downtime. In the wee hours of the morning (Pacific time). Very sorry about that! Hopefully you'll agree it was worth it.