As you read this, CodePen is 100% on PostgreSQL for our main relational database. It was a transition that took a couple weeks of pretty intense effort, and lots of planning and prep before that. I’ve got Alex on the show to talk about it, as he was the main instigator and led the effort, but everyone contributed in some way.

Wondering why? Well…

There is a lot to talk about here, so we get into all the nitty-gritty detail, including how we pulled it off with near-zero downtime and the tools involved.

Time Jumps

  • 00:16 CodePen switched databases
  • 01:08 Team effort
  • 03:16 Is it a big paradigm shift?
  • 06:53 Using ElasticSearch
  • 13:34 Recapping the why
  • 16:14 We have some in house experience with PostGres
  • 17:17 Sponsor: WooCommerce
  • 18:50 How did we do the transition?
  • 27:23 How long did it take to move the data?
  • 31:14 How the migration went
  • 36:53 What enabled this to be possible
  • 41:06 Any negatives or issues?

Sponsor: WooCommerce

WooCommerce is eCommerce for WordPress. It’s powerful software in that it can do anything you could ever want to do in selling online. You can sell physical products. You can sell digital and downloadable products. You can sell memberships or tickets or appointments. The sky is the limit. I’ve used it personally for all those things and have had a great experience every time.

Transcript

[Radio channel adjustment]

Announcer: Today, on CodePen Radio.

Chris Coyier: What's up, everybody?! CodePen Radio #351. This is going to be a deep technical issue about databases. This was huge for us. Not going to bury the lead. We switched.

As a matter of fact, if you go to CodePen today, this is live. This is done. This is a complete project. CodePen moved from MySQL to PostgreSQL.

Yes, I am saying that correctly. We watched a French gentleman say it on YouTube just moments ago. [Laughter] "Post-gres", even though it's always spelled postgre with the SQL at the end of it, which makes you want to say Postgres-Q-L or something, but anyway--

I have with me Alex. What's up, Alex?

Alex Vazquez: Hey, there.

Chris: You did a lot of this work. Of course, it was a team effort because everything we do is, but this was -- you know you're the CTO, so this was square in your territory. We did it.

Alex: Yeah, we did. I'm an "everyman" CTO, so I did a lot of this work myself.

[Laughter]

Alex: Don't let the title fool you. I was writing a lot of code to make this happen.

Chris: Uh-huh.

Alex: Yeah. It took the whole team. Anyone who touched back-end code, from Rachel doing the review, Dee did the move with me, did a lot of the validation, did a lot of the checks and writing Go code to make sure that the move was correct. It took a lot of us, and the whole team to test out the site when we moved to it.

Chris: Yeah. Yeah.

Alex: You and Marie were helping put up messages, warnings, advertise that we're going to be in read-only mode for an hour - that kind of stuff.

Chris: Yep.

Alex: This is not a small move. It's considered the biggest tech decision you'll make for your stack, other than your programming language, which you can choose many. But in database land, you like to choose as few as possible.

Chris: [Laughter]

Alex: Yeah, the good news is CodePen is still up, [laughter] as you've noticed lately.

Chris: [Laughter]

Alex: We're still online, and this was pretty scary for me because there is a small chance that we could lose data, and I would consider that the cardinal sin of running a site for money is that you lose anyone's data and content, so we take that really seriously. We're able to avoid that step and make this a successful move.

00:02:32

Chris: Yeah. Yeah. Yeah. Yeah, hopefully, everybody feels good about that. This isn't super-duper new. It's not like it happened yesterday. We're recording this podcast well after this move has settled in and has easily been successful.

The biggest thing that people are going to be curious about and that we have to talk about -- because we had to talk about this internally -- is the why. Why would this even come up at all? Why would you choose to move away from MySQL, the bedrock of the Internet -- tons of sites on it?

I always associate it with WordPress, which powers half the freakin' Web - or whatever - and that's what it runs on. To choosing something else. We have lots of angles to talk about, but we have to talk about the why. Why would we ever choose this kind of thing?

I'm kind of curious. Is it true that it's not even really that different? It's not like a total paradigm shift in databases, is it?

Alex: It considers itself an object-oriented database management system. That is a lot of words to say that it provides a lot of fancy, unique features on top of the standard SQL feature set that you expect.

Chris: Okay.

Alex: Whenever you have a database, relational database specifically, you expect it to provide certain features. It's called asset compliant, which is a TMNC. It provides transactions. It gives you integrity. I'm forgetting what the D is for.

Chris: [Laughter]

00:04:09

Alex: There are these certain expectations that you have that it's going to provide this certain value. That's why we use a database because it gives you the certainty that my data is going to be correct if I follow all the standard practices of how to manage a relational database, and I can back up that data. I can query that data very easily.

PostgreSQL adds a lot of features on top of that. For us, we were looking at solving some really complex problems. We're taking on the biggest project we've ever done at CodePen, something that kind of wraps up everything we've learned the last eight years. We're super excited about it, and we're looking at the problems that we're going to be solving in the very near future.

Chris: Mm-hmm.

Alex: Considering how to solve those problems, what's the best solution?

One of our driving principles (whenever we're facing these large problems) is something I call tech minimalism. There's a great blog post on this called "Protect Production Minimalism," and this is something that it drives a lot of how we choose tools, how we add tech to our stack. It's not something we take lightly.

We're a team of nine. We're a small team.

Chris: Mm-hmm.

Alex: We don't have dedicated database managers. You're actually talking to the database admin right now.

[Laughter]

Chris: Yeah.

Alex: And so, we choose, when we add things to our stack that we have to manage feed and look at securing all the time and make sure we're backing things up. There's a lot of complexity and there's a lot of overhead that comes with these data stores.

I made a list of some of the things that we're looking to have, that we're looking to solve with PostgreSQL in the near future, that we're currently using an alternative solution for that I thought maybe we could consolidate that solution.

Chris: Yeah. This is an interesting aspect of this, right? We were on MySQL before, and we moved to PostgreSQL. That's a very narrow slice of the story. Right? There's a reason we picked PostgreSQL is because it solves other needs, too. Is that where you're about to go with it?

00:06:23

Alex: Right. The idea that we could consolidate on this one piece of technology that would provide a very good (or good enough) solution. Maybe not as good as if you just specialized with that technology.

For example, Elastic Search is best in class search. There are other solutions out there. Algolia comes to mind, which is an amazing solution.

Chris: Mm-hmm.

Alex: But for the purposes of what we've been doing, we've always used Elastic Search. We have a little bit of knowledge in it.

Chris: Elastic Search is a company, but what you're talking about, it's an actual technology. It's a database choice.

Alex: Right, the tool, the Elastic Search tool.

Chris: Yeah.

Alex: That's a data store, and so every time we save something in MySQL, we send it over to Elastic Search. Elastic Search indexes it. Then we can search it. Right?

Chris: Right. Now we're using MySQL and Elastic Search, two different tools.

Alex: Two completely different tools that we pay completely different servers. We manage that data. We secure that data. We back up that data. All the things associated with managing a data store.

Chris: Right.

00:07:27

Alex: PostgreSQL has text search. It is not quite what Elastic Search is. There are some features there. There are some things that you have to do to make it performant, but we were looking at should we get a much larger database and run everything through our PostgreSQL database or should we just keep two data stores.

The answer we came up with was, if we could consolidate our search with PostgreSQL and keep our relational database in the same thing, that's just a whole slew of tools and custom code that we don't have to maintain.

Chris: It just goes away. Wow!

Alex: It just magically goes away. It's kept in sync by PostgreSQL and when we back up the data, when we secure the data, when we give people permissions or have to give someone access to that data, it's all in the same place.

That was too good for us to pass up considering that our needs for search, we weren't taking advantage of the depth that Elastic Search gives us already. It's not kind of one of the things that we want to focus on. We want a search that works really well, that can be improved over time, but at the same time, that isn't the management - it doesn't require the management that Elastic Search has required for us.

Chris: Yeah, it doesn't add a warehouse of complexity for a team of nine, which we can't do. That's what you're talking about with tech minimalism, right? This is kind of a good enough tool. Maybe it's not the absolute best in class but, for us, removing this wheelbarrow of complexity is just great for us.

Alex: Yeah, it's huge. Similarly, we've been looking at storing analytics inside of CodePen, so it solves a whole slew of problems. If you think about the idea of things like GDPR where you have to disclose and get rid of data when users ask you to get rid of data, we can do that. We know that it's not being shared anywhere. CodePen doesn't sell data to any other third parties, but we just have to kind of agree that if we send it to another data store that those things are being locked down, that those data stores aren't being exploited.

If we wanted to manage things like analytics data, we could do that. Store it in another data store or we could just use PostgreSQL as a data store.

Chris: Just use PostgreSQL. Yep.

Alex: And for analytics. At the time, I was very excited about ClickHouse, which is this awesome, super-fast, column store purpose-built for analytics database. I'm a bit of a magpie. I love tech, and I love tech solutions and all the things open-source. That was something I was really excited about, but the reasonable part of me says, "Let's store everything in PostgreSQL."

Our analytics needs are very simple. They're very straightforward. When we build our analytics tooling, it will be inside of PostgreSQL.

00:10:35

Chris: Yep. There are more things on this list. There's this queueing thing we called SQS that we use that, actually, PostgreSQL can handle as well. There's the fact that somehow -- this blows my mind -- PostgreSQL can be an asset store as well, like for media files and such. [Laughter] I have no idea how that works, but apparently, that's a thing.

Alex: Yeah. That's a feature we're exploring, and it's a feature that they have internally where we use S3.

Chris: Mm-hmm.

Alex: If you upload images and things like (as a Pro user), we store those in S3. But PostgreSQL has features for that where it won't bog down the database. It'll just store it on disk and refer to it intelligently.

PostgreSQL has a ton of momentum behind it. When we chose MySQL in 2012, they were a lot closer, PostgreSQL and MySQL. We knew MySQL from working at Wufoo. We had experience with it, and we just kind of went with what we knew.

Sadly, similar to the Wufoo days (when we chose Prototype over jQuery), it turns out PostgreSQL was the jQuery of the time of databases. And so, we didn't have the momentum behind MySQL that there is behind PostgreSQL.

The interesting thing is that to understand our issue and why we chose what we did, we were also on MySQL 5.6, which came out in 2013, like over 8 years ago. At that point, if we were going to upgrade to the latest version of MySQL (to take advantage of that), it would have been the same level of effort that we had to put into moving to PostgreSQL. That really was an easy choice to say, "Should we move to MySQL 8.0?"

Chris: Okay. Even if you did it, you wouldn't get all this other stuff, right?

Alex: Yeah. We wouldn't have all the momentum. We wouldn't have all the nice little feature sets that we really enjoyed.

Chris: Okay.

Alex: And we also would not have the release cycle and all of the effort that goes into PostgreSQL (going forward), which I'm really excited about because the PostgreSQL community is constantly innovating. They're always extending. PostgreSQL is a very extensible database. And that's really exciting to be able to take advantage of the cool new tools.

Did I mention I was a magpie for tech? [Laughter]

Chris: [Laughter]

Alex: That's one of the fun things about moving to exciting technology is there tends to be a big community around it.

00:13:01

Chris: From a front-end perspective, I know you used the prototype jQuery angle there. I don't know if the listeners will even remember what prototype was, but it was funny to work on. Even while you're working on it, you know that it's dead - kind of - [laughter] and that all the excitement in the industry is based on other languages.

Today, it would be kind of like - I don't know - having some old-school rendered app and then, all of a sudden, picking up React - or whatever - and just seeing how much excitement and speed and energy there is behind the React ecosystem. Maybe that's changing to some degree. React is starting to be an old dog - in a way.

Alex: Yeah. Maybe it's akin to, like, using Backbone.

Chris: Yeah.

Alex: I don't know if you remember Backbone, versus React or Vue today. It might be something akin to that where you're like, "Ooh, I think I kind of--"

Chris: Yeah, perhaps.

Alex: "I think we invested in the wrong tech." [Laughter]

Chris: Yeah. It's clear the industry is going the right way, so picking up PostgreSQL is like getting on the right track of where the energy and excitement and change is happening in the industry.

If we're recapping the why, we're saying we were on MySQL 5.6. It's old. We wanted to get upgraded to something more modern for the future because that's our responsibility as tech people, right? We could have picked new MySQL, but then we're looking at PostgreSQL. That's where the enthusiasm is. Great. And we look at our tech stack and say, "Look at all this stuff we can consolidate, which embraces this concept of tech minimalism, which we need to have as a team of nine."

We don't have a team of 13 just working on search who can pick their own tools for their own expertise. We don't have that. So, we're going to consolidate these tools in the spirit of tech minimalism because it makes us more experts on the tools that we do use.

When you have 15 tools, you're going to be a little jack-of-all-trades on all 15 of those tools. Even if you're really smart -- and, Alex, you are very good at technology. You know lots of stuff. Still, you're going to be fragmented. You're not going to be able to dig as deep on 15 technologies, as you could one. You're just going to be better.

00:15:04

Alex: Yeah, 100%. 100%. At the end of the day, there is a core technology at CodePen that we love to work on. It's the Pen editor. Any time we're working on anything else, whether that's just maintaining some search tool or job queues or analytics or anything like that, those are all nice. It's nice to know what people are doing, what people are into via analytics, but we want to work on the CodePen experience, the core CodePen experience - most of the time here.

Chris: Mm-hmm.

Alex: A lot of times, technology has a way of pulling you away from the things that you actually want to because there's some exploit, there's something that happened that you had no idea existed. You know I spent a small amount of time looking at do our containers have Log4j exploits available to them.

Chris: Right.

Alex: That's a fun part of my life. Every time you add these tools, you have no idea if there's an exploit in there. There's so much software packed into what we call these big services. It was really important to us to use the least amount of software so that we could specialize in the number one thing we like, which is the CodePen experience.

Chris: Heck yeah. We're hard at work, working on that. And we didn't pick it out of a hat. Not only was there some industry momentum towards this whole PostgreSQL thing, but you've used it, other people on the team have used it. There's some existing experience that we could leverage. That gets a lot of points, too, around here, certainly.

Alex: Yeah. We had some in-house expertise. It wasn't our first rodeo with PostgreSQL. The beauty of using a simpler system like MySQL 5.6 is that, honestly, we weren't doing much that was too fancy. A lot of the queries were very easy to migrate because we're hiding all or most of our queries behind an ORM, whether it's Rails or GORM for our Go side. That provided a lot of value for us being able to move and make it easy.

It was still a very fraught with error, fraught with issues move that we were worried about but, in the end, it did work out. It didn't turn out anywhere near what I feared it could be, so that was really great.

Chris: Yeah. [Laughter]

00:17:19

[Guitar music starts]

Chris: This episode of CodePen Radio is brought to you in part by WooCommerce. WooCommerce is e-commerce for your WordPress site. That could be a self-hosted WordPress site, like many of mine.

For example, CSS-Tricks uses WooCommerce, and it uses it in a number of ways. It sells physical products through the store, so obviously you can go to /store and browse around for products. You could buy a poster. It'll arrive at your doorstep anywhere in the world. WooCommerce powers all that, for me. Really cool.

it also powers digital products, too. If you want to download a digital copy of that poster or a digital book, it helps me deliver those securely. It also sells memberships to the site, locking down areas of the site only to members. Very powerful software in that way. Believe me, that's the tip of the iceberg for WooCommerce. It can do all kinds of stuff.

It doesn't hurt if you use the product to subscribe to the blog of the product. I'm a big believer in that. They might as well let the company teach you what to do. Just browsing around the WooCommerce blog, there's stuff like recommended extensions from them and all kinds of advice about your store. There's good stuff in here; information about the fact that 84% of shoppers say that it's the video that convinced them to buy a product or service.

Got to have a video in there. Why don't you use something like VideoPress to do that, or there's a Vimeo plugin to do that. Why not add reviews to your site? Think about the SEO of your site.

There are all kinds of things you can do to improve your site. It's a new year! Maybe it's time to take stock of what's happening at your store or add a store to your site. Make a few bucks. Thanks for the support.

[Guitar music ends]

00:19:08

Chris: Yeah. We need to get into the how, so let's do the how because it'd be interesting. Certainly, one thing that was cool is we had no downtime at all. Whatever the how was, what a miracle! How the heck does that work? The Apple store goes down just because they need to put a new laptop in the store. You know?

Alex: [Laughter] Yeah.

Chris: It's like we just changed ten years' worth of data.

Alex: I'm going to go out on a limb and say that's a bit of a rouse.

Chris: [Laughter]

Alex: I think they can update a website.

Chris: Yeah.

Alex: I don't know if you guys are onto this, but it might be a little bit of a rouse on Apple's part.

Chris: Yeah. They want people pressing command-R. It's funny. They don't want to tell you when. They always tell you when it's not their first rodeo, but nobody tells you when it is their first rodeo. You know?

Alex: [Laughter]

Chris: They're like, "This is my first rodeo." You know?

Alex: We all gloss over that first rodeo. I think we were there.

Chris: [Laughter] Let's do the how, and then we'll tell you what was good and bad, post, now that we have a little bit of perspective on that.

I have no idea how. I wasn't like you. I wasn't as in-depth with this project. I just wanted to be sure, as also a co-founder of CodePen, that it was worth it. And that part was so obvious (by the time it was started).

There were times in the past, when we talked about PostgreSQL, where this was much less obvious to me, and I probably would have voted against it. I would have been like, "Why are we doing all this work, and for what exactly again?"

Alex: Yeah.

Chris: Wouldn't it be cool to working on something that anybody cares about?

00:20:35

Alex: Yeah. Yeah, CodePen, it's interesting because it was scary to say, "Let's invest time in this." Every time we've ever -- this is not the first time we've considered moving from MySQL to PostgreSQL.

Ironically, we always gave up on that idea because we were like, "Oh, that's just too much work. It's too much effort. We're going to get bogged down in problems."

The number one thing that convinced me that we could do this very easily was our ability to test this idea out in an environment that wasn't production but was very close to production. We invest a ton of effort into creating a high fidelity development environment, locally, just on our laptops so that 95% of the features just work locally and are a pretty good recreation of what the production software is. Right?

Using Docker, Docker containers, and the services that we create locally, we're able to target the exact version of PostgreSQL that we were hoping to move to, what it would be like after a move, moving all our local data from a local MySQL database to a local PostgreSQL database that are exactly the same. In our development environment, we moved from MySQL 5.6 to PostgreSQL 13.3.

Chris: Mm-hmm.

Alex: It was the exact version we were on. Doing that, it's very easy because you're just moving a bit of data. But then you're like, "How does the app work? Does the app work almost exactly the same? How many more queries will I have to port?"

We were able to do that in just a couple of days. Right? Once we used PG Loader, which is this kind of dev tool that we were able to use. People use it sometimes to move production data, but we didn't do that.

We moved the data locally. We tested the containers. I'm sorry. We tested our environment and said, "Does this work? Do we have bugs galore?" That would have scared us away.

It's almost like a smoke test for is it even possible to start moving. We did that. That worked exceptionally well. I was convinced this was going to be a week-long task. It turned out a bit longer, but it was close enough.

Then we moved all of this content. If the dev environment recreates 90% to 95%, the staging environment we keep creates like 99% of the environment.

Chris: Right.

Alex: It's an AWS account that's isolated that matches one for one with everything (including the instance types that we use) with what we have in production. We keep those things in sync.

We run an infrastructure as code tool called Pulumi that uses code to create our servers and provision them and install software, deploy software, all those things. That's been a big undertaking over the last two years, but it's been well worth it. I can't even imagine not having that these days. It would feel like going into the stone age of development.

We're able to recreate our production, true production, environment with production data. I actually provision the exact same size server that we want to move to for the MySQL database and the PostgreSQL database, and I move a snapshot of the real database to see what it's like.

Chris: Right.

Alex: Right?

Chris: You're already 80% sure we're going to do this based on your tests locally. Now you're 94% sure we're going to do this because it works on staging, too.

Alex: Right.

Chris: You're still in the, like, convince yourself stage at this point.

00:24:21

Alex: Yeah, at that point you're saying, "Okay, how hard is this truly?"

Chris: Yeah.

Alex: What we had to do, the number one thing that's going to convince you, "Is this going to be okay?" is can you do it and script it, most of it, before you ever do anything in a real production environment?

What we did was, before we ever attempted to do this move, I think I moved the data 10 to 15 times just in a production -- the real data set, not like a fake version of the data.

Chris: Mm-hmm.

Alex: Because data, when it gets very large, has different qualities. Then you have to get into timing things and how long things get to move and all that.

I needed to know how long. We wanted to make sure, if at all possible, that we would have no downtime. I personally felt like the most downtime I was comfortable with was 30 minutes. I just didn't want to be offline. That's just a huge priority for us these days.

Chris: Right because, again, you're using the app. You're hitting the save button. That data is going into a MySQL 5.6 database. At some point, you need a strategy where that data is going into a PostgreSQL 13.3 database.

Alex: Yeah.

Chris: That's a weird cutover moment. I imagine there are strategies, right? Is there a strategy in which you're writing to both of those databases at the same time? I think we did not do that, though, right?

Alex: Yeah. Our code, so one of the key components here is, because we're on AWS, they have a service. I suspect there's a service like this.

PG Loader is a service like this, if you ever want to move from MySQL (or from anything) to PostgreSQL. AWS itself has a service called Data Migration service. It's one of these generic-sounding things. It had it for ages. What it does is--

Chris: Is it a person or is it technology?

Alex: No, it's a tech, technology that you pay for.

Chris: Yeah. Okay.

Alex: You set up and click all by yourself on the AWS console, and so what it does is it reads your MySQL database. It will give you the option of creating a schema or moving data from one database to the other. This is kind of the key component here. It will also keep the data in sync. While you continue writing to MySQL, it will continue writing and moving that new data to PostgreSQL.

Chris: Oh, so it is writing to both - kind of.

Alex: It is writing to both - kind of.

00:26:49

Chris: It's writing to one and then moving to the other. It's not writing both.

Alex: Yeah. The key for us was we did not have to write that code because I was not interested in writing tons of custom code that was untested.

Chris: Yeah.

Alex: To say, like, "I'll just write to two databases," because this is a solved problem as far as tech is concerned. There are tons of open-source software that has tried to do it. Tons of tech articles on how to do this.

For us, what made sense was to say, "We're going to use this data migration service. We'll run it," and we knew exactly the amount of time it would take - just about - because we've already done it 10 to 15 times.

Chris: Yeah.

Alex: We've already run the tool in our staging environment - tons.

Chris: What was it like, hours, days? No, it wasn't days, surely.

Alex: I think I started it in the morning. It took, realistically, about 12 hours.

Chris: Whoof.

Alex: To move all the data.

Chris: But then it stays in sync after that.

Alex: After that, it stays in sync.

Chris: Yeah.

Alex: Yeah, so I'm pretty sure I woke up at 10:00 a.m., turned on the tool, run it, and then it stays in sync. Then by the time that it -- I may have done it maybe the next day just to see that everything was staying in sync, and I could see that the delay between the large table that was always lagging, which is our Pens table -- surprise.

Chris: Mm-hmm.

Alex: --was about 30 seconds, 30 to 45 seconds, so this thing is really close to real-time. It's not real-time, but it's pretty close.

Chris: If you turn it off, so you go down for one minute, then the PostgreSQL one is up to date. Then you've got to flip, which is like read-only mode, right? Just for a little bit.

Alex: Right, that's the beauty of it. We put ourselves into read-only mode, which meant that we could serve content, read content. CodePen wouldn't just 404 or 500 when you reach a site that you couldn't write anything, couldn't save anything, you couldn't sign up to the site for that one minute of time.

Chris: One... One minute! [Laughter] But you do have to prepare for that. That was code that we had to write, right? A little bit.

Alex: Yeah. We did write that code. We wrote a little checker at a chokepoint where we save everything that says, "If this flag is on, do not save. Reject the save." And we warned users.

Chris: I see, so you don't have to visit 50 areas of the site that's like, "When you leave a comment, write code. When you leave a heart, write it."

Alex: Yes.

Chris: "When you do a fork," you know--

Alex: Yeah, which would be a nightmare, right?

Chris: Right.

Alex: The amount of touchpoints where you write code ... to the database....

Chris: The chokepoint is a cool way to put that.

00:29:29

Alex: Yep. And so, what we did was to say, "Okay, we're going to do it at this chokepoint. We're going to make it configurable, dynamically. We're not deploying new code to get it out there because that would be a 10-, 15-minute affair. I was literally on a server, and I updated ... values. As you know, read-only, true-false.

Chris: Nice.

Alex: At that point, imagine that we have the two databases in sync. What we did was we stood up a parallel set of Web servers. Those Web servers--

Chris: Were already reading from PostgreSQL, right?

Alex: Yeah, they were already reading from PostgreSQL. I can actually go visit that site and see if we were up and running (according to the Web servers) but they were hidden at the time. And so, once I knew that those servers were serving the site correctly, I could put CodePen into read-only mode, which meant no one is writing anything. Then I'd go to Cloudflare, which is our -- when you enter CodePen, the CodePen infrastructure world--

Chris: Mm-hmm.

Alex: --you're hitting Cloudflare. Cloudflare is hitting our AWS account. At that point, they have a 60-second, 30-second delay in switching internally. And so, they switch. I now see that our new servers are getting traffic. I'm watching analytics and stuff inside of AWS.

Once that's all honky-dory, it seems well, we're not getting any big errors -- because at that point, we still have the option of switching back to MySQL.

Chris: Mm-hmm.

Alex: This is the key. There's kind of a point of no return. The moment you remove read-only mode and you're on PostgreSQL, you've hit the point of no return because we're writing from MySQL to PostgreSQL. We're not doing the opposite. We're not going the other way.

Chris: [Laughter] We're not doing the opposite. Right. That is no return then. Once people have saved some Pens that are uniquely on PostgreSQL, woo, that would suck to go back. It'd be really hard.

Alex: That's over.

Chris: Yeah.

Alex: Yes. And so, that was the big concern for me. I was like, I don't want to -- I could have written. I could have had another database migration service that could write the other way, but I was like, we'll just roll forward. [Laughter]

Chris: Yeah.

Alex: If I flubbed it at this point where I've had a staging environment, I've run this thing 15 times, the hope is, okay, it's going to be a small error. I'm going to be able to fix that error, if anything, and it'll be okay, but we won't lose any data. Of course, we're taking snapshots at all the right times.

At that point, we're moving over to PostgreSQL. We finally remove read-only mode, once Cloudflare has switched over to our new Web servers, and that's it. We're done.

Chris: Woo! Pop the champagne then.

Alex: It was a lot more detailed than that. Yeah.

Chris: Yeah.

Alex: We didn't have a pop the champagne moment because, at that moment, you're still reading logs. You're looking at sentry errors. You're looking at dashboards to make sure--

Chris: Yeah.

Alex: --that this is real, that you truly are on the new database. That was kind of surreal to realize, okay, we're literally on a new database. We've talked--

The number of times we've actually had the real, like, "I think we might be moving to PostgreSQL," discussion over eight years is three or four times, at least, that I can remember, and we always pulled out. To know that it worked, we didn't have to go down, that was huge for us.

A lot of planning went into that. I think, for four days straight, the only thing I thought about was moving this data. I had a very detailed design doc. We have a lot of services. That was the highlights of how it went, but there were a lot of other services that maybe needed to move to PostgreSQL prior to that that could deal with the delay in real-time. So, we moved a bunch of other small services that read from our database but don't actually write to it as often. And so, it was great. We did celebrate a tiny, tiny bit.

Chris: Yeah. You got to move from the office to the couch.

Alex: There were a couple of tiny things that--

Chris: You know it was great.

Alex: That's right.

Chris: [Laughter]

Alex: That's right. With my laptop.

Chris: With your laptop.

Alex: But it was great.

Chris: [Laughter]

00:33:34

Alex: It was great. In the end, we moved. We moved prior to the deadline that I had given myself. Dee and I were both about to go on maternity/paternity leave, and I knew that I did not want to come back to a CodePen where this was going to be a focus. I knew I wanted to have this ready to go, and we could just start developing all our new features using PostgreSQL.

It's been amazing. We now have the Pens table back where we can add columns and remove columns. A lot of our content is kind of a badly designed table, so it's massive, and we could never touch it if we ever wanted to add a column or remove a column, and we can do that now.

Chris: Yeah, this is big. This is big.

Alex: We can start moving forward.

Chris: Right. Let's say you've metaphorically popped the champagne. This is a moment that we didn't talk about earlier in the show, right? It's so huge! I know this is huge for you because you've told me over and over and over.

Back when we were on MySQL 5.6, if we wanted to add just something as simple as add a column to the Pens table, it was basically a no. I mean I'm sure we could have found some technological way to do it, but it was extremely non-trivial, which now is - what - extremely trivial. Is it that--?

Alex: It's just a migration like any other. Adding a column is, "What column do you want to add? What column do you want to drop?" It's very simple.

Chris: Is that just the miracle of PostgreSQL? Is it just miles ahead of SQL technology?

Alex: Yeah, I suspect it's the miracle of modern databases. I suspect that if we were on MySQL 8.0 or MariaDB, the latest version of that--

Chris: Yeah. It wouldn't be so bad either. Yeah.

Alex: Other Microsoft technology, whatever it might be, it wouldn't be that bad. But for us, you have to understand. We were on this very old version of MySQL that would lock up the entire table just to add that column. It was a real problem for us. We couldn't move forward.

We developed weird things like having a parallel table that was easily searchable [laughter] and gave us quick access to certain columns that we'd use all the time. That was a maintenance headache. And so, we were able to get rid of all of these things.

I got rid of the -- I don't remember if you remember the item metadata table, which was crazy. We were able to just go away with things.

Chris: It's just gone. You killed that table because you just pulled it back into where it should have been, to begin with?

Alex: Exactly.

Chris: Yeah, that's awesome.

Alex: We no longer have to worry about things like that. It's already where you can see us.

My favorite idea in software is something -- I have no idea if this is to the world or something -- what I call it. But I call it compassionate. It's like when you take the ideas that you fleshed out and you start to make them smaller and tighter. They're just sturdier. They're better. It's like adding an arch to a passthrough. It supports a lot more because you've figured out a solution that really works for you.

For us, we know what we want. We know the columns we want to query. We can start removing a lot of this cruft that builds up over time. It makes it a lot easier to work with the CodePen code base and maintain it and everything else.

00:36:53

Chris: Nice. Let's do the bullet points real quick on what really made this. I think we covered the why and how pretty well. What really enabled this as a possibility? You mentioned that amazing tool, the AWS migration service thing. Probably couldn't have done it without that, right?

Alex: Definitely.

Chris: They're both relational databases, so you still write queries, but we don't -- most of our queries are not handwritten, right? They're behind an abstraction that was able to make this more possible.

Alex: Yeah. Like a lot of developers these days, we write a fair amount of SQL queries nowadays because I think I rather enjoy it in this experience. We've been cleaning up our database for a couple of years now and cleaning things up and using more powerful features that are just baked into SQL itself.

But for the most part, when you're looking at typical save, update, read, delete, those kinds of things, we're using an object-relational manager like Rails or GORM in the Golang world. That hides most of the queries, and those things are able to -- under the covers -- adapt to the database that you're using ... SQL.

Chris: Yeah, just be like, "I use PostgreSQL now," and it's like, "Okay. I'll just change my internal syntax to correct that." Yeah, that's amazing.

You mentioned Pulumi, so that helps because you're coding out. You're not clicking buttons and remembering what you clicked, right? You're coding it out and scripting it, so that's cool.

Alex: You don't have to remember that you set -- that you provision 8 servers that scale up to 15 at this CPU threshold. It's all just kind of provisioned for me with a few code changes. We will branch our codebase, submit it to our CI, and it'll automatically take care of creating the infrastructure because that's just something that we've invested a lot of effort into.

It's absolutely amazing when you finally move to automating and managing the infrastructure with something like that. You don't even know how you would do it otherwise because it gives you the certainty that you know you have a one-for-one match of the other infrastructure that you're replacing. That was very key in this solution.

Chris: Yeah. That's really cool. You mentioned the chokepoint, which just meant that we weren't writing sprawling code all over our codebase to handle this. You kind of handled it at one unique level. That would have sucked to handle--

It's interesting how many things are on this checklist that really had to come together to make it possible. Then not only was it possible with no downtime, but the whole project took two weeks. That's impressive. I'm sure there are companies that are a year deep into their theoretical database change. This was pretty much lightspeed.

00:39:40

Alex: Yeah, and that was critical for us because one of the things that we had is we have 100% control of where the data is stored. We didn't have to go through red tape. We didn't have to get approvals for certain services.

If you have a typical kind of IT department where you have different groups, different teams that are in control, it's going to be really hard to get everyone on the same page and say, "Oh, yeah. We all want to move to PostgreSQL. Let's cut a branch right now and do everything."

Chris: [Laughter]

Alex: Whereas we're able to look at the whole thing, see the entire landscape, hold it in our heads, and say, "Okay. I know that there are six services that will need to be modified, but they don't look as scary." I'm also just eternally optimistic and just assume we'll get it done. [Laughter] Which is being naive almost helps in this situation. This was my first rodeo. It's the first time I'd ever moved an extremely large database.

Chris: You heard it here first. First rodeo.

Alex: [Laughter] Was that the first-first rodeo? I think it was.

Chris: [Laughter] First-first rodeo. Yeah, I think so.

Alex: It's amazing. [Laughter] Our first here, so in doing that, you almost have to be naive because it was more work than I expected but it wasn't as bad - looking back on it. I'm over the moon happy that we're done with it. We just get to write some CodePen because, as much as I love data stores, it's not really the point of why we work here.

Chris: Right. You know what they say about podcasts, too? You should always end it on a bummer here, so let's try to go real quick and--

Alex: [Laughter]

Chris: [Laughter] What do we--?

Alex: We should have stopped at first-first rodeo. I'm sorry, guys.

Chris: [Laughter] Is there anything that was a bummer? I mean there was other stuff that wasn't mentioned here. Obviously, there was some degree of testing and such for data integrity and stuff. We don't want to find out, oh, turns out all Pens from March 2017 were mangled, or something. [Laughter] There's some integrity happening here. But I think you mentioned there even was one kind of -- like it's tricky. Even with the migration, there was some kind of problem.

00:41:44

Alex: Yeah. Yeah, there was one data snafu where I misunderstood some of the documentation around data migration service where it was asking, saying if you have a text type column, do you want to have a limit on the size of the column? The way they described it was a language that I didn't understand and I assumed that it wasn't referring to a text size. It said something about binary column, which I misunderstood.

Chris: Hmm.

Alex: And so, in the end, we had some snafus. We weren't experts at using the data migration service, and we weren't experts at PostgreSQL. We also opted not to bring in consultants because of the timeframe. At the end of the day, that helped us, but that's also the kind of thing that may just be impossible in your world (if you have a certain amount of data).

But not having some of those things and not having that deep expertise of saying, "Oh, well, we do this every day," like the way you can get with a data consultant, someone who does these moves all the time, kind of hurt us a little bit in those small details. But I felt confident enough that, like, "Okay. Everyone on the team that's really deeply involved in this move has read these things. We seem like we know what we're doing. There's only so much handholding we're going to get out of this. Let's go for it."

Chris: Let's do it. Yeah.

Alex: That's kind of the CodePen way, honestly.

Chris: Yeah, kind of.

Alex: We have a habit of doing those things that way.

Chris: And then fix it. There was a little data snafu. You get the old data, fixed it. Not a problem anymore. Anywhere there's some kind of custom SQL, that was an issue too, right? Well, go fix it.

I think those were probably a little bit more identified, but even Marie writes custom SQL for dashboards and stuff and then had to go back and basically learn what the differences are with PostgreSQL syntax and rewrite them. But, hey, big deal. You learn.

Alex: Yeah, and I knew there would be some snafus after the fact. There were some queries that I'd forgotten to port over.

We recently had to run a query to get the top 100 Pens of the year, and that needed to be modified.

Chris: Oh, yeah, because you wrote that SQL query, what, years and years ago?

Alex: Yeah, now it's custom SQL. There are still some odds and ends that we're taking care of here and there, but it was good enough to the point where we felt like we got 99% of the changes in.

Chris: Yeah.

Alex: If we failed to -- if we missed something, it was because it was very small, and we knew we weren't going to lose anyone's data. That was critical to us here. We really make that a big point. We have backups across the way, and we push forward. Now we get to just revel in our data store and enjoy it. It's very lavish in there. we have a lot of space.

Chris: I like that word! I think it is a morale improvement. I think it's nice to see you kind of happy with where our database--

Wouldn't that be a bummer if you were coming back from paternity leave and stuff and just being in this bummer place where you know there's this big project that you either need to do or that we're resigned to never making that move because now we're too deep into--

You had a kind of now or never spirit to it, and never would have been a morale hit, I think.

Alex: Certainly. Yeah, and we would have still been living with the quirks. Even though you don't have to pay this big, upfront cost of doing this move and all those things, you always have to weigh the pros and cons.

At the end of the day, we're over the moon happy with the move. We no longer pay the weird costs of not even being able to move data and not being able to read modern documentation on the way your database works. Having to Google for things that happened in 2014 was a very annoying thing, like when you're searching for how to solve a problem and you're trying to figure out whether or not your database has that feature. Most of the time it doesn't. You know?

There are all these little things that happen in your day, and it's huge for us because now we get to just look at what the industry is doing, the way people are solving problems today.

Chris: Yeah.

Alex: It's been pretty magical from that perspective, so we're pretty stoked and pretty stoked to not think about databases, not have to worry about them as much because we're happy with our solution and we're just going to move forward from there.

Chris: Yeah, man. Rock-n-roll. Very positive change. I applaud it. I remember the jubilance when this thing went well, and it was a cool way to kind of send you off on paternity leave and send off -- you know it was kind of before the holidays and all that. Yay! It's done!

Alex: Yeah, it was huge.

Chris: Let's do something more fun.

Alex: Enjoy, everyone.

Chris: Let's move on to the next really super hard technical problem we're trying to solve. [Laughter]

Alex: [Laughter] Yeah, that seems to be the way. But the next one is super fun.

Chris: Yeah.

Alex: And very public-facing, so that's the best part.

Chris: Yeah.

Alex: Actually, I would love to share our software with people.

Chris: Uh-huh.

Alex: We can't really share. You can't really bring your own SQL.

Chris: Two years of this podcast where we're like, "This is an internal thing. You're never going to notice."

Alex: [Laughter]

Chris: It's internal only. Someday--

Alex: Yeah. It should just warm your heart that we're on PostgreSQL.

Chris: Yeah.

Alex: We'd like to stop doing those kinds of podcasts.

Chris: All right, man. Thanks, everybody, for listening. We'll talk to you all later. Bye-bye.

Alex: Bye, y'all.

[Radio channel adjustments]