Deploying PostgreSQL DB Changes Through AWS Developer Tools Using Flyway
The fundamentals of deploying database changes through Flyway are pretty simple. At the root of it all, there are two words you need to know:
1 |
flyway migrate |
This of course comes with a whole set of assumptions. You’ve set up your configuration to connect to the correct database being just one of them. And that’s talking about only one connection to one database. Let’s toss in just one additional database. We’ll have a development database running locally. Then, I’m going to put a PostgreSQL database up in AWS RDS to act as a Continuous Integration test platform. Now, how do I get all this to hook together?
Here’s what I’m going to combine:
- Flyway Community (just to show how it works at the core level)
- GitHub (since AWS CodeCommit is deprecated, I need to have source control somewhere)
- PostgreSQL (two, one local, one in AWS RDS)
- AWS CodeBuild (to act as flow control for my Continuous Integration deployment)
Basic Setup
In order to keep things as simple as possible, I’m going to define all my local work through default flyway.conf file at the root of the project. Here’s a view of what that might look like in Visual Studio Code (VSCode):
While I can do a lot from within the flyway.conf file to control how my deployments get done, taking a very bare bones approach to keep this process as simple as possible, the only thing I really have to do is add a connection string and passwords to my local cluster:
1 |
flyway.url=jdbc:postgresql://localhost:5432/hsr?user=postgres&password=$cthulhu1988 |
With that in place, local deployments really are just a question of running flyway migrate. That’s local development taken care of. I’ve got this up on GitHub as I stated earlier. So I can manage the local repository and the one on the cloud, no issues.
Core setup for AWS RDS is pretty much the defaults to get a PostgreSQL cluster running there and ready for my database. I’ll go ahead and create the database as well. Now, we’re ready for the real work.
Setting Up AWS CodeBuild
My focus throughout this will be on making things as simple as possible. Yes, a true pipeline for database development, testing and deployment is going to have all sorts of complexity. However, we truly don’t need to seek that complexity out when we’re learning. It’ll happen all it’s own. To get started, we need to keep all this as simple as possible. This is especially true because, even though I’m going to keep this silly simple, it’s still going to end up being pretty complex.
We’ll start with the CodeBuild project. We’ll be all over AWS, but this is a good place to start. Stepping through each step, the first is simple enough. Give the project a name:
Whoo! Glad the hard part is over.
Now we need to connect to GitHub. There are multiple options of course, but for this demonstration, we’re just using GitHub. Setting up the others is not radically different:
But wait, before we do this, we need to create a connection between our AWS account and our GitHub account. This is done in Developer Tools Connections page located here:
While doing this work, I regularly have 3-5 tabs open in my browser so I can bounce around between these things as I put them together. We’ll create a new connection:
You’re then prompted to install an app on AWS, or just connect as a GitHub user:
You’ll get prompted to make the connections appropriate to GitHub. It’s relatively straightforward. Switching back to the CodeBuild project now that we have a good connection, we have to finish setting up source control:
This is basically the defaults. I have the source credential from the connection I’ve created. I’m providing the URL to get to a repository in my GitHub account (which, by the way, is public, so all this code is available for you to copy). As you can see, I could set up other mechanisms, but in this case, they aren’t necessary.
Next, because I want to automate this as a Continuous Integration build, I’m going to run the build off of every PUSH command to GitHub. Again, this is a very simplistic approach. Probably, we need specify specific branches, or merge operations, or filter it down to particular operators making the PUSH. We can make this as complex as necessary. In order to arrive at initial functionality though, I’m going to keep it simple. So, I’m going to add a manual webhook:
I had to expand “Additional configuration” and then select the check box next to “Manually create a webhook for this repository in GitHub Console.” Happily, that’s all we have to do initially. We’ll come back around to the webook in a few more steps.
Next, we have to define the environment that’s going to act as our build and deployment tool. We’re basically running a VM and sending it commands. I’m accepting all the defaults here, except for one thing, the Service Role:
For the Service Role, I’m choosing a role that I’ve already created. Can you do this when initially setting up Code Build? No. You’re going to want to create a “New service role.” I’m cheating and using an existing role for one reason, I’ve already set up a bunch of the security on this role. On another tab (I told you, I keep a bunch open while doing this kind of work), I can look at Identity and Access Management (IAM) roles. One in particular has it’s permissions shown here:
By default, the CodeBuild permissions would get added to the role, so you don’t need to worry about that. However, I also want this role to have access to RDS in order to take control of it. Can I limit these permissions? Absolutely, but once more, I’m going with simple expediency to arrive at a functional automated deployment. Without the RDS access, we’ll get errors when we attempt to connect to the RDS server:
With the security set, we’re going to define how the build commands are run. You can hard code them into the CodeBuild project. Alternatively, and the route I’m taking, you can use a build specification file, buildspec.yml. This is a YAML file (and YAML means Yet Another Markup Language, I don’t care what the cool kids tell you ) with the commands built in. I do this because it allows me to modify what’s happening on any given build without editing my CodeBuild configuration. I change the file and PUSH it to GitHub and the new commands I’ve provided get used. Nice and easy.
There are a bunch more settings, of which you can take direct control. However, I’m happy enough at the moment with the defaults. I don’t need a lot out of logging here. I’m not trying to chain multiple builds and multiple batches. I’m not generating artifacts (although, I could easily generate the SQL command that I use for the deployments to my PostgreSQL database). As such, I’m leaving all these defaults in place:
With all that completed. We can click on the button at the bottom of the page. However, we’re not done yet. We still have to set up the webhooks over on GitHub. Happily, that’s silly simple. After clicking on the create button, we’ll be presented a couple of pieces of information:
We’ll be using these over in GitHub. Don’t close them before you capture them. You’ll be unhappy. And those won’t do you any good if you’re looking to hack my stuff. This isn’t my actual project.
So, over on Github, you have to navigate to settings, then down to Webhooks. Adding a new one looks like this:
The Payload URL is coming from the screen on AWS. Same thing for the Secret. You do change the Content Type to ‘application/json’. After that, you’re good to go. You’ve built your CodeBuild project and it will now respond to changes in GitHub.
Creating the buildspec.yml File
While technically, we’re done, the CodeBuild pipeline is done and functional, the fact is, at the moment, it’s not going to function at all. We have to do one more thin. We have to set up the YAML file with some kind of commands to run Flyway to do our database deployments. Further, we have a few choices on the Flyway configuration. While there is more than one way to set up Flyway, I’m going to use a container. It’s an easy way to run it from the Linux VM we create in the CodeBuild project.
Our first choice on the configuration are to hard code the values into the Flyway commands, skipping the configuration completely. I’m not a fan of this just because if we make changes to the existing configuration file, maybe even adding settings that we’re not currently using, those changes and new settings can get picked up by our cloud-based deployment. So, instead, we need to look at adding an additional configuration file somewhere. You can do this two ways, and either works. First way, provide a new configuration file a new name. Then pass that name to Flyway when you run it. Or, the choice I made, add a second configuration file in a second location and reference that when doing a build. I have a ‘conf’ folder in my project. I’ve put a separate configuration file there.
In the configuration file, I’m going to change my connection string:
1 |
flyway.url=jdbc:postgresql://hsr.c2ek9ilzyxhz.us-east-1.rds.amazonaws.com:5432/hamshackradio |
That ensures that I’ll connect up to the AWS RDS PostgreSQL cluster.
Then, I need to configure the commands to run Flyway. Here’s my buildspec.yml file:
1 2 3 4 5 6 7 8 9 10 11 12 |
version: 0.2 phases: #install: #pre_build: build: commands: #- docker run --rm -v $(pwd)/sql:/flyway/sql -v $(pwd)/conf:/flyway/conf flyway/flyway repair - docker run --rm -v $(pwd)/sql:/flyway/sql -v $(pwd)/conf:/flyway/conf flyway/flyway migrate #- docker run --rm -v $(pwd)/sql:/flyway/sql -v $(pwd)/conf:/flyway/conf flyway/flyway clean #post_build: #artifacts: # files: |
What you see there are three different commands, one each for repair, clean and migrate. This is so I can easily swap between the three as I’m experimenting and testing until I’m happy with the behavior. For those who don’t know, repair fixes things when you’ve done something like deleting a migration file. Clean is for cleaning out the schema entirely. I use that to reset the database for some testing. Otherwise, I’m dependent on simply flyway migrate. All other settings and what not are handled in the configuration file.
However, a couple of things to note about the docker run command I use to call Flyway. First, I have to map two volumes to the container. One example is -v $(pwd)/conf:/flyway/conf. This maps my local location, $(pwd), in the folder /conf, described earlier where my configuration file lives, to the internal /flyway/conf location that Flyway expects inside the container. Same thing with the sql file locations.
Now then, with all that done, we’re ready. I can introduce changes to my ./sql folder, do a PUSH up to GitHub, and then AWS CodeBuild will pick them up. It’ll run Flyway, which will deploy the code:
Next steps would be to add some automated testing to ensure the deployments worked as intended.
Conclusion
As you can see, while the fundamental concept, flyway migrate, is simple and straightforward, getting it implemented in an automation tool like AWS CodeBuild is not. Inevitably, there’s a degree of complexity in putting together a Continuous Integration pipeline. So many different things interact. It’s not just a question of running Flyway. We have to connect GitHub to our flow control tool in AWS CodeBuild. That has to be configured with the correct permissions to affect our database. Docker is introduced in order to run Flyway without having to do some kind of install. All of it has to be correctly configured with appropriate security. In short, it’s a little complicated. At the end though, it becomes something you can just let run in the background, automatically deploying your databases for you.