Data Container Revisions, Resets and Graduations

Redgate Test Data Manager allows developers to save each new version of a database as a data container revision. After making local development changes to the container, or running tests, they can instantly reset it to its starting revision. They can also load any previous revision and can even 'graduate' a revision to an image, providing a new baseline for ongoing team development. These techniques are especially effective when used in conjunction with Flyway, which automatically tracks the version of every copy of the database.

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

One of the great advantages of using data containers for database development is the ease and speed with which you can create or reset them. In Redgate Test Data Manager, only the development changes made to each container require extra storage space on disk (in a diff file, per container). This minimizes the hardware resources needed for storing database files and makes it possible to have a separate container for every current revision of a database under development. It also makes it very easy to undo database changes, just by resetting the container to the latest saved revision.

rgclone save and reset commands

This minimizes the need to do development work within transactions or, if you are using an RDBMS like MYSQL that can’t rollback DDL in a transaction, it removes the reliance on undo migration files, or the need to restore backups.

Overview of the save, load, reset and graduate commands

Using Redgate Test Data Manager’s cloning CLI (rgclone), you can save revisions of a data container, using the save command:

rgclone save  data-container <Name-Of-The-Container>

This allows you to access several previous versions via the load command, specifying the revision you need, in this case rev.0 which means “as originally created from the data image“:

rgclone load data-container <Name-Of-The-Container> --revision=rev.0

If you’re working on the latest version of the database, say, you might make some additional changes to try out an idea for a feature, or check a possible performance enhancement. You can then discard the changes instantly, by resetting the data container to the previous saved revision:

rgclone reset data-container <Name-Of-The-Container>

This is the operation that allows us to perform a teardown after running any tests that require modifications to the database, either schema or data, without needing to create a new data

Finally, you can use the graduate command to create a new image from the current revision of a data container, or from a revision that you specify:

rgclone graduate data-container < Name-Of-The-Container Or_ContainerID> --revision <RevisionID> --name <GraduatedImageName>

Working with images, containers and revisions

The way you use a data container depends on your database methodology, but they are always best considered as volatile copies of the database that are used for a single process, such as a test-run, or a branch development, and then destroyed.

We can use one container per branch and several developers could connect to it and make changes, such as by running Flyway migrations. The owner of the data container, as authorized by the authentication token, can save revisions. In practice this would mean that the token would be ‘owned’ by the process that automated the process of saving each Flyway version as a revision. However, we’ll save the topic of automation to the next article. Here, we’ll just be executing PowerShell code from the command line.

The test drive

The demo PowerShell code that follows uses rgclone to create a data image, and from it a data container hosting an empty Pubs database. After connecting to the data container from Flyway and migrating the database to a new version, it will save the data container as a revision and then demonstrate the use of the load, reset and graduate commands, discussed above.

I’ll show the full code listing later in the article, but first let’s take it for a test drive. The code starts by creating an image definition (YAML) file that uses a prescript to create an empty Pubs database.  Thergclone CLI uses this file to create a data image, and from this image a data container, which is a working database instance (a SQL Server instance, in this example) running in a container. You can see previous articles such as Getting Started with Cloning in Redgate Test Data Manager for details.

We then connect to the data container from Flyway and execute the flyway info command. It confirms that it’s an empty database and that there are three pending migration files in the Flyway project:

PathToPubs> Get-content $Credentials  -Raw | flyway info -reportEnabled=true -configFiles=-
Loaded configuration from standard input
Schema history table [master].[dbo].[flyway_schema_history] does not exist yet
Schema version: << Empty Schema >>
+-----------+---------+----------------------------+------+---------+----------+
| Category  | Version | Description                | Type | State   | Undoable |
+-----------+---------+----------------------------+------+---------+----------+
| Versioned | 1.1     | FirstRelease               | SQL  | Pending | Yes      |
| Versioned | 1.2     | SecondRelease1-1-3to1-1-11 | SQL  | Pending | No       |
| Versioned | 1.3     | ThirdRelease1-1-11to1-1-15 | SQL  | Pending | No       |
+-----------+---------+----------------------------+------+---------+----------+

So, we then use flyway migrate to migrate the Pubs database to the latest version, which is v1.3:

PathToPubs> Get-content $ConfigFile  -Raw | flyway migrate -configFiles=-
(deleted verbose content)
Successfully applied 3 migrations to schema [dbo], now at version v1.3 (execution time 00:36.657

We now save this new version as a data container revision:

PathToPubs> rgclone save data-container $ContainerName
Successfully saved data container 'Pubs-TestRevisioning-mssql-develop-container' (55)
New revision is 'rev.1'

Whenever we need a fresh copy of v1.3 for testing, we simply load it as a rev.1 data container. Since rgclone gives the initial data container a revision of rev.0, we can also the load command to revert to an empty database, rather than use the flyway clean command (though in practice it’s sometime faster to do a clean):

PS S:\work\Github\FlywayTeamwork\Pubs> rgclone load data-container $ContainerName --revision=rev.0
Successfully loaded data container 'Pubs-TestRevisioning-mssql-develop-container' (55)
New revision is 'rev.0-eivl.0'

We can prove that we once more have the empty database by running flyway info

PathToPubs> Get-content $Credentials  -Raw | flyway info -reportEnabled=true -configFiles=-
Loaded configuration from standard input
Schema history table [master].[dbo].[flyway_schema_history] does not exist yet
Schema version: << Empty Schema >>
+-----------+---------+----------------------------+------+---------+----------+
| Category  | Version | Description                | Type | State   | Undoable |
+-----------+---------+----------------------------+------+---------+----------+
| Versioned | 1.1     | FirstRelease               | SQL  | Pending | Yes      |
| Versioned | 1.2     | SecondRelease1-1-3to1-1-11 | SQL  | Pending | No       |
| Versioned | 1.3     | ThirdRelease1-1-11to1-1-15 | SQL  | Pending | No       |
+-----------+---------+----------------------------+------+---------+----------+

…but then instantly recreate a working version v1.3 by loading rev.1. No need to run flyway migrate:

PathToPubs> rgclone load data-container $ContainerName --revision=rev.1
Successfully loaded data container 'Pubs-TestRevisioning-mssql-develop-container' (55)
New revision is 'rev.1-mgxy.0'

The database is now back at version 1.3. The Flyway Schema History table is preserved with the database, so there is no need for any action to keep Flyway in sync with the container.

Imagine that you’re working on a feature branch, in version control, and have previously saved your progress as a rev.1 data container. You’ve subsequently made some rash changes and need to undo them quickly. In this case, I’ll just ‘accidentally’ clean the database, removing all objects:

PathToPubs> Get-content $Credentials  -Raw | flyway clean -configFiles=- -reportEnabled=true
Loaded configuration from standard input
Successfully dropped pre-schema database level objects (execution time 00:00.083s)
Successfully cleaned schema [dbo] (execution time 00:04.233s)
Successfully cleaned schema [classic] (execution time 00:00.862s)
Successfully cleaned schema [accounting] (execution time 00:00.591s)
Successfully dropped post-schema database level objects (execution time 00:00.357s)

A flyway info command will confirm that it’s all gone. Sweating slightly, I run the reset command:

PathToPubs>rgclone reset data-container $ContainerName
Successfully reset data container 'Pubs-TestRevisioning-mssql-develop-container' (55)

Running flyway info again confirms that I’ve successfully reverted the database to starting revision point:

PathToPubs> Get-content $Credentials  -Raw | flyway info -reportEnabled=true -configFiles=-
Schema version: 1.3
+-----------+---------+------------------------------+--------+---------+----------+
| Category  | Version | Description                  | Type   | State   | Undoable |
+-----------+---------+------------------------------+--------+---------+----------+
|           |         | << Flyway Schema Creation >> | SCHEMA | Success |          |
| Versioned | 1.1     | FirstRelease                 | SQL    | Success | Yes      |
| Versioned | 1.2     | SecondRelease1-1-3to1-1-11   | SQL    | Success | No       |
| Versioned | 1.3     | ThirdRelease1-1-11to1-1-15   | SQL    | Success | No       |
+-----------+---------+------------------------------+--------+---------+----------+

There are times when a developer might decide that the latest data container revision should be ‘graduated’ to a data image so the team can then use it as the ‘baseline’ for ongoing work. This is especially useful if it takes a while to migrate a database from ‘clean’ to the current version.

PathToPubs> rgclone graduate data-container $ContainerName --revision=rev.1 --name "$($ContainerName)-V1.3"
Successfully graduated data container 'Pubs-TestRevisioning-mssql-develop-container' at revision 'rev.1' to a n
ew data-image
New data image 'Pubs-TestRevisioning-mssql-develop-image-V1.3' (62) available!

Now, it would make a lot of sense when working in a feature branch to have a system that creates a revision for every successful migration within the branch. You wouldn’t want to store historic migrations, only the ones you’d want to revert to in testing. We’ll be showing, in a subsequent article, how to do this within a callback so that you can subsequently ‘undo’ to them.

The PowerShell script

In the meantime, here is the PowerShell script that I used to create the necessary environment and resources for this article so you can ‘play along’:

Summary

Data containers provide some obvious advantages over using a conventional database for database development, especially when testing, or doing branch development. They are geared to the lone process or developer rather than a shared team development.

The save command of the rgclone CLI allows the database developer to save a particular version of the database, running in a container, as a revision. The reset command reverts a data container quickly to the starting revision. With the load command, a database can be loaded repeatedly at any revision. A revision can be saved as an image, via the graduate command, to create copies for testing, or for later use as ‘references’ to establish what exactly is in each version of a database.

With data container, a branch can be isolated, developed, tested and then discarded with a delightful sense of irresponsibility, and without fear of affecting other lines of development until the merge. The obvious initial use is for testing, making the stages of setup and tear-down far quicker and easier. However, the opportunities that are presented for isolated branch development are hard to ignore. Containerized databases make the whole process inherently tidier and easier to manage, and Redgate Test Data Manager’s rgclone CLI makes it all practical without straining network resource.

Tools in this post

Flyway

DevOps for the Database

Find out more

Redgate Test Data Manager

Reliable and secure test data provisioning

Find out more