One Flyway Migration Script for Diverse Database Systems
How to create a single set of SQL migration scripts for Flyway that we can use across multiple database systems, or for all regional variants of a database.
JDBC and ODBC drivers, to be valid, must work to a restricted dialect of SQL, across databases. While this includes both DML and DDL, it is not really sufficient for building a database. These dialects are all defined and specified by the ODBC standard for drivers, and a driver can conform to minimum, core or extended SQL-92. This means that you can even run SQL Queries on text or Excel spreadsheets, or Create Excel databases. Outside this grammar, when using triggers, procedures or functions, you are in rather a wild country, but one that can be navigated cautiously. There is a lot that wasn’t specified. The trick is to keep things simple.
We’ll be using Flyway placeholders to accommodate differences in dialect between database systems, to allow us to run cross database Flyway migrations. I’ll demonstrate installing the classic sample database, Pubs
, on four different database systems, by running a single shared RDBMS-agnostic set of Flyway migrations. We’ll do it for SQL Server, PostgreSQL, MariaDB and, just to show off a bit, SQLite. The code and scripts are here.
Why? Just Why?
I’ll admit that I could save time by phoning every person who needs to run the same database on four different database systems at once. However, I have a data-driven application that needs to be on three, due to the vagaries of hosting. More important are the techniques I illustrate, such as the commenting out of blocks of code. A lot of us deal with ‘variants’, where a version of a database has several different variants, such as an accounting system that must work appropriately for the financial regulations in force in the region. We use the same techniques to deal with this.
What could go wrong?
This shared SQL-92 grammar is useful for the developing polyglot databases, but you’ll soon find discrepancies. The text of the SQL within defaults, for example, can contain code that is for a particular SQL Dialect. The standard doesn’t deal with incrementing IDENTITY
fields. There is no shared concept of a clustered index. SQLite just doesn’t do ALTER
DATABASE
, so we need to shoehorn all the constraints into the CREATE
TABLE
statements. SQLite is a stranger to procedures too, and there is no realistic way around that.
There are many other problems that you’ll come across. You need to deal with all these issues in Flyway by using ‘macros’ or ‘placeholders’. Every database system has the same placeholder keys or names, but the value of a placeholder will be unique to the database system to which you are connecting.
Supporting multiple SQL dialects in a single set of migration scripts
To achieve a single source of migration scripts to build the same database for four different relational database systems we will need to use a special migration source code that has placeholders that are set differently for each database system.
Using Flyway placeholders
Flyway uses placeholders to allow us to specify values that can be used by any executable file. They perform the same role as ‘macros’ in traditional tools. Before Flyway executes a SQL file, it looks for placeholders and will swap them out for the run-time values before the code is executed. See: Customizing Database Deployments using Flyway Callbacks and Placeholders.
Flyway supports placeholder replacement with configurable prefixes and suffixes. These look like ${myplaceholder}
. You can set the values of placeholders from the command line using parameters, from environment variables or from config files. They can be environment variables or be used via the Java API, though these latter methods of using them aren’t much use to us in PowerShell scripts. (See node-flywaydb and node-db-migrate for an interesting way of interacting with Flyway’s API).
There are some useful built-in placeholders that we can use to pass information to any SQL callback or script, such as:
${flyway:defaultSchema}
– the default schema for Flyway${flyway:user}
– the user Flyway will use to connect to the database${flyway:database}
– the name of the database from the connection url${flyway:timestamp}
– the time that Flyway parsed the script, formatted as'yyyy-MM-dd HH:mm:ss'
${flyway:filename}
– the filename of the current script${flyway:workingDirectory}
– the user working directory as defined by the ‘user.dir’ System Property${flyway:table}
– the name of the Flyway schema history table
Building dialect dictionaries
To proceed, we need to build up a dialect dictionary for each RDBMS with values for all the placeholders we need to account for the vagaries of supporting different SQL dialects.
We’ll create our own custom placeholders for each database system and place them in a dialect array. This can then be used by asking Flyway to read in the values for that particular database system.
I’ve used the good old Pubs
database from days of old. It has procedures, views and triggers as well as tables. It also has user-defined types, but I haven’t attempted to deal with these as only PostgreSQL and SQL Server recognize them. I’ve added the constraints but unfortunately, had to reduce the power of the CHECK
constraint, because only SQL Server recognizes the regex syntax that allows you to specify a character range or character collection using []
brackets. I had to give up on triggers for this project because they were all so different in character.
SQLite doesn’t have an ALTER
TABLE
statement which causes problems. You need to save the old data in a temporary table, create a new table under a new name, fill it from the temporary table, delete the old table and the temporary table and finally rename the new table to the old! Fortunately, this exercise doesn’t require ALTER
TABLE
but beware! This is too major a change to be fixed with a placeholder. We avoided the need for ALTER
TABLE
by packing all the definition of constraints and defaults into the CREATE
TABLE
statement and by importing the data in the correct dependency order. I should have chosen a beefier relational database system than SQLite, but then it is rather cute. The other problem is schemas. Flyway depends on schemas, but MariaDB and MySQL don’t have them.
We use the following placeholders:
- ‘currentDateTime‘ – the function used to give the current date
- ‘schemaPrefix’ – the schema prefix to use. (e.g.,
dbo
) MariaDB is a stranger to schemas. - ‘CLOB‘ – the data type you use for text of indeterminate large size
- ‘BLOB’ – the data type you use for binary data of large size
- ‘autoIncrement’ – how you specify an identity column
- ‘DateDatatype’ – the datatype for storing both date and time
- ‘hexValueStart’ – how you declare hex-based binary data
- ‘hexValueEnd’ – how you terminate hex-based binary data
- ‘arg’ – the prefix for the variables that you use as parameters for procedures (SQL Server uses the
@
prefix) - ‘viewStart’ – the code to start the creation of a view
- ‘viewFinish’ – the code to finish the creation of a view
- ‘procStart’ – the initial string for a procedure
- ’emptyProcArgs’ – how you denote that you have no args/parameters for creating a procedure
- ‘procBegin’ – the preliminary string for a procedure
- ‘procEnd’ – the termination string for a procedure
- ‘procFinish’ – the final string for a procedure
Each database system seems to have a different way of doing these things, but with these placeholders set appropriately, one can get a long way in developing a simple database.
Here are some example placeholders for MariaDB:
-placeholders.hexValueStart="decode('" -placeholders.dateDatatype="DateTime" -placeholders.procFinish="// DELIMITER ;" -placeholders.arg="" -placeholders.viewFinish="" -placeholders.autoIncrement="int NOT NULL auto_increment primary key" -placeholders.procBegin="Begin" -placeholders.BLOB="longblob" -placeholders.emptyProcArgs="()" -placeholders.procStart="DELIMITER //" -placeholders.schemaPrefix="" -placeholders.hexValueEnd="','hex')" -placeholders.currentDateTime="CURDATE()" -placeholders.viewStart="" -placeholders.procEnd="end;" -placeholders.CLOB="longtext"
Automating Flyway migrations across multiples relational database systems
It is very satisfying seeing four copies of the Pubs
database being built on four entirely different relational database systems in under a minute, while one sips a cocktail or sherry. Flyway keeps individual record in the flyway_schema_history
table of each database of what migrations have been applied and when. As a result, it doesn’t matter if things get out of sync with successfully applied migrations, and therefore version numbers, when it hits an error on just one of the four databases. If Flyway must do something for a particular database, it learns about it from the schema history table and, if necessary, does it. If it doesn’t, it just shrugs and passes on to the next.
If you want to play along, you can grab a copy of my PubsAgnostic GitHub project. To make this even easier, you can use my PowerShell cmdlet called Get-FilesFromRepo
, which you can get from here via cut-n-paste. Having loaded the cmdlet, just run the following code to create a local copy of the project. I’ve put it in a temporary directory (see the $ProjectFolder
variable below) but you can change this to wherever you want to store it. For example, you might want to create your own GitHub project and store it there.
1 2 3 4 5 6 7 8 |
$ProjectFolder="$env:Temp\PubsAndFlyway\PubsAgnostic"; $Params = @{ 'Owner' = 'Phil-Factor'; 'Repository' = 'PubsAndFlyway'; 'RepoPath' = 'PubsAgnostic'; 'DestinationPath' = $ProjectFolder } Get-FilesFromRepo @Params |
The Scripts subfolder in this project contains two Flyway migration scripts that we’ll execute, in PowerShell, to build and then fill a copy of the classic Pubs
database on four different database systems: SQL Server, PostgreSQL, MariaDB and SQLite (if you need to get up speed on the basics of running Flyway migrations on each of these systems, the links will take you to the right articles).
The work of automating these migrations is done by my Create-FlyWayParametersets
cmdlet (PolyglotFlyway.ps1). It accepts an array of placeholders ($FlywayArray
), one placeholder for each database system. From this, it creates parameter sets that it can then use to connect to each system in turn and run the migration scripts. It will also produce config files if you prefer that.
The placeholder array
We pass as a parameter to the Create-FlyWayParametersets
cmdlet, a data structure ($FlywayArray
) that holds all our placeholders. We have four different database systems with a development database on them. We therefore need four objects in our array.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
$FlywayArray = @( @{ 'RDBMS' = 'PostgreSQL'; 'driver' = 'postgresql'; 'Server' = 'MyServer'; #the name or instance of SQL Server 'Database' = 'pubspolyglot'; #The name of the development database that you are using 'Password' = ''; #your password 'UserID' = 'MyUserID'; # your userid 'port' = '5432' 'Schemas' = 'dbo,people' 'PlaceHolders' = @{ 'currentDateTime' = 'CURRENT_DATE'; 'schemaPrefix' = 'dbo.'; 'CLOB' = 'TEXT'; 'BLOB' = 'BYTEA'; 'autoIncrement' = 'INT NOT NULL GENERATED ALWAYS AS IDENTITY primary key'; 'DateDatatype' = 'DATE'; 'hexValueStart' = "decode('"; 'hexValueEnd' = "','hex')"; 'arg' = ''; 'viewStart' = '' 'viewFinish' = ''; 'procStart' = ''; 'emptyProcArgs' = '()'; 'procBegin' = 'Language SQL as $$'; 'procEnd' = '$$;'; 'procFinish' = ''; } }, @{ 'RDBMS' = 'SQLite'; 'driver' = 'sqlite'; 'Server' = ''; #the name or instance of SQL Server 'Database' = 'C:\Users\phil\sqlite\PubsPolyglot.sqlite3'; #The name of the development database that you are using 'Password' = ''; #your password 'UserID' = ''; # your userid 'port' = '' 'Schemas' = '' 'PlaceHolders' = @{ 'currentDateTime' = "DATE('now')"; 'schemaPrefix' = ''; 'CLOB' = 'clob'; 'BLOB' = 'blob'; 'autoIncrement' = 'INTEGER PRIMARY KEY autoincrement'; 'DateDatatype' = 'varchar(50)'; 'hexValueStart' = "x'"; 'hexValueEnd' = "'" 'arg' = ''; 'viewStart' = '' 'viewFinish' = ''; 'procStart' = '/*'; #sqlite can't do procs 'emptyProcArgs' = ''; 'procBegin' = ''; 'procEnd' = ''; 'procFinish' = '*/'; #sqlite can't do procs } }, @{ 'RDBMS' = 'SQLServer'; 'driver' = 'sqlserver'; 'Server' = 'MyServer'; #the name or instance of SQL Server 'Database' = 'pubsPolyglot'; #The name of the development database that you are using 'Password' = ''; #your password 'UserID' = 'MyUserID'; # your userid 'port' = ''; 'Schemas' = 'dbo,people' 'PlaceHolders' = @{ 'currentDateTime' = 'GetDate()'; 'schemaPrefix' = 'dbo.'; 'CLOB' = 'NVARCHAR(MAX)'; 'BLOB' = 'varbinary(MAX)'; 'dateDatatype' = 'DateTime2'; 'autoIncrement' = 'INT NOT NULL IDENTITY primary key'; 'hexValueStart' = "0x"; 'hexValueEnd' = "" 'arg' = '@'; 'viewStart' = 'GO'; 'viewFinish' = 'GO'; 'procStart' = '--'; 'emptyProcArgs' = ''; 'procBegin' = 'as'; 'procEnd' = '--'; 'procFinish' = 'GO'; } }, @{ 'RDBMS' = 'MariaDB'; 'driver' = 'mariadb'; 'Server' = 'MyServer'; #the name or instance of SQL Server 'Database' = 'pubsPolyglot'; #The name of the development database that you are using 'Password' = ''; #your password 'UserID' = 'MyUserID'; # your userid 'port' = '3307' 'Schemas' = 'pubsPolyglot' 'PlaceHolders' = @{ 'currentDateTime' = 'CURDATE()'; 'schemaPrefix' = ''; 'CLOB' = 'longtext'; 'BLOB' = 'longblob'; 'dateDatatype' = 'DateTime'; 'autoIncrement' = 'int NOT NULL auto_increment primary key'; 'hexValueStart' = "decode('"; 'hexValueEnd' = "','hex')"; 'arg' = ''; 'viewStart' = ''; 'viewFinish' = ''; 'procStart' = 'DELIMITER //'; 'emptyProcArgs' = '()'; 'procBegin' = 'Begin'; 'procEnd' = 'end;'; 'procFinish' = '// DELIMITER ;'; } }) |
Once you’ve done all this, it is just a matter of steering to the right directory and keying in the basic Flyway command and action.
Passing in this array as a parameter, in PowerShell, is only one way to do it. Another way is to specify the placeholder values for each database system in its own in Flyway config file (Flyway has an infinite appetite for these files), and then specify the path to the config file for that specific database type as a parameter to Flyway. You can also save your database details and project data as config files.
I use PowerShell to speed up development and because it helps with bookkeeping (such as ensuring that all placeholders are defined if you need to introduce some more), which I’m not good at doing.
You can, of course, keep this array as a JSON file. If you want to save or edit these as JSON, each JSON document will look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
{ "PlaceHolders": { "hexValueStart": "decode(\u0027", "DateDatatype": "DATE", "procFinish": "", "arg": "", "viewFinish": "", "autoIncrement": "INT NOT NULL GENERATED ALWAYS AS IDENTITY primary key", "procBegin": "Language SQL as $$", "BLOB": "BYTEA", "emptyProcArgs": "()", "procStart": "", "schemaPrefix": "dbo.", "hexValueEnd": "\u0027,\u0027hex\u0027)", "currentDateTime": "CURRENT_DATE", "viewStart": "", "procEnd": "$$;", "CLOB": "TEXT" }, "Server": "MyDatabaseServer", "Password": "", "driver": "postgresql", "port": "5432", "RDBMS": "PostgreSQL", "UserID": "MyUserID", "Schemas": "dbo,people", "Database": "pubspolyglot" } |
I’ve put the complete file here in FlyWayJsonPlaceholders.json
Creating parameter sets from placeholders
The Create-FlyWayParametersets
PowerShell cmdlet does the processing for you and produces an array of parameters for Flyway. If you just want to develop one database, you specify the name of the RDBMS in the -WhichToDo
optional parameter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
<# .SYNOPSIS Creates Flyway Parameter sets from a list of placeholders and parameters .DESCRIPTION A way of generating a whole lot of parameter sets that can be applied to Flyway. Each object of the Flyway Array creates a set of Flyway parameters. The routine automatically stores and inserts passwords, and runs various checks as well as creating the Flyway parameters. .PARAMETER ProjectFolder The path to the project folder containing the migrtation scripts .PARAMETER ProjectDescription A description of the project .PARAMETER ProjectName The name you give to the project .PARAMETER FlywayArray The collection of data objects that describe each database .PARAMETER WhichToDo A wildcard to specify which of the whole collection you want to process .PARAMETER $ConfigFile Is this to create a config file rather than a list of parameters?. .EXAMPLE PS C:\> Create-FlyWayParametersets -ProjectFolder $value1 -ProjectDescription $value2 .NOTES Additional information about the function. #> function Create-FlyWayParametersets { [CmdletBinding()] param ( [Parameter(Mandatory = $true)] $ProjectFolder, [Parameter(Mandatory = $true)] $ProjectDescription, [Parameter(Mandatory = $true)] $ProjectName, [Parameter(Mandatory = $true)] $FlywayArray, [Parameter(Mandatory = $False)] $WhichToDo = '*', #meaning do all the database. You can do just one or a few [Parameter(Mandatory = $False)] $ConfigFile = $False #Change to $true if you are writing config files ) if ($ConfigFile) { $Dlmtr = ''; $Prefix = 'Flyway.' } else { $Dlmtr = '"'; $Prefix = '-' } $TheDatabases = $FlywayArray | Where { $_.RDBMS -like $WhichToDo } | foreach { if (!([string]::IsNullOrEmpty($_.UserID))) #then it is using integrated Credentials { # we see if we've got these stored already #if you change your password, you just delete the file $SqlEncryptedPasswordFile = "$env:USERPROFILE\$($_.UserID)-$($_.Server)-$($_.RDBMS).xml" # test to see if we know about the password in a secure string stored in the user area if (Test-Path -path $SqlEncryptedPasswordFile -PathType leaf) { #has already got this set for this login so fetch it $SqlCredentials = Import-CliXml $SqlEncryptedPasswordFile } else #then we have to ask the user for it (once only) { # hasn't got this set for this login $aborted = $false #in case the user doesn't want to enter the password $SqlCredentials = get-credential -Credential $_.UserID # Save in the user area if ($SqlCredentials -ne $null) #in case the user aborted { $SqlCredentials | Export-CliXml -Path $SqlEncryptedPasswordFile <# Export-Clixml only exports encrypted credentials on Windows. otherwise it just offers some obfuscation but does not provide encryption. #> } else { $aborted = $True } } #so we can now provide the password $_.Password = "$($SqlCredentials.GetNetworkCredential().password)"; $_.UserID = $($SqlCredentials.UserName) } $URL = switch ($_.driver) # we need to get the right format of URL { 'sqlserver' { 'jdbc:<driver>://<host>:<port>;databaseName=<database>' } #sqlServer 'mariadb' { 'jdbc:<driver>://<host>:<port>/<database>' } # mariadb 'postgresql' { 'jdbc:<driver>://<host>:<port>/<database>' } #postgresql 'sqlite' { 'jdbc:sqlite:<database>' } #the database is a file address default { 'unrecognised driver' } } @(#now we substitute the real values for the placeholders @{ '<driver>' = $_.Driver }, @{ '<host>' = $_.Server }, @{ ':<port>' = "$(if ($_.Port -eq '') { '' } else { ':' + $_.port })" }, @{ '<database>' = "$($_.Database)$(if ($_.driver -ne 'sqlite' -and ($_.UserID -eq '' -or $_.Password -eq '')) { ';integratedSecurity=true' } else { '' })" # integratedSecurity not for SQLITE! } ) | foreach{ # replace each macro/placeholder with its value $URL = $URL.Replace($_.Keys[0], $_.Values[0]) } # Now we need to deal with the rest of the values $FlArgs = @(); $FlArgs = @("$($Prefix)url=$URL", "$($Prefix)user=$($_.UserID)", "$($Prefix)password=$($_.Password)" ) # now add the global project-variables for all databases $FlArgs += @("$($Prefix)locations=filesystem:$ProjectFolder\Scripts", "$($Prefix)schemas=$($_.schemas)"); $FlArgs += <# the project variables that we reference with placeholders #> @("$($Prefix)placeholders.projectDescription=$Dlmtr$ProjectDescription$Dlmtr", "$($Prefix)placeholders.projectName=$Dlmtr$ProjectName$Dlmtr") <# the project variables #> $placeholders = $_.PlaceHolders $PlaceHolders.Keys | foreach{ $FlArgs += "$($Prefix)placeholders.$_=$Dlmtr$($placeholders.$_)$Dlmtr" } @{ 'url' = $URL; 'args' = $FLArgs; } } $TheDatabases } |
Running the polyglot migrations
With all this in place, we simply call the Create-FlyWayParametersets
supplying the path our project folder, and a project name sand description. It runs the same set of migrations on each of the databases and tames the output of Flyway so that one can do rapid database development work.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
Create-FlyWayParametersets ` -ProjectFolder 'PathToMyGitFolder\PubsAgnostic' ` -ProjectDescription 'Experiment to use a single migration folder for several RDBMSs' ` -ProjectName 'PubsAgnostic' -FlywayArray $FlywayArray |foreach { "... processing $($_.url)" $Params=$_.args Flyway @Params clean $Result = Flyway @Params -outputType=json migrate | convertFrom-JSON if ($Result.error -ne $null) { write-warning $Result.error.message } else # we report any migration { $Before=$result.initialSchemaVersion; $After=$result.targetSchemaVersion; if ([string]::IsNullOrEmpty($before)){$Before='empty'} if ([string]::IsNullOrEmpty($After)){$After=$Before} if ($Before -ne $after) { "$($result.database) $($Result.operation)d $($Result.migrationsExecuted) version(s) from $before to $After" } else {'No migration was made'} } } |
Summary
In this article, I’ve demonstrated the power of placeholders in Flyway. To illustrate this, I’ve created a project that builds the old Pubs database to four different Relational database systems, despite several differences in syntax. This use of placeholders extends further, of course. There are plenty of occasions when a version of a database must deal with variants: you may need slightly different databases of the same version to be compatible with different applications, different legislative frameworks or for doing performance tests.
Hopefully, this demonstration will have given you idea of how to get around the complications that often face database projects that operate strict versioning.