19 May 2022

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

Phil Factor

19 May 2022

Automating Flyway Development Chores using Database Diagrams

If you can produce a quick Entity-Relationship diagram for any version of a database, you'll have a simple way to 'sanity check' it for unreferenced tables, missing keys and other design flaws. This article shows how to auto-generate these diagrams when running a migration, using Flyway Teams and PowerShell.

Guest post

You might think that the only purpose of Flyway is to apply migration scripts to a database, in the right order, to create whatever version of the database you wish. In fact, Flyway lends itself to being the centre of a development and deployment system and can help you automate many other development chores.

Developing databases is different to developing procedural code. Developers need more information and need to automate far more of the routine ‘chore’ work. Luckily Flyway, with its architecture of placeholders, callbacks and configuration files, is designed to make automation easy.

Catering for ad-hoc reports

Whenever Flyway does a migration run, it inspects the ‘locations’ that you’ve specified in the locations configuration parameter for any callback script that it must execute. If you’re using the community version of Flyway, you can use either SQL or Java callbacks. With Flyway Teams, you can also use PowerShell.

The sort of task that you’d generally want to script out at this point would be creating a build script, doing a backup of the new version or creating a database model. These are all-or-nothing tasks that don’t require input from you. You would never want a light backup, a well-grilled backup or a single-source backup: just a backup.

There are of course many other routine tasks to be done during a Flyway development, such as bug fixing, database code reviews, performance tuning, investigating potential data issues (e.g., causing problems or inconsistencies in business reports) or doing a security check. I like to call these the ‘canteen’ tasks because they feed on the results of the all-or-nothing tasks.

For each canteen task, you need a particular report to answer specific questions. For example, when performing a database code review, you’ll want to answer questions such as:

What tables aren’t referenced by any other table?
Are there any duplicate or near-duplicate indexes?
Have you a table without any candidate keys?
Are there any tables with columns incorrectly allowing NULLs?

You can extract the answers from the database model, for the version of database you’re investigating. It is all about making it easier to maintain database quality standards and, with this approach, you don’t even need access to the database itself. That is neat, because you then don’t need to learn the metadata calls for that particular RDBMS.

The FlywayTeamwork PowerShell framework that I use generates these models if you request them, for any of the more common RDBMSs (SQL Server, PostgreSQL, MariaDB, MySQL and SQLite). My article, Simple Reporting with Flyway and Database Models, shows how to generate the model, produced by a script called by Flyway, and then use it for simple reporting. In this article we’ll take it to a more advanced level.

Producing the code for a very simple Entity-Relationship diagram

Mapping and documenting the Foreign Key relationships between tables is one of those many rather tedious routine tasks that tend to be neglected when a team gets under pressure, which is a shame because it is so easy to spot mistakes in a diagram, whereas errors tend to get lost in lists or code. Far more time is wasted tracking down foreign key problems than are saved by abandoning diagramming. It is part of the greater task of working out the dependencies of a table and the consequences of making a change. So why not automate the process?

Databases are easier to develop if you have Entity-Relationship diagrams covering the tables within the domain of your work. I usually create the UML diagrams for designing databases and processes from PlantUML. You can design a database change, and communicate it to the team, by generating a diagram. These diagrams are scripts, so they can easily be placed in source control, and a change can automatically update the graphic.

Here is a small sample of the sort of thing you can produce:

A simple Entity-Relationship diagram

PlantUML will create Entity-Relationship diagrams as well as classes, states, deployment, or Gantt charts. I’ve already demonstrated how it can produce a Gantt chart from the contents of the Flyway Schema History table, to track the progress of a Flyway project.

Although it will produce Entity Relationship diagrams, you generally wouldn’t want a diagram for an entire database: no paper-size is big enough for a typical corporate database. You’d be more interested in groups of inter-related tables but even then, you’d probably want to exclude certain relationships from the diagram.

Generate the PUML code for the E-R diagram from the database model

Your first task is to locate the model for the version of the database for which you want to generate the E-R diagram.

I’ll refer you to the Simple Reporting with Flyway and Database Models article for details of how to generate the model using Flyway and my PowerShell framework. If you are already using the project directory as your working directory, and have set up all your working parameters, you can read it in like this…

<# We can easily get hold of the JSON model of the database and convert it into a PowerShell object.#>

$Model = ConvertFrom-json (

[IO.File]::ReadAllText("$($dbDetails.ReportLocation)\current\Reports\DatabaseModel.JSON")

)

Alternatively, I’ve provided some sample models for various RDBMSs, so you’ve something to play with. You can read in any of these models, like this:

$Model = ConvertFrom-json (

[IO.File]::ReadAllText("<pathToModel>\<MyModelName>.JSON")

)

Having done that, we can create the code for the diagram. Here is the script that will generate the PUML code from the model that we just read in:

$PumlCode =

@startuml

' hide the spot

hide circle

' avoid problems with angled crows feet

skinparam linetype ortho

$PumlCode += $model | Display-object -depth 3 |

where { $_.Path -match '\$\..+?\.Table' } |

foreach{

$bits = $_.Path.split('.');

[pscustomobject]@{ 'TableName' = "$($Bits[1]).$($bits[3])" }

} | foreach {

"`r`nentity $($_.TableName) "

}

$PumlCode += $model | Display-object -depth 10 |

where{ $_.path -like '$.*.*.*.foreign key.*.Foreign Table' } |

foreach{

$bits = $_.Path.split('.');

[pscustomobject]@{

'TableSchema' = "$($Bits[1])";

'TableName' = "$($bits[3])";

'Key' = "$($bits[5])";

'ReferenceSchema' = "$($_.Value)".Split('.')[0];

'ReferenceTable' = "$($_.Value)".Split('.')[1]

}

} | foreach {

"`r`n$($_.TableSchema).$($_.TableName) }|..|| $($_.ReferenceSchema).$($_.ReferenceTable) "

}

$pumlCode +=@"

@enduml

$PumlCode

This will produce the following PUML:

@startuml

' hide the spot

hide circle

' avoid problems with angled crows feet

skinparam linetype ortho

entity dbo.authors

entity dbo.discounts

entity dbo.editions

entity dbo.employee

entity dbo.jobs

entity dbo.prices

entity dbo.publications

entity dbo.publishers

entity dbo.pub_info

entity dbo.roysched

entity dbo.sales

entity dbo.stores

entity dbo.tagname

entity dbo.tagtitle

entity dbo.titleauthor

dbo.discounts }|..|| dbo.stores

dbo.employee }|..|| dbo.jobs

dbo.employee }|..|| dbo.publishers

dbo.employee }|..|| dbo.jobs

dbo.pub_info }|..|| dbo.publishers

dbo.roysched }|..|| dbo.publications

dbo.sales }|..|| dbo.stores

dbo.sales }|..|| dbo.publications

dbo.tagtitle }|..|| dbo.tagname

dbo.tagtitle }|..|| dbo.publications

dbo.titleauthor }|..|| dbo.authors

dbo.titleauthor }|..|| dbo.publications

@enduml

Viewing and editing the E-R diagram

We can start very simply with PUML. All you need is the web-based PlantUML editor to render the above PUML code into an E-R diagram. Simply paste in the code and hit “Submit“, which gives our initial diagram (we have to shrink even this database to view it):

Basic E-R diagram for pubs database

We can do a bit better than that, though. Firstly, we can make it a lot prettier by editing the PlantUML code:

@startuml

' hide the spot

hide circle

' avoid problems with angled crows feet

skinparam wrapWidth 150

skinparam handwritten true

skinparam monochrome true

skinparam packageStyle rect

skinparam defaultFontName Buxton Sketch

skinparam shadowing true

skinparam MessageAlign left

skinparam header{

FontColor black

FontSize 14

}

left to right direction

Title Pubs Database Version 1.1.7

header Date: 22/04/2022

footer what does this do

entity dbo.authors

entity dbo.discounts

entity dbo.editions

entity dbo.employee

entity dbo.jobs

entity dbo.prices

entity dbo.publications

entity dbo.publishers

entity dbo.pub_info

entity dbo.roysched

entity dbo.sales

entity dbo.stores

entity dbo.tagname

entity dbo.tagtitle

entity dbo.titleauthor

dbo.discounts }|.up.|| dbo.stores

dbo.employee }|..|| dbo.jobs

dbo.employee }|..|| dbo.publishers

dbo.employee }|..|| dbo.jobs

dbo.pub_info }|..|| dbo.publishers

dbo.roysched }|.up.|| dbo.publications

dbo.sales }|..|| dbo.stores

dbo.sales }|..|| dbo.publications

dbo.tagtitle }|.right.|| dbo.tagname

dbo.tagtitle }|..|| dbo.publications

dbo.titleauthor }|..|| dbo.authors

dbo.titleauthor }|..|| dbo.publications

@enduml

My additions give the code hints on displaying the diagram. This has to be done by hand. The use of a special ‘handwriting’ font and ‘hand-drawn’ lines is handy for people who are struggling with process and don’t realize that this is a ‘working’ diagram rather than a splendid design proposition. It must be done on a PC because it uses a font that isn’t available to online diagram generators.

Here is the resulting “prettified” E-R diagram:

E-R diagram with styling

This will immediately flag up missing relationships between tables, of course, but we could really do more, and we will…

Which keys are being used?

We’d like to know which keys are being used to reference other tables, and which keys are being referenced. In the model, we already pick up these details for the comments or documentation that are applied to a table, so we can add them to our diagram. We could do with a title and date too, so let’s add those as well.

<# we find every table reference from the model simply by examining the

Foreign keys #>

$title="Pubs Database Version 1.1.7"

$date="$((Get-date).Date.ToString().Replace('00:00:00', ''))"

$footer='Produced by PlantUML'

$Reference = $model | Display-object -depth 10 |

where{ $_.path -like '$.*.*.*.foreign key.*.Foreign Table' } | foreach{

$bits = $_.Path.split('.');

[pscustomobject]@{

'TableSchema' = "$($Bits[1])";

'TableName' = "$($bits[3])";

'Key' = "$($bits[5])";

'ReferenceSchema' = "$($_.Value)".Split('.')[0];

'ReferenceTable' = "$($_.Value)".Split('.')[1]

}

# now we can <a id="post-6917917-_Hlk103852679"></a>get all the tables that are either referencing or referenced

# in any real database, the resulting diagram would be too big but we will

# deal with that problem

$Tables = $Reference | Select TableSchema, Tablename

# We add in the referenced tables

$Tables += $Reference | foreach{

[pscustomobject]@{

'TableSchema' = $_.ReferenceSchema;

'Tablename' = $_.ReferenceTable

}

<# we boil this down to a list of all the participating tables and for each

of them we produce a simple PUML code that defines each entity. We can read

any additional information we need from the table, in this case the comment#>

$EntitiyCode = $Tables | sort -Unique tableSchema, Tablename | Foreach{

$TableSchema = $_.tableSchema; #remember the table schema

$TableName = $_.tablename; #and the table name

#now we can get the value of the comment for that table.

#Lucky you did comments eh?

$comment = $model.$TableSchema.table.$TableName.comment

#we reference the foreign keys. one after another to get a list of keys

$keys = $model.$TableSchema.table.$TableName.'foreign key' |

foreach{ $_.psobject.Properties.Value } | select -ExpandProperty Cols

#We can now create the data within the table entity of the foreign

#keys accesses. If you don't have table documentation, you might need

#some columns too

$TheLisOfKeys = ($keys | foreach{ "* $($_) : number <<FK>>" }) -join "`r`n "

entity "$TableSchema.$TableName" {

$comment

$TheLisOfKeys

}

$EntityRelations = $reference | foreach{@"

$($_.TableSchema).$($_.TableName) ||..|| $($_.ReferenceSchema).$($_.ReferenceTable)

}

$pumlCode = @"

@startuml

' hide the spot

hide circle

' avoid problems with angled crows feet

skinparam linetype ortho

skinparam wrapWidth 150

skinparam MessageAlign left

skinparam header{

FontColor black

FontSize 14

}

Title $title

header Date: $date

footer $footer

$($EntitiyCode -join "`r`n")

$($EntityRelations -join "`r`n ")

@enduml

$PumlCode

This script will produce the following PUML code. You can prettify it quite a lot more, but I am trying to keep this relatively simple.

100

101

102

103

104

105

106

107

108

@startuml

' hide the spot

hide circle

' avoid problems with angled crows feet

skinparam linetype ortho

skinparam wrapWidth 150

skinparam MessageAlign left

skinparam header{

FontColor black

FontSize 14

}

Title Pubs Database Version 1.1.7

header Date: 22/04/2022

footer Produced by PlantUML

entity "dbo.authors" {

The authors of the publications. a publication can have one or more author

}

entity "dbo.discounts" {

These are the discounts offered by the sales people for bulk orders

* stor_id : number <<FK>>

}

entity "dbo.employee" {

An employee of any of the publishers

* job_id : number <<FK>>

* pub_id : number <<FK>>

}

entity "dbo.jobs" {

These are the job descriptions and min/max salary level

}

entity "dbo.pub_info" {

this holds the special information about every publisher

* pub_id : number <<FK>>

}

entity "dbo.publications" {

This lists every publication marketed by the distributor

}

entity "dbo.publishers" {

this is a table of publishers who we distribute books for

}

entity "dbo.roysched" {

this is a table of the authors royalty scheduleable

* title_id : number <<FK>>

}

entity "dbo.sales" {

these are the sales of every edition of every publication

* title_id : number <<FK>>

* stor_id : number <<FK>>

}

entity "dbo.stores" {

these are all the stores who are our customers

}

entity "dbo.tagname" {

All the categories of publications

}

entity "dbo.tagtitle" {

This relates tags to publications so that publications can have more than one

* title_id : number <<FK>>

* TagName_ID : number <<FK>>

}

entity "dbo.titleauthor" {

this is a table that relates authors to publications, and gives their order of listing and royalty%

* title_id : number <<FK>>

* au_id : number <<FK>>

}

dbo.discounts }|..|| dbo.stores

dbo.employee }|..|| dbo.jobs

dbo.employee }|..|| dbo.publishers

dbo.employee }|..|| dbo.jobs

dbo.pub_info }|..|| dbo.publishers

dbo.roysched }|..|| dbo.publications

dbo.sales }|..|| dbo.stores

dbo.sales }|..|| dbo.publications

dbo.tagtitle }|..|| dbo.tagname

dbo.tagtitle }|..|| dbo.publications

dbo.titleauthor }|..|| dbo.authors

dbo.titleauthor }|..|| dbo.publications

@enduml

In my case, for the pubs the sample database I’ve provided, you get this diagram:

E-R diagram showing Primary and Foreign Keys

Investigating relationships for a subset of tables

Well, this is fine, but it still doesn’t go quite far enough. Even with our small, sample database the diagram starts to get cramped. We can do help a bit by altering the direction of the relations but, actually, we probably don’t even want the entire database plotted out, just those tables that are directly involved in our investigation.

This leads us to a central theme of this article, and more generally of what’s required to support a DevOps approach to team-based development: you cannot predict all the requirements that the team will have for information. You’ll never provide everything necessary in advance. It is much better to provide the ingredients (the model and the code to extract information from it) than the eventual meal. If you just served up an ER diagram of the entire database on every revision, it would make anyone’s eyeballs swivel even if they had the necessary eyesight. It is much better just to provide the basic information you need, by automation, and at that point allow you to find out just what you need.

In this case, we’ll improve the code so you can specify a group of tables for which you need to investigate the relationships. We’ll do it just by passing the name of a member of the group of tables that you need to examine. In this case, we just specify a member of the group such as ‘publications’ to get just a collection of linked tables.

Diagram Description automatically generated

I can’t show in the article all the code to allow you to do this, so I’ve prepared it as a Cmdlet on Github called Create-PUMLEntityDiagram.ps1. However, it really is just a slightly more complex version of the code I’ve shown here.

Either you can generate the model for the whole database or, to get the list of tables, you choose one the tables from the group that you wish to investigate:

Create-PUMLEntityDiagram '*' '*' $model # do them all. It is a small database.

Create-PUMLEntityDiagram 'dbo' 'TagTitle' $model

Create-PUMLEntityDiagram 'dbo' ' TagTitle ' $model 'Publications from PubsMySQL 1.1.7' 'Phil Factor Enterprises'

In this case, I chose TagTitle, but it could be any of the main group.

Summary

Once you have a model of a database, you have the opportunity for gaining a lot more information about the database you are developing. A model is merely a handy way of taking a snapshot of the main features of a database. It in turn facilitates a lot of reports, quality checks, diagrams and programming aids.

It isn’t always possible to extract all the information you need just from the model. With performance work, for example, you need the live database. However, for a lot of the time, information derived from a model can give you many insights into things that aren’t quite right with a database.

Fortunately, Flyway evolved with the advice and input of teams who wanted to automate as many of the processes around database development as possible. It makes all this much more convenient: you can generate a model from a callback and then kick off whatever further actions and reporting from that model that you need.

Tools in this post

Redgate Flyway

Bring stability and speed to database deployments

Find out more

Redgate Flyway

Bring stability and speed to database deployments

Find out more

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles
Redgate Flyway
Database Documentation
Automating Flyway Development Chores…

Guest post

Automating Flyway Development Chores using Database Diagrams

Guest post

Catering for ad-hoc reports

Producing the code for a very simple Entity-Relationship diagram

Generate the PUML code for the E-R diagram from the database model

Viewing and editing the E-R diagram

Which keys are being used?

Investigating relationships for a subset of tables

Summary

Tools in this post

Redgate Flyway

Redgate Flyway

Product Articles

University

Events

Forums

Community

Simple Talk

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles Redgate Flyway Database Documentation Automating Flyway Development Chores…

Guest post

Guest post

Catering for ad-hoc reports

Producing the code for a very simple Entity-Relationship diagram

Generate the PUML code for the E-R diagram from the database model

Viewing and editing the E-R diagram

Which keys are being used?

Investigating relationships for a subset of tables

Summary

Tools in this post

Redgate Flyway

Redgate Flyway

You may also like

Redgate Hub

Product Articles

University

Events

Forums

Community

Simple Talk

Product articles
Redgate Flyway
Database Documentation
Automating Flyway Development Chores…