Liquibase Tutorial: Learn to Manage Your Database Schema

Phil Vuollet | December 4, 2018

Liquibase is one of the most versatile tools for database migration. It works on nearly all SQL database platforms. So whether you use MySQL, SQL Server, Oracle, Firebird or a combination of these or any other common database platforms, it’s got you covered. It also runs in any CI/CD pipeline so long as you can run Java, install the connectors, and connect to your databases. This means you can use Liquibase to deploy database changes along with your new features that are released behind feature toggles. It takes a bit of planning—especially when you’re releasing conflicting or destructive changes.

Luckily, Liquibase has some tools to make database changes less painful. It performs all the DDL operations and several very useful refactorings. Sometimes you have to be careful about when you run those database changes on your production instances since they can lock your tables for quite a while. In this post, you’ll learn how to use Liquibase to make all things database migration easier.

How Does Liquibase Work?

Changes are defined in platform-agnostic language and translated to whatever database platform you use. Basically, you keep a running list of changes, and Liquibase applies those changes for you through its execution engine. Since it runs on Java, it needs the correct JDBC driver to perform the database update. And, of course, you’ll need the latest JRE installed. Since Liquibase runs from a shell or command line, you can run it manually or from whatever deployment pipeline you use.

Liquibase tracks changes using its own tables in your schema to ensure consistency and to avoid corruption due to incorrectly altered changelogs. It records a hash of each changeset. While it’s running updates, it puts a “lock” on your database so you can’t accidentally run two changelogs concurrently.

Meet the Changelogs

Changelogs are written using domain-specific languages. You can write them in JSON, YAML, XML, or one of a few other supported formats. Changelogs consist of a series of changesets. A changeset represents a single change to your database. Some examples include creating a table, adding a column to a table, and adding an index. You can also include SQL statements in a changelog.

The changelog should be kept under source control. When you make changes, you add a new changeset to the end of the changelog. You would never alter an existing changeset.

Let’s take a look at some common schema changes using various syntaxes for changelogs to give you a better idea of how it works.

Apply Schema Changes

Adding a table is a common change you’ll see in any given Liquibase changelog. The following example is how you might add two tables using a changelog with an XML format:

<?xml version="1.0" encoding="UTF-8"?> 
<databaseChangeLog 
  xmlns="http://www.liquibase.org/xml/ns/dbchangelog/1.9" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog/1.9
                      http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-1.9.xsd"> 
  <changeSet author="philn5d" id="users_db_1.0.0">
    <createTable tableName="users">
      <column name="id" type="INT" autoIncrement="true">
        <constraints nullable="false" primaryKey="true" />
      </column>
      <column name="username" type="VARCHAR(100)">
        <constraints nullable="false" unique="true" />
      </column>
    </createTable>
  </changeSet>
  <changeSet author="philn5d" id="users_db_1.0.1">
    <createTable tableName="accounts">
      <column name="id" type="INT" autoIncrement="true">
        <constraints nullable="false" 
                     primaryKey="true" />
      </column>
      <column name="owner_id" type="INT">
        <constraints nullable="false" 
                     references="users(id)"
                     foreignKeyName="FK_accounts_users" />
      </column>
    </createTable>
  </changeSet>
</databaseChangeLog>

Here we have a changelog with two changesets. Each changeset creates a new table. The second table references the first one with a foreign key on the “owner_id” column. If we want to update one of these two tables, for example by adding a column to the users table, we would add a new changeset.

Let’s look at a changeset to add an account name to the accounts table using JSON format. Since you’d be adding this to the existing changelog, let’s imagine the above changelog was written in JSON and we’re adding to it with this:

{
  "databaseChangeLog": [
    { ... },
    { ... },
    {
      "changeSet": {
        "id": "users_db_1.0.2",
        "author": "philn5d",
        "changes": [
          {
            "addColumn": {
              "columns": [
                {
                  "column": {
                    "name": "name",
                    "type": "varchar(255)"
                  }
                }
              ],
              "tableName": "accounts"
            }
          }
        ]
      }
    }
  ]
}

In this example, the changeset is added to the end of the JSON array. This changeset adds a column to the accounts table.

Let’s look at just one more example, this time in YAML. For this one, we’ll take a look at a rollback action. Liquibase recommends including rollbacks in your changesets, so here’s how you would include a rollback in the previous changeset if it were in YAML:

databaseChangeLog:
  - changeSet:
     ...
  - changeSet:
    ...
  - changeSet:
      id: users_db_1.0.2
      author: philn5d
      changes:
        - addColumn:
            columnName: name
          tableName: accounts
      rollback:
        - dropColumn:
            columnName: name
            tableName: accounts

Again, this is a changeset that adds the name column. It comes after the first two changesets. Here, we can see the addition of a “rollback” to the changeset. Many changes have auto-rollback support, so you don’t need to include an explicit rollback. The Liquibase documentation has notes about support for each of their supported databases. “addColumn“, for example, will automatically roll back, so it’s not necessary to include the rollback shown in the example above.

You can do just about anything with the schema, manipulating views, stored procedures, functions, and triggers, among others. Stored procedures and functions should have special treatment, however. Well take a look at those later, but first a word of caution about adding an index with these changelogs.

Index With Caution

Sometimes an index will take a long time to create. During index creation, the table may be locked, thereby rendering an application useless while the index is being populated. There are some things you can do to mitigate the risks of an application impact while using Liquibase.

One thing you can do is not add indexes using Liquibase. Sometimes we really only want to index tables in production to address performance issues. This probably isn’t the best solution because you should be tracking all changes in your database migration tool. As always, do what’s best in your situation.

Another thing you can do is run the updates within a specific maintenance window. Now, there’s no telling how long an indexing operation will take, so you’ll want to be careful about that when you actually add one. There are a few strategies you could use when adding an index.

You could use a hybrid approach. Apply the index in production when it’s safe, then add it to the changelog with an “exists” precondition. Liquibase allows for preconditions, which have to pass in order to the changeset to run. One such precondition is “indexExists.” You’d add the following changeset to add an index to the “users” table:

changeSet:
  id: users_db_1.0.3
  author: philn5d
  changes:
    - preConditions:
        - onFail: CONTINUE
          not:
            - indexExists:
                schemaName: users
                indexName: IX_users_name
    - createIndex:
        schemaName: users
        indexName: IX_users_name
        columns:
          - column:
              name: username
              type: VARCHAR(100)

The precondition above should skip this changeset when the index already exists. Without the precondition, the entire changelog would fail when Liquibase attempts to run a create index statement on the database when it reaches this changeset. The precondition prevents this issue.

Another way to control changes is to run the script when there is a change to the definition of the underlying object. This is how you manage stored procedures and functions with Liquibase.

Manage Stored Procedures

It’s best to keep stored procedure and function definitions in SQL files under source control. This allows the changesets to be smaller. It also makes changes easier to manage.

You could put the entire definition in the changeset, but when you need to alter any stored procedure, you’d have to open the entire changelog file. Your editor would not work well with a changelog file that has a mix of SQL and XML, YAML, or JSON. It’s much easier to edit the SQL directly within its own file. You’ll also have an easier time comparing changes in commits as well as individual objects.

There are at least two ways to write the changeset. You can use the “createProcedure” command or the “sqlFile” command. Depending on your database stack, you’ll need to use different techniques to “CREATE OR MODIFY” the procedure or function. You might need to include a separate command in the changeset to drop the object before attempting to create the stored proc or function.

Despite these database-specific anomalies, it’s a much easier approach to maintaining SQL objects of this sort compared to something like using a comparison tool. It’s also a breeze to maintain “list data” using Liquibase. There are a few ways you might make data changes. We’ll check that out next…

Make Data Changes

In addition to managing the objects, Liquibase is good for maintaining data. Some data represents lists of domain-specific data rather than user data. An example of this might be automobile body types. The tables that contain this data are also known as “lookup tables.” Liquibase allows you to load data into tables, spawn off lookup tables from existing tables, and reload tables. Of course, you can always create your own SQL files to load up your lookup tables and have Liquibase run those.

The “loadUpdateData” command is useful for loading test data in addition to lookup data. This command will either insert or replace the data specified in a CSV file. We won’t go into too many details, but it allows you to map the data into columns as needed.

The custom script approach can be accomplished by using a “sqlFile” command.

When you’re making data changes, you can use “contexts” to apply changesets depending on the context. For example, you might want to replace data with baseline test data in a test environment. In this case, you would only run the “loadUpdateData” for certain tables in the “test” context. Context is passed to Liquibase as an argument when you run it (either with a script or a deployment tool).

All this is great when you do the changes and run your own deployments. But, what if you have to go through a DBA to update the database? Liquibase has a way. Here’s how…

Generate DBA Scripts

When you’re ready to run the updates, you can either have Liquibase execute them for you using a local JDBC driver or you can have it produce a SQL script for the changes. You can pass that script over to your DBAs so they can review and run it in accordance with your release process.

Generating the SQL is also handy when you want to verify your changeset without executing against the database. You can output the SQL to stdout or redirect it to a file.

# SQL to stdout
java -jar liquibase.jar ... updateSQL

# or redirect SQL to file
java -jar liquibase.jar ... updateSQL > /changes.sql

Barring the other arguments to the Liquibase jar, these examples show you how to generate the SQL rather than updating the database.

What Now?

I hope you’ve enjoyed this introduction to managing your database migrations with Liquibase. You should read the docs to get the full story on how to perform the updates. This is certainly one thing that will help you along with your continuous improvement. This in conjunction with other techniques, such as feature flags, can seriously improve your cycle time, which could mean the difference between success with your products and falling behind the competition.

About Rollout.io

Rollout is an advanced feature management solution that gives engineering and product teams feature control, post-deployment. It’s the most effective way to roll out new features to the right audience while protecting customers from failure and improving KPI’s.