April 11, 2024

A Guide to Changeset Checksums in Liquibase

Updated April 2024

Changesets and checksums are uniquely applied concepts in Liquibase’s database change management automation offering that help teams understand how a proposed change fared throughout the pipeline. Did it deploy completely, without problem – or did it encounter evolutions throughout review, testing, validation, and deployment through lower stages?

Find out what Changesets, checksums, and other relevant terms mean in the Liquibase context, plus how to troubleshoot checksum issues. 

What are Changesets, Changelogs, and checksums?

In Liquibase, the leading database DevOps automation, governance, and observability solution, a Changeset is a collection of granular, individual database changes that work together and add up to execute the database schema adjustment requested by the database or development team. These changes can range from simple modifications like creating a new table or adding a column to more complex alterations involving data transformation. Each Changeset is uniquely identified by a combination of the author's name and an ID, which is typically a filename or a tag. 

A Changelog is a file that lists and organizes a series of Changesets as a comprehensive ledger that dictates the sequence and specifics of database schema changes as they move through the pipeline. The Changelog is like a Changeset itinerary, in which each entry points to a specific set of instructions (Changesets) on how to modify the database’s structure or content.

In this context, a checksum is like a digital fingerprint of a Changeset at a specific point in time with the goal of ensuring the integrity and consistency of Changesets over time. Liquibase calculates this checksum based on the content of the Changeset when it is first executed against a database. This calculation is done using a hash function, which turns the Changeset content into a short, fixed-size string of characters that uniquely represents the content of that Changeset. This calculation compares the original Changeset to the actually applied changes to see if they match, indicating a properly deployed change that made it through the pipeline without alteration. 

Liquibase’s technical documentation sums this up well: 

“When Liquibase reaches a Changeset in your Changelog during execution, it computes a checksum and stores it in the MD5SUM column of the DATABASECHANGELOG table. This tells Liquibase if the Changeset has been modified since it was run.

Uses and benefits of Changeset checksums in database DevOps

Once a Changeset's checksum is calculated and stored, Liquibase uses it as a reference point for future operations. Each time a Changeset is run or checked, Liquibase recalculates its checksum and compares it against the stored value. If the checksums match, it indicates that the Changeset has not been altered since its last execution, ensuring the changes applied to the database are strictly as intended.

The checksum’s key purpose is to detect if a Changeset is modified after it has been run so that Liquibase can alert the team. This can prevent unintended modifications from being applied to the database, which could potentially lead to inconsistencies or errors. Checksums also support database version control for schema changes, since they track which Changesets have been applied to a database and ensure that only the necessary Changesets are executed in the correct order.

The ability to measure for inconsistencies in Changesets improves collaboration and database DevOps culture by adding a layer of security and integrity protection to environments where multiple developers are working on database changes. For instance, if problematic change code makes it to production and needs to be reverted, checksums help by confirming the rollback script matches the Changeset it is intended to revert, maintaining the database's integrity throughout this contentious element of the change management process. 

Changeset checksums are a vital element in Liquibase’s database DevOps platform that ensure your deployments are consistent, reliable, and traceable across all environments.

Now, we’ll cover how to use Changesets checksums in Liquibase.

Changeset checksums in action

In order to detect changes between what is currently in the Changelog versus what was actually run against the database, Liquibase stores a checksum with each Changeset entry in the DATABASECHANGELOG tracking table. Liquibase uses a checksum to detect if your target database was updated. The checksum is computed after the parameters are applied. 

Checksums are one of the various tools in the Liquibase DevOps platform that help drive higher standards of integrity, consistency, and reliability in the database change management process. By strategically organizing changes into Changeset and Changelogs, then executing checksums automatically or as needed, teams increase quality and reduce the frequency of errors – as well as the time it takes to track down issues and debug. 

Bringing DevOps to the database isn’t just about automating change requests, test, and deployments. It’s also about integrating streamlined ways to support rapid deployments with governance, auditability, and cross-functional visibility that enhances collaboration. 

Here’s an example of Changeset checksums in action. Let's say your target database already ran the following Changeset:


   
       
   

This Changeset adds “my_column” as an “int”. The Changelog contains the same “id=1, author=example” Changeset as the following Changeset:


   
       
   

This Changeset adds “my_column” as a “bigint” instead.

Since Liquibase tracks which Changesets have run against your target database by the id/author/filepath combination, it sees the updated Changeset as “ran” even though your target database has “my_column” as an “int” column rather than the expected “bigint” type.

If you run the "int" Changeset against your database and then change it to "bigint" and re-run update, Liquibase will exit with the following error:

Validation Failed:
 1 change sets check sum
com/example/changelog.xml::1::example was: 8:63f82d4ff1b9dfa113739b7f362bd37d but is now: 8:b4fd16a20425fe377b00d81df722d604

This error shows how Liquibase notices that your target database is out of sync with the Changelog.

However, not every change to a Changeset impacts the checksum. Here are some examples:

  • Reformatting whitespace and line breaks (except within SQL statements)
  • Changing preconditions
  • Changing contexts
  • Changing labels
  • Adding validCheckSum settings
  • Changing comments (via native xml/yaml comments or using the comment tag)

Troubleshooting checksums

Checksums help you add metadata to Changesets that have already been run without problems. Checksums are also helpful when they can catch errors early in your pipeline and revert the Changeset back to the original version. They also help you roll forward with a new Changeset.

There are  several options to address checksum issues.

Revert and roll forward

Using the previous example, you can set the type back to “int” and add a new modifyDataType Changeset to change the column from “int” to “bigint.” Deployment databases that don’t catch the checksum error means they are likely ephemeral. In this case, rebuilding is the easier option.

Valid checksum tag

If the revert + roll-forward approach isn’t the best option, you can use the <validCheckSum> tag to specify which checksum you want to accept as valid, even if it’s different than what was stored in the DATABASECHANGELOG table. Use the checksum values from the error message.

Using our example, your final Changeset would look like the following:


   8:b4fd16a20425fe377b00d81df722d604
   
       
   

The type is still "bigint," but it now has a validCheckSum tag. 

Running this Changeset against your database that originally ran the “int” version will no longer detect the checksum error, but  the database will still consider the Changeset as “ran.” That means the column will remain an “int.”

You have the option to leave it that way, or update it. To update, you would add an additional modifyDataType Changeset from “int” to “bigint” with a “changeSetRan” onFail=MARK_RAN precondition right before the addColumn changset. Doing this will migrate only databases that have run the changset and leave the rest.


   
       
   
   

 

   8:b4fd16a20425fe377b00d81df722d604
   
       
   

Manual modification

If you only have a handful of databases with issues, you can also manually modify their datatype.

Change history visibility

In release 4.27.0, Liquibase introduced DATABASECHANGELOGHISTORY (DBCLH) table to record a history of all changes it makes to the database. This more powerful version of the DATABASECHANGELOG table leverages Changeset checksums in a similar way, but adds a dimension of time to understand the history of how a database reached its current state. 

While a standard Liquibase DATABASECHANGELOG table is immensely useful in showing the applied and active changes that make the database’s state what it is, the historical table also includes changes that are no longer present, such as those that have been rolled back or otherwise updated via updates, rollbacks, drop-alls, Changelog syncs, tags, and runOnChange commands. 

Harness the power of Changeset checksums

Changeset checksums are a critical element of optimal database DevOps workflows, since they help automatically protect the integrity, security, and reliability of databases undergoing changes. The tracking, collaboration, and troubleshooting benefits brought about by Changeset checksums ensures that any changes made to the database schema are tracked, validated, and protected against unauthorized alterations. It’s one of the many ways Liquibase streamlines database change management while reducing the risk of deployment errors and data inconsistencies, leading to a faster, more controlled pipeline.

Learn how Changeset checksums and other advanced database DevOps features work in this on-demand and interactive Liquibase demo. You’ll learn how Liquibase works to equip database, application, and data teams with the database change management automation, governance, and observability they need to innovate the foundation of the data-driven business. 

For an in-depth technical walk-through on using Changeset checksums in Liquibase, head to our official documentation.

Nathan Voxland
Nathan Voxland
Share on: