October 3, 2008

Dealing with Changing Changesets

Updated: July 15, 2021

The goal of Liquibase is to consistently apply the exact same changes across all your databases. This consistency gives you the confidence to know that what has been tested from development through staging will not break production.

To be consistent, you should always run the exact same statements, in the same order, across all your databases. That means you shouldn’t adjust or change the behavior of changesets that have been deployed to some environments. To help you catch unexpected or accidental differences in what you’ve already run before in a changeset vs. what the script wants now, Liquibase tracks a checksum of each changeset in the databasechangelog table and exits with a validation error if it detects differences.

Always roll forward for consistency & safety

Because we want consistency and safety, the best way to manage changes is to roll forward. For example, if you have a changeset that creates a table named “addrs” and it should be “address”, don’t go back and change the original changeset. Instead, add a new renameTable changeset. It can feel unnecessary and non-performant, but your database is fast and inconsistencies in deployment scripts can be fatal.

Things rarely go according to plan, so Liquibase has several options for dealing with changes to existing changesets depending on your needs.

The exceptions to the rule & how to deal with them

There are times when changing existing changesets is the best option. Here are some examples of when it’s okay and how to deal with the changing changesets.

If the developer changes their mind prior to committing the changelog

Here’s the scenario: A changeset starts its life with a developer or DBA adding it to a changelog and running it against their sandbox database. As they test their changes, they decide they should have named a table differently (or used a different datatype, or created an index differently, etc.). As long as they have not committed or shared that changeset, they can roll it back in their environment, modify it, and re-apply it. They’ll know that everyone who has ever run that changeset is in the same spot.

If someone fat-fingered a changeset like this:

Don’t bother with a separate renameTable changeset that will be there forever, just do the following:

Run liquibase rollback-count 1
Fix the misspelling in the changelog file
Run liquibase update

However, if you have already committed that problem changeset, you are no longer in control of where it has run. Once you have committed the changeset with the misspelling, you must “roll forward” with an additional renameTable changeset. So, take a second before you commit your changes to double-check your changelog just like you do with all the rest of your code.

If there are performance issues

While database changes are usually fast, there are some that are not. Creating indexes on a large table is a common example, but others can be slow as well. If you have a changelog that creates an index and then drops it again later, there may be no reason to create it in the first place.

Because liquibase update simply compares what is currently in the changelog file to what has been executed against the target database, if you remove the createIndex changeset from your changelog file it will no longer be applied to future databases.

For databases that already have had that index created, you will want to take it out. The easiest way to remove that index from databases that have previously created it is to add a dropIndex changeset with a changeSetExecuted precondition.

It's important to remember, even with the index example, that it can be dangerous to add inconsistency to your scripts. The changelog was running successfully when the index was there and perhaps there are steps along the way that depend on it or run more quickly because the index is there. Even with something like unused indexes that are likely safe, it is the safest to just leave them in there and follow the roll-forward steps unless you are seeing unacceptable liquibase update times.

If the original changeset was unrecoverably buggy

There are times when an incorrectly created change is not caught until later in the process. Ideally, you would roll forward with a fixing changeset, but sometimes that is not possible. For example, maybe a changeset joined column data together but didn�t include the needed space between the values and there is no way to know what the data originally was. You found the bug and don�t want to deploy it to production, but are okay with leaving it messed up in your test environments for now.

In this case, modify the existing changeset to include the correct logic. This keeps you from breaking more database environments. When you try to apply that changed changeset to environments that already ran the old version, they will fail with an error that looks something like this:

Validation Failed:
     1 change sets check sum
          com/example/changelog.xml::1::author_name was: 8:1b99d310bf88ee47d9d7daf1f33f2275 but is now: 8:af6710859b7acfa6538fed7079fe9525

To tell Liquibase that the new checksum is one you are okay with, add a <validCheckSum> tag to the changeset. You can get the checksum from the error message above:


       8:af6710859b7acfa6538fed7079fe9525
       (new logic here)

Liquibase will not attempt to re-run that changeset against databases that already ran the old version. It just knows you are okay with the fact that what was originally run is different from what is currently being requested.
If you plan on manually cleaning up the databases impacted by the buggy version of the changeset, you are done. If you’d like to add logic to the changelog to automatically apply a fix to only databases which had the original problem, you can do that. Add another changeset that uses a precondition to check for the existence of the bad state and applies the fix when it detects that or marks the “fixup” changeset as “ran” if there is no need to run it.


 
     ...
 
 LOGIC TO FIX ISSUE

If you have complex views or stored procedures

For people managing large views or stored procedures in Liquibase, a commonly used pattern is to make an exception to the “roll forward by adding new changesets to the end” rule in favor of modifying the original changeset and adding a “runOnChange=true” attribute to the changeset. (Learn more about the runOnChange attribute.)


    
           CREATE PROCEDURE xyz AS [original code]

Then when a change needs to be made, they edit it to be:


    
           CREATE OR REPLACE PROCEDURE xyz AS [new code]

The “create or replace” syntax allows the new code to run just as well against a fresh database as one that has the old version of the procedure in it. The runOnChange attribute tells Liquibase to run the SQL whenever it is different from the last time it was run against a given database.

The advantage to this style is that it keeps the source for that procedure in one spot, so Git’s branching, merging, and history work well with it.

The danger of making anything runOnChange is that you are inherently introducing variation in the order it is run against different databases.

Imagine you have a changelog with the following changesets:

Changeset 1 creates table A
Changeset 2 creates procedure X which queries from A
Changeset 3 creates table B

You run them against your “dev” and “test” environments and it’s successful. But then you change changeset 2 to have procedure X query tables A and B. That change will run successfully against “dev” and “test” because the actual order was:

Changeset 1
Changeset 2 (old)
Changeset 3
Changeset 2 (new)

But when you run the changelog against a new “staging” environment, it fails on changeset 2 with a “table B does not exist” error because it ran:

Changeset 1
Changeset 2 (new)
Changeset 3

There are ways to mitigate the danger. Keeping all your runOnChange changesets at the end of your changelog can often help, but is not foolproof. In the above example, if the changeset 2 was at the end of the file, the “dev” and “test” order would have been:

Changeset 1
Changeset 3
Changeset 2 (old)
Changeset 2 (new)

and the “staging” order would have been

Changeset 1
Changeset 3
Changeset 2 (new)

Which is still different, but is valid.

Adding runOrder=”last” to your runOnChange changesets will ensure they are running at the end without having to physically move them.

Even with moving them to the end, there is still the chance of errors in different environments if the stored procedures reference other stored procedures, which may or may not already be made depending on the actual order things run in, so be on the lookout for that. Remember, you can always reorder changesets simply by moving them around in the file.

Even more options

Beyond these options, there are several other tools available to use including the changeLogSync command, markNextChangeSetRan, preconditions based on the database state (<tableExists>, <sqlCheck>, etc.), or even manually editing the databasechangelog table to add or remove tables.

Summing it up

While it’s absolutely a best practice to roll forward, Liquibase provides many flexible options for dealing with complex problems that always come up sooner or later when working with databases. Our community of users is available on the Liquibase forum and live chat if you’re working through issues dealing with changing changesets. If you’re looking for ongoing or more in-depth help, we recommend checking out Liquibase Pro to receive expert help from the team at Liquibase.