April 30, 2024

Correlate database changes to system availability with Liquibase’s tracking and observability

Any change to your database can impact any number of components throughout your infrastructure and lead to downtime.

With DevOps at the database – including advanced control and process visibility – you can elevate your team’s confidence in every deployment while equipping them with the tools and insights needed to resolve any problem that slips through.

Liquibase enables you to revise and release database updates from development to production faster and safer. By embracing continuous, incremental database changes and enabling granular control with features like Quality Checks, every deployment undergoes rigorous automated testing and validation to release more quickly with fewer errors.

But even with the most thoroughly reviewed request, an incremental database change could still affect performance and, as a result, system availability.

Luckily, Liquibase also enables database observability, including pipeline analytics and change operation monitoring, that enables users to share information about database changes delivered by Liquibase across teams. Integrating Liquibase into your automation, observability, analysis, and productivity systems allows you to create a comprehensive dashboard for modern software delivery.

This kind of deployment pipeline visibility – powered by Liquibase’s Structured Logging – allows teams to gain deeper insights into the relationship between database changes and service availability to correlate downtime with specific updates effectively. Additionally, Site Reliability Engineers (SREs) and those in operational roles with the responsibility to perform Root Cause Analysis (RCA) of failures can use this same information to quickly pinpoint what changed and when.

Learn how database updates can impact availability, plus how advanced database DevOps features reduce errors and simplify remediation for those that do slip through.

Database availability is critical to application uptime

Either directly or through dependencies, a lack of database availability results in an inability to serve content or respond to user requests properly.

Your applications and IT systems need the database to function correctly. When the database goes down, applications can’t store and retrieve critical data, bringing down user experiences, revenue sources, and other essential business technologies. Examples of impacted systems include:

Content: Dynamic content enriched with data stored in the database may fail to function correctly, leading to a degraded user experience or outright errors.
Session management: If user session-related data stored in the database is unavailable, users may experience issues with logging in, maintaining session state, or accessing personalized content.
E-commerce transactions: Database availability is critical for processing E-commerce transactions, managing inventory, and updating order statuses. Downtime or sluggish database performance can result in lost sales and dissatisfied customers.
Search functionality: Search functionality powered by databases may cease to function, impairing usability and navigation.
Business intelligence: BI relies on performant access to data warehouses to provide mission critical analysis for business operations. Degraded database access can lead to missing timely insights into user behavior.

With database availability impacting so many critical elements, your teams need a better way to reduce, predict, and resolve downtime events.

Database changes put system availability at risk

While Liquibase automates your database change management pipeline and runs every change through a finely tuned series of checks, database deployments can still impact the availability of your application or website. It might not be that an error slipped through, but that error-free code leads to unexpected challenges downstream or across dependencies.

There are several ways that even the most streamlined and optimized database deployments can degrade availability, including:

Schema changes: Significant changes to the database schema, such as adding or removing columns, tables, or indexes, could lead to slow performance or even downtime during the deployment process.
Data migration: When migrating data between different database versions or structures, there's a risk of degraded user experience, especially if the dataset is large, due to the strain this puts on the database.
Performance optimization: Optimization processes such as query tuning or index rebuilding could increase the load on the database server, leading to slower response times.
Concurrency issues: Introducing new features or modifying existing ones might inadvertently introduce concurrency issues in the database. These issues can lead to deadlocks or long-running transactions, impacting the overall availability and creating wait times for users.

To mitigate these risks, it's essential to follow best practices such as:

Thorough testing in a staging environment before deploying changes to production
Implementing proper rollback mechanisms
Scheduling deployments during off-peak hours whenever possible

Additionally, you should have monitoring and alerting systems in place to help minimize downtime by detecting and addressing issues quickly.

Liquibase’s database observability capabilities, rooted in Structured Logging, support this kind of monitoring and alerting setup by providing detailed information about each database change as soon as it is deployed. When an availability problem is detected, Liquibase has the pipeline analytics and change operation reports that help you answer the question, "What changed with the database?"

This rich, contextualized metadata around changes and workflow metrics will shine a light on how to resolve any errors. It might also highlight a database change as the root cause of a broader issue.

Database change can be a root cause of availability degradation

Even thoroughly scrutinized changes to the database can be a hidden root cause of availability degradation. When performing Root Cause Analysis to begin recovery, a view of recent database changes is critical to understanding outages related to the database.

Correlating availability events with database changes logged by Liquibase’s granular tracking can be accomplished using various techniques. Here's a general approach you can follow:

Data ingestion: Ensure that your observability platform indexes both your availability data (e.g., web server logs, uptime monitoring logs) and Liquibase’s Structured Logging.
Define common fields: Identify common fields or attributes between your availability events and logged database changes that can be used for correlation. This could include database hostnames, application endpoint URLs, or any other relevant metadata.
Search queries: Write observability search queries to extract availability events and logged database changes separately. For availability events, you might search for HTTP status codes indicating service availability (e.g., 200 for success, 5xx for errors). Many monitoring systems provide out-of-the-box performance metrics, such as latency and resource utilization, which can be monitored.

For database changes, you might search for specific deployment commands, schemas, labels, contexts, or JDBC connection strings recorded by Liquibase Structured Logging during database change deployments. This approach would cover:

Correlation logic: Develop correlation logic to link availability events with corresponding logged database changes. This involves joining the search results based on common fields, such as JDBC connection string, while filtering events within a time window.
Alerting: Configure alerting mechanisms to notify stakeholders whenever a correlated pair of availability and logged database changes is detected. You can set up alert conditions based on predefined thresholds or conditions that indicate potential issues or anomalies.
Dashboard visualization: Leverage dashboards in the observability platform to visualize correlated availability and logged database changes over time. This can provide insights into the impact of changes on system availability and help identify trends or patterns.
Refinement and tuning: Continuously refine and tune your correlation logic based on feedback and observations. Adjust search queries, correlation rules, and alerting thresholds as needed to improve accuracy and relevance.

Underlying all of these opportunities for observability-driven insights and resolutions is Liquibase’s Structured Logging. With a look at what Structured Logging tracks, it’s evident how beneficial Liquibase’s monitoring and analytics capabilities become to improving uptime.

Unlocking database observability with Structured Logging

Liquibase’s tracking, pipeline analytics, and change operation monitoring are driven by Structured Logging, which uses the logged key values to describe each database change and associated Liquibase actions. These keys include:

changesetAuthor - Author of the changeset being logged
changesetId - Unique ID of the changeset being logged
deploymentOutcome - Outcome of the database deployment
liquibaseCommandName - Name of the command entered in the CLI
liquibaseSchemaName - Name of the database schema.
liquibaseTargetUrl - Unique identifier (URL) of the target database associated with the log
Message - Short descriptor of log event
operationStart - UTC timestamp of operation start
operationStop - UTC timestamp of operation end

Additionally, when using update, update-count, update-one-changeset, update-testing-rollback, and update-to-tag, the exact SQL executed during the change is logged in the changesetSql key. For very detailed timestamps, the changesetOperationStart and changesetOperationStop can be used.

By leveraging Liquibase Structured Logging alongside your availability monitoring solution, you can effectively correlate availability events with database changes, enabling you to gain deeper insights into the relationship between database changes and service availability.

Your teams can then act on these insights and embrace continuous optimization of the change management workflow to elevate quality and efficiency on an ongoing basis. This database-level enhancement and visibility helps you deliver better and more reliable user experiences, avoid costly failures, and improve overall uptime.

To further explore the capabilities afforded by Structured Logging, check out our deep dive into database observability and the top benefits of database observability.