ACID Properties in Database Management Systems

ACID Properties in Database Management Systems

·

8 min read

ACID is a set of specifications that ensure that a database will be in a consistent state after running logically grouped operations, also known as transactions. The acronym stands for Atomicity, Consistency, Isolation, and Durability. These properties are crucial for any database that aims to maintain data integrity despite concurrent access and potential system failures.

To better understand the ACID properties, let's shift our attention to what a database transaction actually is.

What is a Database Transaction?

A transaction is a single unit of work that is used to group multiple operations that need to be executed together to ensure that the database remains in a consistent state after the execution is completed, regardless of the success or failure of the execution. You can think of a transaction as a series of logically grouped operations that are executed in a specific order. These operations can include reading data from the database, inserting new records, or modifying existing ones. A transaction only succeeds if ALL the operations within it succeed.

A transaction starts with a particular operation, such as creating an order on an e-commerce website. It then continues with a series of read/write operations that are performed sequentially. Once the operations are completed successfully, the transaction is committed and the changes are now permanently applied to the database.

Example of a Database Transaction

To thoroughly understand what a transaction is, let's go over a simple example. Let's say you have two bank accounts namely account_1 and account_2. Both accounts represent records in the bank_accounts table. The account numbers for those two accounts are 1234567890 and 0987654321 respectively. You want to transfer £100 from the 1st account to the 2nd one. A reliable approach to the described operation may look like the following:

  1. Read the current balance of the source account, account_1 in this example.

  2. Check if its balance is sufficient to carry on with the rest of the operations (i.e. balance >= 100).

    • In case of insufficient balance, raise an exception, and exit the transaction.
  3. Read the balance of the destination account, account_2 in our example.

  4. Update the balance of account_1 by deducting £100 from it.

  5. Update the balance of account_2 by adding £100 to it.

  6. Commit the transaction to signify its end and to persist its changes.

Makes sense? Now let's create a transaction that applies this operation to a PostgreSQL database.

BEGIN;

-- Lock the bank accounts to prevent concurrent access
LOCK bank_accounts IN EXCLUSIVE MODE;

-- Declare variables to hold the current values of both balances
DECLARE @src_balance INT;
DECLARE @dst_balance INT;

-- Retrieve the current balance of account_1
SELECT @src_balance = balance FROM bank_accounts WHERE account_number = '1234567890';

-- Check if the source account has enough funds
IF @src_balance < 100 THEN
    RAISE EXCEPTION 'Insufficient funds in account 1234567890';
END IF;

-- Retrieve the current balance of account_2
SELECT @dst_balance = balance FROM bank_accounts WHERE account_number = '0987654321';

-- Perform the transfer
UPDATE bank_accounts SET balance = @src_balance - 100 WHERE account_number = '1234567890';
UPDATE bank_accounts SET balance = @dst_balance + 100 WHERE account_number = '0987654321';

-- Commit the changes.
COMMIT;

As already mentioned, a transaction never commits its changes unless all of the operations within it are performed successfully. If for any reason any of the statements in the above transaction fails, the transaction will be rolled back and none of its changes will be persisted.

Now that we cleared the concept of a transaction, let's switch gears back to the A in ACID, Atomicity.

Atomicity

Atomicity is the property that ensures the unity of all operations within a transaction. It dictates that a transaction is treated as a single, inseparable unit of work and that either all of the statements within a transaction are executed successfully, or none of them are.

During the lifespan of a transaction, several operations can be performed on a database. These changes can be insertions, deletions, or updates. Atomicity makes sure that if any of these alterations fail, the entire transaction is rolled back and the database returns to its state before the transaction began execution. This warrants the consistency of the database at all times, even at times of unexpected failures.

Following up on the example transaction above, if something goes wrong while the transaction is running, such as system or network failure, Atomicity guarantees that changes made to account_1 and account_2 are rolled back.

To achieve this, Atomicity is usually implemented using a technique called write-ahead logging (WAL). Before persisting the changes to the database, the WAL technique entails that these changes are written to a separate log file first. If the transaction is committed, the log file is then used to flush and persist the changes to the database. If the transaction fails at any point, the log file is also used to undo the changes made by the transaction.

Consistency

Consistency guarantees that a transaction transforms the database from one valid state to another. In other words, the transaction can only be committed if it abides by the integrity constraints and business rules defined for the database.

Integrity constraints are rules that are built into a database to ensure the integrity of the data. For instance, a constraint on a primary key ensures its uniqueness, or a foreign key ensures that it references a valid primary key on another table. Business rules are rules set by the developers of the application that are used to warrant the consistency of the data with respect to its real-world representation.

If any of the changes performed by a transaction violate any of these rules or constraints, the transaction is rolled back, and the database returns to its state before running the transaction. This ensures that data within the database is consistent at all times.

Considering the transaction example above, the consistency property guarantees that account_1 has enough funds to complete the transfer account_2 before the actual transfer is made. If the source account does not have enough money, the transaction is rolled back, and the database returns to its state before the transaction started.

Consistency is achieved using a collection of built-in constraints, such as primary_keys and foreign_keys, and business rules, such as stored procedures, functions, and triggers.).

Isolation

Isolation is an essential database property in multi-user environments. If multiple transactions are running concurrently, isolation ensures that changes performed by one transaction are not visible to other transactions until the first transaction is committed. In other words, each transaction is run as if it were the only transaction being executed at a point in time. Without isolation, multiple transactions can access the same data simultaneously, which can result in conflicts and inconsistency of data.

There are different isolation levels, each providing a different degree of isolation between transactions.

Isolation Levels

Isolation levels dictate which phenomena must not happen when performing concurrent transactions. The most common isolation levels are:

  • Read Uncommitted: The lowest level of isolation, which allows a transaction to read data that is being modified by another running transaction even if this data isn't committed yet. This results in dirty reads, where a transaction reads data that can be later rolled back. Dirty reads are not allowed in some DBMSs like PostgreSQL.

  • Read Committed: This level offers a higher degree of isolation than read uncommitted. It only permits a transaction to read data that has been committed by another transaction, hence, it eliminates the possibility of dirty reads. Using this level, however, can still lead to non-repeatable reads, where a transaction reads the same records twice and might get different results.

  • Repeatable Read: A higher level of isolation, where a transaction can only read data that has been committed by another transaction, and that data cannot be modified until the first transaction is committed. This eliminates the occurrence of non-repeatable reads. One way to achieve this is by placing shared locks on all data that is read by each statement in a transaction and all the locks are held until the transaction completes. As a result, other transactions are unable to modify the data that has been read by the current transaction. A drawback of this isolation level is that it can still lead to phantom reads, where the transaction reads the same data twice but gets different results due to new rows being added by another transaction that matches the query of the first transaction. Hence, the number of rows returned each time is different.

  • Serializable: The highest level of isolation. It ensures that transactions are executed in a serialized order, as if they were executed one after the other, not concurrently. This eliminates all of the issues that can happen using previous isolation levels, however, it's not without its drawbacks. Serializable leads to decreased concurrency, as one transaction now can hold the data by a lock and other transactions cannot access this data until the first transaction is committed, which leads to longer execution times. This isolation level is useful for applications where consistency is more important than concurrency.

Durability

Durability ensures that once a transaction is committed, its changes will survive any subsequent failures. In other words, this property guarantees that changes committed to the database by a transaction are permanent and can't be undone.

Similar to atomicity, durability can be achieved by using the WAL technique, which ensures that the changes proposed by a transaction can be recovered in case of a system failure.

Another technique is data replication, which involves creating a replica of the database on a separate server in a different location. This guarantees that even if the primary database fails, the replicated database can be used as a backup.

Conclusion

In summary, the ACID properties are essential for ensuring the reliability of a database, and the integrity and consistency of the data it houses. They also ensure that a transaction can only bring a database from one valid state to another through Atomicity, Consistency, Isolation, and Durability.

Adhering to these criteria guarantees that the data is always correct, even in the face of concurrent access or unprecedented errors.