Logo

dev-resources.site

for different kinds of informations.

How is Concurrency Control Maintained to Ensure Isolation?

Published at
11/24/2024
Categories
distributedsystems
softwareengineering
systemdesign
Author
Ujjwal Raj
How is Concurrency Control Maintained to Ensure Isolation?

In a distributed system, several applications require concurrency control (we have discussed some examples in our previous blogs as well). If multiple transactions are occurring simultaneously, locks are generally used on objects in the data store (to be read and written) to maintain serializability. A lock manager runs as part of the process, tracking all granted locks, which transactions are acquiring locks, and which transactions are waiting for locked objects to be released.

2PL is a Pessimistic Concurrency Control Mechanism

2PL operates on the assumption that conflicts between transactions are likely to occur. To prevent such conflicts, it locks resources proactively. During the growing phase, 2PL acquires all necessary locks on resources before proceeding with the transaction. These locks block other transactions from accessing the same resources, even if no conflict would have occurred. If a transaction cannot acquire a required lock (because another transaction holds it), it waits until the lock is released.

2PL Can Also Cause Deadlock

Image description

Imagine a scenario where transaction T1 is assigned resource A while transaction T2 is assigned resource B. T1 is waiting for B to be freed while T2 is waiting for A to be freed. This creates a deadlock.

To resolve this scenario, deadlock detection is employed, and a victim transaction is aborted (rolled back if needed) and restarted.

Problem in a Pessimistic Protocol

In 2PL, a read-only transaction might wait for a long time to acquire a shared lock. Optimistic Concurrency Control (OCC) avoids this overhead.

Optimistic Concurrency Control (OCC)

OCC avoids the overhead of 2PL (pessimistic protocols), making it suitable for read-heavy workloads with infrequent writes.

In OCC, transactions write to a local workspace without modifying the actual data store. When a transaction wants to commit, the data store compares the workspaces of other running transactions.

If validated, the content of the local workspace is written to the data store. If not, the transaction is aborted and restarted.

Unlike 2PL, which uses logical locks, OCC uses physical locks. For example, during validation, the data store locks the local workspaces of running transactions. These locks are called latches (physical locks) and help avoid race conditions.

The problem with OCC is that a read-only transaction may be aborted because the values it read have been overwritten (as observed in a validation failure).

Multi-Version Concurrency Control (MVCC)

In this mechanism, a new version of the data store is created when a transaction writes to it. A read transaction always reads the newest version, making reads immutable and ensuring a consistent snapshot. As a result, aborted writes cannot block or delay a read transaction.

MVCC may implement OCC or 2PL for write transactions. MVCC is widely used because, in day-to-day life, read transactions outnumber write transactions. For instance, you scroll Instagram far more often than you upload a post.

MVCC Using 2PL for Write Operations

Whenever a write is performed, a commit timestamp (TC_i) is assigned to a version. When a read-only transaction starts, its timestamp (TS_i) is compared to TC_i. A read transaction only sees changes in the data by transaction i if TS_i ≥ TC_i.

Conclusion

In summary, different concurrency control mechanisms like 2PL, OCC, and MVCC address the trade-offs between consistency, performance, and resource contention in distributed systems. MVCC, with its ability to handle read-heavy workloads efficiently, has become the most popular choice in modern applications.

Here are links to my previous posts which I publish every Sunday on distributed systems:

Feel free to check them out and share your thoughts!

Featured ones: