CAP theorem is a concept that computer scientist Eric A. Brewer coined during a distributed computing talk in 2000. That’s why it’s also known as Brewer’s theorem. It states that a distributed system cannot have more than two of the following guarantees.

  • Consistency
  • High availability
  • Partition tolerance

A proof was published two years later by two MIT professors Nancy Lynch and Seth Gilbert.

The term distributed system refers to a network that stores data on many physical or virtual machines (nodes) simultaneously. All cloud applications run on distributed networks; thus, a developer needs to understand how it works. A developer with a great understanding of the CAP theorem knows the best management system that takes care of the application needs.

It is an essential tool that prepares system designers of the trade-offs as they design data systems with the shared networks. CAP has played a huge role, especially when it comes to the design of most distributed systems. Software architects are well informed of the trade-offs they need to consider as they make their data systems.

Let us dig into these conditions one by one

Consistency

It is a condition that states that all the nodes of a distributed system see the same data at the same time. This guarantee means that the most recent write value will be returned by all the nodes when one performs a read operation. A system with consistency starts with consistency and still ends in a consistent state.

High availability

This is a condition stating that every request gets a response on success or failure. The system needs to operate 100% of the time to achieve a high availability status. Each client must get a response regardless of the nodes’ individual state in the system. This metric measures whether you can submit write/read commands or you cannot. The nodes need to be available all the time because the databases are time-independent

  Strategy pattern. A design pattern we use every day

Partition Tolerance

It is a condition which states that despite the network nodes delaying several messages, the system continues to run as intended. If a system is partition tolerant, it can withstand any network failure without failing the entire network. This is because the data records are replicated across multiple network nodes sufficiently. Partition tolerance helps to keep the system running even when there are intermittent outages. Modern distributed systems must have partition tolerance, leaving you with only two options (Availability and consistency) to trade-in.

Diagram visualizing the CAP theorem

Types of CAP Theorem NoSQL database

CP database: This delivers consistency and partition tolerance hence the initials CP. Availability is the CAP characteristic that is in expense in this type of database. It shuts down when a partition occurs.

AP database: It delivers availability and partition tolerance by sacrificing consistency.

CA database: All nodes get consistency and availability at the expense of fault tolerance. A CA database is just a theory because partitions are not an option in a physically distributed system. However, you can create a CA database using a relational database in case you need one. A good example is Microsoft SQL Server.

Final thoughts on CAP theorem

If you are thinking about developing an application that will run on the cloud, you need to understand the CAP theorem deeply. It is a concept that helps developers to create efficient systems. Every system needs to compromise one of the 3 CAP conditions. But if you are running on a distributed database/network, you are only left with two options to pick: Consistency or availability. Once you choose your best two conditions, you can develop a good management system to mitigate the missing condition.

  Design patterns explained. What do we benefit from design patterns?

Write A Comment