The CAP theorem or Brewers Theorem states that in a distributed system it is impossible to simultaneously guarantee the three properties – Consistency, Availability and Partition tolerance (failure tolerance). According to the CAP theorem, a distributed system can meet two of the following properties at the same time, but not all three.
The consistency of the stored data. In distributed systems with replicated data, it is important to ensure that all replicas of the compromised record are updated after a transaction is completed. This consistency should not be confused with the consistency from the ACID transactions, which only affects the internal consistency of a database. Availability in terms of acceptable response times. All requests to the system are always answered. The failure tolerance of the computer/server networks. The system also continues to work when partitioned, i.e. when nodes can no longer communicate with each other (to keep the data consistent with each other). This can occur due to the loss of messages, the failure of individual network nodes, or the cancellation of connections on the network (the partition of the network).
Since only two of these three requirements can be fully met in distributed systems at the same time, the CAP theorem is often visualized as a triangle in which a concrete application can be categorized to one of the edges. However, this is inaccurate. Because the CAP theorem refers to distributed systems, a selection without partition tolerance is not part of the consideration. The interpretation is therefore rather that in the case of network partitioning (i.e. in a case of failure that does not describe the desired system execution), one has to choose between the consistency of the data and the availability of the system.
Real Life Examples of CAP Theorem
Domain Name System (DNS)
The Domain Name System is the Internet service that resolves symbolic hostnames such as numeric IP addresses to TCP/IP communication. DNS falls into the AP category. Availability is extremely high, as is tolerance to the failure of individual DNS servers. However, consistency is not always immediate: it can sometimes take days for a modified DNS record to be propagated to the entire DNS hierarchy and thus seen by all clients.
Cloud platforms rely on horizontal scaling, i.e. the load is distributed across many nodes consisting of cheap, not necessarily fail-safe hardware . Therefore, a cloud application must be able to show tolerance to the failure of individual nodes. High availability is also required. It follows that a cloud application (at least in large parts) also falls into the AP category. Examples of web applications that do not require strict consistency would be social media sites such as Twitter or Facebook; if individual messages do not arrive at all users at the same time, the basic function of the service is not affected.
Since a strict consistency can no longer be guaranteed here due to the CAP theorem, but a completely inconsistent data storage is also not desired, one has to accept weaker consistency conditions. As a counterpart to the ACID principle of relational databases, many NoSQL databases rely on the BASE principle. Even consistency can be translated well with “final consistency”, i.e. the system is back in a consistent state after a certain (as short) period of inconsistency as possible.
Relational Database Management System (RDBMS)
The relational databases, such as DB2 or Oracle, primarily strive for consistency. An RDBMS cluster usually falls into the CA category. Above all, they strive for the availability and consistency of all nodes. Because they are mostly operated with highly available networks and servers, they do not necessarily have to be able to handle partitioning well.
For distributed financial applications such as ATMs, the consistency of the data is paramount: a withdrawn amount of money must be reliably debited to the account page, deposited money must appear on the account. This requirement must also be ensured in the event of traffic disruptions (partition tolerance). Compared to consistency and partition tolerance, availability is secondary(CP): In the event of network disruptions, an ATM or other service should be unavailable rather than handle incorrect transactions.