Back to all topics
system-design

CAP Theorem

You can only pick two: Consistency, Availability, or Partition Tolerance. Choose wisely.

Tiny Summary

CAP Theorem states you can only guarantee 2 of 3 properties in a distributed system: Consistency (all nodes see same data), Availability (every request gets a response), Partition Tolerance (system works despite network failures). Pick your tradeoffs.


The Impossible Triangle

In distributed systems, you face a fundamental tradeoff. You MUST handle network partitions (they will happen), so you're really choosing between Consistency and Availability.

        Consistency
             /\
            /  \
           /    \
          /  ??  \
         /________\
  Availability    Partition
                Tolerance

You can't have all three. Pick two.


The Choice

CP (Consistency + Partition Tolerance):

When the network splits, reject requests until data is consistent.

Daily examples:

  • Your bank ATM: "Service temporarily unavailable" (better than showing wrong balance)
  • Stock trading: Halt trading rather than show stale prices
  • Inventory systems: "Out of stock" rather than overselling

AP (Availability + Partition Tolerance):

When the network splits, serve requests with potentially stale data.

Daily examples:

  • Facebook: You see a post, your friend doesn't yet (eventual consistency)
  • Twitter: Follower counts slightly off across devices
  • DNS: Serves cached records even if authoritative server is down

Real Business Scenarios

Banking App (CP):

Network partition happens. Two data centers can't talk.

  • CP choice: Disable withdrawals until partition heals → frustrated users, but no overdrafts
  • AP choice: Allow withdrawals from stale data → happy users, potential overdrafts, money lost

Banks choose CP. Correctness over availability.

Social Media (AP):

Network partition happens. East and West coast data centers split.

  • CP choice: Disable posting until healed → users leave for competitors
  • AP choice: Accept posts, sync later → users stay, minor inconsistencies (fine for likes/follows)

Social apps choose AP. Availability over perfect consistency.


How to Choose

Choose CP (Consistency) when:

  • Money is involved (payments, billing, balances)
  • Inventory management (can't oversell physical goods)
  • Critical data (medical records, legal documents)
  • Correctness matters more than uptime

Choose AP (Availability) when:

  • Social features (likes, follows, comments)
  • Analytics and metrics (slight staleness acceptable)
  • User-generated content (posts, photos)
  • User experience matters more than perfect accuracy

The PACELC Extension

CAP assumes network partitions. But what about normal operation?

PACELC: If Partition, choose A or C. Else, choose Latency or Consistency.

Even without partitions, you trade consistency for speed:

  • Waiting for all replicas = consistent but slow
  • Reading from nearest replica = fast but potentially stale

Common Mistakes

"We'll build a CA system"

No. Network partitions WILL happen. Choosing CA means you don't handle network failures. That's not production-ready.

"We'll be consistent AND available during partitions"

Also no. That's mathematically impossible. CAP theorem is proven, not a suggestion.

"We need strong consistency everywhere"

Expensive and slow. Most features don't need it. Use strong consistency only where you need it (payments), eventual consistency everywhere else (user profiles).


Key Insights

Network partitions aren't theoretical — they happen daily in production. AWS regions go down. Fiber gets cut. Routers fail. Your system must handle it.

Most systems are actually "eventually consistent AP" — they prioritize availability and sync data when networks heal. Only financial/critical systems choose CP.

Don't build CP for features that don't need it. Eventual consistency is fine for most SaaS features.

Use the simulation to see how different choices affect user experience during network failures!