Distributed systems face a fundamental trade-off of consistency vs. availability and performance. Strong consistency is easy to understand but is slow, expensive, and is unavailable when the system partitions. Eventual consistency (EC) can be cheaper, faster, and more scalable, but is hard to understand and get correct. This tutorial explores the multiple gradations between strong and eventual consistency. It focuses on understanding EC, from perspectives of the algorithm designer, of the system builder, and the application programmer. It will include formal definitions of correctness, study of lower bounds, and implementation recipes and tricks.
- Marc Shapiro, Inria & Université Pierre et Marie Curie-LIP6 (Paris, France), lip6.fr/Marc.Shapiro/.
- Nuno Preguiça, Departamento de Informática, FCT, Universidade Nova de Lisboa (Portugal), asc.di.fct.unl.pt/~nmp/.
- The consistency problem on the Internet: why replication is needed, and why it's hard.
- Strong vs. Eventual Consistency: CAP, correctness vs. performance.
- Basic mechanisms of EC: replica, local query, distributed update, state-based vs. operation-based, detect and resolve concurrent updates, accumulation of meta-data.
- Distributed resolution: highest timestamp wins, 2P-Set, Operational Transformation.
- Formal definitions: system model, EC, causal consistency, partial-order model.
- CRDTs: sufficient conditions, examples.
- Designing a correct set: basic OR-Set, garbage collection.
- More guarantees: transactions, invariants.
- Available CRDT implementations.