Replicating shared data is a fundamental mechanism in large-scale
distributed systems, but suffers from a fundamental tension between
scalability and data consistency. Eventual consistency sidesteps the
(foreground) synchronisation bottleneck, but remains ad-hoc,
error-prone, and difficult to prove correct.
We present a promising new approach that is simple, scales almost
indefinitely, and provably ensures eventual consistency: A CRDT is a
data type that demonstrates some simple properties, viz. that its
concurrent operations commute, or that its states form a semi-lattice.
Any CRDT provably converges, provided all replicas eventually receive
all operations. A CRDT requires no synchronisation: an update can
execute immediately, irrespective of network latency, faults, or
disconnection; it is highly scalable and fault-tolerant.
The approach is necessarily limited since any task requiring consensus
is out of reach. Nonetheless, many interesting and useful data types
can be designed as a CRDT. We previously published the Treedoc CRDT, a
sequence data type suited to concurrent editing tasks (as in a p2p
wiki). This talk presents a portfolio of generally-useful, non-trivial,
composable CRDTs, including variations on counters, registers, sets,
maps (key-value stores), graphs and sequences. This research is part of
a systematic and principled study of CRDTs, to discover their power and
limitations, and to better understand the underlying mechanisms and
requirements. The challenges ahead include scaling garbage collection
and integrating occasional non-commuting operations.
This is joint work with Marek Zawirski of LIP6, Nuno Preguiça of
Universidade Nova de Lisboa, and Carlos Baquero, of Universidade do
Minho. The research is supported by ANR grant ConcoRDanT
(ANR-10-BLAN-0208), a Google Research Award 2009, and a Google European
Doctoral Fellowship 2010.