SlideShare a Scribd company logo
1 of 42
Download to read offline
CRDTs and Redis
From sequential to concurrent executions
Carlos Baquero
HASLab, INESC TEC & Univ. Minho, Portugal
April 26th 2018
The speed of communication in the 19th century
W. H. Harrison’s death
“At 12:30 am on April 4th, 1841 President
William Henry Harrison died of pneumonia
just a month after taking office. The Rich-
mond Enquirer published the news of his
death two days later on April 6th. The North-
Carolina standard newspaper published it on
April 14th. His death wasn’t known of in Los
Angeles until July 23rd, 110 days after it had
occurred.”
Text by Zack Bloom, A Quick History of Digital Communication Before the
Internet. https://eager.io/blog/communication-pre-internet/
Picture by By Albert Sands Southworth and Josiah Johnson Hawes
The speed of communication in the 19th century
Francis Galton Isochronic Map
The speed of communication in the 21st century
RTT data gathered via http://www.azurespeed.com
The speed of communication in the 21st century
If you really like long latencies . . .
Time delay between Mars and Earth
blogs.esa.int/mex/2012/08/05/time-delay-between-mars-and-earth/
Delay/Disruption Tolerant Networking
www.nasa.gov/content/dtn
Latency magnitudes
Geo-replication
λ, up to 50ms (local region DC)
Λ, between 100ms and 300ms (inter-continental)
No inter-DC replication
Client writes observe λ latency
Planet-wide geo-replication
Replication techniques versus client side write latency ranges
Consensus/Paxos [Λ, 2Λ] (with no divergence)
Primary-Backup [λ, Λ] (asynchronous version)
Multi-Master λ (allowing divergence)
EC and CAP for Geo-Replication
Eventually Consistent. CACM 2009, Werner Vogels
In an ideal world there would be only one consistency model:
when an update is made all observers would see that update.
Building reliable distributed systems at a worldwide scale
demands trade-offs between consistency and availability.
CAP theorem. PODC 2000, Eric Brewer
Of three properties of shared-data systems – data consistency,
system availability, and tolerance to network partition – only two
can be achieved at any given time.
CRDTs provide support for partition-tolerant high availability
From sequential to concurrent executions
Consensus and Primary-Backup provide illusion of a single replica
This also preserves some illusion of sequential behaviour
Sequential execution
Ops O o // p // q
Time //
We have an ordered set (O, <). O = {o, p, q} and o < p < q
From sequential to concurrent executions
Consensus and Primary-Backup provide illusion of a single replica
This also preserves some illusion of sequential behaviour
Sequential execution
Ops O o // p // q
Time //
We have an ordered set (O, <). O = {o, p, q} and o < p < q
From sequential to concurrent executions
EC Multi-master (or active-active) can expose concurrency
Concurrent execution
p // q

Ops O o
??
r
s
77
Time //
Ordered set (O, ). o  p  q  r and o  s  r
Some ops in O are concurrent: p s and q s
Design of Conflict-Free Replicated Data Types
A full partially ordered log of operations can implement any CRDT
Replicas keep increasing local views of an evolving distributed log
Any query can be expressed from that log
Example: Counter at i is |{inc | inc ∈ Oi }| − |{dec | dec ∈ Oi }|
CRDTs are efficient representations that follow some general rules
Design of Conflict-Free Replicated Data Types
A full partially ordered log of operations can implement any CRDT
Replicas keep increasing local views of an evolving distributed log
Any query can be expressed from that log
Example: Counter at i is |{inc | inc ∈ Oi }| − |{dec | dec ∈ Oi }|
CRDTs are efficient representations that follow some general rules
Design of Conflict-Free Replicated Data Types
A full partially ordered log of operations can implement any CRDT
Replicas keep increasing local views of an evolving distributed log
Any query can be expressed from that log
Example: Counter at i is |{inc | inc ∈ Oi }| − |{dec | dec ∈ Oi }|
CRDTs are efficient representations that follow some general rules
Principle of permutation equivalence
If operations in sequence can commute, preserving a given result,
then under concurrency they should preserve the same result
Sequential
inc(10) // inc(35) // dec(5) // inc(2)
dec(5) // inc(2) // inc(10) // inc(35)
Concurrent
inc(35)

inc(10)
88

inc(2)
dec(5)
88
You guessed: Result is 42
Implementing Counters
Example: CRDT PNCounters
A inc(35)

B inc(10)
88

inc(2)
C dec(5)
88
Lets track total number of incs and decs done at each replica
{A(incs, decs), . . . , C(. . . , . . .)}
Implementing Counters
Example: CRDT PNCounters
Separate positive and negative counts are kept per replica
A {A(35, 0), B(10, 0)}
++
B {B(10, 0)}
66
((
{A(35, 0), B(12, 0), C(0, 5)}
C {B(10, 0), C(0, 5)}
33
Joining does point-wise maximums among entries (semilattice)
At any time, counter value is sum of incs minus the sum of decs
Implementing Counters
Example: CRDT PNCounters
Separate positive and negative counts are kept per replica
A {A(35, 0), B(10, 0)}
++
B {B(10, 0)}
66
((
{A(35, 0), B(12, 0), C(0, 5)}
C {B(10, 0), C(0, 5)}
33
Joining does point-wise maximums among entries (semilattice)
At any time, counter value is sum of incs minus the sum of decs
Implementing Counters
Redis CRDT Counters
There are multiple ways to implement CRDT counters
Redis has a distinct implementation that favours garbage collection
Redis CRDT counters are 59 bits (not 64) to avoid overflows
Registers
Registers are an ordered set of write operations
Sequential execution
A wr(x) // wr(j) // wr(k) // wr(x)
Sequential execution under distribution
A wr(x)
%%
wr(x)
B wr(j) // wr(k)
99
Register value is x, the last written value
Implementing Registers
Naif Last-Writer-Wins
CRDT register implemented by attaching local wall-clock times
Sequential execution under distribution
A (11 : 00)x
((
(11 : 30)?
B (12 : 02)j // (12 : 05)k
66
Problem: Wall-clock on B is one hour ahead of A
Value x might not be writeable again at A since 12:05  11:30
Registers
Sequential Semantics
Register shows value v at replica i iff
wr(v) ∈ Oi
and
wr( ) ∈ Oi · wr(v)  wr( )
Preservation of sequential semantics
Concurrent semantics should preserve the sequential semantics
This also ensures correct sequential execution under distribution
Multi-value Registers
Concurrency semantics shows all concurrent values
{v | wr(v) ∈ Oi ∧ wr( ) ∈ Oi · wr(v)  wr( )}
Concurrent execution
A wr(x)
%%
// wr(y) // {y, k} // wr(m) // {m}
B wr(j) // wr(k)
99
Dynamo shopping carts are multi-value registers with payload sets
The m value could be an application level merge of values y and k
Implementing Multi-value Registers
Concurrency can be preciselly tracked with version vectors
Concurrent execution (version vectors)
A [1, 0]x
%%
// [2, 0]y // [2, 0]y, [1, 2]k // [3, 2]m
B [1, 1]j // [1, 2]k
66
Metadata can be compressed with a common causal context and a
single scalar per value (dotted version vectors)
Registers in Redis
LWW arbitration
Multi-value registers allows executions leading to concurrent values
Presenting concurrent values is at odds with the sequential API
Redis both tracks causality and registers wall-clock times
Querying uses Last-Writer-Wins selection among concurrent values
This preserves correctness of sequential semantics
A value with clock 12:05 can still be causally overwritten at 11:30
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X
X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X
In general, given Oi , the set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X
X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X
In general, given Oi , the set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X
X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X
In general, given Oi , the set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X
X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X
In general, given Oi , the set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Sets
Sequential Semantics
Consider add and rmv operations
X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X
X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X
In general, given Oi , the set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Sets
Concurrency Semantics
Problem: Concurrently adding and removing the same element
Concurrent execution
A add(x)
%%
// rmv(x) // {?} // add(x) // {x}
B rmv(x) // add(x)
::
Concurrency Semantics
Add-Wins Sets
Lets choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
Concurrency Semantics
Add-Wins Sets
Lets choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
Concurrency Semantics
Add-Wins Sets
Lets choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
Concurrency Semantics
Add-Wins Sets
Lets choose Add-Wins
Consider a set of known operations Oi , at node i, that is ordered
by an happens-before partial order . Set has elements
{e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e)  rmv(e)}
Is this familiar?
The sequential semantics applies identical rules on a total order
Redis CRDT sets are Add-Wins Sets
Equivalence to a sequential execution?
Add-Wins Sets
Can we always explain a concurrent execution by a sequential one?
Concurrent execution
A {x, y} // add(y) // rmv(x) // {y} //
##
{x, y}
B {x, y} // add(x) // rmv(y) // {x} //
;;
{x, y}
Two (failed) sequential explanations
H1 {x, y} // . . . // rmv(x) // { x, y}
H2 {x, y} // . . . // rmv(y) // {x, y}
Concurrent executions can have richer outcomes
Concurrency Semantics
Remove-Wins Sets
Alternative: Lets choose Remove-Wins
Xi
.
= {e | add(e) ∈ Oi ∧ ∀ rmv(e) ∈ Oi · rmv(e)  add(e)}
Remove-Wins requires more metadata than Add-Wins
Both Add and Remove-Wins have same semantics in a total order
They are different but both preserve sequential semantics
Concurrency Semantics
Remove-Wins Sets
Alternative: Lets choose Remove-Wins
Xi
.
= {e | add(e) ∈ Oi ∧ ∀ rmv(e) ∈ Oi · rmv(e)  add(e)}
Remove-Wins requires more metadata than Add-Wins
Both Add and Remove-Wins have same semantics in a total order
They are different but both preserve sequential semantics
Choice of semantics
Design freedom is limited by preservation of sequential semantics
Delaying choice of semantics to query time
A CRDT Set data type could store enough information to allow a
parametrized query that shows either Add-Wins or Remove-Wins
Open question: Reason for an extended concurrency aware API?
Sequence/List
Weak/Strong Specification [Attiya et al, PODC 16]
Element x is kept
rpush(b) // xb
$$
// lpush(x) // x
99
%%
axb
lpush(a) // ax
::
Element x is removed (Redis enforces Strong Specification)
rpush(b) // xb
##
// lpush(x) // x
::
$$
// rem(x) // // ab ¬ ba
lpush(a) // ax
;;
Causal Consistency
Redis CRDTs provide per-key causal consistency
Source FIFO
A apple

%%
B apple pie
##
hot
##
C pie hot peach? apple
Causal consistency
A apple
%%
B apple
%%
pie
$$
hot

C apple pie hot tasty
Strongest highly available consistency model
Thanks and Questions
Thanks to Yiftach, Yuval, Yossi, Meir, Cihan for their inputs
Glad to try to answer any questions
Carlos Baquero, cbm@di.uminho.pt, @xmal
#Redis #RedisConf

More Related Content

What's hot

A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsA Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
Matt Parker
 
Time and space complexity
Time and space complexityTime and space complexity
Time and space complexity
Ankit Katiyar
 
Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7
Traian Rebedea
 

What's hot (20)

Task Constrained Motion Planning for Snake Robot
Task Constrained Motion Planning for Snake RobotTask Constrained Motion Planning for Snake Robot
Task Constrained Motion Planning for Snake Robot
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
 
Fsa
FsaFsa
Fsa
 
O2
O2O2
O2
 
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsA Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
 
Kumegawa russia
Kumegawa russiaKumegawa russia
Kumegawa russia
 
Bellman Ford's Algorithm
Bellman Ford's AlgorithmBellman Ford's Algorithm
Bellman Ford's Algorithm
 
Forward kinematics
Forward kinematicsForward kinematics
Forward kinematics
 
Price of anarchy is independent of network topology
Price of anarchy is independent of network topologyPrice of anarchy is independent of network topology
Price of anarchy is independent of network topology
 
The Ring programming language version 1.5.4 book - Part 35 of 185
The Ring programming language version 1.5.4 book - Part 35 of 185The Ring programming language version 1.5.4 book - Part 35 of 185
The Ring programming language version 1.5.4 book - Part 35 of 185
 
Time and space complexity
Time and space complexityTime and space complexity
Time and space complexity
 
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
QR Factorizations and SVDs for Tall-and-skinny Matrices in MapReduce Architec...
 
Binary Session Types for Psi-Calculi (APLAS 2016)
Binary Session Types for Psi-Calculi (APLAS 2016)Binary Session Types for Psi-Calculi (APLAS 2016)
Binary Session Types for Psi-Calculi (APLAS 2016)
 
discrete-hmm
discrete-hmmdiscrete-hmm
discrete-hmm
 
Wg qcolorable
Wg qcolorableWg qcolorable
Wg qcolorable
 
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectu...
 
Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7Algorithm Design and Complexity - Course 7
Algorithm Design and Complexity - Course 7
 
Yampa AFRP Introduction
Yampa AFRP IntroductionYampa AFRP Introduction
Yampa AFRP Introduction
 
Aaex4 group2(中英夾雜)
Aaex4 group2(中英夾雜)Aaex4 group2(中英夾雜)
Aaex4 group2(中英夾雜)
 
Lossy Kernelization
Lossy KernelizationLossy Kernelization
Lossy Kernelization
 

Similar to RedisConf18 - CRDTs and Redis - From sequential to concurrent executions

Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Silvio Cesare
 

Similar to RedisConf18 - CRDTs and Redis - From sequential to concurrent executions (20)

R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
Constraint Programming in Haskell
Constraint Programming in HaskellConstraint Programming in Haskell
Constraint Programming in Haskell
 
R language introduction
R language introductionR language introduction
R language introduction
 
Scala as a Declarative Language
Scala as a Declarative LanguageScala as a Declarative Language
Scala as a Declarative Language
 
Scribed lec8
Scribed lec8Scribed lec8
Scribed lec8
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
Reactive x
Reactive xReactive x
Reactive x
 
slides.07.pptx
slides.07.pptxslides.07.pptx
slides.07.pptx
 
Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...
Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...
Parametrized Model Checking of Fault Tolerant Distributed Algorithms by Abstr...
 
Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with R
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Ch7
Ch7Ch7
Ch7
 
Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10Bayesian phylogenetic inference_big4_ws_2016-10-10
Bayesian phylogenetic inference_big4_ws_2016-10-10
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 
ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012ITS World Congress :: Vienna, Oct 2012
ITS World Congress :: Vienna, Oct 2012
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
Itroroduction to R language
Itroroduction to R languageItroroduction to R language
Itroroduction to R language
 

More from Redis Labs

SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
Redis Labs
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Redis Labs
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
Redis Labs
 

More from Redis Labs (20)

Redis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redisRedis Day Bangalore 2020 - Session state caching with redis
Redis Day Bangalore 2020 - Session state caching with redis
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
 
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020
 
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
Rust and Redis - Solving Problems for Kubernetes by Ravi Jagannathan of VMwar...
 
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of OracleRedis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
Redis for Data Science and Engineering by Dmitry Polyakovsky of Oracle
 
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
Practical Use Cases for ACLs in Redis 6 by Jamie Scott - Redis Day Seattle 2020
 
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
Moving Beyond Cache by Yiftach Shoolman Redis Labs - Redis Day Seattle 2020
 
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...
 
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
JSON in Redis - When to use RedisJSON by Jay Won of Coupang - Redis Day Seatt...
 
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
Highly Available Persistent Session Management Service by Mohamed Elmergawi o...
 
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
Anatomy of a Redis Command by Madelyn Olson of Amazon Web Services - Redis Da...
 
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
Building a Multi-dimensional Analytics Engine with RedisGraph by Matthew Goos...
 
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
RediSearch 1.6 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
RedisGraph 2.0 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
RedisTimeSeries 1.2 by Pieter Cailliau - Redis Day Bangalore 2020
 
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
RedisAI 0.9 by Sherin Thomas of Tensorwerk - Redis Day Bangalore 2020
 
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
Rate-Limiting 30 Million requests by Vijay Lakshminarayanan and Girish Koundi...
 
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
Three Pillars of Observability by Rajalakshmi Raji Srinivasan of Site24x7 Zoh...
 
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
Solving Complex Scaling Problems by Prashant Kumar and Abhishek Jain of Myntr...
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

RedisConf18 - CRDTs and Redis - From sequential to concurrent executions

  • 1. CRDTs and Redis From sequential to concurrent executions Carlos Baquero HASLab, INESC TEC & Univ. Minho, Portugal April 26th 2018
  • 2. The speed of communication in the 19th century W. H. Harrison’s death “At 12:30 am on April 4th, 1841 President William Henry Harrison died of pneumonia just a month after taking office. The Rich- mond Enquirer published the news of his death two days later on April 6th. The North- Carolina standard newspaper published it on April 14th. His death wasn’t known of in Los Angeles until July 23rd, 110 days after it had occurred.” Text by Zack Bloom, A Quick History of Digital Communication Before the Internet. https://eager.io/blog/communication-pre-internet/ Picture by By Albert Sands Southworth and Josiah Johnson Hawes
  • 3. The speed of communication in the 19th century Francis Galton Isochronic Map
  • 4. The speed of communication in the 21st century RTT data gathered via http://www.azurespeed.com
  • 5. The speed of communication in the 21st century If you really like long latencies . . . Time delay between Mars and Earth blogs.esa.int/mex/2012/08/05/time-delay-between-mars-and-earth/ Delay/Disruption Tolerant Networking www.nasa.gov/content/dtn
  • 6. Latency magnitudes Geo-replication λ, up to 50ms (local region DC) Λ, between 100ms and 300ms (inter-continental) No inter-DC replication Client writes observe λ latency Planet-wide geo-replication Replication techniques versus client side write latency ranges Consensus/Paxos [Λ, 2Λ] (with no divergence) Primary-Backup [λ, Λ] (asynchronous version) Multi-Master λ (allowing divergence)
  • 7. EC and CAP for Geo-Replication Eventually Consistent. CACM 2009, Werner Vogels In an ideal world there would be only one consistency model: when an update is made all observers would see that update. Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. CAP theorem. PODC 2000, Eric Brewer Of three properties of shared-data systems – data consistency, system availability, and tolerance to network partition – only two can be achieved at any given time. CRDTs provide support for partition-tolerant high availability
  • 8. From sequential to concurrent executions Consensus and Primary-Backup provide illusion of a single replica This also preserves some illusion of sequential behaviour Sequential execution Ops O o // p // q Time // We have an ordered set (O, <). O = {o, p, q} and o < p < q
  • 9. From sequential to concurrent executions Consensus and Primary-Backup provide illusion of a single replica This also preserves some illusion of sequential behaviour Sequential execution Ops O o // p // q Time // We have an ordered set (O, <). O = {o, p, q} and o < p < q
  • 10. From sequential to concurrent executions EC Multi-master (or active-active) can expose concurrency Concurrent execution p // q Ops O o ?? r s 77 Time // Ordered set (O, ). o p q r and o s r Some ops in O are concurrent: p s and q s
  • 11. Design of Conflict-Free Replicated Data Types A full partially ordered log of operations can implement any CRDT Replicas keep increasing local views of an evolving distributed log Any query can be expressed from that log Example: Counter at i is |{inc | inc ∈ Oi }| − |{dec | dec ∈ Oi }| CRDTs are efficient representations that follow some general rules
  • 12. Design of Conflict-Free Replicated Data Types A full partially ordered log of operations can implement any CRDT Replicas keep increasing local views of an evolving distributed log Any query can be expressed from that log Example: Counter at i is |{inc | inc ∈ Oi }| − |{dec | dec ∈ Oi }| CRDTs are efficient representations that follow some general rules
  • 13. Design of Conflict-Free Replicated Data Types A full partially ordered log of operations can implement any CRDT Replicas keep increasing local views of an evolving distributed log Any query can be expressed from that log Example: Counter at i is |{inc | inc ∈ Oi }| − |{dec | dec ∈ Oi }| CRDTs are efficient representations that follow some general rules
  • 14. Principle of permutation equivalence If operations in sequence can commute, preserving a given result, then under concurrency they should preserve the same result Sequential inc(10) // inc(35) // dec(5) // inc(2) dec(5) // inc(2) // inc(10) // inc(35) Concurrent inc(35) inc(10) 88 inc(2) dec(5) 88 You guessed: Result is 42
  • 15. Implementing Counters Example: CRDT PNCounters A inc(35) B inc(10) 88 inc(2) C dec(5) 88 Lets track total number of incs and decs done at each replica {A(incs, decs), . . . , C(. . . , . . .)}
  • 16. Implementing Counters Example: CRDT PNCounters Separate positive and negative counts are kept per replica A {A(35, 0), B(10, 0)} ++ B {B(10, 0)} 66 (( {A(35, 0), B(12, 0), C(0, 5)} C {B(10, 0), C(0, 5)} 33 Joining does point-wise maximums among entries (semilattice) At any time, counter value is sum of incs minus the sum of decs
  • 17. Implementing Counters Example: CRDT PNCounters Separate positive and negative counts are kept per replica A {A(35, 0), B(10, 0)} ++ B {B(10, 0)} 66 (( {A(35, 0), B(12, 0), C(0, 5)} C {B(10, 0), C(0, 5)} 33 Joining does point-wise maximums among entries (semilattice) At any time, counter value is sum of incs minus the sum of decs
  • 18. Implementing Counters Redis CRDT Counters There are multiple ways to implement CRDT counters Redis has a distinct implementation that favours garbage collection Redis CRDT counters are 59 bits (not 64) to avoid overflows
  • 19. Registers Registers are an ordered set of write operations Sequential execution A wr(x) // wr(j) // wr(k) // wr(x) Sequential execution under distribution A wr(x) %% wr(x) B wr(j) // wr(k) 99 Register value is x, the last written value
  • 20. Implementing Registers Naif Last-Writer-Wins CRDT register implemented by attaching local wall-clock times Sequential execution under distribution A (11 : 00)x (( (11 : 30)? B (12 : 02)j // (12 : 05)k 66 Problem: Wall-clock on B is one hour ahead of A Value x might not be writeable again at A since 12:05 11:30
  • 21. Registers Sequential Semantics Register shows value v at replica i iff wr(v) ∈ Oi and wr( ) ∈ Oi · wr(v) wr( )
  • 22. Preservation of sequential semantics Concurrent semantics should preserve the sequential semantics This also ensures correct sequential execution under distribution
  • 23. Multi-value Registers Concurrency semantics shows all concurrent values {v | wr(v) ∈ Oi ∧ wr( ) ∈ Oi · wr(v) wr( )} Concurrent execution A wr(x) %% // wr(y) // {y, k} // wr(m) // {m} B wr(j) // wr(k) 99 Dynamo shopping carts are multi-value registers with payload sets The m value could be an application level merge of values y and k
  • 24. Implementing Multi-value Registers Concurrency can be preciselly tracked with version vectors Concurrent execution (version vectors) A [1, 0]x %% // [2, 0]y // [2, 0]y, [1, 2]k // [3, 2]m B [1, 1]j // [1, 2]k 66 Metadata can be compressed with a common causal context and a single scalar per value (dotted version vectors)
  • 25. Registers in Redis LWW arbitration Multi-value registers allows executions leading to concurrent values Presenting concurrent values is at odds with the sequential API Redis both tracks causality and registers wall-clock times Querying uses Last-Writer-Wins selection among concurrent values This preserves correctness of sequential semantics A value with clock 12:05 can still be causally overwritten at 11:30
  • 26. Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X In general, given Oi , the set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)}
  • 27. Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X In general, given Oi , the set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)}
  • 28. Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X In general, given Oi , the set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)}
  • 29. Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X In general, given Oi , the set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)}
  • 30. Sets Sequential Semantics Consider add and rmv operations X = {. . .}, add(a) −→ add(c) we observe that a, c ∈ X X = {. . .}, add(c) −→ rmv(c) we observe that c ∈ X In general, given Oi , the set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)}
  • 31. Sets Concurrency Semantics Problem: Concurrently adding and removing the same element Concurrent execution A add(x) %% // rmv(x) // {?} // add(x) // {x} B rmv(x) // add(x) ::
  • 32. Concurrency Semantics Add-Wins Sets Lets choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 33. Concurrency Semantics Add-Wins Sets Lets choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 34. Concurrency Semantics Add-Wins Sets Lets choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 35. Concurrency Semantics Add-Wins Sets Lets choose Add-Wins Consider a set of known operations Oi , at node i, that is ordered by an happens-before partial order . Set has elements {e | add(e) ∈ Oi ∧ rmv(e) ∈ Oi · add(e) rmv(e)} Is this familiar? The sequential semantics applies identical rules on a total order Redis CRDT sets are Add-Wins Sets
  • 36. Equivalence to a sequential execution? Add-Wins Sets Can we always explain a concurrent execution by a sequential one? Concurrent execution A {x, y} // add(y) // rmv(x) // {y} // ## {x, y} B {x, y} // add(x) // rmv(y) // {x} // ;; {x, y} Two (failed) sequential explanations H1 {x, y} // . . . // rmv(x) // { x, y} H2 {x, y} // . . . // rmv(y) // {x, y} Concurrent executions can have richer outcomes
  • 37. Concurrency Semantics Remove-Wins Sets Alternative: Lets choose Remove-Wins Xi . = {e | add(e) ∈ Oi ∧ ∀ rmv(e) ∈ Oi · rmv(e) add(e)} Remove-Wins requires more metadata than Add-Wins Both Add and Remove-Wins have same semantics in a total order They are different but both preserve sequential semantics
  • 38. Concurrency Semantics Remove-Wins Sets Alternative: Lets choose Remove-Wins Xi . = {e | add(e) ∈ Oi ∧ ∀ rmv(e) ∈ Oi · rmv(e) add(e)} Remove-Wins requires more metadata than Add-Wins Both Add and Remove-Wins have same semantics in a total order They are different but both preserve sequential semantics
  • 39. Choice of semantics Design freedom is limited by preservation of sequential semantics Delaying choice of semantics to query time A CRDT Set data type could store enough information to allow a parametrized query that shows either Add-Wins or Remove-Wins Open question: Reason for an extended concurrency aware API?
  • 40. Sequence/List Weak/Strong Specification [Attiya et al, PODC 16] Element x is kept rpush(b) // xb $$ // lpush(x) // x 99 %% axb lpush(a) // ax :: Element x is removed (Redis enforces Strong Specification) rpush(b) // xb ## // lpush(x) // x :: $$ // rem(x) // // ab ¬ ba lpush(a) // ax ;;
  • 41. Causal Consistency Redis CRDTs provide per-key causal consistency Source FIFO A apple %% B apple pie ## hot ## C pie hot peach? apple Causal consistency A apple %% B apple %% pie $$ hot C apple pie hot tasty Strongest highly available consistency model
  • 42. Thanks and Questions Thanks to Yiftach, Yuval, Yossi, Meir, Cihan for their inputs Glad to try to answer any questions Carlos Baquero, cbm@di.uminho.pt, @xmal #Redis #RedisConf