SlideShare a Scribd company logo
1 of 24
Download to read offline
May 2016
Raft in details
Ivan Glushkov
ivan.glushkov@gmail.com
@gliush
Raft is a consensus algorithm for managing a
replicated log
Why
❖ Very important topic
❖ [Multi-]Paxos, the key player, is very difficult to
understand
❖ Paxos -> Multi-Paxos -> different approaches
❖ Different optimization, closed-source implementation
Raft goals
❖ Main goal - understandability:
❖ Decomposition: separated leader election, log
replication, safety, membership changes
❖ State space reduction
Replicated state machines
❖ The same state on each server
❖ Compute the state from replicated log
❖ Consensus algorithm: keep the replicated log consistent
Properties of consensus algorithm
❖ Never return an incorrect result: Safety
❖ Functional as long as any majority of the severs are
operational: Availability
❖ Do not depend on timing to ensure consistency (at
worst - unavailable)
Raft in a nutshell
❖ Elect a leader
❖ Give full responsibility for replicated log to leader:
❖ Leader accepts log entries from client
❖ Leader replicates log entries on other servers
❖ Tell servers when it is safe to apply changes on their
state
Raft basics: States
❖ Follower
❖ Candidate
❖ Leader
RequestVote RPC
❖ Become a candidate, if no communication from leader over
a timeout (random election timeout: e.g. 150-300ms)
❖ Spec: (term, candidateId, lastLogIndex, lastLogTerm) 

-> (term, voteGranted)
❖ Receiver:
❖ false if term < currentTerm
❖ true if not voted yet and term and log are Up-To-Date
❖ false
Leader Election
❖ Heartbeats from leader
❖ Election timeout
❖ Candidate results:
❖ It wins
❖ Another server wins
❖ No winner for a period of time -> new election
Leader Election Property
❖ Election Safety: at most one leader can be elected in a
given term
Log Replication
❖ log index
❖ log term
❖ log command
Log Replication
❖ Leader appends command from client to its log
❖ Leader issues AppendEntries RPC to replicate the entry
❖ Leader replicates entry to majority -> apply entry to
state -> “committed entry”
❖ Follower checks if entry is committed by server ->
commit it locally
Log Replication Properties
❖ Leader Append-Only: a leader never overwrites or
deletes entries in its log; it only appends new entries
❖ Log Matching: If two entries in different logs have the
same index and term, then they store the same
command.
❖ Log Matching: If two entries in different logs have the
same index and term, then the logs are identical in all
preceding entries.
Inconsistency
Solving Inconsistency
❖ Leader forces the followers’ logs to duplicate its own
❖ Conflicting entries in the follower logs will be
overwritten with entries from the Leader’s log:
❖ Find latest log where two servers agree
❖ Remove everything after this point
❖ Write new logs
Safety Property
❖ Leader Completeness: if a log entry is committed in a
given term, then that entry will be present in the logs of
the leaders for all higher-numbered terms
❖ State Machine Safety: if a server has applied a log entry
at a given index to its state machine, no other server will
ever apply a different log entry for the same index.
Safety Restriction
❖ Up-to-date: if the logs have last entries with different
terms, then the log with the later term is more up-to-
date. If the logs end with the same term, then whichever
log is longer is more up-to-date.
time
Follower and Candidate Crashes
❖ Retry indefinitely
❖ Raft RPC are idempotent
Raft Joint Consensus
❖ Two phase to change membership configuration
❖ In Joint Consensus state:
❖ replicate log entries to both configuration
❖ any server from either configuration may be a leader
❖ agreement (for election and commitment) requires
separate majorities from both configuration
Log compaction
❖ Independent

snapshotting
❖ Snapshot metadata
❖ InstallSnapshot RPC
Client Interaction
❖ Works with leader
❖ Leader respond to a request when it commits an entry
❖ Assign uniqueID to every command, leader stores latest
ID with response.
❖ Read-only - could be without writing to a log, possible
stale data. Or write “no-op” entry at start
Correctness
❖ TLA+ formal specification for consensus alg (400 LOC)
❖ Proof of linearizability in Raft using Coq (community)
Questions?

More Related Content

What's hot

Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into OverdriveTodd Palino
 
Storage As A Service (StAAS)
Storage As A Service (StAAS)Storage As A Service (StAAS)
Storage As A Service (StAAS)Shreyans Jain
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 ReplicationAli Usman
 
What is gRPC introduction gRPC Explained
What is gRPC introduction gRPC ExplainedWhat is gRPC introduction gRPC Explained
What is gRPC introduction gRPC Explainedjeetendra mandal
 
Building your First gRPC Service
Building your First gRPC ServiceBuilding your First gRPC Service
Building your First gRPC ServiceJessie Barnett
 
MPI message passing interface
MPI message passing interfaceMPI message passing interface
MPI message passing interfaceMohit Raghuvanshi
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeSlim Baltagi
 
Linux Terminal commands for Devops.pdf
Linux Terminal commands for Devops.pdfLinux Terminal commands for Devops.pdf
Linux Terminal commands for Devops.pdfNambi Nam
 
file system in operating system
file system in operating systemfile system in operating system
file system in operating systemtittuajay
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...HostedbyConfluent
 
Kafka and Avro with Confluent Schema Registry
Kafka and Avro with Confluent Schema RegistryKafka and Avro with Confluent Schema Registry
Kafka and Avro with Confluent Schema RegistryJean-Paul Azar
 
Generations Of Programming Languages
Generations Of Programming LanguagesGenerations Of Programming Languages
Generations Of Programming Languagespy7rjs
 
Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)
Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)
Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)Farwa Ansari
 
Distributed Operating System
Distributed Operating SystemDistributed Operating System
Distributed Operating SystemSanthiNivas
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...StreamNative
 
Google File Systems
Google File SystemsGoogle File Systems
Google File SystemsAzeem Mumtaz
 
Memory consistency models and basics
Memory consistency models and basicsMemory consistency models and basics
Memory consistency models and basicsRamdas Mozhikunnath
 
Resilient service to-service calls in a post-Hystrix world
Resilient service to-service calls in a post-Hystrix worldResilient service to-service calls in a post-Hystrix world
Resilient service to-service calls in a post-Hystrix worldRares Musina
 

What's hot (20)

Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
Storage As A Service (StAAS)
Storage As A Service (StAAS)Storage As A Service (StAAS)
Storage As A Service (StAAS)
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 Replication
 
What is gRPC introduction gRPC Explained
What is gRPC introduction gRPC ExplainedWhat is gRPC introduction gRPC Explained
What is gRPC introduction gRPC Explained
 
Password Strength
Password StrengthPassword Strength
Password Strength
 
Building your First gRPC Service
Building your First gRPC ServiceBuilding your First gRPC Service
Building your First gRPC Service
 
MPI message passing interface
MPI message passing interfaceMPI message passing interface
MPI message passing interface
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
 
Linux Terminal commands for Devops.pdf
Linux Terminal commands for Devops.pdfLinux Terminal commands for Devops.pdf
Linux Terminal commands for Devops.pdf
 
file system in operating system
file system in operating systemfile system in operating system
file system in operating system
 
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra... Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
 
Kafka and Avro with Confluent Schema Registry
Kafka and Avro with Confluent Schema RegistryKafka and Avro with Confluent Schema Registry
Kafka and Avro with Confluent Schema Registry
 
Generations Of Programming Languages
Generations Of Programming LanguagesGenerations Of Programming Languages
Generations Of Programming Languages
 
Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)
Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)
Chapter 5: Names, Bindings and Scopes (review Questions and Problem Set)
 
Distributed Operating System
Distributed Operating SystemDistributed Operating System
Distributed Operating System
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
 
Google File Systems
Google File SystemsGoogle File Systems
Google File Systems
 
Memory consistency models and basics
Memory consistency models and basicsMemory consistency models and basics
Memory consistency models and basics
 
Resilient service to-service calls in a post-Hystrix world
Resilient service to-service calls in a post-Hystrix worldResilient service to-service calls in a post-Hystrix world
Resilient service to-service calls in a post-Hystrix world
 

Similar to Raft in details

Consensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_systemConsensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_systemAtin Mukherjee
 
Manging scalability of distributed system
Manging scalability of distributed systemManging scalability of distributed system
Manging scalability of distributed systemAtin Mukherjee
 
Module: Mutable Content in IPFS
Module: Mutable Content in IPFSModule: Mutable Content in IPFS
Module: Mutable Content in IPFSIoannis Psaras
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systemsAndrea Monacchi
 
New awesome features in MySQL 5.7
New awesome features in MySQL 5.7New awesome features in MySQL 5.7
New awesome features in MySQL 5.7Zhaoyang Wang
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveChieh (Jack) Yu
 
Top 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsaTop 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsahopesuresh
 
NFSv4 Replication for Grid Computing
NFSv4 Replication for Grid ComputingNFSv4 Replication for Grid Computing
NFSv4 Replication for Grid Computingpeterhoneyman
 
Loadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPSLoadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPSShrey Agarwal
 
Distributed transactions
Distributed transactionsDistributed transactions
Distributed transactionsAritra Das
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouMariaDB plc
 
Thriftypaxos
ThriftypaxosThriftypaxos
ThriftypaxosYongraeJo
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexYahoo Developer Network
 
Raft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdfRaft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdfjsathy97
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversJez Halford
 
ONOS Platform Architecture
ONOS Platform ArchitectureONOS Platform Architecture
ONOS Platform ArchitectureOpenDaylight
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScyllaDB
 

Similar to Raft in details (20)

Consensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_systemConsensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_system
 
Manging scalability of distributed system
Manging scalability of distributed systemManging scalability of distributed system
Manging scalability of distributed system
 
Module: Mutable Content in IPFS
Module: Mutable Content in IPFSModule: Mutable Content in IPFS
Module: Mutable Content in IPFS
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systems
 
New awesome features in MySQL 5.7
New awesome features in MySQL 5.7New awesome features in MySQL 5.7
New awesome features in MySQL 5.7
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep Dive
 
Top 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsaTop 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsa
 
NFSv4 Replication for Grid Computing
NFSv4 Replication for Grid ComputingNFSv4 Replication for Grid Computing
NFSv4 Replication for Grid Computing
 
Loadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPSLoadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPS
 
Distributed transactions
Distributed transactionsDistributed transactions
Distributed transactions
 
Real-Time Streaming Protocol -QOS
Real-Time Streaming Protocol -QOSReal-Time Streaming Protocol -QOS
Real-Time Streaming Protocol -QOS
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
 
Thriftypaxos
ThriftypaxosThriftypaxos
Thriftypaxos
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
Raft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdfRaft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdf
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
ONOS Platform Architecture
ONOS Platform ArchitectureONOS Platform Architecture
ONOS Platform Architecture
 
Megastore by Google
Megastore by GoogleMegastore by Google
Megastore by Google
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent Databases
 

More from Ivan Glushkov

Distributed tracing with erlang/elixir
Distributed tracing with erlang/elixirDistributed tracing with erlang/elixir
Distributed tracing with erlang/elixirIvan Glushkov
 
Kubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusKubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusIvan Glushkov
 
Mystery Machine Overview
Mystery Machine OverviewMystery Machine Overview
Mystery Machine OverviewIvan Glushkov
 
Google Dataflow Intro
Google Dataflow IntroGoogle Dataflow Intro
Google Dataflow IntroIvan Glushkov
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015Ivan Glushkov
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulIvan Glushkov
 

More from Ivan Glushkov (8)

Distributed tracing with erlang/elixir
Distributed tracing with erlang/elixirDistributed tracing with erlang/elixir
Distributed tracing with erlang/elixir
 
Kubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusKubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rus
 
Mystery Machine Overview
Mystery Machine OverviewMystery Machine Overview
Mystery Machine Overview
 
Hashicorp Nomad
Hashicorp NomadHashicorp Nomad
Hashicorp Nomad
 
Google Dataflow Intro
Google Dataflow IntroGoogle Dataflow Intro
Google Dataflow Intro
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and Consul
 
fp intro
fp introfp intro
fp intro
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 

Recently uploaded (20)

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 

Raft in details

  • 1. May 2016 Raft in details Ivan Glushkov ivan.glushkov@gmail.com @gliush
  • 2. Raft is a consensus algorithm for managing a replicated log
  • 3. Why ❖ Very important topic ❖ [Multi-]Paxos, the key player, is very difficult to understand ❖ Paxos -> Multi-Paxos -> different approaches ❖ Different optimization, closed-source implementation
  • 4. Raft goals ❖ Main goal - understandability: ❖ Decomposition: separated leader election, log replication, safety, membership changes ❖ State space reduction
  • 5. Replicated state machines ❖ The same state on each server ❖ Compute the state from replicated log ❖ Consensus algorithm: keep the replicated log consistent
  • 6. Properties of consensus algorithm ❖ Never return an incorrect result: Safety ❖ Functional as long as any majority of the severs are operational: Availability ❖ Do not depend on timing to ensure consistency (at worst - unavailable)
  • 7. Raft in a nutshell ❖ Elect a leader ❖ Give full responsibility for replicated log to leader: ❖ Leader accepts log entries from client ❖ Leader replicates log entries on other servers ❖ Tell servers when it is safe to apply changes on their state
  • 8. Raft basics: States ❖ Follower ❖ Candidate ❖ Leader
  • 9. RequestVote RPC ❖ Become a candidate, if no communication from leader over a timeout (random election timeout: e.g. 150-300ms) ❖ Spec: (term, candidateId, lastLogIndex, lastLogTerm) 
 -> (term, voteGranted) ❖ Receiver: ❖ false if term < currentTerm ❖ true if not voted yet and term and log are Up-To-Date ❖ false
  • 10. Leader Election ❖ Heartbeats from leader ❖ Election timeout ❖ Candidate results: ❖ It wins ❖ Another server wins ❖ No winner for a period of time -> new election
  • 11. Leader Election Property ❖ Election Safety: at most one leader can be elected in a given term
  • 12. Log Replication ❖ log index ❖ log term ❖ log command
  • 13. Log Replication ❖ Leader appends command from client to its log ❖ Leader issues AppendEntries RPC to replicate the entry ❖ Leader replicates entry to majority -> apply entry to state -> “committed entry” ❖ Follower checks if entry is committed by server -> commit it locally
  • 14. Log Replication Properties ❖ Leader Append-Only: a leader never overwrites or deletes entries in its log; it only appends new entries ❖ Log Matching: If two entries in different logs have the same index and term, then they store the same command. ❖ Log Matching: If two entries in different logs have the same index and term, then the logs are identical in all preceding entries.
  • 16. Solving Inconsistency ❖ Leader forces the followers’ logs to duplicate its own ❖ Conflicting entries in the follower logs will be overwritten with entries from the Leader’s log: ❖ Find latest log where two servers agree ❖ Remove everything after this point ❖ Write new logs
  • 17. Safety Property ❖ Leader Completeness: if a log entry is committed in a given term, then that entry will be present in the logs of the leaders for all higher-numbered terms ❖ State Machine Safety: if a server has applied a log entry at a given index to its state machine, no other server will ever apply a different log entry for the same index.
  • 18. Safety Restriction ❖ Up-to-date: if the logs have last entries with different terms, then the log with the later term is more up-to- date. If the logs end with the same term, then whichever log is longer is more up-to-date. time
  • 19. Follower and Candidate Crashes ❖ Retry indefinitely ❖ Raft RPC are idempotent
  • 20. Raft Joint Consensus ❖ Two phase to change membership configuration ❖ In Joint Consensus state: ❖ replicate log entries to both configuration ❖ any server from either configuration may be a leader ❖ agreement (for election and commitment) requires separate majorities from both configuration
  • 21. Log compaction ❖ Independent
 snapshotting ❖ Snapshot metadata ❖ InstallSnapshot RPC
  • 22. Client Interaction ❖ Works with leader ❖ Leader respond to a request when it commits an entry ❖ Assign uniqueID to every command, leader stores latest ID with response. ❖ Read-only - could be without writing to a log, possible stale data. Or write “no-op” entry at start
  • 23. Correctness ❖ TLA+ formal specification for consensus alg (400 LOC) ❖ Proof of linearizability in Raft using Coq (community)