SlideShare a Scribd company logo
1 of 30
Download to read offline
REPLICATION
IN THE WILD
Ensar Basri Kahveci
Hello! Ensar Basri Kahveci
Distributed Systems Engineer @ Hazelcast
Ph.D. Candidate @ Bilkent CS
twitter: metanet | github: metanet | linkedin: basrikahveci
- IMDG: storage + computation + messaging
- Open source, distributed, highly available, elastic,
scalable
- Distributed Java collections, JCache, HD store
- Embedded or client - server deployment
- Clients: Java, Scala, C++, C#, Python, Node.js, ...
- Integration modules
- https://blog.hazelcast.com/announcing-hazel
cast-imdg-3-8
- Rolling upgrades
- User code deployment
- Hot restart improvements
- WAN replication improvements
REPLICATION
- Putting a data set into
multiple nodes.
- Each replica has a full copy.
- A few reasons for replication:
- Performance
- Availability and fault tolerance
REPLICATION + PARTITIONING
- Mostly used with
partitioning.
- Two partitions: P1, P2
- Two replicas for each
partition.
NOTHING FOR FREE!
- Very easy to do when the data is immutable.
- Problems start when we have multiple copies
of the data and we want to update them.
- Two main difficulties:
- Handling updates,
- Handling failures.
CAP PrIncIple
- Proposed by Eric Brewer in 2000 [13],
- Proved by Gilbert and Lynch in 2002 [14].
- A shared-data system cannot achieve perfect
consistency and availability in the presence of
partitions CP vs AP.
- Widespread acceptance, and yet a lot of criticism
[15 - 21].
consIstency AND avaIlabIlIty
- Degrees of consistency:
- Data centric, client centric
- Degrees of availability:
- High availability, sticky
availability, non-availability
- Replication is directly
related to C and A. [25]
The dangers of replIcatIon and a solutIon
- Gray et al. [1] classify replication models by 2
parameters:
- Where to make updates: primary copy or update
anywhere
- When to make updates: eagerly or lazily
WHERE: PRIMARY COPY
- There is a single replica
managing the updates.
- Concurrency control is easy.
- No conflicts and no conflict-handling logic.
- Updates are executed on the primary and
secondaries with the same order.
- When primary fails, a new primary is elected.
- Ensuring a single and good primary is hard.
WHERE: UPDATE ANYWHERE
- Each replica can initiate a
transaction to make an update.
- Complex concurrency control.
- Deadlocks or conflicts are
possible.
- In practice, there is also
multi-leader.
WHEN: EAGER REPLICATION
- Synchronously updates all
replicas as part of one atomic
transaction.
- Provides strong consistency.
- Degree of availability can
degrade on node failures.
- Consensus algorithms.
WHEN: LAZY REPLICATION
- Updates each replica with a
separate transaction.
- Updates can execute quite fast.
- Degree of availability is high.
- Eventual consistency.
- Data copies can diverge.
- Data loss or conflicts can occur.
WHERE?
WHEN?
PRIMARY COPY UPDATE ANYWHERE
EAGER
strong consistency
simple concurrency
slow
inflexible
strong consistency
complex concurrency
slow
expensive
deadlocks
LAZY
fast
sticky availability
eventual consistency
simple concurrency
inconsistency
fast
flexible
high availability
eventual consistency
inconsistency
conflicts
WHERE?
WHEN?
PRIMARY COPY UPDATE ANYWHERE
EAGER
Multi Paxos [5]
etcd and Consul (RAFT) [6]
Zookeeper (Zab) [7]
Kafka
VoltDB [24]
Paxos [5]
Hazelcast Cluster State Change [12]
MySQL 5.7 Group Replication [23]
LAZY
Hazelcast
MongoDB
ElasticSearch
Redis
Dynamo [4]
Cassandra
Riak
Hazelcast Active-Active WAN
Replication [22]
PRIMARY COPY + EAGER REPLICATION
- When the primary fails, secondaries are
guaranteed to be up to date.
- Raft, Kafka etc.
- Majority approach can be used.
- In Kafka, in-sync-replica set [11] is maintained.
- Secondaries can be used for reads.
UPDATE ANYWHERE + EAGER REPLICATION
- Each replica can initiate a new transaction.
- Concurrent transactions can compete with
each other.
- Possibility of deadlocks.
- In the basic Paxos algorithm, there is no
designated leader.
PRIMARy COPY + LAZY REPLICATION
- The primary copy can execute updates fast.
- Secondaries can fall behind the primary. It is
called replication lag.
- It can lead to data loss during leader failover, but
still no conflicts.
- Implies sticky availability.
- Secondaries can be used for reads.
UPDATE ANYWHERE + LAZY REPLICATION
- Dynamo-style [4] highly available databases.
- Quorums.
- Concurrent updates are first-class citizens.
- Possibility of conflicts
- Avoiding, discarding, detecting & resolving conflicts
- Eventual convergence
- Write repair, read repair and anti-entropy
QUORUMS
- W + R > N
- W = 3, R = 1, N = 3
- W = 2, R = 2, N = 3
- If W or R is not met
- Sloppy quorums and
hinted handoff
ConflIct-free replIcated data types (CRDTS)
- Special data types that achieve strong
eventual consistency and monotonicity [2]
- No conflicts
- Merge function has 3 properties:
- Commutative: A+B=B+A
- Associative: A+(B+C)=(A+B)+C
- Idempotent: f(f(x))=f(x)
- Riak Data Types [3]
DISCARDING CONFLICTS: LAST WRITE WINS
- When 2 updates are concurrent, define an
arbitrary order among them.
- i.e., pretend that one of them is more recent.
- Attach a timestamp to each write.
- Cassandra uses physical timestamps [8], [9].
DETECTING CONFLICTS: VECTOR CLOCKS
- In Dynamo paper [4], each update is done
against a particular version of a data entry.
- Multiple versions of a data entry can exist together.
- Vector clocks [10] are used to track causality.
- The system can determine the authoritative version:
syntactic reconciliation
- The system cannot reconcile multiple versions:
semantic reconciliation
VECTOR CLOCKS
ResolvIng conflIcts and EVENTUAL CONVERGENCE
- Write repair
- Read repair
- Anti-entropy
- Merkle trees
Recap
- We apply replication to make distributed
systems performant, available and fault
tolerant.
- It suffers from core problems of distributed systems.
- Various replication protocols are built based
on when and where to make updates.
- No silver bullet. It is a world of trade-offs.
- We are hiring!
- Senior Java Developer
http://stackoverflow.com/jobs/129435/senior-java-developer-hazelcast
- Software Quality and Performance Wiz
http://stackoverflow.com/jobs/126077/software-quality-and-performance-wiz
ard-hazelcast
- Solution Architech
http://stackoverflow.com/jobs/131938/solutions-architect-hazelcast
REFerences
[1] Gray, Jim, et al. "The dangers of replication and a solution." ACM SIGMOD Record 25.2 (1996): 173-182.
[2] Shapiro, Marc, et al. "Conflict-free replicated data types." Symposium on Self-Stabilizing Systems. Springer, Berlin, Heidelberg, 2011.
[3] http://docs.basho.com/riak/kv/2.2.0/learn/concepts/crdts/
[4] DeCandia, Giuseppe, et al. "Dynamo: amazon's highly available key-value store." ACM SIGOPS operating systems review 41.6 (2007): 205-220.
[5] Lamport, Leslie. "Paxos made simple." ACM Sigact News 32.4 (2001): 18-25.
[6] Ongaro, Diego, and John K. Ousterhout. "In Search of an Understandable Consensus Algorithm." USENIX Annual Technical Conference. 2014.
[7] Hunt, Patrick, et al. "ZooKeeper: Wait-free Coordination for Internet-scale Systems." USENIX annual technical conference. Vol. 8. 2010.
[8] http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks
[9] https://aphyr.com/posts/299-the-trouble-with-timestamps
[10] Raynal, Michel, and Mukesh Singhal. "Logical time: Capturing causality in distributed systems." Computer 29.2 (1996): 49-56.
[11] http://kafka.apache.org/documentation.html#replication
[12] http://docs.hazelcast.org/docs/latest/manual/html-single/index.html#managing-cluster-and-member-states
[13] E. Brewer, "Towards Robust Distributed Systems," Proc. 19th Ann. ACM Symp. Principles of Distributed Computing (PODC 00), ACM, 2000, pp. 7-10
[14] https://codahale.com/you-cant-sacrifice-partition-tolerance/
[15] http://blog.nahurst.com/visual-guide-to-nosql-systems
[16] http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
[17] https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/
[18] https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
[19] Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." Acm Sigact News 33.2 (2002): 51-59.
[20] https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html
[21] https://henryr.github.io/cap-faq/
[22] http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#wan-replication
[23] https://dev.mysql.com/doc/refman/5.7/en/group-replication.html
[24] https://www.voltdb.com/architecture
[25] Bailis, Peter, et al. "Highly available transactions: Virtues and limitations." Proceedings of the VLDB Endowment 7.3 (2013): 181-192.
THANKS!Any questions?

More Related Content

What's hot

The PHP mysqlnd plugin talk - plugins an alternative to MySQL Proxy
The PHP mysqlnd plugin talk - plugins an alternative to MySQL ProxyThe PHP mysqlnd plugin talk - plugins an alternative to MySQL Proxy
The PHP mysqlnd plugin talk - plugins an alternative to MySQL Proxy
Ulf Wendel
 
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Severalnines
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQL
Kenny Gryp
 

What's hot (20)

PoC: Using a Group Communication System to improve MySQL Replication HA
PoC: Using a Group Communication System to improve MySQL Replication HAPoC: Using a Group Communication System to improve MySQL Replication HA
PoC: Using a Group Communication System to improve MySQL Replication HA
 
Built-in query caching for all PHP MySQL extensions/APIs
Built-in query caching for all PHP MySQL extensions/APIsBuilt-in query caching for all PHP MySQL extensions/APIs
Built-in query caching for all PHP MySQL extensions/APIs
 
NoSQL in MySQL
NoSQL in MySQLNoSQL in MySQL
NoSQL in MySQL
 
The PHP mysqlnd plugin talk - plugins an alternative to MySQL Proxy
The PHP mysqlnd plugin talk - plugins an alternative to MySQL ProxyThe PHP mysqlnd plugin talk - plugins an alternative to MySQL Proxy
The PHP mysqlnd plugin talk - plugins an alternative to MySQL Proxy
 
MySQL native driver for PHP (mysqlnd) - Introduction and overview, Edition 2011
MySQL native driver for PHP (mysqlnd) - Introduction and overview, Edition 2011MySQL native driver for PHP (mysqlnd) - Introduction and overview, Edition 2011
MySQL native driver for PHP (mysqlnd) - Introduction and overview, Edition 2011
 
Intro to PECL/mysqlnd_ms (4/7/2011)
Intro to PECL/mysqlnd_ms (4/7/2011)Intro to PECL/mysqlnd_ms (4/7/2011)
Intro to PECL/mysqlnd_ms (4/7/2011)
 
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
Webinar slides: ClusterControl 1.4: The MySQL Replication & MongoDB Edition -...
 
Webinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High AvailabilityWebinar slides: Managing MySQL Replication for High Availability
Webinar slides: Managing MySQL Replication for High Availability
 
Become a MySQL DBA - Webinars - Schema Changes for MySQL Replication & Galera...
Become a MySQL DBA - Webinars - Schema Changes for MySQL Replication & Galera...Become a MySQL DBA - Webinars - Schema Changes for MySQL Replication & Galera...
Become a MySQL DBA - Webinars - Schema Changes for MySQL Replication & Galera...
 
MySQL? Load? Clustering! Balancing! PECL/mysqlnd_ms 1.4
MySQL? Load? Clustering! Balancing! PECL/mysqlnd_ms 1.4MySQL? Load? Clustering! Balancing! PECL/mysqlnd_ms 1.4
MySQL? Load? Clustering! Balancing! PECL/mysqlnd_ms 1.4
 
MySQL Multi Master Replication
MySQL Multi Master ReplicationMySQL Multi Master Replication
MySQL Multi Master Replication
 
PHP mysqlnd connection multiplexing plugin
PHP mysqlnd connection multiplexing pluginPHP mysqlnd connection multiplexing plugin
PHP mysqlnd connection multiplexing plugin
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Mysql high availability and scalability
Mysql high availability and scalabilityMysql high availability and scalability
Mysql high availability and scalability
 
Introduction to failover clustering with sql server
Introduction to failover clustering with sql serverIntroduction to failover clustering with sql server
Introduction to failover clustering with sql server
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQL
 
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best PracticesMySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
 
MySQL Database Architectures - 2020-10
MySQL Database Architectures -  2020-10MySQL Database Architectures -  2020-10
MySQL Database Architectures - 2020-10
 
Online MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackupOnline MySQL Backups with Percona XtraBackup
Online MySQL Backups with Percona XtraBackup
 

Viewers also liked

Viewers also liked (20)

Fikir küpü 2014 4.çeyrek bülteni
Fikir küpü 2014 4.çeyrek bülteniFikir küpü 2014 4.çeyrek bülteni
Fikir küpü 2014 4.çeyrek bülteni
 
2. Etap İhale İlanı
2. Etap İhale İlanı2. Etap İhale İlanı
2. Etap İhale İlanı
 
Teknopark İstanbul ofi̇s, i̇ş yeri̇, dekorasyon ve tadi̇lat kılavuzu
Teknopark İstanbul  ofi̇s, i̇ş yeri̇, dekorasyon ve tadi̇lat kılavuzuTeknopark İstanbul  ofi̇s, i̇ş yeri̇, dekorasyon ve tadi̇lat kılavuzu
Teknopark İstanbul ofi̇s, i̇ş yeri̇, dekorasyon ve tadi̇lat kılavuzu
 
Teknopark Istanbul Güncel Servis Güzergahları
Teknopark Istanbul Güncel Servis GüzergahlarıTeknopark Istanbul Güncel Servis Güzergahları
Teknopark Istanbul Güncel Servis Güzergahları
 
Sucool Tİ Start Up Marketing Roadmap
Sucool Tİ Start Up Marketing RoadmapSucool Tİ Start Up Marketing Roadmap
Sucool Tİ Start Up Marketing Roadmap
 
Teknopark İstanbul Güncel Yenek Menüsü - Mayıs 2016
Teknopark İstanbul Güncel Yenek Menüsü - Mayıs 2016Teknopark İstanbul Güncel Yenek Menüsü - Mayıs 2016
Teknopark İstanbul Güncel Yenek Menüsü - Mayıs 2016
 
Teknopark İstanbul Otopark Prosedürü (İdare Binası ve Kuluçka Merkezi)
Teknopark İstanbul Otopark Prosedürü (İdare Binası ve Kuluçka Merkezi)Teknopark İstanbul Otopark Prosedürü (İdare Binası ve Kuluçka Merkezi)
Teknopark İstanbul Otopark Prosedürü (İdare Binası ve Kuluçka Merkezi)
 
Fikir küpü 2015 1. çeyrek bülteni
Fikir küpü 2015 1. çeyrek bülteniFikir küpü 2015 1. çeyrek bülteni
Fikir küpü 2015 1. çeyrek bülteni
 
Fikir Küpü Mezuniyet Töreni Sunumu
Fikir Küpü Mezuniyet Töreni SunumuFikir Küpü Mezuniyet Töreni Sunumu
Fikir Küpü Mezuniyet Töreni Sunumu
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 
Client-centric Consistency Models
Client-centric Consistency ModelsClient-centric Consistency Models
Client-centric Consistency Models
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
MySQL 5.7 Replication News
MySQL 5.7 Replication News MySQL 5.7 Replication News
MySQL 5.7 Replication News
 
Upgrade to MySQL 5.7 and latest news planned for MySQL 8
Upgrade to MySQL 5.7 and latest news planned for MySQL 8Upgrade to MySQL 5.7 and latest news planned for MySQL 8
Upgrade to MySQL 5.7 and latest news planned for MySQL 8
 
Discovering MySQL 5.7 @ InstantPost
Discovering MySQL 5.7 @ InstantPostDiscovering MySQL 5.7 @ InstantPost
Discovering MySQL 5.7 @ InstantPost
 
New awesome features in MySQL 5.7
New awesome features in MySQL 5.7New awesome features in MySQL 5.7
New awesome features in MySQL 5.7
 
International Journal of Optical Sciences (Vol 2 Issue 2)
International Journal of Optical Sciences (Vol 2 Issue 2)International Journal of Optical Sciences (Vol 2 Issue 2)
International Journal of Optical Sciences (Vol 2 Issue 2)
 
Código de Classificacao de Documentos - Correios
Código de Classificacao de Documentos - CorreiosCódigo de Classificacao de Documentos - Correios
Código de Classificacao de Documentos - Correios
 
Reglamento interno santiago mariño
Reglamento interno santiago mariñoReglamento interno santiago mariño
Reglamento interno santiago mariño
 

Similar to Replication in the wild ankara cloud meetup - feb 2017

Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availability
Renato Lucindo
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
Bhupesh Bansal
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Cal Henderson
 

Similar to Replication in the wild ankara cloud meetup - feb 2017 (20)

Replication in the Wild - Warsaw Cloud Native Meetup - May 2017
Replication in the Wild - Warsaw Cloud Native Meetup - May 2017Replication in the Wild - Warsaw Cloud Native Meetup - May 2017
Replication in the Wild - Warsaw Cloud Native Meetup - May 2017
 
Replication in the Wild
Replication in the WildReplication in the Wild
Replication in the Wild
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
Indeed Flex: The Story of a Revolutionary Recruitment Platform
Indeed Flex: The Story of a Revolutionary Recruitment PlatformIndeed Flex: The Story of a Revolutionary Recruitment Platform
Indeed Flex: The Story of a Revolutionary Recruitment Platform
 
Planning to Fail #phpuk13
Planning to Fail #phpuk13Planning to Fail #phpuk13
Planning to Fail #phpuk13
 
Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availability
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
 
Planning to Fail #phpne13
Planning to Fail #phpne13Planning to Fail #phpne13
Planning to Fail #phpne13
 
No sql
No sqlNo sql
No sql
 
Design (Cloud systems) for Failures
Design (Cloud systems) for FailuresDesign (Cloud systems) for Failures
Design (Cloud systems) for Failures
 
NoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, ImplementationsNoSQL Introduction, Theory, Implementations
NoSQL Introduction, Theory, Implementations
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
test
testtest
test
 
HeartBeat
HeartBeatHeartBeat
HeartBeat
 
Lecture 24
Lecture 24Lecture 24
Lecture 24
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
NoSQL
NoSQLNoSQL
NoSQL
 

More from AnkaraCloud (6)

Kubernetes Nedir?
Kubernetes Nedir?Kubernetes Nedir?
Kubernetes Nedir?
 
Apache Kafka Nedir?
Apache Kafka Nedir?   Apache Kafka Nedir?
Apache Kafka Nedir?
 
Nokta techpresentation
Nokta techpresentationNokta techpresentation
Nokta techpresentation
 
Designing a Reliable Software Factory for the Cloud
Designing a Reliable Software Factory for the CloudDesigning a Reliable Software Factory for the Cloud
Designing a Reliable Software Factory for the Cloud
 
Dev ops culture and practices
Dev ops culture  and  practicesDev ops culture  and  practices
Dev ops culture and practices
 
Introduction to Amazon Web Services
Introduction to Amazon Web ServicesIntroduction to Amazon Web Services
Introduction to Amazon Web Services
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Replication in the wild ankara cloud meetup - feb 2017

  • 2. Hello! Ensar Basri Kahveci Distributed Systems Engineer @ Hazelcast Ph.D. Candidate @ Bilkent CS twitter: metanet | github: metanet | linkedin: basrikahveci
  • 3. - IMDG: storage + computation + messaging - Open source, distributed, highly available, elastic, scalable - Distributed Java collections, JCache, HD store - Embedded or client - server deployment - Clients: Java, Scala, C++, C#, Python, Node.js, ... - Integration modules
  • 4. - https://blog.hazelcast.com/announcing-hazel cast-imdg-3-8 - Rolling upgrades - User code deployment - Hot restart improvements - WAN replication improvements
  • 5. REPLICATION - Putting a data set into multiple nodes. - Each replica has a full copy. - A few reasons for replication: - Performance - Availability and fault tolerance
  • 6. REPLICATION + PARTITIONING - Mostly used with partitioning. - Two partitions: P1, P2 - Two replicas for each partition.
  • 7. NOTHING FOR FREE! - Very easy to do when the data is immutable. - Problems start when we have multiple copies of the data and we want to update them. - Two main difficulties: - Handling updates, - Handling failures.
  • 8. CAP PrIncIple - Proposed by Eric Brewer in 2000 [13], - Proved by Gilbert and Lynch in 2002 [14]. - A shared-data system cannot achieve perfect consistency and availability in the presence of partitions CP vs AP. - Widespread acceptance, and yet a lot of criticism [15 - 21].
  • 9. consIstency AND avaIlabIlIty - Degrees of consistency: - Data centric, client centric - Degrees of availability: - High availability, sticky availability, non-availability - Replication is directly related to C and A. [25]
  • 10. The dangers of replIcatIon and a solutIon - Gray et al. [1] classify replication models by 2 parameters: - Where to make updates: primary copy or update anywhere - When to make updates: eagerly or lazily
  • 11. WHERE: PRIMARY COPY - There is a single replica managing the updates. - Concurrency control is easy. - No conflicts and no conflict-handling logic. - Updates are executed on the primary and secondaries with the same order. - When primary fails, a new primary is elected. - Ensuring a single and good primary is hard.
  • 12. WHERE: UPDATE ANYWHERE - Each replica can initiate a transaction to make an update. - Complex concurrency control. - Deadlocks or conflicts are possible. - In practice, there is also multi-leader.
  • 13. WHEN: EAGER REPLICATION - Synchronously updates all replicas as part of one atomic transaction. - Provides strong consistency. - Degree of availability can degrade on node failures. - Consensus algorithms.
  • 14. WHEN: LAZY REPLICATION - Updates each replica with a separate transaction. - Updates can execute quite fast. - Degree of availability is high. - Eventual consistency. - Data copies can diverge. - Data loss or conflicts can occur.
  • 15. WHERE? WHEN? PRIMARY COPY UPDATE ANYWHERE EAGER strong consistency simple concurrency slow inflexible strong consistency complex concurrency slow expensive deadlocks LAZY fast sticky availability eventual consistency simple concurrency inconsistency fast flexible high availability eventual consistency inconsistency conflicts
  • 16. WHERE? WHEN? PRIMARY COPY UPDATE ANYWHERE EAGER Multi Paxos [5] etcd and Consul (RAFT) [6] Zookeeper (Zab) [7] Kafka VoltDB [24] Paxos [5] Hazelcast Cluster State Change [12] MySQL 5.7 Group Replication [23] LAZY Hazelcast MongoDB ElasticSearch Redis Dynamo [4] Cassandra Riak Hazelcast Active-Active WAN Replication [22]
  • 17. PRIMARY COPY + EAGER REPLICATION - When the primary fails, secondaries are guaranteed to be up to date. - Raft, Kafka etc. - Majority approach can be used. - In Kafka, in-sync-replica set [11] is maintained. - Secondaries can be used for reads.
  • 18. UPDATE ANYWHERE + EAGER REPLICATION - Each replica can initiate a new transaction. - Concurrent transactions can compete with each other. - Possibility of deadlocks. - In the basic Paxos algorithm, there is no designated leader.
  • 19. PRIMARy COPY + LAZY REPLICATION - The primary copy can execute updates fast. - Secondaries can fall behind the primary. It is called replication lag. - It can lead to data loss during leader failover, but still no conflicts. - Implies sticky availability. - Secondaries can be used for reads.
  • 20. UPDATE ANYWHERE + LAZY REPLICATION - Dynamo-style [4] highly available databases. - Quorums. - Concurrent updates are first-class citizens. - Possibility of conflicts - Avoiding, discarding, detecting & resolving conflicts - Eventual convergence - Write repair, read repair and anti-entropy
  • 21. QUORUMS - W + R > N - W = 3, R = 1, N = 3 - W = 2, R = 2, N = 3 - If W or R is not met - Sloppy quorums and hinted handoff
  • 22. ConflIct-free replIcated data types (CRDTS) - Special data types that achieve strong eventual consistency and monotonicity [2] - No conflicts - Merge function has 3 properties: - Commutative: A+B=B+A - Associative: A+(B+C)=(A+B)+C - Idempotent: f(f(x))=f(x) - Riak Data Types [3]
  • 23. DISCARDING CONFLICTS: LAST WRITE WINS - When 2 updates are concurrent, define an arbitrary order among them. - i.e., pretend that one of them is more recent. - Attach a timestamp to each write. - Cassandra uses physical timestamps [8], [9].
  • 24. DETECTING CONFLICTS: VECTOR CLOCKS - In Dynamo paper [4], each update is done against a particular version of a data entry. - Multiple versions of a data entry can exist together. - Vector clocks [10] are used to track causality. - The system can determine the authoritative version: syntactic reconciliation - The system cannot reconcile multiple versions: semantic reconciliation
  • 26. ResolvIng conflIcts and EVENTUAL CONVERGENCE - Write repair - Read repair - Anti-entropy - Merkle trees
  • 27. Recap - We apply replication to make distributed systems performant, available and fault tolerant. - It suffers from core problems of distributed systems. - Various replication protocols are built based on when and where to make updates. - No silver bullet. It is a world of trade-offs.
  • 28. - We are hiring! - Senior Java Developer http://stackoverflow.com/jobs/129435/senior-java-developer-hazelcast - Software Quality and Performance Wiz http://stackoverflow.com/jobs/126077/software-quality-and-performance-wiz ard-hazelcast - Solution Architech http://stackoverflow.com/jobs/131938/solutions-architect-hazelcast
  • 29. REFerences [1] Gray, Jim, et al. "The dangers of replication and a solution." ACM SIGMOD Record 25.2 (1996): 173-182. [2] Shapiro, Marc, et al. "Conflict-free replicated data types." Symposium on Self-Stabilizing Systems. Springer, Berlin, Heidelberg, 2011. [3] http://docs.basho.com/riak/kv/2.2.0/learn/concepts/crdts/ [4] DeCandia, Giuseppe, et al. "Dynamo: amazon's highly available key-value store." ACM SIGOPS operating systems review 41.6 (2007): 205-220. [5] Lamport, Leslie. "Paxos made simple." ACM Sigact News 32.4 (2001): 18-25. [6] Ongaro, Diego, and John K. Ousterhout. "In Search of an Understandable Consensus Algorithm." USENIX Annual Technical Conference. 2014. [7] Hunt, Patrick, et al. "ZooKeeper: Wait-free Coordination for Internet-scale Systems." USENIX annual technical conference. Vol. 8. 2010. [8] http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks [9] https://aphyr.com/posts/299-the-trouble-with-timestamps [10] Raynal, Michel, and Mukesh Singhal. "Logical time: Capturing causality in distributed systems." Computer 29.2 (1996): 49-56. [11] http://kafka.apache.org/documentation.html#replication [12] http://docs.hazelcast.org/docs/latest/manual/html-single/index.html#managing-cluster-and-member-states [13] E. Brewer, "Towards Robust Distributed Systems," Proc. 19th Ann. ACM Symp. Principles of Distributed Computing (PODC 00), ACM, 2000, pp. 7-10 [14] https://codahale.com/you-cant-sacrifice-partition-tolerance/ [15] http://blog.nahurst.com/visual-guide-to-nosql-systems [16] http://www.allthingsdistributed.com/2008/12/eventually_consistent.html [17] https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/ [18] https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed [19] Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." Acm Sigact News 33.2 (2002): 51-59. [20] https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html [21] https://henryr.github.io/cap-faq/ [22] http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#wan-replication [23] https://dev.mysql.com/doc/refman/5.7/en/group-replication.html [24] https://www.voltdb.com/architecture [25] Bailis, Peter, et al. "Highly available transactions: Virtues and limitations." Proceedings of the VLDB Endowment 7.3 (2013): 181-192.