SlideShare a Scribd company logo
1 of 35
Consistency and Availability
tradeoff in database clusters
Grokking Techtalk 40
About Me
● About Me
● Introduce Segmentation Platform
2
About Me
● Joint Grab for 2 years, currently working as the lead engineer of Segmentation Platform project
● Lead the Research Database group in Grokking Lab
3
About Segmentation Platform (SegP)
● Technology
○ Programming Languages: golang, java, scala
○ Batch processing (spark, scala),
○ Caching (redis),
○ Message queue (SQS, Kafka),
○ Relational database (MySQL),
○ Non-relational (Cassandra, DynamoDB, Elastic Search),
● Team's scope
○ Features development. Coordinate with business owners to develop a platform for
segmentation. Similar to segments.io but for internal users.
○ Batch data processing.
○ Real-time traffic. Build and maintain grpc apis to serve online traffic
4
What we'll discuss in this talk
● CAP theorem
● The cluster architect of Redis, Elastic Search, Cassandra
● How C-A tradeoff reflected in their designs
5
CAP Theorem
6
Consistency
The system is considered consistent if v1 is
returned to client 2 if the read request (2)
happened after the write request (1)
Client 1
Client 2
DB
System
(1) Update v=v1
(2) Get v
V value is
currently v0
7
Availability
When 1 request is sent, one algorithm is being
designed to handle that request which some
steps.
If the system can't go through the algorithm
designed for that request, they're considered
"not available" to that client.
Client 1
Client 2
DB
System
500
DB
System
(3) System return 2xx or 4xxx
(1) Client send request
(2) System
went
through the
algorithm
defined for
this request
(2) System
cannot go
through the
algorithm
defined for
this request
8
Network partition
Network partition happened when some of
the nodes cannot communicate properly to
each other and they believe that the others
was offline.
For example, Node 1 cannot communicate
with Node 2, hence Node 1 thought that
Node 2 is offline. But Node 2 still alive, and
still serve requests.
Node 1 Node 2
Client 1 Client 2
9
CAP Theorem
A distributed database has three very desirable properties:
1. Tolerance towards Network Partition
2. Consistency
3. Availability
The CAP theorem states: You can have at most two of these properties for any shared-data system
Consistency
Availability Partition tolerance
10
Redis Cluster
11
What is Redis
12
- Stands for Remote Dictionary Server
- Is a fast, open-source, in-memory key-value data store for use as a database, cache,
message broker, and queue.
- Delivers sub-millisecond response times enabling millions of requests per second for real-
time applications in Gaming, Ad-Tech, Financial Services, Healthcare, and IoT.
- Popular choice for caching, session management, gaming, leaderboards, real-time analytics,
geospatial, ride-hailing, chat/messaging, media streaming, and pub/sub apps.
Redis cluster - Multi-master
Key is hashed into (1-16384). Depends on
the hash value, the value will be read (and
write into the node assigned that token
accordingly.)
Client
Redis node
1
Redis node
2
key -> value
5 -> "ho chi minh"
6 -> "ha noi"
token 1->8000
token 8001-
>16384
key -> hash
5 -> 18
6 -> 8003
13
6 -> "ha noi"
5 -> "ho chi
minh"
Redis cluster - Master/Replica
Redis uses asynchronous replication, with
asynchronous replica-to-master
acknowledges of the amount of data
processed.
A master can have multiple replicas.
Client write to master, but can read from
replica
Client
1
Redis
master
Redis
replica
Redis
replica
Client
2
Write
command
async updates
Read
command
Ref: https://redis.io/topics/replication
async updates
14
C-A tradeoff
Redis uses asynchronous replication
by default. Which means, by default,
it's AP.
If network partition happened between
master and replica, we'll see
inconsistent data.
Client
1
Redis
master
Redis
replica
Redis
replica
Client
2
Write
command
async updates
async updates
Read
command
return stale
data
15
Elastic Search cluster
16
What is Elasticsearch
17
● Elasticsearch is a distributed, open source search and analytics engine for all types of data,
including textual, numerical, geospatial, structured, and unstructured.
● Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V.
(now known as Elastic).
Roles in Elasticsearch Cluster
Coordinator node
Master nodeData node
Data node
Client 1
Client 2
Manages the overall
operation of a cluster
and keeps track of
the cluster state
Stores and searches
data. Performs all
data-related
operations (indexing,
searching,
aggregating) on local
shards.
Delegates client
requests to the
shards on the data
nodes, collects and
aggregates the
results into one final
result, and sends this
result back to the
client.
18
Steps for primary shards:
● Validate incoming operation and reject it if
structurally invalid
● Execute the operation locally
● Forward the operation to each replica in the current
in-sync copies set.
● Once all replicas have successfully performed the
operation and responded to the primary, the
primary acknowledges the successful completion
of the request to the client
Primary and Replica shards
user1 auth a1 login
from
homepa
ge
Destination node
Coordination node
19
p1
r11
r12
r21
p2
r22
Primary shards to
forward the
operation to replica
If network partition happen, the primary shard cannot
write to replica shard which lead to the primary shard
becomes unavailable.
By default, ElasticSearch is more CP.
C-A tradeoff
Destination node
20
p1
r11
r12
r21
p2
r22
Primary shards to
forward the
operation to replica
Cassandra
21
What is Cassandra
22
Apache Cassandra is an open source, distributed NoSQL database that began internally at
Facebook and was released as an open-source project in July 2008.
Cassandra delivers continuous availability (zero downtime), high performance, and linear
scalability that modern applications require, while also offering operational simplicity and effortless
replication across data centers and geographies.
Cassandra Ring Cluster
-> token: 44
1-20
21-40
41-60
61-80
81-100
101-120
121-140
141-160
Coordinator node
- Each nodes will be assigned a range of token
- Client could connect to any nodes to write, that
node will become the coordinator node
- Partition keys will be hashed into a token.
Coordinator will base on the token to know which
node we can store the data
user1 auth a1 login
from
homepa
ge
Destination node
23
Replication Factor
-> token: 44
1-20
21-40
41-60
61-80
81-100
101-120
121-140
141-160
Coordinator node
Replication node
- Replication Factor (RF) = number of copies we
want to store
- Replication node will be defined by the Replication
Strategy
- Simple strategy = next two nodes will be the
replication node
user1 auth a1 login
from
homepa
ge
24
Data Consistency
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
- Client 1 connect to C1 to read, C1 write data to three nodes, but
failed at node B.
- Client 2 also connect to C2 to read data,
What would happen?
25
Consistent Level (Write)
Level Read Write
One Returns a response from
the closest replica, as
determined by the snitch.
By default, a read repair
runs in the background to
make the other replicas
consistent.
A write must be written to
the commit log and
memtable of at least one
replica node.
Quorum Returns the record after a
quorum of replicas has
responded.
A write must be written to
the commit log and
memtable on a quorum of
replica nodes
All Returns the record after all
replicas have responded.
The read operation will fail if
a replica does not respond.
A write must be written to
the commit log and
memtable on all replica
nodes in the cluster for that
partition.
26
Write with CL=ALL
C1
C2
C
A
B
Client 1
write data with
token 44
v2
v1
v2
Write with CL=ALL
- All replica succeeded -> success
- 1 replica failed -> failed
Result: Failed
27
Write with CL=QUORUM
C1
C2
C
A
B
Client 1
write data with
token 44
v2
v1
v2
Quorum = (RF + 1) / 2 = 2
- Two replicas succeeded -> success
- Less than two success -> failed
Result: Success
28
Consistent Level (Read)
Level Read Write
One Returns a response from
the closest replica, as
determined by the snitch.
By default, a read repair
runs in the background to
make the other replicas
consistent.
A write must be written to
the commit log and
memtable of at least one
replica node.
Quorum Returns the record after a
quorum of replicas has
responded.
A write must be written to
the commit log and
memtable on a quorum of
replica nodes
All Returns the record after all
replicas have responded.
The read operation will fail if
a replica does not respond.
A write must be written to
the commit log and
memtable on all replica
nodes in the cluster for that
partition.
29
Write=QUORUM, Read=One
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
Potentially inconsistent read. If client 2 read
node B, client 2 will receive stale-data.
W (Quorum) + R (1) -> eventual consistent
30
Write=QUORUM, Read=QUORUM
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
Any read combination will always return v2
W (QU) + R (QU) -> consistent
31
Write=All, Read=One
C1
C2
C
A
B
Client 1
Client 2
read data with
token 44
write data with
token 44
v2
v1
v2
Potentially inconsistent read. If client 2 read
node B, client 2 will receive stale-data.
W (All) + R (1) -> consistent
32
Summarize Read and Write CL
WRITE READ Consistent Read Availability Write Availability
All All Consistent Low Low
Quorum All Consistent Low Medium
One All Consistent Low High
All Quoru
m
Consistent Medium Low
Quorum Quoru
m
Consistent Medium Medium
One Quoru
m
Inconsistent Medium High
All One Consistent High Low
Quorum One Inconsistent High Medium
One One Inconsistent High High
33
Summary
Redis Cassandra Elastic Search
Availability > Consistency Tweakable availability and
consistency
Availability < Consistency
34
Q&A
35

More Related Content

What's hot

Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...Grokking VN
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problemGrokking VN
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandraVinay Kumar Chella
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedis Labs
 
High Concurrency Architecture at TIKI
High Concurrency Architecture at TIKIHigh Concurrency Architecture at TIKI
High Concurrency Architecture at TIKINghia Minh
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsTyler Treat
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice ArchitectureNguyen Tung
 
API Platform and Symfony: a Framework for API-driven Projects
API Platform and Symfony: a Framework for API-driven ProjectsAPI Platform and Symfony: a Framework for API-driven Projects
API Platform and Symfony: a Framework for API-driven ProjectsLes-Tilleuls.coop
 
Understanding Reactive Programming
Understanding Reactive ProgrammingUnderstanding Reactive Programming
Understanding Reactive ProgrammingAndres Almiray
 
ITLC HN 14 - Bizweb Microservices Architecture
ITLC HN 14  - Bizweb Microservices ArchitectureITLC HN 14  - Bizweb Microservices Architecture
ITLC HN 14 - Bizweb Microservices ArchitectureIT Expert Club
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking VN
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
 
Redis - Usability and Use Cases
Redis - Usability and Use CasesRedis - Usability and Use Cases
Redis - Usability and Use CasesFabrizio Farinacci
 
Domain Driven Design và Event Driven Architecture
Domain Driven Design và Event Driven Architecture Domain Driven Design và Event Driven Architecture
Domain Driven Design và Event Driven Architecture IT Expert Club
 
Tiki.vn - How we scale as a tech startup
Tiki.vn - How we scale as a tech startupTiki.vn - How we scale as a tech startup
Tiki.vn - How we scale as a tech startupTung Ns
 
Redis in Practice
Redis in PracticeRedis in Practice
Redis in PracticeNoah Davis
 
Atomicity In Redis: Thomas Hunter
Atomicity In Redis: Thomas HunterAtomicity In Redis: Thomas Hunter
Atomicity In Redis: Thomas HunterRedis Labs
 

What's hot (20)

Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 Grokking Techtalk #39: How to build an event driven architecture with Kafka ... Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
 
Grokking Techtalk #37: Data intensive problem
 Grokking Techtalk #37: Data intensive problem Grokking Techtalk #37: Data intensive problem
Grokking Techtalk #37: Data intensive problem
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandra
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel HochmanRedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
RedisConf17 - Lyft - Geospatial at Scale - Daniel Hochman
 
High Concurrency Architecture at TIKI
High Concurrency Architecture at TIKIHigh Concurrency Architecture at TIKI
High Concurrency Architecture at TIKI
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
 
API Platform and Symfony: a Framework for API-driven Projects
API Platform and Symfony: a Framework for API-driven ProjectsAPI Platform and Symfony: a Framework for API-driven Projects
API Platform and Symfony: a Framework for API-driven Projects
 
Understanding Reactive Programming
Understanding Reactive ProgrammingUnderstanding Reactive Programming
Understanding Reactive Programming
 
ITLC HN 14 - Bizweb Microservices Architecture
ITLC HN 14  - Bizweb Microservices ArchitectureITLC HN 14  - Bizweb Microservices Architecture
ITLC HN 14 - Bizweb Microservices Architecture
 
Grokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous CommunicationsGrokking TechTalk #31: Asynchronous Communications
Grokking TechTalk #31: Asynchronous Communications
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 
Redis - Usability and Use Cases
Redis - Usability and Use CasesRedis - Usability and Use Cases
Redis - Usability and Use Cases
 
Domain Driven Design và Event Driven Architecture
Domain Driven Design và Event Driven Architecture Domain Driven Design và Event Driven Architecture
Domain Driven Design và Event Driven Architecture
 
Tiki.vn - How we scale as a tech startup
Tiki.vn - How we scale as a tech startupTiki.vn - How we scale as a tech startup
Tiki.vn - How we scale as a tech startup
 
Redis in Practice
Redis in PracticeRedis in Practice
Redis in Practice
 
Atomicity In Redis: Thomas Hunter
Atomicity In Redis: Thomas HunterAtomicity In Redis: Thomas Hunter
Atomicity In Redis: Thomas Hunter
 

Similar to Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster

Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystemAlex Thompson
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBANikhil Kumar
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
Data correlation using PySpark and HDFS
Data correlation using PySpark and HDFSData correlation using PySpark and HDFS
Data correlation using PySpark and HDFSJohn Conley
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingApache Apex
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetMongoDB
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistencyScyllaDB
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...HostedbyConfluent
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 

Similar to Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster (20)

Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
MYSQL
MYSQLMYSQL
MYSQL
 
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
MongoDB World 2018: Active-Active Application Architectures: Become a MongoDB...
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBA
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
infiniband.pdf
infiniband.pdfinfiniband.pdf
infiniband.pdf
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Data correlation using PySpark and HDFS
Data correlation using PySpark and HDFSData correlation using PySpark and HDFS
Data correlation using PySpark and HDFS
 
Lecture9
Lecture9Lecture9
Lecture9
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
Postgres clusters
Postgres clustersPostgres clusters
Postgres clusters
 
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark StreamingIntro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
Intro to Apache Apex (next gen Hadoop) & comparison to Spark Streaming
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica Set
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
PortLand.pptx
PortLand.pptxPortLand.pptx
PortLand.pptx
 

More from Grokking VN

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking VN
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking VN
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking VN
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking VN
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking VN
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking VN
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking VN
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...Grokking VN
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking VN
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking VN
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking VN
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking VN
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking VN
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking VN
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking VN
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking VN
 
Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101Grokking VN
 
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...Grokking VN
 
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B:  Giới thiệu về Viễn thông Di độngGrokking TechTalk #18B:  Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di độngGrokking VN
 

More from Grokking VN (20)

Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banksGrokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
Grokking Techtalk #46: Lessons from years hacking and defending Vietnamese banks
 
Grokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles ThinkingGrokking Techtalk #45: First Principles Thinking
Grokking Techtalk #45: First Principles Thinking
 
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...Grokking Techtalk #42: Engineering challenges on building data platform for M...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
 
Grokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystifiedGrokking Techtalk #43: Payment gateway demystified
Grokking Techtalk #43: Payment gateway demystified
 
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platformGrokking Techtalk #40: AWS’s philosophy on designing MLOps platform
Grokking Techtalk #40: AWS’s philosophy on designing MLOps platform
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Grokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellcheckingGrokking TechTalk #35: Efficient spellchecking
Grokking TechTalk #35: Efficient spellchecking
 
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer... Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
Grokking Techtalk #34: K8S On-premise: Incident & Lesson Learned ZaloPay Mer...
 
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
Grokking TechTalk #33: Architecture of AI-First Systems - Engineering for Big...
 
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at ScaleGrokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
Grokking TechTalk #30: From App to Ecosystem: Lessons Learned at Scale
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Grokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search TreeGrokking TechTalk #27: Optimal Binary Search Tree
Grokking TechTalk #27: Optimal Binary Search Tree
 
Grokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the MagicGrokking TechTalk #26: Kotlin, Understand the Magic
Grokking TechTalk #26: Kotlin, Understand the Magic
 
Grokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platformGrokking TechTalk #26: Compare ios and android platform
Grokking TechTalk #26: Compare ios and android platform
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
 
Grokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocolsGrokking TechTalk #24: Kafka's principles and protocols
Grokking TechTalk #24: Kafka's principles and protocols
 
Grokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer VisionGrokking TechTalk #21: Deep Learning in Computer Vision
Grokking TechTalk #21: Deep Learning in Computer Vision
 
Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101Grokking TechTalk #20: PostgreSQL Internals 101
Grokking TechTalk #20: PostgreSQL Internals 101
 
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...Grokking TechTalk #19: Software Development Cycle In The International Moneta...
Grokking TechTalk #19: Software Development Cycle In The International Moneta...
 
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B:  Giới thiệu về Viễn thông Di độngGrokking TechTalk #18B:  Giới thiệu về Viễn thông Di động
Grokking TechTalk #18B: Giới thiệu về Viễn thông Di động
 

Recently uploaded

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Recently uploaded (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Grokking Techtalk #40: Consistency and Availability tradeoff in database cluster

  • 1. Consistency and Availability tradeoff in database clusters Grokking Techtalk 40
  • 2. About Me ● About Me ● Introduce Segmentation Platform 2
  • 3. About Me ● Joint Grab for 2 years, currently working as the lead engineer of Segmentation Platform project ● Lead the Research Database group in Grokking Lab 3
  • 4. About Segmentation Platform (SegP) ● Technology ○ Programming Languages: golang, java, scala ○ Batch processing (spark, scala), ○ Caching (redis), ○ Message queue (SQS, Kafka), ○ Relational database (MySQL), ○ Non-relational (Cassandra, DynamoDB, Elastic Search), ● Team's scope ○ Features development. Coordinate with business owners to develop a platform for segmentation. Similar to segments.io but for internal users. ○ Batch data processing. ○ Real-time traffic. Build and maintain grpc apis to serve online traffic 4
  • 5. What we'll discuss in this talk ● CAP theorem ● The cluster architect of Redis, Elastic Search, Cassandra ● How C-A tradeoff reflected in their designs 5
  • 7. Consistency The system is considered consistent if v1 is returned to client 2 if the read request (2) happened after the write request (1) Client 1 Client 2 DB System (1) Update v=v1 (2) Get v V value is currently v0 7
  • 8. Availability When 1 request is sent, one algorithm is being designed to handle that request which some steps. If the system can't go through the algorithm designed for that request, they're considered "not available" to that client. Client 1 Client 2 DB System 500 DB System (3) System return 2xx or 4xxx (1) Client send request (2) System went through the algorithm defined for this request (2) System cannot go through the algorithm defined for this request 8
  • 9. Network partition Network partition happened when some of the nodes cannot communicate properly to each other and they believe that the others was offline. For example, Node 1 cannot communicate with Node 2, hence Node 1 thought that Node 2 is offline. But Node 2 still alive, and still serve requests. Node 1 Node 2 Client 1 Client 2 9
  • 10. CAP Theorem A distributed database has three very desirable properties: 1. Tolerance towards Network Partition 2. Consistency 3. Availability The CAP theorem states: You can have at most two of these properties for any shared-data system Consistency Availability Partition tolerance 10
  • 12. What is Redis 12 - Stands for Remote Dictionary Server - Is a fast, open-source, in-memory key-value data store for use as a database, cache, message broker, and queue. - Delivers sub-millisecond response times enabling millions of requests per second for real- time applications in Gaming, Ad-Tech, Financial Services, Healthcare, and IoT. - Popular choice for caching, session management, gaming, leaderboards, real-time analytics, geospatial, ride-hailing, chat/messaging, media streaming, and pub/sub apps.
  • 13. Redis cluster - Multi-master Key is hashed into (1-16384). Depends on the hash value, the value will be read (and write into the node assigned that token accordingly.) Client Redis node 1 Redis node 2 key -> value 5 -> "ho chi minh" 6 -> "ha noi" token 1->8000 token 8001- >16384 key -> hash 5 -> 18 6 -> 8003 13 6 -> "ha noi" 5 -> "ho chi minh"
  • 14. Redis cluster - Master/Replica Redis uses asynchronous replication, with asynchronous replica-to-master acknowledges of the amount of data processed. A master can have multiple replicas. Client write to master, but can read from replica Client 1 Redis master Redis replica Redis replica Client 2 Write command async updates Read command Ref: https://redis.io/topics/replication async updates 14
  • 15. C-A tradeoff Redis uses asynchronous replication by default. Which means, by default, it's AP. If network partition happened between master and replica, we'll see inconsistent data. Client 1 Redis master Redis replica Redis replica Client 2 Write command async updates async updates Read command return stale data 15
  • 17. What is Elasticsearch 17 ● Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. ● Elasticsearch is built on Apache Lucene and was first released in 2010 by Elasticsearch N.V. (now known as Elastic).
  • 18. Roles in Elasticsearch Cluster Coordinator node Master nodeData node Data node Client 1 Client 2 Manages the overall operation of a cluster and keeps track of the cluster state Stores and searches data. Performs all data-related operations (indexing, searching, aggregating) on local shards. Delegates client requests to the shards on the data nodes, collects and aggregates the results into one final result, and sends this result back to the client. 18
  • 19. Steps for primary shards: ● Validate incoming operation and reject it if structurally invalid ● Execute the operation locally ● Forward the operation to each replica in the current in-sync copies set. ● Once all replicas have successfully performed the operation and responded to the primary, the primary acknowledges the successful completion of the request to the client Primary and Replica shards user1 auth a1 login from homepa ge Destination node Coordination node 19 p1 r11 r12 r21 p2 r22 Primary shards to forward the operation to replica
  • 20. If network partition happen, the primary shard cannot write to replica shard which lead to the primary shard becomes unavailable. By default, ElasticSearch is more CP. C-A tradeoff Destination node 20 p1 r11 r12 r21 p2 r22 Primary shards to forward the operation to replica
  • 22. What is Cassandra 22 Apache Cassandra is an open source, distributed NoSQL database that began internally at Facebook and was released as an open-source project in July 2008. Cassandra delivers continuous availability (zero downtime), high performance, and linear scalability that modern applications require, while also offering operational simplicity and effortless replication across data centers and geographies.
  • 23. Cassandra Ring Cluster -> token: 44 1-20 21-40 41-60 61-80 81-100 101-120 121-140 141-160 Coordinator node - Each nodes will be assigned a range of token - Client could connect to any nodes to write, that node will become the coordinator node - Partition keys will be hashed into a token. Coordinator will base on the token to know which node we can store the data user1 auth a1 login from homepa ge Destination node 23
  • 24. Replication Factor -> token: 44 1-20 21-40 41-60 61-80 81-100 101-120 121-140 141-160 Coordinator node Replication node - Replication Factor (RF) = number of copies we want to store - Replication node will be defined by the Replication Strategy - Simple strategy = next two nodes will be the replication node user1 auth a1 login from homepa ge 24
  • 25. Data Consistency C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 - Client 1 connect to C1 to read, C1 write data to three nodes, but failed at node B. - Client 2 also connect to C2 to read data, What would happen? 25
  • 26. Consistent Level (Write) Level Read Write One Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to make the other replicas consistent. A write must be written to the commit log and memtable of at least one replica node. Quorum Returns the record after a quorum of replicas has responded. A write must be written to the commit log and memtable on a quorum of replica nodes All Returns the record after all replicas have responded. The read operation will fail if a replica does not respond. A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition. 26
  • 27. Write with CL=ALL C1 C2 C A B Client 1 write data with token 44 v2 v1 v2 Write with CL=ALL - All replica succeeded -> success - 1 replica failed -> failed Result: Failed 27
  • 28. Write with CL=QUORUM C1 C2 C A B Client 1 write data with token 44 v2 v1 v2 Quorum = (RF + 1) / 2 = 2 - Two replicas succeeded -> success - Less than two success -> failed Result: Success 28
  • 29. Consistent Level (Read) Level Read Write One Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to make the other replicas consistent. A write must be written to the commit log and memtable of at least one replica node. Quorum Returns the record after a quorum of replicas has responded. A write must be written to the commit log and memtable on a quorum of replica nodes All Returns the record after all replicas have responded. The read operation will fail if a replica does not respond. A write must be written to the commit log and memtable on all replica nodes in the cluster for that partition. 29
  • 30. Write=QUORUM, Read=One C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 Potentially inconsistent read. If client 2 read node B, client 2 will receive stale-data. W (Quorum) + R (1) -> eventual consistent 30
  • 31. Write=QUORUM, Read=QUORUM C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 Any read combination will always return v2 W (QU) + R (QU) -> consistent 31
  • 32. Write=All, Read=One C1 C2 C A B Client 1 Client 2 read data with token 44 write data with token 44 v2 v1 v2 Potentially inconsistent read. If client 2 read node B, client 2 will receive stale-data. W (All) + R (1) -> consistent 32
  • 33. Summarize Read and Write CL WRITE READ Consistent Read Availability Write Availability All All Consistent Low Low Quorum All Consistent Low Medium One All Consistent Low High All Quoru m Consistent Medium Low Quorum Quoru m Consistent Medium Medium One Quoru m Inconsistent Medium High All One Consistent High Low Quorum One Inconsistent High Medium One One Inconsistent High High 33
  • 34. Summary Redis Cassandra Elastic Search Availability > Consistency Tweakable availability and consistency Availability < Consistency 34