SlideShare a Scribd company logo
1 of 25
Maintaining Consistency
Across Data Centers
or: How I Learned to Stop
Worrying About WAN Latency
Randy Fradin
BlackRock
2
• Part of BlackRock’s Aladdin Product Group
• Core Software Infrastructure - building scalable
storage, compute, and messaging systems
• Joined BlackRock in 2009
• Using Cassandra since 2011
• Excited to be speaking at #CassandraSummit 2016
• Also check out my talk from Cassandra Summit 2015,
“Multi-Tenancy in Cassandra at BlackRock”
About the Speaker
3
Who We Are
• is the world’s largest investment manager
• Over $4.5 trillion in assets under management
• is the world’s largest provider of exchange-traded funds
• #26 on list of the World’s Most Admired Companies 2016
• Advisor and technology provider
4
BlackRock as a Technology Provider
• is BlackRock’s
enterprise investment system
• Used by BlackRock and more
than 160 other institutions
around the world
• Generated over $500 million in
revenue last year
5
Cassandra at BlackRock
• Started using Cassandra 0.6 in development in 2010
• First production usage in 2011 on version 0.8
• Currently on version 2.1
Using Cassandra in Multiple Data Centers
7
Support for Data Centers in Cassandra
• A cluster can span wide distances
• Disaster recovery
• Proximity to other systems
• In Cassandra, “data center” == replication group
• Usually you group by proximity
• Can also group by type of workload
SITE 2
analytic
workload
SITE 2
production
workload
SITE 1
analytic
workload
SITE 1
production
workload
8
Using Data Centers in Cassandra
1. Tell the cluster where your nodes are:
• Use a snitch!
2. Tell the cluster where you want your data to go:
• CREATE KEYSPACE example WITH REPLICATION =
{ ‘class’ : ‘NetworkTopologyStrategy’, ‘DC1’ : ‘3’, ‘DC2’ : ‘3’ }
3. Write your data and watch it replicate to all your data centers!
• (…if they’re all available)
• Otherwise, hinted handoff, read repair, and anti-entropy repair have your back.
Client
= replica node
= non-replica node
* not discussed: racks, tokens, vnodes
9
Cross-Data Center Optimizations
Data moving between data centers is optimized:
• Cross-data center forwarding
• inter_dc_tcp_nodelay
• inter_dc_stream_throughput_outbound_megabits_per_sec
Client
= data
= data + forwarding addresses
10
Data Centers & Consistency Levels
• Every write or read is forwarded to corresponding replicas
• “consistency” = # of replies needed to succeed
• Reads reflect the latest writes (“strong consistency”) when:
read consistency + write consistency > replica count
• Some consistency levels are “aware” of data centers, others not:
Data center “oblivious”
• ANY
• ONE
• TWO
• THREE
• QUORUM
• ALL
Data center “aware”
• LOCAL_ONE
• LOCAL_QUORUM
• EACH_QUORUM
Strong Consistency Across Data Centers
12
Why Strong Consistency Across Data Centers?
• Typical Cassandra use cases prioritize low latency and high throughput.
• But, sometimes high availability and strong consistency are more important!
• Requirements:
1. Non-stop availability
2. Never lose data
13
Implementing Consistency Across Data Centers
What replication factor and consistency level should we use?
Globally ConsistentLocally Consistent
Requirements
3 replicas per data center +
LOCAL_QUORUM operations?
1 replica per data center +
QUORUM operations?
• Non-stop availability
• Never lose data
vs
Client
Client
Client
Client
14
Challenge 1: Latency
With all that latency on each operation, isn’t performance terrible?
Actually, this wasn’t such a problem:
• 10ms+ latency per operation is acceptable for many apps
• Minimize use of sequential operations
• High throughput still achievable
Client
10ms+ synchronous latency
15
Challenge 2: Inconsistent Performance
Actually, the picture is not so simple…
• ~12ms reads + writes from the east coast
• ~74ms reads + writes from the west coast
• 6X performance difference after failover
74ms
83ms
12ms
Client
Client
Client
QUORUM
takes 74ms+
QUORUM
takes 12ms+
16
Challenge 2: Inconsistent Performance
• We expanded the cluster to a 4th data center, on the west coast
• Now QUORUM = 3 out of 4 replicas
• Now we have the same (slow) performance everywhere! yay?
74ms
83ms
12ms
Client
Client
Client
QUORUM
takes 74ms+
QUORUM
takes 74ms+
74ms
15ms
17
Challenge 2: Inconsistent Performance
• But wait! For strong consistency we need R + W > N
• So we got creative: read TWO + write THREE > (N=4)
• Now reads take ~12-15ms and writes take ~74ms
• Swap for write-heavy workloads: read THREE + write TWO
74ms
83ms
12ms
Client
Client
read @ TWO takes 15ms+
74ms
15ms
write @ THREE takes 74ms+
write @ THREE takes 74ms+
read @ TWO takes 12ms+
18
Challenge 3: Migrating Data Centers
• Last year we migrated one of the east coast data centers
• Temporarily increased replica count from 4 to 5
• But TWO + THREE is not > 5 ! This violates strong consistency!
• What we really wanted was TWO + ALL_BUT_ONE
• But there is no ALL_BUT_ONE…
19
Challenge 3: Migrating Data Centers
• So we made THREE == 4 !
• … rather, we patched Cassandra to redefine THREE -> replica count minus one
Client
Client
read @ TWO takes 15ms+
write @ “THREE” takes 74ms+
write @ “THREE” takes 74ms+
read @ TWO takes 12ms+
(where THREE means 4!)
(where THREE means 4!)
20
Challenge 4: Performance Degradation
• If a single node fails, read latency goes from ~12ms to ~74ms
• Theoretical solution: 2 replicas per data center (8 in total), read THREE + write SIX
• But, once again, there is no SIX!
ClientClient
21
Challenge 5: Isolating Different Workloads
It’s useful to isolate analytic workloads from production workloads …
… but this isn’t possible if “production” is doing quorum across all replicas.
22
Challenge 5: Isolating Different Workloads
• Potential solution:
• Configure production nodes as one data center
• Use Cassandra’s rack feature to distribute data
evenly
• Use LOCAL_QUORUM on production nodes
• Configure analytic nodes into separate “data
centers”
• Issues:
• Does not permit TWO + THREE “quorum”
• Can’t reuse the same cluster for other apps
which want truly “local” LOCAL_QUORUM
analytic
workload
analytic
workload
analytic
workload
production
workload
Example: 3 physical sites, 4 Cassandra “data centers”
23
Making Consistency Pluggable
Many challenges could be solved if consistency were pluggable:
• Quorum across a subset of data centers
• “Uneven” quorums:
– read 2 + write N-1
– read (N+1)/2 + write (N/2)+1
– and so on…
• Local consistency with extra resiliency:
– LOCAL_QUORUM + X remote replicas
– LOCAL_QUORUM in 2 data centers
Consistency levels should be:
• User-definable
• Fully configurable
• Simple for operators to deploy
Discussion is ongoing: CASSANDRA-8119
24
Other Tips for Success
• Consider implications for fault-tolerance
• Two nodes offline in different data centers can cause failures
• Build a custom snitch to explicitly favor nearby data centers
• Increase native_transport_max_threads
• Enable inter_dc_tcp_nodelay
• Check your TCP window size settings
http://rockthecode.io/
@rockthecodeIO

More Related Content

What's hot

Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
DataWorks Summit
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
Jun Rao
 

What's hot (20)

Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
 
Let's scale-out PostgreSQL using Citus (English)
Let's scale-out PostgreSQL using Citus (English)Let's scale-out PostgreSQL using Citus (English)
Let's scale-out PostgreSQL using Citus (English)
 
Survey of High Performance NoSQL Systems
Survey of High Performance NoSQL SystemsSurvey of High Performance NoSQL Systems
Survey of High Performance NoSQL Systems
 
MySQL Failover and Orchestrator
MySQL Failover and OrchestratorMySQL Failover and Orchestrator
MySQL Failover and Orchestrator
 
Optimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performanceOptimizing MariaDB for maximum performance
Optimizing MariaDB for maximum performance
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQL
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
 
Oracle Database 12c : Multitenant
Oracle Database 12c : MultitenantOracle Database 12c : Multitenant
Oracle Database 12c : Multitenant
 
Redis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech TalkRedis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech Talk
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
ProxySQL Cluster - Percona Live 2022
ProxySQL Cluster - Percona Live 2022ProxySQL Cluster - Percona Live 2022
ProxySQL Cluster - Percona Live 2022
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 

Viewers also liked

Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
DataStax
 
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
DataStax
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
DataStax
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
DataStax
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
DataStax
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
DataStax
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
DataStax
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
DataStax
 

Viewers also liked (12)

Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
 
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
Clock Skew and Other Annoying Realities in Distributed Systems (Donny Nadolny...
 
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
The Promise and Perils of Encrypting Cassandra Data (Ameesh Divatia, Baffle, ...
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
PagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra FailuresPagerDuty: One Year of Cassandra Failures
PagerDuty: One Year of Cassandra Failures
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
 
Always On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on CassandraAlways On: Building Highly Available Applications on Cassandra
Always On: Building Highly Available Applications on Cassandra
 

Similar to Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassandra Summit 2016

RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with RedisRedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
Redis Labs
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
Uwe Friedrichsen
 

Similar to Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassandra Summit 2016 (20)

BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Devops kc
Devops kcDevops kc
Devops kc
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup?
 
Performance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerformance & Scalability Improvements in Perforce
Performance & Scalability Improvements in Perforce
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda Moran
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
 
Virtualization in 4-4 1-4 Data Center Network.
Virtualization in 4-4 1-4 Data Center Network.Virtualization in 4-4 1-4 Data Center Network.
Virtualization in 4-4 1-4 Data Center Network.
 
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with RedisRedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
RedisConf18 - Techniques for Synchronizing In-Memory Caches with Redis
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!Add Redis to Postgres to Make Your Microservices Go Boom!
Add Redis to Postgres to Make Your Microservices Go Boom!
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Real-world consistency explained
Real-world consistency explainedReal-world consistency explained
Real-world consistency explained
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 

More from DataStax

More from DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Recently uploaded (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 

Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassandra Summit 2016

  • 1. Maintaining Consistency Across Data Centers or: How I Learned to Stop Worrying About WAN Latency Randy Fradin BlackRock
  • 2. 2 • Part of BlackRock’s Aladdin Product Group • Core Software Infrastructure - building scalable storage, compute, and messaging systems • Joined BlackRock in 2009 • Using Cassandra since 2011 • Excited to be speaking at #CassandraSummit 2016 • Also check out my talk from Cassandra Summit 2015, “Multi-Tenancy in Cassandra at BlackRock” About the Speaker
  • 3. 3 Who We Are • is the world’s largest investment manager • Over $4.5 trillion in assets under management • is the world’s largest provider of exchange-traded funds • #26 on list of the World’s Most Admired Companies 2016 • Advisor and technology provider
  • 4. 4 BlackRock as a Technology Provider • is BlackRock’s enterprise investment system • Used by BlackRock and more than 160 other institutions around the world • Generated over $500 million in revenue last year
  • 5. 5 Cassandra at BlackRock • Started using Cassandra 0.6 in development in 2010 • First production usage in 2011 on version 0.8 • Currently on version 2.1
  • 6. Using Cassandra in Multiple Data Centers
  • 7. 7 Support for Data Centers in Cassandra • A cluster can span wide distances • Disaster recovery • Proximity to other systems • In Cassandra, “data center” == replication group • Usually you group by proximity • Can also group by type of workload SITE 2 analytic workload SITE 2 production workload SITE 1 analytic workload SITE 1 production workload
  • 8. 8 Using Data Centers in Cassandra 1. Tell the cluster where your nodes are: • Use a snitch! 2. Tell the cluster where you want your data to go: • CREATE KEYSPACE example WITH REPLICATION = { ‘class’ : ‘NetworkTopologyStrategy’, ‘DC1’ : ‘3’, ‘DC2’ : ‘3’ } 3. Write your data and watch it replicate to all your data centers! • (…if they’re all available) • Otherwise, hinted handoff, read repair, and anti-entropy repair have your back. Client = replica node = non-replica node * not discussed: racks, tokens, vnodes
  • 9. 9 Cross-Data Center Optimizations Data moving between data centers is optimized: • Cross-data center forwarding • inter_dc_tcp_nodelay • inter_dc_stream_throughput_outbound_megabits_per_sec Client = data = data + forwarding addresses
  • 10. 10 Data Centers & Consistency Levels • Every write or read is forwarded to corresponding replicas • “consistency” = # of replies needed to succeed • Reads reflect the latest writes (“strong consistency”) when: read consistency + write consistency > replica count • Some consistency levels are “aware” of data centers, others not: Data center “oblivious” • ANY • ONE • TWO • THREE • QUORUM • ALL Data center “aware” • LOCAL_ONE • LOCAL_QUORUM • EACH_QUORUM
  • 12. 12 Why Strong Consistency Across Data Centers? • Typical Cassandra use cases prioritize low latency and high throughput. • But, sometimes high availability and strong consistency are more important! • Requirements: 1. Non-stop availability 2. Never lose data
  • 13. 13 Implementing Consistency Across Data Centers What replication factor and consistency level should we use? Globally ConsistentLocally Consistent Requirements 3 replicas per data center + LOCAL_QUORUM operations? 1 replica per data center + QUORUM operations? • Non-stop availability • Never lose data vs Client Client Client Client
  • 14. 14 Challenge 1: Latency With all that latency on each operation, isn’t performance terrible? Actually, this wasn’t such a problem: • 10ms+ latency per operation is acceptable for many apps • Minimize use of sequential operations • High throughput still achievable Client 10ms+ synchronous latency
  • 15. 15 Challenge 2: Inconsistent Performance Actually, the picture is not so simple… • ~12ms reads + writes from the east coast • ~74ms reads + writes from the west coast • 6X performance difference after failover 74ms 83ms 12ms Client Client Client QUORUM takes 74ms+ QUORUM takes 12ms+
  • 16. 16 Challenge 2: Inconsistent Performance • We expanded the cluster to a 4th data center, on the west coast • Now QUORUM = 3 out of 4 replicas • Now we have the same (slow) performance everywhere! yay? 74ms 83ms 12ms Client Client Client QUORUM takes 74ms+ QUORUM takes 74ms+ 74ms 15ms
  • 17. 17 Challenge 2: Inconsistent Performance • But wait! For strong consistency we need R + W > N • So we got creative: read TWO + write THREE > (N=4) • Now reads take ~12-15ms and writes take ~74ms • Swap for write-heavy workloads: read THREE + write TWO 74ms 83ms 12ms Client Client read @ TWO takes 15ms+ 74ms 15ms write @ THREE takes 74ms+ write @ THREE takes 74ms+ read @ TWO takes 12ms+
  • 18. 18 Challenge 3: Migrating Data Centers • Last year we migrated one of the east coast data centers • Temporarily increased replica count from 4 to 5 • But TWO + THREE is not > 5 ! This violates strong consistency! • What we really wanted was TWO + ALL_BUT_ONE • But there is no ALL_BUT_ONE…
  • 19. 19 Challenge 3: Migrating Data Centers • So we made THREE == 4 ! • … rather, we patched Cassandra to redefine THREE -> replica count minus one Client Client read @ TWO takes 15ms+ write @ “THREE” takes 74ms+ write @ “THREE” takes 74ms+ read @ TWO takes 12ms+ (where THREE means 4!) (where THREE means 4!)
  • 20. 20 Challenge 4: Performance Degradation • If a single node fails, read latency goes from ~12ms to ~74ms • Theoretical solution: 2 replicas per data center (8 in total), read THREE + write SIX • But, once again, there is no SIX! ClientClient
  • 21. 21 Challenge 5: Isolating Different Workloads It’s useful to isolate analytic workloads from production workloads … … but this isn’t possible if “production” is doing quorum across all replicas.
  • 22. 22 Challenge 5: Isolating Different Workloads • Potential solution: • Configure production nodes as one data center • Use Cassandra’s rack feature to distribute data evenly • Use LOCAL_QUORUM on production nodes • Configure analytic nodes into separate “data centers” • Issues: • Does not permit TWO + THREE “quorum” • Can’t reuse the same cluster for other apps which want truly “local” LOCAL_QUORUM analytic workload analytic workload analytic workload production workload Example: 3 physical sites, 4 Cassandra “data centers”
  • 23. 23 Making Consistency Pluggable Many challenges could be solved if consistency were pluggable: • Quorum across a subset of data centers • “Uneven” quorums: – read 2 + write N-1 – read (N+1)/2 + write (N/2)+1 – and so on… • Local consistency with extra resiliency: – LOCAL_QUORUM + X remote replicas – LOCAL_QUORUM in 2 data centers Consistency levels should be: • User-definable • Fully configurable • Simple for operators to deploy Discussion is ongoing: CASSANDRA-8119
  • 24. 24 Other Tips for Success • Consider implications for fault-tolerance • Two nodes offline in different data centers can cause failures • Build a custom snitch to explicitly favor nearby data centers • Increase native_transport_max_threads • Enable inter_dc_tcp_nodelay • Check your TCP window size settings

Editor's Notes

  1. 1