SlideShare a Scribd company logo
1 of 50
Migrating 500 Nodes from Rackspace to Google
With Zero Downtime
Gilberto Müller
• Engineering Manager
• 17 YoE
• XP - Infrastructure and datastores
• METRONOM for 2.5 years
• Previously HSBC, Wipro, MasterCard
• SRE enthusiast
Paul Chandler
• Independent Cassandra Consultant
• First used Cassandra in 2014
• Designed this Google Move process
• Historically based in the Travel Industry
British Airways, Avis, TUI etc
METRO
• Leading international wholesale and
retail food specialist company
• 50+ years old
• 35 countries
• 764 stores (in 25 countries)
• 150.000 people worldwide
• ~24mn customers
• €36.5bn on sales for fiscal year
2017/18
METRONOM
• The biggest software company
you never heard about (from our CEO)
• Digital transformation started in 2015
• Platform as a Service and Dev
• Cassandra started as the only option
• 8 Platform teams (changing over time)
• Multiple DCs in different countries,
hybrid-cloud (EU, CH, and RU*)
• 100+ application development teams
• MCC main customer
NoSQL Team
• 9 people from 10 different places
• Agile: Dash
• Shared responsibility
• Consultancy
• SRE
• DevOps
• Infrastructure as a Code
• Provisioning, patch, upgrade
• Support
• Migrations
• We offer a platform, not DBA
• Service wrapper (whole platform)
• Backup and restore (whole platform)
• On-call
Products
• Apache Cassandra
• DataStax Enterprise
• Apache Solr (Solr Cloud)
• DSE Search
• Apache Spark
• HDFS*
DataStax, is a registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Solr, Apache Spark, Spark, Apache Zookeeper, Zookeeper, Apache Hadoop, and Hadoop are
either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other
countries.
Technologies
and Numbers
• Zookeeper
• HAProxy
• Nginx
• OpsCenter
• Graphana
• PostgreSQL
• Puppet
• Jenkins
• Java
• Linux
• 1200+ servers
• 300+ clusters
• 165+ C* (both flavours)
• 80+ Solr
Implementation
Steady State - 1 Datacenter RS_UK
RS_UK
• Multiple Clusters
• Move 1 cluster at time
• No Downtime allowed
RS_UK
• Local consistency types for Reading and
Writing
• LOCAL_ONE
LOCAL_QUORUM
• Application Driver needs to DC Aware policy
• Light Weight Transactions (LWT) must use
LOCAL_SERIAL
Application Pre Requisites
RS_UK
ALTER KEYSPACE system_auth WITH replication =
{'class': 'NetworkTopologyStrategy', ‘RS_UK': 3,
‘GL_EU': 3};
Keyspaces:
 system_auth
 system_schema
 dse_leases
 system_distributed
 dse_perf
 system_traces
 dse_security
Step 1 – Alter system keyspaces
RS_UK
GL_EU• Can be different
Number of Nodes
• Only System keyspaces
automatically migrated
• Should be quick
Step 2 - Create Nodes in New Datacenter
RS_UK
GL_EUcassandra.yaml
• cluster_name: Must be the same
for both datacenters
• seeds: should point to seeds in
RS_UK
cassandra-rackdc.properties
• dc should be the new datacenter
Continue using
GossipingPropertyFileSnitch
Step 2 - Create Nodes in New Datacenter
RS_UK
GL_EU
Nodes created and system keyspaces copied
RS_UK
GL_EU
• Must still connect to
RS_UK
• No Data in GL_EU
Nodes created and system keyspaces copied
RS_UK
GL_EU
• ALTER KEYSPACE user_keyspace1 WITH replication = {'class': 'NetworkTopologyStrategy',
‘RS_UK': 3, ‘GL_EU': 3};
• ALTER KEYSPACE user_keyspace2 WITH replication = {'class': 'NetworkTopologyStrategy',
‘RS_UK': 3, ‘GL_EU': 3};
• ALTER KEYSPACE user_keyspace3 WITH replication = {'class': 'NetworkTopologyStrategy',
‘RS_UK': 3, ‘GL_EU': 3};
Step 3 – Alter Replication for User Keyspaces
RS_UK
GL_EUAt This Point:
• Inserted data replicated
• Old data not replicated
(yet)
• Still don’t connect
• Lots of data missing
Keyspaces Replicated
RS_UK
GL_EUOn each new node run in turn
• nodetool rebuild RS_UK
This will take some time, best to script this section
Step 4 – Rebuild Nodes
RS_UK
GL_EUNodes gain data one
node at a time
Step 4 – Rebuild Nodes
RS_UK
GL_EUFully functioning cluster:
• Connect to either DC
• Data flows
automatically
Nodes Rebuilt
RS_UK
GL_EUcassandra.yaml
change seed nodes to be nodes in GL_EU
Point all applications to new datacenter
Full repair on all nodes in new datacenter
Prepare for Decommission
RS_UK
GL_EU
Prepare for Decommission
RS_UK
GL_EU
• ALTER KEYSPACE user_keyspace1 WITH replication = {'class': 'NetworkTopologyStrategy',
‘GL_EU': 3};
• ALTER KEYSPACE user_keyspace2 WITH replication = {'class': 'NetworkTopologyStrategy',
‘GL_EU': 3};
• ALTER KEYSPACE user_keyspace3 WITH replication = {'class': 'NetworkTopologyStrategy', ‘
‘GL_EU': 3};
• Plus system keyspaces
Alter Replication to one Datacenter for ALL keyspaces
RS_UK
GL_EU
Data now Disconnectecd
RS_UK
GL_EU
• Stop each node in RS_UK
• Decommission each node in turn
• nodetool removenode xxxxxxxxxxxxxxxx
Decommission RS_UK nodes
RS_UK
GL_EU
Decommission RS_UK nodes
Datacenter: RS_UK
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.29.30.29 11.66 GB 256 ? ab479afd-c754-47f7-92fb-47790d734ac9 rack1
UN 10.29.30.33 12.32 GB 256 ? 9aa1c5c5-c6cd-4267-ba68-c6bd8b2ac460 rack2
UN 10.29.30.34 12.16 GB 256 ? db454258-ac73-4a8a-9c75-226108c66889 rack3
Datacenter: GL_EU
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.131.134.35 13.19 GB 256 ? 114b4a37-7d69-40e5-988b-a4c998e7a02a rack1
UN 10.131.134.39 12.14 GB 256 ? 4173fc2a-e65c-43aa-baa4-a5eefe0ceb60 rack2
UN 10.131.134.42 12.97 GB 256 ? 8b5dde02-1ff1-48cc-9900-6d8f2bb339bf rack3
nodetool removenode ab479afd-c754-47f7-92fb-47790d734ac9
RS_UK
GL_EU• Data successfully
moved
• Old Datacenter
decommissioned
Movement Complete
What Possibly Could Go Wrong ?
Network Performance
Test the network performance between Datacenters
Network Performance
• Enough Bandwidth
• Not stealing all bandwidth
iperf3
• iperf3 –s
• iperf3 -c xxx.xxx.xxx.xxxx
• iperf3 -c xxx.xxx.xxx.xxxx -b 10G
• iperf3 -c xxx.xxx.xxx.xxxx -C yeah
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 17.1 GBytes 14.7 Gbits/sec
net.ipv4.tcp_congestion_control=yeah
Nodetool setinterdcstreamthroughput xxx
Views
( Pre 5.0.12 only )
Views
• Views rebuilt – not streamed
• Uses selects on table to rebuild
= Tombstone Trouble
Memory
Heavy use of Heap memory
Heap Size
• Streaming and Compaction use up memory
• Heap size can be increased
• Don’t need to worry about GC pauses
• Change back before connecting applications
Compaction Throughput
• Large amount of data streamed
• Compaction Lag
• Lots of small sstables
• Update Compaction Throughput
nodetool setcompactionthroughput xxxxx
Streaming Throughput
• Reduce pressure if needed
• Reduce only streaming between datacenters
nodetool setinterdcstreamthroughput xxxxx
Application Latency
RS_UK
GL_EUselect column from table
where id = 1
• 3 nodes holding data
per DC
Multi DC Replication
RS_UK
GL_EU
2 nodes of:
Node3
Node4
Node5
LOCAL QUORUM
RS_UK
GL_EU
4 nodes of:
Node3
Node4
Node5
Node8
Node9
Node10
At least one in
2nd DC
250 miles
22 m/s
QUORUM
Lightweight Transactions (LWT)
insert into table (id, name)
values (1, “Name” )
IF NOT EXISTS
Uses Paxos algorithm
Uses different consistency level for Paxos
SERIAL or LOCAL_SERIAL
RS_UK
GL_EUselect column from table
where id = 1
• Without DC aware
there
will be problems
Load Balancing Policy
Implementation
• DB of cluster and node names
• Automatic scripts to create cloud
instances
• Scale clusters up or down
• Puppet
• Jenkins jobs
• Rebuild stage
• Decommission stage
• Service wrapper to protect integrity of
cluster
Conclusion
Success
• 91 Clusters moved
• Solr migration (not covered here)
• No C* cluster downtime
• Incorrect consistency sometimes caused application downtime
• April 2018 - October 2018
• One cluster delayed until February 2019
• Padding 0s with compression
• Automation is a must
Process can also be used for
• Splitting clusters (i.e.: multi-tenant)
• Updating non-trivial configuration
• num_tokens
• Upgrading underlying operating system
• Ubuntu upgrades (upstart –> systemd)
Thank You
More details can be found at:
https://bit.ly/2Lnosw6
Paul ChandlerGilberto Müller
Any Questions?

More Related Content

What's hot

Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016DataStax
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistencyScyllaDB
 
Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0Shi Shao Feng
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...DataStax
 
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseCloudera, Inc.
 
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin SeyfeSOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin SeyfeDatabricks
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflixVinay Kumar Chella
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...DataStax
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk ModellingScyllaDB
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScyllaDB
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
 
Wayfair Use Case: The four R's of Metrics Delivery
Wayfair Use Case: The four R's of Metrics DeliveryWayfair Use Case: The four R's of Metrics Delivery
Wayfair Use Case: The four R's of Metrics DeliveryInfluxData
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
HBase at Flurry
HBase at FlurryHBase at Flurry
HBase at Flurryddlatham
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudVinay Kumar Chella
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applicationsBen Slater
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2ScyllaDB
 
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...ScyllaDB
 

What's hot (20)

Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
Storing Cassandra Metrics (Chris Lohfink, DataStax) | C* Summit 2016
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0Realtime olap architecture in apache kylin 3.0
Realtime olap architecture in apache kylin 3.0
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
Accumulo Summit 2015: Performance Models for Apache Accumulo: The Heavy Tail ...
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
HBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBaseHBaseCon 2013: Scalable Network Designs for Apache HBase
HBaseCon 2013: Scalable Network Designs for Apache HBase
 
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin SeyfeSOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflix
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Wayfair Use Case: The four R's of Metrics Delivery
Wayfair Use Case: The four R's of Metrics DeliveryWayfair Use Case: The four R's of Metrics Delivery
Wayfair Use Case: The four R's of Metrics Delivery
 
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
HBase at Flurry
HBase at FlurryHBase at Flurry
HBase at Flurry
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
 
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
 

Similar to Migrating 500 Nodes from Rackspace to Google Cloud with Zero Downtime

Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureDataStax Academy
 
The Convergence of HPC and Deep Learning
The Convergence of HPC and Deep LearningThe Convergence of HPC and Deep Learning
The Convergence of HPC and Deep Learninginside-BigData.com
 
Göteborg Distributed: Eventual Consistency in Apache Cassandra
Göteborg Distributed: Eventual Consistency in Apache CassandraGöteborg Distributed: Eventual Consistency in Apache Cassandra
Göteborg Distributed: Eventual Consistency in Apache CassandraJeremy Hanna
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAndre Essing
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialmadhuinturi
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plansinside-BigData.com
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster inwin stack
 
Performance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudPerformance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudBrendan Gregg
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
Simulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightningSimulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightningCloudLightning
 

Similar to Migrating 500 Nodes from Rackspace to Google Cloud with Zero Downtime (20)

BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & AzureHigh Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
 
The Convergence of HPC and Deep Learning
The Convergence of HPC and Deep LearningThe Convergence of HPC and Deep Learning
The Convergence of HPC and Deep Learning
 
Göteborg Distributed: Eventual Consistency in Apache Cassandra
Göteborg Distributed: Eventual Consistency in Apache CassandraGöteborg Distributed: Eventual Consistency in Apache Cassandra
Göteborg Distributed: Eventual Consistency in Apache Cassandra
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep Dive
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Devops kc
Devops kcDevops kc
Devops kc
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plans
 
Scalable IoT platform
Scalable IoT platformScalable IoT platform
Scalable IoT platform
 
Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster Ambedded - how to build a true no single point of failure ceph cluster
Ambedded - how to build a true no single point of failure ceph cluster
 
Performance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudPerformance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloud
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
Simulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightningSimulating Heterogeneous Resources in CloudLightning
Simulating Heterogeneous Resources in CloudLightning
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Migrating 500 Nodes from Rackspace to Google Cloud with Zero Downtime

  • 1. Migrating 500 Nodes from Rackspace to Google With Zero Downtime
  • 2. Gilberto Müller • Engineering Manager • 17 YoE • XP - Infrastructure and datastores • METRONOM for 2.5 years • Previously HSBC, Wipro, MasterCard • SRE enthusiast
  • 3. Paul Chandler • Independent Cassandra Consultant • First used Cassandra in 2014 • Designed this Google Move process • Historically based in the Travel Industry British Airways, Avis, TUI etc
  • 4. METRO • Leading international wholesale and retail food specialist company • 50+ years old • 35 countries • 764 stores (in 25 countries) • 150.000 people worldwide • ~24mn customers • €36.5bn on sales for fiscal year 2017/18
  • 5. METRONOM • The biggest software company you never heard about (from our CEO) • Digital transformation started in 2015 • Platform as a Service and Dev • Cassandra started as the only option • 8 Platform teams (changing over time) • Multiple DCs in different countries, hybrid-cloud (EU, CH, and RU*) • 100+ application development teams • MCC main customer
  • 6. NoSQL Team • 9 people from 10 different places • Agile: Dash • Shared responsibility • Consultancy • SRE • DevOps • Infrastructure as a Code • Provisioning, patch, upgrade • Support • Migrations • We offer a platform, not DBA • Service wrapper (whole platform) • Backup and restore (whole platform) • On-call
  • 7. Products • Apache Cassandra • DataStax Enterprise • Apache Solr (Solr Cloud) • DSE Search • Apache Spark • HDFS* DataStax, is a registered trademark of DataStax, Inc. and its subsidiaries in the United States and/or other countries. Apache, Apache Cassandra, Cassandra, Apache Solr, Apache Spark, Spark, Apache Zookeeper, Zookeeper, Apache Hadoop, and Hadoop are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
  • 8. Technologies and Numbers • Zookeeper • HAProxy • Nginx • OpsCenter • Graphana • PostgreSQL • Puppet • Jenkins • Java • Linux • 1200+ servers • 300+ clusters • 165+ C* (both flavours) • 80+ Solr
  • 10. Steady State - 1 Datacenter RS_UK RS_UK • Multiple Clusters • Move 1 cluster at time • No Downtime allowed
  • 11. RS_UK • Local consistency types for Reading and Writing • LOCAL_ONE LOCAL_QUORUM • Application Driver needs to DC Aware policy • Light Weight Transactions (LWT) must use LOCAL_SERIAL Application Pre Requisites
  • 12. RS_UK ALTER KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', ‘RS_UK': 3, ‘GL_EU': 3}; Keyspaces:  system_auth  system_schema  dse_leases  system_distributed  dse_perf  system_traces  dse_security Step 1 – Alter system keyspaces
  • 13. RS_UK GL_EU• Can be different Number of Nodes • Only System keyspaces automatically migrated • Should be quick Step 2 - Create Nodes in New Datacenter
  • 14. RS_UK GL_EUcassandra.yaml • cluster_name: Must be the same for both datacenters • seeds: should point to seeds in RS_UK cassandra-rackdc.properties • dc should be the new datacenter Continue using GossipingPropertyFileSnitch Step 2 - Create Nodes in New Datacenter
  • 15. RS_UK GL_EU Nodes created and system keyspaces copied
  • 16. RS_UK GL_EU • Must still connect to RS_UK • No Data in GL_EU Nodes created and system keyspaces copied
  • 17. RS_UK GL_EU • ALTER KEYSPACE user_keyspace1 WITH replication = {'class': 'NetworkTopologyStrategy', ‘RS_UK': 3, ‘GL_EU': 3}; • ALTER KEYSPACE user_keyspace2 WITH replication = {'class': 'NetworkTopologyStrategy', ‘RS_UK': 3, ‘GL_EU': 3}; • ALTER KEYSPACE user_keyspace3 WITH replication = {'class': 'NetworkTopologyStrategy', ‘RS_UK': 3, ‘GL_EU': 3}; Step 3 – Alter Replication for User Keyspaces
  • 18. RS_UK GL_EUAt This Point: • Inserted data replicated • Old data not replicated (yet) • Still don’t connect • Lots of data missing Keyspaces Replicated
  • 19. RS_UK GL_EUOn each new node run in turn • nodetool rebuild RS_UK This will take some time, best to script this section Step 4 – Rebuild Nodes
  • 20. RS_UK GL_EUNodes gain data one node at a time Step 4 – Rebuild Nodes
  • 21. RS_UK GL_EUFully functioning cluster: • Connect to either DC • Data flows automatically Nodes Rebuilt
  • 22. RS_UK GL_EUcassandra.yaml change seed nodes to be nodes in GL_EU Point all applications to new datacenter Full repair on all nodes in new datacenter Prepare for Decommission
  • 24. RS_UK GL_EU • ALTER KEYSPACE user_keyspace1 WITH replication = {'class': 'NetworkTopologyStrategy', ‘GL_EU': 3}; • ALTER KEYSPACE user_keyspace2 WITH replication = {'class': 'NetworkTopologyStrategy', ‘GL_EU': 3}; • ALTER KEYSPACE user_keyspace3 WITH replication = {'class': 'NetworkTopologyStrategy', ‘ ‘GL_EU': 3}; • Plus system keyspaces Alter Replication to one Datacenter for ALL keyspaces
  • 26. RS_UK GL_EU • Stop each node in RS_UK • Decommission each node in turn • nodetool removenode xxxxxxxxxxxxxxxx Decommission RS_UK nodes
  • 27. RS_UK GL_EU Decommission RS_UK nodes Datacenter: RS_UK =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.29.30.29 11.66 GB 256 ? ab479afd-c754-47f7-92fb-47790d734ac9 rack1 UN 10.29.30.33 12.32 GB 256 ? 9aa1c5c5-c6cd-4267-ba68-c6bd8b2ac460 rack2 UN 10.29.30.34 12.16 GB 256 ? db454258-ac73-4a8a-9c75-226108c66889 rack3 Datacenter: GL_EU =================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 10.131.134.35 13.19 GB 256 ? 114b4a37-7d69-40e5-988b-a4c998e7a02a rack1 UN 10.131.134.39 12.14 GB 256 ? 4173fc2a-e65c-43aa-baa4-a5eefe0ceb60 rack2 UN 10.131.134.42 12.97 GB 256 ? 8b5dde02-1ff1-48cc-9900-6d8f2bb339bf rack3 nodetool removenode ab479afd-c754-47f7-92fb-47790d734ac9
  • 28. RS_UK GL_EU• Data successfully moved • Old Datacenter decommissioned Movement Complete
  • 29. What Possibly Could Go Wrong ?
  • 30. Network Performance Test the network performance between Datacenters
  • 31. Network Performance • Enough Bandwidth • Not stealing all bandwidth
  • 32. iperf3 • iperf3 –s • iperf3 -c xxx.xxx.xxx.xxxx • iperf3 -c xxx.xxx.xxx.xxxx -b 10G • iperf3 -c xxx.xxx.xxx.xxxx -C yeah [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 17.1 GBytes 14.7 Gbits/sec
  • 35. Views • Views rebuilt – not streamed • Uses selects on table to rebuild = Tombstone Trouble
  • 36. Memory Heavy use of Heap memory
  • 37. Heap Size • Streaming and Compaction use up memory • Heap size can be increased • Don’t need to worry about GC pauses • Change back before connecting applications
  • 38. Compaction Throughput • Large amount of data streamed • Compaction Lag • Lots of small sstables • Update Compaction Throughput nodetool setcompactionthroughput xxxxx
  • 39. Streaming Throughput • Reduce pressure if needed • Reduce only streaming between datacenters nodetool setinterdcstreamthroughput xxxxx
  • 41. RS_UK GL_EUselect column from table where id = 1 • 3 nodes holding data per DC Multi DC Replication
  • 43. RS_UK GL_EU 4 nodes of: Node3 Node4 Node5 Node8 Node9 Node10 At least one in 2nd DC 250 miles 22 m/s QUORUM
  • 44. Lightweight Transactions (LWT) insert into table (id, name) values (1, “Name” ) IF NOT EXISTS Uses Paxos algorithm Uses different consistency level for Paxos SERIAL or LOCAL_SERIAL
  • 45. RS_UK GL_EUselect column from table where id = 1 • Without DC aware there will be problems Load Balancing Policy
  • 46. Implementation • DB of cluster and node names • Automatic scripts to create cloud instances • Scale clusters up or down • Puppet • Jenkins jobs • Rebuild stage • Decommission stage • Service wrapper to protect integrity of cluster
  • 48. Success • 91 Clusters moved • Solr migration (not covered here) • No C* cluster downtime • Incorrect consistency sometimes caused application downtime • April 2018 - October 2018 • One cluster delayed until February 2019 • Padding 0s with compression • Automation is a must
  • 49. Process can also be used for • Splitting clusters (i.e.: multi-tenant) • Updating non-trivial configuration • num_tokens • Upgrading underlying operating system • Ubuntu upgrades (upstart –> systemd)
  • 50. Thank You More details can be found at: https://bit.ly/2Lnosw6 Paul ChandlerGilberto Müller Any Questions?