SlideShare a Scribd company logo
1 of 18
Introduction to
MongoDB sharding
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
About me
• Product engineer at ServerDensity
• Working with mongoDB in production for more than 4 years
• Python and php programmer
• Pybcn co-organizer
• FOSDEM volunteer
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What is sharding?
It’s the system MongoDB uses to:
• Distribute writes
• Distribute primary reads
• Distribute data
• Or, in other words, grow horizontally and scale
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What does it look like?
• Like this:
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What does it look like?
• Or like this:
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Nomenclature:
• Shard:
• Logical data partition
• Each shard is handled by a server or replica set
• Shard key:
• Key that all documents MUST have
• Decided by the user
• Chunk:
• Logical data partition inside a shard
• They be split into 2 smaller chunks
• They can be moved to another shard for balancing
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What does each component do?
• Mongos processes route data
• Config servers hold metadata:
• What chunks are there
• What shard holds each chunk
• Which chunks are being migrated
• The shard servers hold the actual data
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
How does it work?
Whenever you read/write data this happens:
1. You run your query in your shell/driver
2. Your driver contacts the mongos process (a proxy)
3. The mongos process retrieves metadata from the config servers
4. Based on the metadata, asks the shards affected by the query to run
their part of the job
5. Mongos returns the result
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Data partitioning
Your data will be split in chunks based on your shard key:
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Choosing a good shard key
In order to get a good shard key it has to:
• Be used in ALL queries
• Allow a huge amount of possible values:
• Sha1 hash -> good
• Phone number -> not bad
• Zip code -> bad
• Boolean -> awful
• Have values evenly distributed across all the key space
If your shard key has a big cardinality, but it’s not evenly distributed
across the key space: use a hashed shard key
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Chunk partitioning
Whenever a chunk reaches certain size, the mongos process will try to
split it into two:
This will fail if all docs in this chunk belong to the same shard key value
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Balancing
• Inevitably, some shards will get more chunks than others
• The sharded cluster will automatically move chunks from crowded
shards to under-populated shards:
• It’s possible to start/stop and customize the balancing algorithm
• It’s possible to manually move chunks around
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
HA in a sharded cluster
In order to achieve HA in a sharded cluster you’ll need:
• 3 config servers:
• As long as 1 is up you’ll be able to read/write into the collection
• If a config server is down the metadata collection will be read-
only, so you won’t be able to:
• Split chunks
• Balance the cluster
• Add shards
• N shards; each one with, at least:
• 2 data bearing-nodes
• An arbiter or another data-bearing node
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Demo time!
Creating a new demo sharded cluster:
sudo service mongod stop
mkdir shard0
mkdir shard1
mkdir config
# Start the config server
mongod --fork --syslog --configsvr --dbpath config --port 27019
# Start the shard servers
mongod --fork --syslog --dbpath shard0 --port 30000
mongod --fork --syslog --dbpath shard1 --port 30001
# Start the mongos process
mongos --fork --syslog --configdb localhost:27019
# Add shards
mongo initSharding.js
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Demo time!
Creating a new demo sharded cluster:
//Creating shards
sh.addShard("localhost:30000");
sh.addShard("localhost:30001");
//Adding test data
for (i = 0; i < 10000; i++) {
db.testdata.insert({"i": i})
}
//Creating index
db.testdata.createIndex({"i": 1});
//Enabling sharding
sh.enableSharding("test")
sh.shardCollection("test.testdata", {i:1})
//Manually splitting chunks
for(i = 1; i < 20; i++) {
sh.splitAt("test.testdata", {"i": i*500})
}
//Status
print(sh.status(true));
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Questions?
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
We’re hiring!
We’re looking for awesome engineers!
Talk to me after the presentation or go to:
https://www.serverdensity.com/jobs/
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Code
https://github.com/jsoucheiron/mongodb-barcelona-sharding-introduction
Slides
http://www.slideshare.net/jordixou (soon)

More Related Content

Similar to Introduction to MongoDB sharding

MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionDaniel Coupal
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchMongoDB
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: ShardingMongoDB
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012MongoDB
 
MongoDB Hacks of Frustration
MongoDB Hacks of FrustrationMongoDB Hacks of Frustration
MongoDB Hacks of FrustrationMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingMongoDB
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)MongoDB
 
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 20185 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018Matthew Groves
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Mydbops
 

Similar to Introduction to MongoDB sharding (20)

MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun Verch
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: Sharding
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012
 
MongoDB Hacks of Frustration
MongoDB Hacks of FrustrationMongoDB Hacks of Frustration
MongoDB Hacks of Frustration
 
Sharding
ShardingSharding
Sharding
 
Sharding
ShardingSharding
Sharding
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
How to scale MongoDB
How to scale MongoDBHow to scale MongoDB
How to scale MongoDB
 
MongoDB by Tonny
MongoDB by TonnyMongoDB by Tonny
MongoDB by Tonny
 
Sharding
ShardingSharding
Sharding
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to sharding
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)
 
Tag based sharding presentation
Tag based sharding presentationTag based sharding presentation
Tag based sharding presentation
 
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 20185 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding
 

Recently uploaded

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Introduction to MongoDB sharding

  • 1. Introduction to MongoDB sharding Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29
  • 2. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 About me • Product engineer at ServerDensity • Working with mongoDB in production for more than 4 years • Python and php programmer • Pybcn co-organizer • FOSDEM volunteer
  • 3. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What is sharding? It’s the system MongoDB uses to: • Distribute writes • Distribute primary reads • Distribute data • Or, in other words, grow horizontally and scale
  • 4. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What does it look like? • Like this:
  • 5. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What does it look like? • Or like this:
  • 6. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Nomenclature: • Shard: • Logical data partition • Each shard is handled by a server or replica set • Shard key: • Key that all documents MUST have • Decided by the user • Chunk: • Logical data partition inside a shard • They be split into 2 smaller chunks • They can be moved to another shard for balancing
  • 7. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What does each component do? • Mongos processes route data • Config servers hold metadata: • What chunks are there • What shard holds each chunk • Which chunks are being migrated • The shard servers hold the actual data
  • 8. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 How does it work? Whenever you read/write data this happens: 1. You run your query in your shell/driver 2. Your driver contacts the mongos process (a proxy) 3. The mongos process retrieves metadata from the config servers 4. Based on the metadata, asks the shards affected by the query to run their part of the job 5. Mongos returns the result
  • 9. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Data partitioning Your data will be split in chunks based on your shard key:
  • 10. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Choosing a good shard key In order to get a good shard key it has to: • Be used in ALL queries • Allow a huge amount of possible values: • Sha1 hash -> good • Phone number -> not bad • Zip code -> bad • Boolean -> awful • Have values evenly distributed across all the key space If your shard key has a big cardinality, but it’s not evenly distributed across the key space: use a hashed shard key
  • 11. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Chunk partitioning Whenever a chunk reaches certain size, the mongos process will try to split it into two: This will fail if all docs in this chunk belong to the same shard key value
  • 12. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Balancing • Inevitably, some shards will get more chunks than others • The sharded cluster will automatically move chunks from crowded shards to under-populated shards: • It’s possible to start/stop and customize the balancing algorithm • It’s possible to manually move chunks around
  • 13. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 HA in a sharded cluster In order to achieve HA in a sharded cluster you’ll need: • 3 config servers: • As long as 1 is up you’ll be able to read/write into the collection • If a config server is down the metadata collection will be read- only, so you won’t be able to: • Split chunks • Balance the cluster • Add shards • N shards; each one with, at least: • 2 data bearing-nodes • An arbiter or another data-bearing node
  • 14. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Demo time! Creating a new demo sharded cluster: sudo service mongod stop mkdir shard0 mkdir shard1 mkdir config # Start the config server mongod --fork --syslog --configsvr --dbpath config --port 27019 # Start the shard servers mongod --fork --syslog --dbpath shard0 --port 30000 mongod --fork --syslog --dbpath shard1 --port 30001 # Start the mongos process mongos --fork --syslog --configdb localhost:27019 # Add shards mongo initSharding.js
  • 15. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Demo time! Creating a new demo sharded cluster: //Creating shards sh.addShard("localhost:30000"); sh.addShard("localhost:30001"); //Adding test data for (i = 0; i < 10000; i++) { db.testdata.insert({"i": i}) } //Creating index db.testdata.createIndex({"i": 1}); //Enabling sharding sh.enableSharding("test") sh.shardCollection("test.testdata", {i:1}) //Manually splitting chunks for(i = 1; i < 20; i++) { sh.splitAt("test.testdata", {"i": i*500}) } //Status print(sh.status(true));
  • 16. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Questions?
  • 17. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 We’re hiring! We’re looking for awesome engineers! Talk to me after the presentation or go to: https://www.serverdensity.com/jobs/
  • 18. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Code https://github.com/jsoucheiron/mongodb-barcelona-sharding-introduction Slides http://www.slideshare.net/jordixou (soon)