SlideShare a Scribd company logo
1 of 31
Download to read offline
Apr-26-2017
Exploring the replication
and sharding in MongoDB
• Master degree, Software Engineering
• Working with databases since 2007
• MySQL, LAMP, Linux since 2010
• Pythian OSDB managed services since 2014
• Lead Database Consultant since 2016
• C100DBA: MongoDB certified DBA
https://mk.linkedin.com/in/igorle @igorLE
About me
© 2017 Pythian. Confidential
Overview
• What is replica set, how replication works, replication concepts
• Replica set features, deployment architectures
• Vertical vs Horizontal scaling
• What is a sharded cluster in MongoDB
• Cluster components - shards, config servers, mongos
• Shard keys and chunks
• Hashed vs range based sharding
• QA
© 2017 Pythian. Confidential
Replication
• Group of mongod processes that maintain the same data set
• Redundancy and high availability
• Increased read capacity
© 2017 Pythian. Confidential
Replication concept
1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary*
Automatic failover on Primary failure
*settings.chainingAllowed (true by default)
© 2017 Pythian. Confidential
2. oplog
1.
3. 3.
4. 4.
5.
Replica set oplog
• Special capped collection that keeps a rolling record of all
operations that modify the data stored in the databases
• Idempotent
• Default oplog size
For Unix and Windows systems
Storage Engine Default Oplog Size Lower Bound Upper Bound
In-memory 5% of physical memory 50MB 50GB
WiredTiger 5% of free disk space 990MB 50GB
MMAPv1 5% of free disk space 990MB 50GB
© 2017 Pythian. Confidential
● Start each server with config options for replSet
/usr/bin/mongod --replSet "myRepl"
● Initiate the replica set on one node - rs.initialize()
● Verify the configuration - rs.conf()
Deployment
© 2017 Pythian. Confidential
Node 2 Node 3
rs.initialize() Node 1
• Add the rest of the nodes - rs.add() on the Primary node
rs.add(“node2:27017”) , rs.add(“node3:27017”)
• Check the status of the replica set - rs.status()
Deployment
© 2017 Pythian. Confidential
Node 2 Node 3
rs.add()
Configuration options
• 50 members per replica set (7 voting members)
• Arbiter node
• Priority 0 node
• Hidden node
• Delayed node
• Write concern
• Read preference
© 2017 Pythian. Confidential
Arbiter node
• Does not hold copy of data
• Votes in elections
hidden : true
© 2017 Pythian. Confidential
Arbiter
Priority 0 node
Priority - floating point (i.e. decimal) number between 0 and 1000
• Never becomes primary
• Visible to application
• Node with highest priority is eventually elected as Primary
© 2017 Pythian. Confidential
Secondary
priority : 0
Hidden node
• Never becomes primary
• Not visible to application
• Use cases
○ reporting
○ backups
hidden : true
© 2017 Pythian. Confidential
hidden: true priority:0
Secondary
hidden : true priority : 0
Delayed node
• Must be priority 0 member
• Should be hidden member (not mandatory)
• Has votes 1 by default
• Considerations (oplog size)
• Mainly used for backups
© 2017 Pythian. Confidential
Secondary
slaveDelay : 3600
priority : 0
hidden : true
Write concern
{ w: <value>, j: <boolean>, wtimeout: <number> }
● w - number of mongod instances that
acknowledged the write operation
● j - acknowledgement that the write operation
has been written to the journal
● wtimeout - time limit to prevent write operations
from blocking indefinitely
© 2017 Pythian. Confidential
Read preference
How read operations are routed to replica set members
1. primary (by default)
2. primaryPreferred
3. secondary
4. secondaryPreferred
5. nearest (least network latency)
MongoDB 3.4 maxStalenessSeconds ( >= 90 seconds)
© 2017 Pythian. Confidential
Driver
1.
2. 2.
2.
3. 3.4.4.
4.
5. 5.
5.
• CPU
• RAM
• DISK
Scaling (vertical)
© 2017 Pythian. Confidential
• Method for distributing data across multiple machines
• Splitting data across multiple horizontal partitions
Scaling (horizontal) - Sharding
© 2017 Pythian. Confidential
Sharding with MongoDB
Shard
© 2017 Pythian. Confidential
Shard
Data
Sharding with MongoDB
• Shard/Replica set
(subset of the sharded data)
• Config servers
(metadata and config settings)
• mongos
(query router, cluster interface)
© 2017 Pythian. Confidential
Shard
Data
Sharding with MongoDB
• Shard/Replica set
(subset of the sharded data)
• Config servers
(metadata and config settings)
• mongos
(query router, cluster interface)
sh.addShard("shardName")
© 2017 Pythian. Confidential
A B
Shards
• Contains subset of sharded data
• Replica set for redundancy and HA
• Primary shard
• Non sharded collections
• --shardsvr in config file (port 27018)
© 2017 Pythian. Confidential
Config servers
• Config servers as replica set only (>= 3.4)
• Stores the metadata for sharded cluster in
config database
• Authentication configuration information in
admin database
• Holds balancer on Primary node (>= 3.4)
• --configsvr in config file (port 27019)
© 2017 Pythian. Confidential
mongos
• Caching metadata from config servers
• Routes queries to shards
• No persistent state
• Updates cache on metadata changes
• Holds balancer (mongodb <= 3.2)
• mongos version 3.4 can not connect to
earlier mongod version
© 2017 Pythian. Confidential
Sharding collection
• Enable sharding on database
sh.enableSharding("users")
• Shard collection
sh.shardCollection("users.history", { user_id : 1 } )
• Shard key - indexed key that exists in every document
○ range based
sh.shardCollection("users.history", { user_id : 1 } )
○ hashed based
sh.shardCollection( "users.history", { user_id : "hashed" } )
© 2017 Pythian. Confidential
Shard keys and chunks
Choosing Shard key
• Large Shard Key Cardinality
• Low Shard Key Frequency
• Non-Monotonically changing shard keys
Chunks
• A contiguous range of shard key values within a particular shard
• Inclusive lower and exclusive upper range based on shard key
• Default size of 64MB
© 2017 Pythian. Confidential
Ranged sharding
• Dividing data into contiguous ranges determined by the shard key values
• Documents with “close” shard key values are likely to be in the same
chunk or shard
• Query Isolation - more likely to target single shard
© 2017 Pythian. Confidential
Hashed sharding
• Uses a hashed index of a single field as the shard key to partition data
• More even data distribution at the cost of reducing Query Isolation
• Applications do not need to compute hashes
• Keys that change monotonically like ObjectId or timestamps are ideal
© 2017 Pythian. Confidential
Shard keys limitations
• Must be ascending indexed key or indexed compound keys that
exists in every document in the collection
• Cannot be multikey index, a text index or a geospatial index
• Cannot exceed 512 bytes
• Immutable key - can not be updated/changed
• Update operations that affect a single document must include the
shard key or the _id field
• No option for sharding if unique indexes on other fields exist
• No option for second unique index if the shard key is unique index
© 2017 Pythian. Confidential
Summary
• Replica set with odd number of voting members
• Hidden or Delayed member for dedicated functions (backups …)
• Dataset fits into single server, keep unsharded deployment
• Horizontal scaling - shards running as replica sets
• Shard keys are immutable with max size of 512 bytes
• Shard keys must exists in every document in the collection
• Ranged sharding may not distribute the data evenly
• Hashed sharding distributes the data randomly
© 2017 Pythian. Confidential
Questions?
© 2017 Pythian. Confidential
We’re Hiring!
https://www.pythian.com/careers/
© 2017 Pythian. Confidential

More Related Content

What's hot

Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariDataWorks Summit
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaObjectRocket
 
MongoDB Database Replication
MongoDB Database ReplicationMongoDB Database Replication
MongoDB Database ReplicationMehdi Valikhani
 
Time Series Data with InfluxDB
Time Series Data with InfluxDBTime Series Data with InfluxDB
Time Series Data with InfluxDBTuri, Inc.
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache RangerDataWorks Summit
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack IntroductionVikram Shinde
 
Distributed Locking in Kubernetes
Distributed Locking in KubernetesDistributed Locking in Kubernetes
Distributed Locking in KubernetesRafał Leszko
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginnersNeil Baker
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Databricks
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 

What's hot (20)

Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache Ambari
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
MongoDB Database Replication
MongoDB Database ReplicationMongoDB Database Replication
MongoDB Database Replication
 
Time Series Data with InfluxDB
Time Series Data with InfluxDBTime Series Data with InfluxDB
Time Series Data with InfluxDB
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
HDFS Selective Wire Encryption
HDFS Selective Wire EncryptionHDFS Selective Wire Encryption
HDFS Selective Wire Encryption
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache Ranger
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
 
Distributed Locking in Kubernetes
Distributed Locking in KubernetesDistributed Locking in Kubernetes
Distributed Locking in Kubernetes
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
Logstash
LogstashLogstash
Logstash
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 

Similar to Exploring the replication and sharding in MongoDB

MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at RestMydbops
 
Data dictionary pl17
Data dictionary pl17Data dictionary pl17
Data dictionary pl17Ståle Deraas
 
MySQL :What's New #GIDS16
MySQL :What's New #GIDS16MySQL :What's New #GIDS16
MySQL :What's New #GIDS16Sanjay Manwani
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
Pimping the ForgeRock Identity Platform for a Billion Users
Pimping the ForgeRock Identity Platform for a Billion UsersPimping the ForgeRock Identity Platform for a Billion Users
Pimping the ForgeRock Identity Platform for a Billion UsersForgeRock
 
MariaDB Server Compatibility with MySQL
MariaDB Server Compatibility with MySQLMariaDB Server Compatibility with MySQL
MariaDB Server Compatibility with MySQLColin Charles
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentationOleksii Usyk
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...Severalnines
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDBIgor Donchovski
 
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...Mydbops
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Mydbops
 
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...Big Data Spain
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudCAMMS
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 

Similar to Exploring the replication and sharding in MongoDB (20)

How to scale MongoDB
How to scale MongoDBHow to scale MongoDB
How to scale MongoDB
 
Introduction to Mongodb
Introduction to MongodbIntroduction to Mongodb
Introduction to Mongodb
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at Rest
 
Data dictionary pl17
Data dictionary pl17Data dictionary pl17
Data dictionary pl17
 
MySQL :What's New #GIDS16
MySQL :What's New #GIDS16MySQL :What's New #GIDS16
MySQL :What's New #GIDS16
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Pimping the ForgeRock Identity Platform for a Billion Users
Pimping the ForgeRock Identity Platform for a Billion UsersPimping the ForgeRock Identity Platform for a Billion Users
Pimping the ForgeRock Identity Platform for a Billion Users
 
MariaDB Server Compatibility with MySQL
MariaDB Server Compatibility with MySQLMariaDB Server Compatibility with MySQL
MariaDB Server Compatibility with MySQL
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentation
 
Mongo db 3.4 Overview
Mongo db 3.4 OverviewMongo db 3.4 Overview
Mongo db 3.4 Overview
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
 
Hadoop
HadoopHadoop
Hadoop
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDB
 
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding
 
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
Keeping your Enterprise’s Big Data Secure by Owen O’Malley at Big Data Spain ...
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in Cloud
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 

More from Igor Donchovski

MongoDB Backups and PITR
MongoDB Backups and PITRMongoDB Backups and PITR
MongoDB Backups and PITRIgor Donchovski
 
Sharding and things we'd like to see improved
Sharding and things we'd like to see improvedSharding and things we'd like to see improved
Sharding and things we'd like to see improvedIgor Donchovski
 
Maintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsMaintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsIgor Donchovski
 
MongoDB HA - what can go wrong
MongoDB HA - what can go wrongMongoDB HA - what can go wrong
MongoDB HA - what can go wrongIgor Donchovski
 
Enhancing the default MongoDB Security
Enhancing the default MongoDB SecurityEnhancing the default MongoDB Security
Enhancing the default MongoDB SecurityIgor Donchovski
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAIgor Donchovski
 

More from Igor Donchovski (6)

MongoDB Backups and PITR
MongoDB Backups and PITRMongoDB Backups and PITR
MongoDB Backups and PITR
 
Sharding and things we'd like to see improved
Sharding and things we'd like to see improvedSharding and things we'd like to see improved
Sharding and things we'd like to see improved
 
Maintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsMaintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica Sets
 
MongoDB HA - what can go wrong
MongoDB HA - what can go wrongMongoDB HA - what can go wrong
MongoDB HA - what can go wrong
 
Enhancing the default MongoDB Security
Enhancing the default MongoDB SecurityEnhancing the default MongoDB Security
Enhancing the default MongoDB Security
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBA
 

Recently uploaded

+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...Health
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stageAbc194748
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesRAJNEESHKUMAR341697
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfsmsksolar
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 

Recently uploaded (20)

+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 

Exploring the replication and sharding in MongoDB

  • 2. • Master degree, Software Engineering • Working with databases since 2007 • MySQL, LAMP, Linux since 2010 • Pythian OSDB managed services since 2014 • Lead Database Consultant since 2016 • C100DBA: MongoDB certified DBA https://mk.linkedin.com/in/igorle @igorLE About me © 2017 Pythian. Confidential
  • 3. Overview • What is replica set, how replication works, replication concepts • Replica set features, deployment architectures • Vertical vs Horizontal scaling • What is a sharded cluster in MongoDB • Cluster components - shards, config servers, mongos • Shard keys and chunks • Hashed vs range based sharding • QA © 2017 Pythian. Confidential
  • 4. Replication • Group of mongod processes that maintain the same data set • Redundancy and high availability • Increased read capacity © 2017 Pythian. Confidential
  • 5. Replication concept 1. Write operations go to the Primary node 2. All changes are recorded into operations log 3. Asynchronous replication to Secondary 4. Secondaries copy the Primary oplog 5. Secondary can use sync source Secondary* Automatic failover on Primary failure *settings.chainingAllowed (true by default) © 2017 Pythian. Confidential 2. oplog 1. 3. 3. 4. 4. 5.
  • 6. Replica set oplog • Special capped collection that keeps a rolling record of all operations that modify the data stored in the databases • Idempotent • Default oplog size For Unix and Windows systems Storage Engine Default Oplog Size Lower Bound Upper Bound In-memory 5% of physical memory 50MB 50GB WiredTiger 5% of free disk space 990MB 50GB MMAPv1 5% of free disk space 990MB 50GB © 2017 Pythian. Confidential
  • 7. ● Start each server with config options for replSet /usr/bin/mongod --replSet "myRepl" ● Initiate the replica set on one node - rs.initialize() ● Verify the configuration - rs.conf() Deployment © 2017 Pythian. Confidential Node 2 Node 3 rs.initialize() Node 1
  • 8. • Add the rest of the nodes - rs.add() on the Primary node rs.add(“node2:27017”) , rs.add(“node3:27017”) • Check the status of the replica set - rs.status() Deployment © 2017 Pythian. Confidential Node 2 Node 3 rs.add()
  • 9. Configuration options • 50 members per replica set (7 voting members) • Arbiter node • Priority 0 node • Hidden node • Delayed node • Write concern • Read preference © 2017 Pythian. Confidential
  • 10. Arbiter node • Does not hold copy of data • Votes in elections hidden : true © 2017 Pythian. Confidential Arbiter
  • 11. Priority 0 node Priority - floating point (i.e. decimal) number between 0 and 1000 • Never becomes primary • Visible to application • Node with highest priority is eventually elected as Primary © 2017 Pythian. Confidential Secondary priority : 0
  • 12. Hidden node • Never becomes primary • Not visible to application • Use cases ○ reporting ○ backups hidden : true © 2017 Pythian. Confidential hidden: true priority:0 Secondary hidden : true priority : 0
  • 13. Delayed node • Must be priority 0 member • Should be hidden member (not mandatory) • Has votes 1 by default • Considerations (oplog size) • Mainly used for backups © 2017 Pythian. Confidential Secondary slaveDelay : 3600 priority : 0 hidden : true
  • 14. Write concern { w: <value>, j: <boolean>, wtimeout: <number> } ● w - number of mongod instances that acknowledged the write operation ● j - acknowledgement that the write operation has been written to the journal ● wtimeout - time limit to prevent write operations from blocking indefinitely © 2017 Pythian. Confidential
  • 15. Read preference How read operations are routed to replica set members 1. primary (by default) 2. primaryPreferred 3. secondary 4. secondaryPreferred 5. nearest (least network latency) MongoDB 3.4 maxStalenessSeconds ( >= 90 seconds) © 2017 Pythian. Confidential Driver 1. 2. 2. 2. 3. 3.4.4. 4. 5. 5. 5.
  • 16. • CPU • RAM • DISK Scaling (vertical) © 2017 Pythian. Confidential
  • 17. • Method for distributing data across multiple machines • Splitting data across multiple horizontal partitions Scaling (horizontal) - Sharding © 2017 Pythian. Confidential
  • 18. Sharding with MongoDB Shard © 2017 Pythian. Confidential Shard Data
  • 19. Sharding with MongoDB • Shard/Replica set (subset of the sharded data) • Config servers (metadata and config settings) • mongos (query router, cluster interface) © 2017 Pythian. Confidential Shard Data
  • 20. Sharding with MongoDB • Shard/Replica set (subset of the sharded data) • Config servers (metadata and config settings) • mongos (query router, cluster interface) sh.addShard("shardName") © 2017 Pythian. Confidential A B
  • 21. Shards • Contains subset of sharded data • Replica set for redundancy and HA • Primary shard • Non sharded collections • --shardsvr in config file (port 27018) © 2017 Pythian. Confidential
  • 22. Config servers • Config servers as replica set only (>= 3.4) • Stores the metadata for sharded cluster in config database • Authentication configuration information in admin database • Holds balancer on Primary node (>= 3.4) • --configsvr in config file (port 27019) © 2017 Pythian. Confidential
  • 23. mongos • Caching metadata from config servers • Routes queries to shards • No persistent state • Updates cache on metadata changes • Holds balancer (mongodb <= 3.2) • mongos version 3.4 can not connect to earlier mongod version © 2017 Pythian. Confidential
  • 24. Sharding collection • Enable sharding on database sh.enableSharding("users") • Shard collection sh.shardCollection("users.history", { user_id : 1 } ) • Shard key - indexed key that exists in every document ○ range based sh.shardCollection("users.history", { user_id : 1 } ) ○ hashed based sh.shardCollection( "users.history", { user_id : "hashed" } ) © 2017 Pythian. Confidential
  • 25. Shard keys and chunks Choosing Shard key • Large Shard Key Cardinality • Low Shard Key Frequency • Non-Monotonically changing shard keys Chunks • A contiguous range of shard key values within a particular shard • Inclusive lower and exclusive upper range based on shard key • Default size of 64MB © 2017 Pythian. Confidential
  • 26. Ranged sharding • Dividing data into contiguous ranges determined by the shard key values • Documents with “close” shard key values are likely to be in the same chunk or shard • Query Isolation - more likely to target single shard © 2017 Pythian. Confidential
  • 27. Hashed sharding • Uses a hashed index of a single field as the shard key to partition data • More even data distribution at the cost of reducing Query Isolation • Applications do not need to compute hashes • Keys that change monotonically like ObjectId or timestamps are ideal © 2017 Pythian. Confidential
  • 28. Shard keys limitations • Must be ascending indexed key or indexed compound keys that exists in every document in the collection • Cannot be multikey index, a text index or a geospatial index • Cannot exceed 512 bytes • Immutable key - can not be updated/changed • Update operations that affect a single document must include the shard key or the _id field • No option for sharding if unique indexes on other fields exist • No option for second unique index if the shard key is unique index © 2017 Pythian. Confidential
  • 29. Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (backups …) • Dataset fits into single server, keep unsharded deployment • Horizontal scaling - shards running as replica sets • Shard keys are immutable with max size of 512 bytes • Shard keys must exists in every document in the collection • Ranged sharding may not distribute the data evenly • Hashed sharding distributes the data randomly © 2017 Pythian. Confidential