This document provides an overview of MongoDB basics, including:
- A history of MongoDB and how it enables working with non-structured data and real-time analytics.
- MongoDB's ranking as the highest placed non-relational database and as a "Challenger" to relational databases.
- How MongoDB works using a clustered architecture with shards, replica sets, config servers, and mongos processes to provide scalability, high availability, and load balancing.
- Key MongoDB concepts like documents, collections, embedded documents, and schema flexibility compared to a traditional SQL schema.
- MongoDB utilities for backup, restore, and monitoring like mongoexport, mongorestore, mongostat, and mongotop.
3. Who am I?
Juan Antonio Roy Couto
❏ Financial Software Developer
❏ Email: juanroycouto@gmail.com
❏ Twitter: @juanroycouto
❏ Linkedin: https://www.linkedin.com/in/juanroycouto
❏ Slideshare: slideshare.net/juanroycouto
❏ Personal blog: http://www.juanroy.es
❏ Contributor at: http://www.mongodbspain.com
❏ Charrosfera member: http://www.charrosfera.com
MongoDB Basics
3
4. ❏ History
❏ Ranking, Who, Community & Metrics, Drivers
❏ Products
❏ Cluster Overview
❏ Characteristics
❏ Schema Design
❏ How does MongoDB work?
❏ Utilities
❏ Data analytics
❏ Ops Manager
❏ Cloud Manager
Agenda MongoDB Basics
4
5. History MongoDB Basics
MongoDB
Internet of things
Cloud computing
Wearables
Apps
Smart cities
❏ Non structured data
❏ Enabling Big Data analytics
❏ Faster development
❏ Real time analytics
❏ Better strategic decisions
❏ Reduce costs and time to
market
5
7. Who? MongoDB Basics
7
Who is using MongoDB?
https://www.mongodb.com/who-uses-mongodb
Who provides MongoDB?
https://www.mongodb.com/partners/list
8. Community & Metrics
https://www.mongodb.org/community
MongoDB Basics
❏ 10 million downloads
❏ 2,000+ customers (including over one third of the Fortune 100)
❏ 100+ MongoDB User Groups and 40,000 members worldwide
❏ 300,000+ Education Registrations
❏ The only “Challenger” to relational databases in Gartner’s
Operational Database Magic Quadrant
❏ Highest placed non-relational database in DB Engines rankings
8
17. SQL Schema Design MongoDB Basics
17
❏ Customer Key
❏ First Name
❏ Last Name
Tables
Customers
❏ Address Key
❏ Customer Key
❏ Street
❏ Number
❏ Location
Addresses
❏ Pet Key
❏ Customer Key
❏ Type
❏ Breed
❏ Name
Pets
18. MongoDB Schema Design MongoDB Basics
18
Customers Collection
❏ Street
❏ Number
❏ Location
Addresses
❏ Type
❏ Breed
❏ Name
Pets
Customers Info
❏ First Name
❏ Last Name
❏ Type
❏ Breed
❏ Name
> db.customers.findOne()
{
"_id" : ObjectId("54131863041cd2e6181156ba"),
"first_name" : "Peter",
"last_name" : "Keil",
"address" : {
"street" : "C/Alcalá",
"number" : 123,
"location" : "Madrid",
},
"pets" : [
{
"type" : "Dog",
"breed" : "Airedale Terrier",
"name" : "Linda",
},
{
"type" : "Dog",
"breed" : "Akita",
"name" : "Bruto",
}
]
}
>
20. Cluster overview
Replica Set
❏ High Availability
❏ Data Safety
❏ Asynchronous
❏ Automatic Node Recovery
❏ Read Preference
❏ Write Concern
Replica Set
Secondary
Secondary
Primary
MongoDB Basics
20
21. ❏ Scale out
❏ Even data distribution across all of the
shards based on a shard key
❏ A shard key range belongs to only one
shard
❏ More efficient queries (performance)
Cluster overview
Shards
Cluster
Shard 0 Shard 2Shard 1
A-I J-Q R-Z
MongoDB Basics
21
22. Cluster overview
Config servers
❏ config database
❏ Identical information (consistency check).
❏ Metadata:
❏ Cluster shards list
❏ Data per shard (chunk ranges)
❏ ...
❏ Replica Set (3.2 version)
MongoDB Basics
22
23. ❏ Receives client requests and returns results.
❏ Reads the metadata and sends the query to the necessary
shard/shards.
❏ Does not store data.
❏ Keeps a cache version of the metadata.
Cluster overview
mongos
MongoDB Basics
23
24. Definitions
❏ Range: Data division based on the values of the shard key.
❏ Chunk: They are not physical data. Chunks are just a logical grouping of
data into ranges (64MB by default).
❏ Split: Chunk division (size > 64MB). No data is moved. Background.
❏ Migration: Chunk movements between shards in order to get an even
distribution. Only one chunk is moved at a time.
❏ Balanced system: The same number of chunks per shard.
❏ Balancer: Checks if a migration is needed and starts it (background).
❏ Pre-split: First data is split, then it is stored.
❏ Tag-based sharding: Used when you want to pin ranges to a specific
shard.
MongoDB Basics
24
25. How does MongoDB work?
Shard 0 Shard 1 Shard 2 Shard 3
mongos
Client
Migrations
MongoDB Basics
25
26. Utilities
Backup tools
MongoDB Basics
26
Name Description
mongoexport Generates a JSON or CSV file from a mongodb instance
mongoimport Imports content from a JSON, CSV or TSV export
mongodump Utility for creating a binary export
mongorestore Writes data to a mongodb instance from a binary file
27. Utilities
Track tools
MongoDB Basics
27
Name Description
mongostat
Provides a quick overview of the status of a running mongod or
mongos instance
mongotop
Provides a method to track the amount of time a mongodb
instance spends reading or writing data.
mongotop provides statistics on a collection level.
By default, returns values each second.
29. OPS Manager MongoDB Basics
29
The best way to run MongoDB within your own data center or public cloud
❏ Monitors 100+ key database and system health metrics
(operations, memory, CPU,...)
❏ Customizable web dashboard
❏ Deploy new clusters (adding shards, replica set members,…)
❏ Alerts
❏ Backup (point-in-time recovery)
❏ Automation (upgrades, scaling,..)
30. Cloud Manager MongoDB Basics
❏ Simplify complex operational tasks (Reduce tedious manual steps to just a click of a button)
❏ Automated database management (deploy and upgrade with zero downtime)
❏ Continuous real-time backup (Cloud manager is disaster recovery).
❏ Full performance visibility
❏ Alerts
❏ Get the insights you need to make critical decisions fast.
❏ Cloud Manager saves you time, money, and helps you protect the customer experience by
eliminating the guesswork from running MongoDB.
30
31. ❏ High Performance
❏ Flexible
❏ Automatic Scalable
❏ Automatic Failover
❏ High Availability
❏ Reduced Administrative Tasks (replica set, sharding, disaster recovery)
❏ Real Time Analytic Tools (aggregation framework, mapReduce, Hadoop,
Spark, and BI connectors,...)
❏ Easy To Learn
Summary MongoDB Basics
31