SlideShare a Scribd company logo
1 of 73
MongoDB Best Practices
Jay Runkel
Principal Solutions Architect
jay.runkel@mongodb.com
@jayrunkel
About Me
• Solution Architect
• Part of Sales Organization
• Work with many organizations new to MongoDB
Everyone Loves MongoDB’s Flexibility
• Document Model
• Dynamic Schema
• Powerful Query Language
• Secondary Indexes
Everyone Loves MongoDB’s Flexibility
• Document Model
• Dynamic Schema
• Powerful Query Language
• Secondary Indexes
Sometimes Organizations Struggle with
Performance
Good News!
• Poor Performance Usually Due to Common (and often simple) mistakes
Agenda
• Quick MongoDB Introduction
• Best Practices
1. Hardware/OS
2. Schema/Queries
3. Loading Data
MongoDB Introduction
Document Data Model
Relational MongoDB
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: [45.123,47.232],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
]
}
Documents are Rich Data Structures
{
first_name: ‘Paul’,
surname: ‘Miller’,
cell: 447557505611,
city: ‘London’,
location: [45.123,47.232],
Profession: [‘banking’, ‘finance’, ‘trader’],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
Fields can contain an array of
sub-documents
Fields
Typed fields
Fields can
contain arrays
Do More With Your Data
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: [45.123,47.232],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
}
}
Rich Queries
Find everybody in London
with a car built between
1970 and 1980
Geospatial
Find all of the car owners
within 5km of Trafalgar Sq.
Text Search
Find all the cars described
as having leather seats
Aggregation
Calculate the average value
of Paul’s car collection
Map Reduce
What is the ownership
pattern of colors by
geography over time?
(is purple trending up in
China?)
Automatic Sharding
Three types: hash-based, range-based, location-
aware
Increase or decrease capacity as you go
Automatic balancing
Query Routing
Multiple query optimization
models
Each sharding option
appropriate for different
apps
mongos
Replica Sets
Replica Set – 2 to 50 copies
Self-healing shard
Data Center Aware
Addresses availability considerations:
High Availability
Disaster Recovery
Maintenance
Workload Isolation: operational & analytics
Assumptions
Assumptions
MongoDB 3.0 or 3.2
Storage Engine Architecture in 3.2
Content
Repo
IoT Sensor
Backend
Ad Service
Customer
Analytics
Archive
MongoDB Query Language (MQL) + Native Drivers
MongoDB Document Data Model
WT MMAP
Supported in MongoDB 3.2
Management
Security
In-memory
(beta)
Encrypted 3rd party
Best Practices
Hardware/Operating System
Servers
• Specifications Good Fit For MongoDB?
• Correct Number of Servers?
• Properly Configured?
What Type of Servers
• RAM
– 64  256 GB+
• Fast IO Systems
– RAID-10/SSDs
• Many cores
– Compress/Uncompress
– Encrypt/Decrypt
– Aggregation queries
What about a SAN?
• Mostly Random Disk Access
• IOPS
• Need dedicated IOPS or performance will vary
• Configure your SAN properly
• Suitability of any IO system will depend upon IOPS
How Many Servers Do I Need?
• How Many Shards Do I Need?
MongoDB cluster sizing at 30,000 ft
• Disk Space
• RAM
• Query Throughput
• Sum of disk space across shards > greater than required storage size
Disk Space: How Many Shards Do I Need?
• Sum of disk space across shards > greater than required storage size
Disk Space: How Many Shards Do I Need?
Example
Data Size = 9 TB
WiredTiger Compression Ratio: .33
Storage size = 3 TB
Server disk capacity = 2 TB
2 Shards Required
• Working set should fit in RAM
– Sum of RAM across shards > Working Set
• WorkSet = Indexes plus the set of documents accessed frequently
• WorkSet in RAM 
– Shorter latency
– Higher Throughput
RAM: How Many Shards Do I Need?
• Measuring Index Size
– db.coll.stats() – index size of collection
• Estimate frequently accessed documents
– Ex: total size of documents accessed
per day
RAM: How Many Shards Do I Need?
• Measuring Index Size
– db.coll.stats() – index size of collection
• Estimate frequently accessed documents
– Ex: total size of documents accessed
per day
RAM: How Many Shards Do I Need?
Example
Working Set = 428 GB
Server RAM = 128 GB
428/128 = 3.34
4 Shards Required
• Measure max sustained query rate of a single server (with replication)
– build a prototype and measure
• Assume sharding overhead of 20-30%
Query Rate: How Many Shards Do I Need?
• Measure max sustained query rate of a single server (with replication)
– build a prototype and measure
• Assume sharding overhead of 20-30%
Query Rate: How Many Shards Do I Need?
Example
Require: 50K ops/sec
Prototype performance: 20 ops/sec (1
replica set)
4 Shards Required: 80 ops/sec * .7 =
56K ops/sec
Configure Them Properly
• Default OS Settings Often Don’t Provide Optimal Performance
• See MongoDB Production Notes
– https://docs.mongodb.org/manual/administration/production-notes
• Also Review:
– Amazon EC2: https://docs.mongodb.org/ecosystem/platforms/amazon-ec2/
– Azure: https://docs.mongodb.org/ecosystem/platforms/windows-azure/
Server/OS Configuration
• Server configuration recommendations
– XFS
– Turn off atime and diratime
– NOOP scheduler
– File descriptor limits
– Disable transparent huge pages and NUMA
– Read ahead of 32
– Separate data volumes for data files, the journal, and the log.
– Change the default TCP keepalive time to 300 seconds.
These are important
• Ignore them and your performance may suffer
• The first 100 lines of the MongoDB logs identifies
suboptimal OS settings
Best Practices
Schema Design
Don’t Use a Relational Schema
Taylor MongoDB Schema toApplication
Workload
• Design schema to provide good query performance
• Schema design will impact required number of shards!
Application
Query Workload
{
Name: “john”
Height: 12
Address: {…}
}
db.cust.find({…})
db.cust.aggregate({…})
Compare Alternative Schemas
• Build a spreadsheet
• Calculate # of shards for each schema
• Estimate query performance
– # of documents
– # of inserts
– # of deletes
– Required indexes
– Number of documents inspected
– Number of documents sent across network
Modeling Decisions
• Referencing vs. Embedding
• Aggregating data by device, customer, product, etc.
Referencing
Procedure
{
"_id" : 333,
"date" : "2003-02-09T05:00:00"),
"hospital" : “County Hills”,
"patient" : “John Doe”,
"physician" : “Stephen Smith”,
"type" : ”Chest X-ray",
”result" : 134
}
Results
{
“_id” : 134
"type" : "txt",
"size" : NumberInt(12),
"content" : {
value1: 343,
value2: “abc”,
…
}
}
EmbeddingProcedure
{
"_id" : 333,
"date" : "2003-02-09T05:00:00"),
"hospital" : “County Hills”,
"patient" : “John Doe”,
"physician" : “Stephen Smith”,
"type" : ”Chest X-ray",
”result" : {
"type" : "txt",
"size" : NumberInt(12),
"content" : {
value1: 343,
value2: “abc”,
…
}
}
}
Embedding
• Advantages
– Retrieve all relevant information in a single query/document
– Avoid implementing joins in application code
– Update related information as a single atomic operation
• MongoDB doesn’t offer multi-document transactions
• Limitations
– Large documents mean more overhead if most fields are not relevant
– Might mean replicating data
– 16 MB document size limit
Referencing
• Advantages
– Smaller documents
– Less likely to reach 16 MB document limit
– Infrequently accessed information not accessed on every query
– No duplication of data
• Limitations
– Two queries required to retrieve information
– Cannot update related information atomically
{
_id: 2,
first: “Joe”,
last: “Patient”,
addr: { …},
procedures: [
{
id: 12345,
date: 2015-02-15,
type: “Cat scan”,
…},
{
id: 12346,
date: 2015-02-15,
type: “blood test”,
…}]
}
Patients
Embed
One-to-Many & Many-to-Many Relationships
{
_id: 2,
first: “Joe”,
last: “Patient”,
addr: { …},
procedures: [12345, 12346]}
{
_id: 12345,
date: 2015-02-15,
type: “Cat scan”,
…}
{
_id: 12346,
date: 2015-02-15,
type: “blood test”,
…}
Patients
Reference
Procedures
Schema Alternatives – Do the math?
• How complex queries?
• How much hardware/shards will I need?
Vital Sign Monitoring Device
Vital Signs Measured:
• Blood Pressure
• Pulse
• Blood Oxygen Levels
Produces data at regular intervals
• Once per minute
We have a hospital(s) of devices
Data From Vital Signs Monitoring Device
{
deviceId: 123456,
spO2: 88,
pulse: 74,
bp: [128, 80],
ts: ISODate("2013-10-16T22:07:00.000-0500")
}
• One document per minute per device
• Relational approach
Document Per Hour (By minute)
{
deviceId: 123456,
spO2: { 0: 88, 1: 90, …, 59: 92},
pulse: { 0: 74, 1: 76, …, 59: 72},
bp: { 0: [122, 80], 1: [126, 84], …, 59: [124, 78]},
ts: ISODate("2013-10-16T22:00:00.000-0500")
}
• Store per-minute data at the hourly level
• Update-driven workload
• 1 document per device per hour
Characterizing Write Differences
• Example: data generated every minute
• Recording the data for 1 patient for 1 hour:
Document Per Event
60 inserts
Document Per Hour
1 insert, 59 updates
Characterizing Read Differences
• Want to graph 24 hour of vital signs for a patient:
• Read performance is greatly improved
Document Per Event
1440 reads
Document Per Hour
24 reads
Characterizing Memory and Storage
Differences
Document Per Minute Document Per Hour
Number Documents 52.6 B 876 M
Total Index Size 6364 GB 106 GB
_id index 1468 GB 24.5 GB
{ts: 1, deviceId: 1} 4895 GB 81.6 GB
Document Size 92 Bytes 758 Bytes
Database Size 4503 GB 618 GB
• 100K Devices
• 1 years worth of data
Characterizing Memory and Storage
Differences
Document Per Minute Document Per Hour
Number Documents 52.6 B 876 M
Total Index Size 6364 GB 106 GB
_id index 1468 GB 24.5 GB
{ts: 1, deviceId: 1} 4895 GB 81.6 GB
Document Size 92 Bytes 758 Bytes
Database Size 4503 GB 618 GB
• 100K Devices
• 1 years worth of data
100000 *
365 * 24 *
60
100000 *
365 * 24
Characterizing Memory and Storage
Differences
Document Per Minute Document Per Hour
Number Documents 52.6 B 876 M
Total Index Size 6364 GB 106 GB
_id index 1468 GB 24.5 GB
{ts: 1, deviceId: 1} 4895 GB 81.6 GB
Document Size 92 Bytes 758 Bytes
Database Size 4503 GB 618 GB
• 100K Devices
• 1 years worth of data
100000 *
365 * 24 *
60 * 130
100000 *
365 * 24 *
130
Characterizing Memory and Storage
Differences
Document Per Minute Document Per Hour
Number Documents 52.6 B 876 M
Total Index Size 6364 GB 106 GB
_id index 1468 GB 24.5 GB
{ts: 1, deviceId: 1} 4895 GB 81.6 GB
Document Size 92 Bytes 758 Bytes
Database Size 4503 GB 618 GB
• 100K Devices
• 1 years worth of data
100000 *
365 * 24 *
60 * 92
100000 *
365 * 24 *
758
Best Practices
Loading Data
Rule of Thumb
• To saturate a MongoDB cluster 
– loader hardware ~= mongodb hardware
• Many threads
• Many mongos
Loader Architecture
loader
mongos
primary
primary
primary
secondary
secondary
secondary
secondary
secondary
secondary
Loader Architecture
loader
mongos
primary
primary
primary
secondary
secondary
secondary
secondary
secondary
secondary
Where are the bottlenecks?
Loader Architecture
loader
mongos
primary
primary
primary
secondary
secondary
secondary
secondary
secondary
secondary
Where are the bottlenecks?
Loader Architecture
loader (8)
mongos (4)
primary
primary
primary
secondary
secondary
secondary
secondary
secondary
secondary
loader (8)
mongos (4)
loader (8)
mongos (4)
Use many
threads
Use
multiple
loader
servers
When Sharding
• If you care about initial performance, you must pre-split
• Otherwise, initial performance will be slow
• (hash sharding automatically presplits collection)
Without presplitting
Shard 1 Shard 2 Shard 3 Shard 4
-∞ … ∞
• sh.shardCollection(“records.patients”, {zipcode : 1})
Without presplitting
Shard 1 Shard 2 Shard 3 Shard 4
-∞ … 11305
• 64K chunks
• Splitting will occur quickly
• Balancing occurs much more slowly
• The entire query workload  Shard 1
11306 … 44506
44507 … ∞
Without presplitting
Shard 1 Shard 2 Shard 3 Shard 4
-∞ … 11305
11306 … 44506
44507 … ∞
Loader
mongos
Split collection
Shard 1 Shard 2 Shard 3 Shard 4
• Split and distribute empty chunks before loading any data
• Evenly distribute query load across cluster
-∞ … 08333
08334 … 16667
16668 … 25000
25001… 33334
33335 … 41668
41669 … 50000
50001 … 58334
58335 … 66668
66669 … 75000
75001 … 83334
88335 … 96668
96669 … 99999
Split collection
Shard 1 Shard 2 Shard 3 Shard 4
-∞ … 08333
08334 … 16667
16668 … 25000
25001… 33334
33335 … 41668
41669 … 50000
50001 … 58334
58335 … 66668
66669 … 75000
75001 … 83334
88335 … 96668
96669 … 99999
Loader
mongos
Summary
Best Practices
1. Use servers with specifications that will provide good MongoDB performance
– 64+ GB RAM, many cores, many IOPS (RAID-10/SSDs)
2. Calculate How Many Shards?
1. Calculate required RAM and Disk Space
2. Build a prototype to determine the ops/sec capacity of a server
3. Do the math
3. Configure OS for Optimal MongoDB Performance
– See MongoDB Production Notes
– Review logs for warnings (Don’t ignore)
Best Practices (cont.)
4. Create a Document Schema
– Denormalized
5. Tailor schema to application workload
– Use application queries to guide schema design decisions
– Consider alternative schemas
– Compare cluster size (# of shards) and performance
– Build a spreadsheet
Best Practices
6. Loading Data
– Loader Hardware ~= MongoDB hardware
– Many threads
– Many mongos
7. Pre-split
– Ensure query workload is evenly distributed across the cluster from the start
Questions?
jay.runkel@mongodb.com
@jayrunkel

More Related Content

What's hot

Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkBo Yang
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningAlbert Chen
 
Distributed stream processing with Apache Kafka
Distributed stream processing with Apache KafkaDistributed stream processing with Apache Kafka
Distributed stream processing with Apache Kafkaconfluent
 
Elastic Search
Elastic SearchElastic Search
Elastic SearchNavule Rao
 
Hacking the browser with puppeteer sharp .NET conf AR 2018
Hacking the browser with puppeteer sharp .NET conf AR 2018Hacking the browser with puppeteer sharp .NET conf AR 2018
Hacking the browser with puppeteer sharp .NET conf AR 2018Darío Kondratiuk
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyAlexander Kukushkin
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
PostgreSQL and Benchmarks
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and BenchmarksJignesh Shah
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance IssuesOdoo
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 
jq: JSON - Like a Boss
jq: JSON - Like a Bossjq: JSON - Like a Boss
jq: JSON - Like a BossBob Tiernay
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redisDvir Volk
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101MongoDB
 
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsMonitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsDatabricks
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateBobby Curtis
 
Painless JavaScript Testing with Jest
Painless JavaScript Testing with JestPainless JavaScript Testing with Jest
Painless JavaScript Testing with JestMichał Pierzchała
 

What's hot (20)

Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance Tuning
 
Distributed stream processing with Apache Kafka
Distributed stream processing with Apache KafkaDistributed stream processing with Apache Kafka
Distributed stream processing with Apache Kafka
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Hacking the browser with puppeteer sharp .NET conf AR 2018
Hacking the browser with puppeteer sharp .NET conf AR 2018Hacking the browser with puppeteer sharp .NET conf AR 2018
Hacking the browser with puppeteer sharp .NET conf AR 2018
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
PostgreSQL and Benchmarks
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and Benchmarks
 
Tools for Solving Performance Issues
Tools for Solving Performance IssuesTools for Solving Performance Issues
Tools for Solving Performance Issues
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
jq: JSON - Like a Boss
jq: JSON - Like a Bossjq: JSON - Like a Boss
jq: JSON - Like a Boss
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
Kafka on Pulsar
Kafka on Pulsar Kafka on Pulsar
Kafka on Pulsar
 
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsMonitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
 
Improve PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGateImprove PostgreSQL replication with Oracle GoldenGate
Improve PostgreSQL replication with Oracle GoldenGate
 
Painless JavaScript Testing with Jest
Painless JavaScript Testing with JestPainless JavaScript Testing with Jest
Painless JavaScript Testing with Jest
 
The basics of fluentd
The basics of fluentdThe basics of fluentd
The basics of fluentd
 

Viewers also liked

Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018Lewis Lin 🦊
 
Salary Negotiation Cheat Sheet
Salary Negotiation Cheat SheetSalary Negotiation Cheat Sheet
Salary Negotiation Cheat SheetLewis Lin 🦊
 
UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2Lewis Lin 🦊
 
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...Lewis Lin 🦊
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesLewis Lin 🦊
 
Performance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) QuestionsPerformance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) QuestionsLewis Lin 🦊
 
MBA CSEA 2017 Attendees
MBA CSEA 2017 AttendeesMBA CSEA 2017 Attendees
MBA CSEA 2017 AttendeesLewis Lin 🦊
 
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...Lewis Lin 🦊
 
2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend Report2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend ReportLewis Lin 🦊
 
UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1Lewis Lin 🦊
 
Creating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBCreating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBLewis Lin 🦊
 
Book Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. LinBook Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. LinLewis Lin 🦊
 

Viewers also liked (12)

Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018
 
Salary Negotiation Cheat Sheet
Salary Negotiation Cheat SheetSalary Negotiation Cheat Sheet
Salary Negotiation Cheat Sheet
 
UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2
 
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
 
MongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World ExamplesMongoDB Schema Design: Four Real-World Examples
MongoDB Schema Design: Four Real-World Examples
 
Performance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) QuestionsPerformance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) Questions
 
MBA CSEA 2017 Attendees
MBA CSEA 2017 AttendeesMBA CSEA 2017 Attendees
MBA CSEA 2017 Attendees
 
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
 
2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend Report2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend Report
 
UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1
 
Creating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBCreating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDB
 
Book Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. LinBook Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. Lin
 

Similar to MongoDB Best Practices

MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series DataMongoDB
 
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...MongoDB
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB ClusterMongoDB
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDBDenny Lee
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, whenEugenio Minardi
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudMongoDB
 
Simultaneous analysis of massive data streams in real time and batch
Simultaneous analysis of massive data streams in real time and batchSimultaneous analysis of massive data streams in real time and batch
Simultaneous analysis of massive data streams in real time and batchAnjana Fernando
 
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...MongoDB
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...
WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...
WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...WSO2
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Ivo Andreev
 
Codemotion Milano 2014 - MongoDB and the Internet of Things
Codemotion Milano 2014 - MongoDB and the Internet of ThingsCodemotion Milano 2014 - MongoDB and the Internet of Things
Codemotion Milano 2014 - MongoDB and the Internet of ThingsMassimo Brignoli
 
Mongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - BrignoliMongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - BrignoliCodemotion
 
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWSSungmin Kim
 
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy IndustriesWebinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy IndustriesMongoDB
 

Similar to MongoDB Best Practices (20)

MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
MongoDB IoT City Tour STUTTGART: Managing the Database Complexity, by Arthur ...
 
Sizing Your MongoDB Cluster
Sizing Your MongoDB ClusterSizing Your MongoDB Cluster
Sizing Your MongoDB Cluster
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
MongoDB: What, why, when
MongoDB: What, why, whenMongoDB: What, why, when
MongoDB: What, why, when
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
High Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal CloudHigh Performance, Scalable MongoDB in a Bare Metal Cloud
High Performance, Scalable MongoDB in a Bare Metal Cloud
 
Simultaneous analysis of massive data streams in real time and batch
Simultaneous analysis of massive data streams in real time and batchSimultaneous analysis of massive data streams in real time and batch
Simultaneous analysis of massive data streams in real time and batch
 
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
MongoDB IoT City Tour LONDON: Managing the Database Complexity, by Arthur Vie...
 
MongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor ManagementMongoDB for Time Series Data: Setting the Stage for Sensor Management
MongoDB for Time Series Data: Setting the Stage for Sensor Management
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...
WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...
WSO2Con Asia 2014 - Simultaneous Analysis of Massive Data Streams in real-tim...
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)Time Series Databases for IoT (On-premises and Azure)
Time Series Databases for IoT (On-premises and Azure)
 
Codemotion Milano 2014 - MongoDB and the Internet of Things
Codemotion Milano 2014 - MongoDB and the Internet of ThingsCodemotion Milano 2014 - MongoDB and the Internet of Things
Codemotion Milano 2014 - MongoDB and the Internet of Things
 
Mongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - BrignoliMongo db 2.4 time series data - Brignoli
Mongo db 2.4 time series data - Brignoli
 
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
 
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy IndustriesWebinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
 

More from Lewis Lin 🦊

Gaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPointGaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPointLewis Lin 🦊
 
P&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand ManagementP&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand ManagementLewis Lin 🦊
 
Jeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney StudiosJeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney StudiosLewis Lin 🦊
 
Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020Lewis Lin 🦊
 
Gallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance ManagementGallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance ManagementLewis Lin 🦊
 
Twitter Job Opportunities for Students
Twitter Job Opportunities for StudentsTwitter Job Opportunities for Students
Twitter Job Opportunities for StudentsLewis Lin 🦊
 
Facebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management CandidatesFacebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management CandidatesLewis Lin 🦊
 
Performance Management at Google
Performance Management at GooglePerformance Management at Google
Performance Management at GoogleLewis Lin 🦊
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerLewis Lin 🦊
 
Google Interview Prep Guide Product Manager
Google Interview Prep Guide Product ManagerGoogle Interview Prep Guide Product Manager
Google Interview Prep Guide Product ManagerLewis Lin 🦊
 
Skills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. LinSkills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. LinLewis Lin 🦊
 
How Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership TraitsHow Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership TraitsLewis Lin 🦊
 
Product Manager Skills Survey
Product Manager Skills SurveyProduct Manager Skills Survey
Product Manager Skills SurveyLewis Lin 🦊
 
Uxpin Why Build a Design System
Uxpin Why Build a Design SystemUxpin Why Build a Design System
Uxpin Why Build a Design SystemLewis Lin 🦊
 
30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study Guide30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study GuideLewis Lin 🦊
 
30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study Guide30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study GuideLewis Lin 🦊
 
36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study Guide36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study GuideLewis Lin 🦊
 
McKinsey's Assessment on PM Careers
McKinsey's Assessment on PM CareersMcKinsey's Assessment on PM Careers
McKinsey's Assessment on PM CareersLewis Lin 🦊
 
Five Traits of Great Product Managers
Five Traits of Great Product ManagersFive Traits of Great Product Managers
Five Traits of Great Product ManagersLewis Lin 🦊
 

More from Lewis Lin 🦊 (20)

Gaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPointGaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPoint
 
P&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand ManagementP&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand Management
 
Jeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney StudiosJeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney Studios
 
Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020
 
Gallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance ManagementGallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance Management
 
Twitter Job Opportunities for Students
Twitter Job Opportunities for StudentsTwitter Job Opportunities for Students
Twitter Job Opportunities for Students
 
Facebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management CandidatesFacebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management Candidates
 
Performance Management at Google
Performance Management at GooglePerformance Management at Google
Performance Management at Google
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software Engineer
 
Google Interview Prep Guide Product Manager
Google Interview Prep Guide Product ManagerGoogle Interview Prep Guide Product Manager
Google Interview Prep Guide Product Manager
 
Skills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. LinSkills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. Lin
 
How Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership TraitsHow Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership Traits
 
Product Manager Skills Survey
Product Manager Skills SurveyProduct Manager Skills Survey
Product Manager Skills Survey
 
Uxpin Why Build a Design System
Uxpin Why Build a Design SystemUxpin Why Build a Design System
Uxpin Why Build a Design System
 
Sourcing on GitHub
Sourcing on GitHubSourcing on GitHub
Sourcing on GitHub
 
30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study Guide30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study Guide
 
30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study Guide30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study Guide
 
36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study Guide36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study Guide
 
McKinsey's Assessment on PM Careers
McKinsey's Assessment on PM CareersMcKinsey's Assessment on PM Careers
McKinsey's Assessment on PM Careers
 
Five Traits of Great Product Managers
Five Traits of Great Product ManagersFive Traits of Great Product Managers
Five Traits of Great Product Managers
 

Recently uploaded

Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 

Recently uploaded (20)

Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 

MongoDB Best Practices

  • 1.
  • 2. MongoDB Best Practices Jay Runkel Principal Solutions Architect jay.runkel@mongodb.com @jayrunkel
  • 3. About Me • Solution Architect • Part of Sales Organization • Work with many organizations new to MongoDB
  • 4. Everyone Loves MongoDB’s Flexibility • Document Model • Dynamic Schema • Powerful Query Language • Secondary Indexes
  • 5. Everyone Loves MongoDB’s Flexibility • Document Model • Dynamic Schema • Powerful Query Language • Secondary Indexes
  • 7. Good News! • Poor Performance Usually Due to Common (and often simple) mistakes
  • 8. Agenda • Quick MongoDB Introduction • Best Practices 1. Hardware/OS 2. Schema/Queries 3. Loading Data
  • 10. Document Data Model Relational MongoDB { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ] }
  • 11. Documents are Rich Data Structures { first_name: ‘Paul’, surname: ‘Miller’, cell: 447557505611, city: ‘London’, location: [45.123,47.232], Profession: [‘banking’, ‘finance’, ‘trader’], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } Fields can contain an array of sub-documents Fields Typed fields Fields can contain arrays
  • 12. Do More With Your Data { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } } } Rich Queries Find everybody in London with a car built between 1970 and 1980 Geospatial Find all of the car owners within 5km of Trafalgar Sq. Text Search Find all the cars described as having leather seats Aggregation Calculate the average value of Paul’s car collection Map Reduce What is the ownership pattern of colors by geography over time? (is purple trending up in China?)
  • 13. Automatic Sharding Three types: hash-based, range-based, location- aware Increase or decrease capacity as you go Automatic balancing
  • 14. Query Routing Multiple query optimization models Each sharding option appropriate for different apps mongos
  • 15. Replica Sets Replica Set – 2 to 50 copies Self-healing shard Data Center Aware Addresses availability considerations: High Availability Disaster Recovery Maintenance Workload Isolation: operational & analytics
  • 18. Storage Engine Architecture in 3.2 Content Repo IoT Sensor Backend Ad Service Customer Analytics Archive MongoDB Query Language (MQL) + Native Drivers MongoDB Document Data Model WT MMAP Supported in MongoDB 3.2 Management Security In-memory (beta) Encrypted 3rd party
  • 20. Servers • Specifications Good Fit For MongoDB? • Correct Number of Servers? • Properly Configured?
  • 21. What Type of Servers • RAM – 64  256 GB+ • Fast IO Systems – RAID-10/SSDs • Many cores – Compress/Uncompress – Encrypt/Decrypt – Aggregation queries
  • 22. What about a SAN? • Mostly Random Disk Access • IOPS • Need dedicated IOPS or performance will vary • Configure your SAN properly • Suitability of any IO system will depend upon IOPS
  • 23. How Many Servers Do I Need? • How Many Shards Do I Need?
  • 24. MongoDB cluster sizing at 30,000 ft • Disk Space • RAM • Query Throughput
  • 25. • Sum of disk space across shards > greater than required storage size Disk Space: How Many Shards Do I Need?
  • 26. • Sum of disk space across shards > greater than required storage size Disk Space: How Many Shards Do I Need? Example Data Size = 9 TB WiredTiger Compression Ratio: .33 Storage size = 3 TB Server disk capacity = 2 TB 2 Shards Required
  • 27. • Working set should fit in RAM – Sum of RAM across shards > Working Set • WorkSet = Indexes plus the set of documents accessed frequently • WorkSet in RAM  – Shorter latency – Higher Throughput RAM: How Many Shards Do I Need?
  • 28. • Measuring Index Size – db.coll.stats() – index size of collection • Estimate frequently accessed documents – Ex: total size of documents accessed per day RAM: How Many Shards Do I Need?
  • 29. • Measuring Index Size – db.coll.stats() – index size of collection • Estimate frequently accessed documents – Ex: total size of documents accessed per day RAM: How Many Shards Do I Need? Example Working Set = 428 GB Server RAM = 128 GB 428/128 = 3.34 4 Shards Required
  • 30. • Measure max sustained query rate of a single server (with replication) – build a prototype and measure • Assume sharding overhead of 20-30% Query Rate: How Many Shards Do I Need?
  • 31. • Measure max sustained query rate of a single server (with replication) – build a prototype and measure • Assume sharding overhead of 20-30% Query Rate: How Many Shards Do I Need? Example Require: 50K ops/sec Prototype performance: 20 ops/sec (1 replica set) 4 Shards Required: 80 ops/sec * .7 = 56K ops/sec
  • 32.
  • 33. Configure Them Properly • Default OS Settings Often Don’t Provide Optimal Performance • See MongoDB Production Notes – https://docs.mongodb.org/manual/administration/production-notes • Also Review: – Amazon EC2: https://docs.mongodb.org/ecosystem/platforms/amazon-ec2/ – Azure: https://docs.mongodb.org/ecosystem/platforms/windows-azure/
  • 34. Server/OS Configuration • Server configuration recommendations – XFS – Turn off atime and diratime – NOOP scheduler – File descriptor limits – Disable transparent huge pages and NUMA – Read ahead of 32 – Separate data volumes for data files, the journal, and the log. – Change the default TCP keepalive time to 300 seconds.
  • 35. These are important • Ignore them and your performance may suffer • The first 100 lines of the MongoDB logs identifies suboptimal OS settings
  • 37. Don’t Use a Relational Schema
  • 38. Taylor MongoDB Schema toApplication Workload • Design schema to provide good query performance • Schema design will impact required number of shards! Application Query Workload { Name: “john” Height: 12 Address: {…} } db.cust.find({…}) db.cust.aggregate({…})
  • 39. Compare Alternative Schemas • Build a spreadsheet • Calculate # of shards for each schema • Estimate query performance – # of documents – # of inserts – # of deletes – Required indexes – Number of documents inspected – Number of documents sent across network
  • 40. Modeling Decisions • Referencing vs. Embedding • Aggregating data by device, customer, product, etc.
  • 41. Referencing Procedure { "_id" : 333, "date" : "2003-02-09T05:00:00"), "hospital" : “County Hills”, "patient" : “John Doe”, "physician" : “Stephen Smith”, "type" : ”Chest X-ray", ”result" : 134 } Results { “_id” : 134 "type" : "txt", "size" : NumberInt(12), "content" : { value1: 343, value2: “abc”, … } }
  • 42. EmbeddingProcedure { "_id" : 333, "date" : "2003-02-09T05:00:00"), "hospital" : “County Hills”, "patient" : “John Doe”, "physician" : “Stephen Smith”, "type" : ”Chest X-ray", ”result" : { "type" : "txt", "size" : NumberInt(12), "content" : { value1: 343, value2: “abc”, … } } }
  • 43. Embedding • Advantages – Retrieve all relevant information in a single query/document – Avoid implementing joins in application code – Update related information as a single atomic operation • MongoDB doesn’t offer multi-document transactions • Limitations – Large documents mean more overhead if most fields are not relevant – Might mean replicating data – 16 MB document size limit
  • 44. Referencing • Advantages – Smaller documents – Less likely to reach 16 MB document limit – Infrequently accessed information not accessed on every query – No duplication of data • Limitations – Two queries required to retrieve information – Cannot update related information atomically
  • 45. { _id: 2, first: “Joe”, last: “Patient”, addr: { …}, procedures: [ { id: 12345, date: 2015-02-15, type: “Cat scan”, …}, { id: 12346, date: 2015-02-15, type: “blood test”, …}] } Patients Embed One-to-Many & Many-to-Many Relationships { _id: 2, first: “Joe”, last: “Patient”, addr: { …}, procedures: [12345, 12346]} { _id: 12345, date: 2015-02-15, type: “Cat scan”, …} { _id: 12346, date: 2015-02-15, type: “blood test”, …} Patients Reference Procedures
  • 46. Schema Alternatives – Do the math? • How complex queries? • How much hardware/shards will I need?
  • 47. Vital Sign Monitoring Device Vital Signs Measured: • Blood Pressure • Pulse • Blood Oxygen Levels Produces data at regular intervals • Once per minute
  • 48. We have a hospital(s) of devices
  • 49. Data From Vital Signs Monitoring Device { deviceId: 123456, spO2: 88, pulse: 74, bp: [128, 80], ts: ISODate("2013-10-16T22:07:00.000-0500") } • One document per minute per device • Relational approach
  • 50. Document Per Hour (By minute) { deviceId: 123456, spO2: { 0: 88, 1: 90, …, 59: 92}, pulse: { 0: 74, 1: 76, …, 59: 72}, bp: { 0: [122, 80], 1: [126, 84], …, 59: [124, 78]}, ts: ISODate("2013-10-16T22:00:00.000-0500") } • Store per-minute data at the hourly level • Update-driven workload • 1 document per device per hour
  • 51. Characterizing Write Differences • Example: data generated every minute • Recording the data for 1 patient for 1 hour: Document Per Event 60 inserts Document Per Hour 1 insert, 59 updates
  • 52. Characterizing Read Differences • Want to graph 24 hour of vital signs for a patient: • Read performance is greatly improved Document Per Event 1440 reads Document Per Hour 24 reads
  • 53. Characterizing Memory and Storage Differences Document Per Minute Document Per Hour Number Documents 52.6 B 876 M Total Index Size 6364 GB 106 GB _id index 1468 GB 24.5 GB {ts: 1, deviceId: 1} 4895 GB 81.6 GB Document Size 92 Bytes 758 Bytes Database Size 4503 GB 618 GB • 100K Devices • 1 years worth of data
  • 54. Characterizing Memory and Storage Differences Document Per Minute Document Per Hour Number Documents 52.6 B 876 M Total Index Size 6364 GB 106 GB _id index 1468 GB 24.5 GB {ts: 1, deviceId: 1} 4895 GB 81.6 GB Document Size 92 Bytes 758 Bytes Database Size 4503 GB 618 GB • 100K Devices • 1 years worth of data 100000 * 365 * 24 * 60 100000 * 365 * 24
  • 55. Characterizing Memory and Storage Differences Document Per Minute Document Per Hour Number Documents 52.6 B 876 M Total Index Size 6364 GB 106 GB _id index 1468 GB 24.5 GB {ts: 1, deviceId: 1} 4895 GB 81.6 GB Document Size 92 Bytes 758 Bytes Database Size 4503 GB 618 GB • 100K Devices • 1 years worth of data 100000 * 365 * 24 * 60 * 130 100000 * 365 * 24 * 130
  • 56. Characterizing Memory and Storage Differences Document Per Minute Document Per Hour Number Documents 52.6 B 876 M Total Index Size 6364 GB 106 GB _id index 1468 GB 24.5 GB {ts: 1, deviceId: 1} 4895 GB 81.6 GB Document Size 92 Bytes 758 Bytes Database Size 4503 GB 618 GB • 100K Devices • 1 years worth of data 100000 * 365 * 24 * 60 * 92 100000 * 365 * 24 * 758
  • 58. Rule of Thumb • To saturate a MongoDB cluster  – loader hardware ~= mongodb hardware • Many threads • Many mongos
  • 62. Loader Architecture loader (8) mongos (4) primary primary primary secondary secondary secondary secondary secondary secondary loader (8) mongos (4) loader (8) mongos (4) Use many threads Use multiple loader servers
  • 63. When Sharding • If you care about initial performance, you must pre-split • Otherwise, initial performance will be slow • (hash sharding automatically presplits collection)
  • 64. Without presplitting Shard 1 Shard 2 Shard 3 Shard 4 -∞ … ∞ • sh.shardCollection(“records.patients”, {zipcode : 1})
  • 65. Without presplitting Shard 1 Shard 2 Shard 3 Shard 4 -∞ … 11305 • 64K chunks • Splitting will occur quickly • Balancing occurs much more slowly • The entire query workload  Shard 1 11306 … 44506 44507 … ∞
  • 66. Without presplitting Shard 1 Shard 2 Shard 3 Shard 4 -∞ … 11305 11306 … 44506 44507 … ∞ Loader mongos
  • 67. Split collection Shard 1 Shard 2 Shard 3 Shard 4 • Split and distribute empty chunks before loading any data • Evenly distribute query load across cluster -∞ … 08333 08334 … 16667 16668 … 25000 25001… 33334 33335 … 41668 41669 … 50000 50001 … 58334 58335 … 66668 66669 … 75000 75001 … 83334 88335 … 96668 96669 … 99999
  • 68. Split collection Shard 1 Shard 2 Shard 3 Shard 4 -∞ … 08333 08334 … 16667 16668 … 25000 25001… 33334 33335 … 41668 41669 … 50000 50001 … 58334 58335 … 66668 66669 … 75000 75001 … 83334 88335 … 96668 96669 … 99999 Loader mongos
  • 70. Best Practices 1. Use servers with specifications that will provide good MongoDB performance – 64+ GB RAM, many cores, many IOPS (RAID-10/SSDs) 2. Calculate How Many Shards? 1. Calculate required RAM and Disk Space 2. Build a prototype to determine the ops/sec capacity of a server 3. Do the math 3. Configure OS for Optimal MongoDB Performance – See MongoDB Production Notes – Review logs for warnings (Don’t ignore)
  • 71. Best Practices (cont.) 4. Create a Document Schema – Denormalized 5. Tailor schema to application workload – Use application queries to guide schema design decisions – Consider alternative schemas – Compare cluster size (# of shards) and performance – Build a spreadsheet
  • 72. Best Practices 6. Loading Data – Loader Hardware ~= MongoDB hardware – Many threads – Many mongos 7. Pre-split – Ensure query workload is evenly distributed across the cluster from the start

Editor's Notes

  1. Here we have greatly reduced the relational data model for this application to two tables. In reality no database has two tables. It is much more common to have hundreds or thousands of tables. And as a developer where do you begin when you have a complex data model?? If you’re building an app you’re really thinking about just a hand full of common things, like products, and these can be represented in a document much more easily that a complex relational model where the data is broken up in a way that doesn’t really reflect the way you think about the data or write an application.
  2. MongoDB provides horizontal scale-out for databases using a technique called sharding, which is trans- parent to applications. Sharding distributes data across multiple physical partitions called shards. Sharding allows MongoDB deployments to address the hardware limitations of a single server, such as bottlenecks in RAM or disk I/O, without adding complexity to the application. MongoDB supports three types of sharding: • Range-based Sharding. Documents are partitioned across shards according to the shard key value. Documents with shard key values “close” to one another are likely to be co-located on the same shard. This approach is well suited for applications that need to optimize range- based queries. • Hash-based Sharding. Documents are uniformly distributed according to an MD5 hash of the shard key value. Documents with shard key values “close” to one another are unlikely to be co-located on the same shard. This approach guarantees a uniform distribution of writes across shards, but is less optimal for range-based queries. • Tag-aware Sharding. Documents are partitioned according to a user-specified configuration that associates shard key ranges with shards. Users can optimize the physical location of documents for application requirements such as locating data in specific data centers. MongoDB automatically balances the data in the cluster as the data grows or the size of the cluster increases or decreases.
  3. Sharding is transparent to applications; whether there is one or one hundred shards, the application code for querying MongoDB is the same. Applications issue queries to a query router that dispatches the query to the appropriate shards. For key-value queries that are based on the shard key, the query router will dispatch the query to the shard that manages the document with the requested key. When using range-based sharding, queries that specify ranges on the shard key are only dispatched to shards that contain documents with values within the range. For queries that don’t use the shard key, the query router will dispatch the query to all shards and aggregate and sort the results as appropriate. Multiple query routers can be used with a MongoDB system, and the appropriate number is determined based on performance and availability requirements of the application.
  4. High Availability – Ensure application availability during many types of failures Disaster Recovery – Address the RTO and RPO goals for business continuity Maintenance – Perform upgrades and other maintenance operations with no application downtime Secondaries can be used for a variety of applications – failover, hot backup, rolling upgrades, data locality and privacy and workload isolation