SlideShare a Scribd company logo
1 of 27
Download to read offline
MongoDB Schema Design
Tips & Tricks
Grupo Undanet
August 2017, Salamanca
Who am I
Juan Roy
Twitter: @juanroycouto
Email: juanroycouto@gmail.com
MongoDB DBA at Grupo Undanet
2
Agenda
MongoDB Schema Design
● What is MongoDB
● What is a JSON Document
● What a Document Must Contain
● Relational Approach vs
Document Model
● Normalization vs
Denormalization
● Embedding Documents
● Things to Keep in Mind
● Goals
● Over Normalization
3
● Overloaded Documents
● Working Set
● Historic Information
● 1-1
● 1-Few (Embedding & Referencing)
● N-1
● 1-Many
● Many-Many
● Recap
What is MongoDB
MongoDB Schema Design
● Non-Relational Database
● NoSQL Multipurpose Database
● Main Characteristics:
○ Scalability
○ High Availability
○ Automatic Failover
○ …
● Document-based (JSON)
4
SQL MongoDB
Database Database
Table Collection
Register Document
What is a JSON Document
MongoDB Schema Design
5
{
"_id" : ObjectId("59400587962fe33db2194129"),
"description" : "MICHELIN 285/30 ZR21 PILOT SUPER SPORT 2012",
"date" : ISODate("2017-08-28T04:02:32Z"),
"property" : {
"tag" : {
"noisebands" : "1",
"rollingresistance" : "B",
"noise" : "69",
"wetgrip" : "A"
},
"ratio" : 30,
},
"ecotasa" : [
{
"country" : "724",
"price" : NumberDecimal("1.380000"),
},
{
"country" : "620",
"price" : NumberDecimal("0.000000"),
}
],
"location" : {
"type" : Point,
"coordinates" : [ -5.724332, 40.959219 ]
}
}
_id
string
array
date
subdocument
geo-location
number
What a Document must Contain
MongoDB Schema Design
● Ideally
○ All (principal application) item-related data
○ 1 Doc per Item
6
Application Principal Item
Catalog Article
Finance Client
● Really
○ Most frequently accessed data
Relational Approach vs Document Model
MongoDB Schema Design
7
{
"_id" : ObjectId("59400587962fe33db2194129"),
"description" : "MICHELIN 285/30 ZR21 PILOT SUPER SPORT 2012",
"date" : ISODate("2017-08-28T04:02:32Z")
"property" : {
"tag" : {
"noisebands" : "1",
"rollingresistance" : "B",
"noise" : "69",
"wetgrip" : "A"
},
"ratio" : "30",
},
"ecotasa" : [
{
"country" : "724",
"price" : NumberDecimal("1.380000"),
},
{
"country" : "620",
"price" : NumberDecimal("0.000000"),
}
],
"location" : {
"type" : Point,
"coordinates" : [ -5.724332, 40.959219 ]
}
}
Normalization vs Denormalization
MongoDB Schema Design
8
People
{
_id : 1,
name : 'Peter',
city : 'Salamanca'
}
Motorbikes
{
_id : 1,
owner : 1,
color : 'red',
model : 'Suzuki'
}
{
_id : 2,
owner : 1,
color : 'black',
model : 'Harley Davidson'
}
People
{
_id : 1,
name : 'Peter',
city : 'Salamanca',
motorbikes : [
{
model : 'Suzuki',
color : 'red'
},
{
model : 'Harley Davidson',
color : 'black'
}
]
}
Denormalization
Normalization
Embedding Documents
MongoDB Schema Design
9
People
{
_id : 1,
name : 'Peter',
city : 'Salamanca'
}
Motorbikes
{
_id : 1,
owner : 1,
color : 'red',
model : 'Suzuki'
}
{
_id : 2,
owner : 1,
color : 'black',
model : 'Harley Davidson'
}
People
{
_id : 1,
name : 'Peter',
city : 'Salamanca',
motorbikes : [
{
model : 'Suzuki',
color : 'red'
},
{
model : 'Harley Davidson',
color : 'black'
}
]
}
Things to Keep in Mind
MongoDB Schema Design
10
● Avoid Relational Approach
● What will happen if we scale
● Size of:
○ Data
○ Index
○ Document
● How will users access the data
○ Normal users
○ Machine Learning
○ Business Intelligence
Goals
MongoDB Schema Design
11
● Performance
● Scalability
● Simplicity
Over Normalization
MongoDB Schema Design
● The relational model has been moved directly to the MongoDB model.
● In the relational world is common to have one table per concept. They do not
have arrays.
● Only one action implies multiple queries, instead of just querying the data
once.
12
Overloaded Documents
MongoDB Schema Design
● This problem can arise if the application is packing lots of rarely used data
into its frequently accessed documents.
● If your application is packing rarely used data into a document that needs to
be touched frequently, that means it is more likely to evict other important
data from the cache when that document gets read.
● Multiply this across a collection and the net result is that the server could be
paging a lot more data than necessary in order to service the application.
13
Working Set
MongoDB Schema Design
14
The Working Set is the size of:
● Our Data *
plus
● Our Indexes
* But only the size of our most accessed data
The Working Set must fit in RAM!
Working Set
MongoDB Schema Design
15
The Working Set does not fit in RAM, what should I do?
● Add more RAM to our machine
● Shard
● Reduce the size of our Working Set:
○ Limit our arrays
○ Limit our embedded documents
○ …
○ Benefits:
■ Fast data retrieval
■ One query brings all the information needed
Historic Information
MongoDB Schema Design
16
● When our data grows up continuously (historical) and we embed them on our
main collection, our document will own a lot of information not needed
habitually. But maybe, I want to store that for analytics purposes. So we’ll
keep it away from the user document.
● That is not the case of information with a limited growth (addresses, phone
numbers, etc).
1-1
MongoDB Schema Design
17
id name phone_number zip_code
1 Rick 555-111-1234 01209
2 Mike 555-222-2345 30062
Users
{
_id : 1,
name : 'Rick',
phone_number : '555-111-1234',
zip_code : '01209'
}
{
_id : 2,
name : 'Mike',
phone_number : '555-222-2345',
zip_code : '30062'
}
1-Few
MongoDB Schema Design
18
● Referencing (or Normalization)
○ To show a user’s information we need to do joins (or more than one query), this implies
random seeks, a very low-performance operation!
● Embedding (or Denormalization)
○ We can avoid joins via denormalization. This implies redundancy data and more complex
applications for not to generate inconsistencies.
○ Arrays help us to get no redundancy. This solution gives us perform benefits.
○ With denormalization, we have a lot of data model possibilities and this makes more difficult to
define our model.
1-Few
MongoDB Schema Design
19
id name zip_code
1 Rick 01209
2 Mike 30062
id user_id phone_number
1 1 555-111-1234
2 2 555-222-2345
3 2 555-333-3456
1-Few (MongoDB-Embedding)
MongoDB Schema Design
● The approach that gives us the best performance and data consistency guarantees.
● Locality: MongoDB stores documents contiguously on disk, putting all the data you
need into one document means that you’re never more than one seek away from
everything you need.
● Atomicity and Isolation: Embedding we get atomicity (transactionality).
20
{
_id : 2,
name : 'Mike',
zip_code : '30062',
phone_numbers : [ '555-222-2345', '555-333-3456' ]
}
1-Few (MongoDB-Referencing)
MongoDB Schema Design
21
{
_id : 2,
name : 'Mike',
zip_code : '30062',
phone_numbers : [ 2, 3 ]
}
{
_id : 2,
user_id : 2,
phone_number : '555-222-2345'
}
{
_id : 3,
user_id : 2,
phone_number : '555-333-3456'
}
● Referencing we lose transactionality.
● We need:
○ More than one query
○ To use $lookup (joins)
● This approach is worst than embedding
for performance.
● If we have to read our data frequently is
better to embed it.
● Flexibility in order to project desired
fields.
N-1
MongoDB Schema Design
22
{
_id : 2,
name : 'Mike',
zip_code : '30062',
phone_numbers : [ 2, 3 ],
address : '13, Rue del Percebe'
}
{
_id : 1,
name : 'Rick',
zip_code : '01209',
phone_numbers : [ 2, 3 ],
address : '13, Rue del Percebe'
}
What if two people share an address?
● Does that mean that you have to
store the address twice? Yes, you
do have to store it twice, three
times, etc.
● This is better than make
unnecessary joins. This extra
space on the disk you are going to
need will make your queries faster.
1-Many
MongoDB Schema Design
Case: A blog with hundreds, or even thousands, of comments for a given post.
Embedding carries significant penalties:
● The larger a document is, the more RAM it uses. The fewer documents in RAM, the more likely the
server is to page fault to retrieve documents, and ultimately page faults lead to random disk I/O.
● Growing documents must eventually be copied to larger spaces.
● The document never stops growing up.
● MongoDB documents have a hard size limit of 16MB.
Referencing:
● The document will not grow up because we will have one document per comment in a second
collection.
● For very high or unpredictable one-to-many relationships.
Solution: We may only wish to display the first three comments when showing a blog entry, more is simply
wasting RAM.
23
Many-Many
MongoDB Schema Design
● We will embed a list of _id values in both directions
● We no longer have redundant information
24
Product
{ _id : 'My product',
category_ids : [ 'My category',... ]
}
Category
{ _id : 'My category',
product_ids : [ 'My product', … ]
}
Recap
MongoDB Schema Design
● Avoid round trips to the database.
● User events should only generate a small number of queries.
● Use arrays when needed and of course when they won’t grow indefinitely.
● Don’t just migrate relational schemas.
● Data that is queried together should be in the same document whenever possible.
● Store the last login time, plus the shopping cart, in the user document since that is all
we need for the landing page.
● Embedding for performance and atomicity (transactionality).
● Referencing for huge relationships.
Ultimately, the decision depends on the access patterns of your application.
25
Questions?
MongoDB Schema Design
26
Thank you!
MongoDB Schema Design
Thank you for your attention!
27

More Related Content

What's hot

MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDBMongoDB
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...Gianfranco Palumbo
 
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosConceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosMongoDB
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBMongoDB
 
Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationMongoDB
 
Doing Joins in MongoDB: Best Practices for Using $lookup
Doing Joins in MongoDB: Best Practices for Using $lookupDoing Joins in MongoDB: Best Practices for Using $lookup
Doing Joins in MongoDB: Best Practices for Using $lookupMongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
 
Mongo db – document oriented database
Mongo db – document oriented databaseMongo db – document oriented database
Mongo db – document oriented databaseWojciech Sznapka
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsMongoDB
 
Webinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsWebinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsMongoDB
 
Jumpstart: Introduction to MongoDB
Jumpstart: Introduction to MongoDBJumpstart: Introduction to MongoDB
Jumpstart: Introduction to MongoDBMongoDB
 
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB MongoDB
 

What's hot (20)

MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
MongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big ComputeMongoDB Europe 2016 - Big Data meets Big Compute
MongoDB Europe 2016 - Big Data meets Big Compute
 
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
Conceptos básicos. Seminario web 2: Su primera aplicación MongoDB
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
 
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
 
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentosConceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
 
Back to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB ApplicationBack to Basics Webinar 2: Your First MongoDB Application
Back to Basics Webinar 2: Your First MongoDB Application
 
Doing Joins in MongoDB: Best Practices for Using $lookup
Doing Joins in MongoDB: Best Practices for Using $lookupDoing Joins in MongoDB: Best Practices for Using $lookup
Doing Joins in MongoDB: Best Practices for Using $lookup
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
 
Mongo db – document oriented database
Mongo db – document oriented databaseMongo db – document oriented database
Mongo db – document oriented database
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 
Webinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to BasicsWebinar: Getting Started with MongoDB - Back to Basics
Webinar: Getting Started with MongoDB - Back to Basics
 
Jumpstart: Introduction to MongoDB
Jumpstart: Introduction to MongoDBJumpstart: Introduction to MongoDB
Jumpstart: Introduction to MongoDB
 
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB and Spark
MongoDB and SparkMongoDB and Spark
MongoDB and Spark
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB
 

Similar to MongoDB Schema Design Tips & Tricks

An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014OpenExpoES
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDBCésar Trigo
 
MongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de HuelvaMongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de HuelvaJuan Antonio Roy Couto
 
MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?Binary Studio
 
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema DesignMongoDB
 
Jumpstart: Introduction to Schema Design
Jumpstart: Introduction to Schema DesignJumpstart: Introduction to Schema Design
Jumpstart: Introduction to Schema DesignMongoDB
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 
MongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseMongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseSudhir Patil
 
MongoDB Design Patterns
MongoDB Design PatternsMongoDB Design Patterns
MongoDB Design PatternsHaim Michael
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB
 
MongoDB NoSQL - Developer Guide
MongoDB NoSQL - Developer GuideMongoDB NoSQL - Developer Guide
MongoDB NoSQL - Developer GuideShiv K Sah
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics MongoDB
 

Similar to MongoDB Schema Design Tips & Tricks (20)

An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014An introduction to MongoDB by César Trigo #OpenExpoDay 2014
An introduction to MongoDB by César Trigo #OpenExpoDay 2014
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
MongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de HuelvaMongoDB Workshop Universidad de Huelva
MongoDB Workshop Universidad de Huelva
 
MongoDB DOC v1.5
MongoDB DOC v1.5MongoDB DOC v1.5
MongoDB DOC v1.5
 
MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?MongoDB 3.2 - a giant leap. What’s new?
MongoDB 3.2 - a giant leap. What’s new?
 
MongoDB
MongoDBMongoDB
MongoDB
 
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
[MongoDB.local Bengaluru 2018] Jumpstart: Introduction to Schema Design
 
MongoDB Basics Unileon
MongoDB Basics UnileonMongoDB Basics Unileon
MongoDB Basics Unileon
 
Jumpstart: Introduction to Schema Design
Jumpstart: Introduction to Schema DesignJumpstart: Introduction to Schema Design
Jumpstart: Introduction to Schema Design
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
MongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseMongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql Database
 
MongoDB Design Patterns
MongoDB Design PatternsMongoDB Design Patterns
MongoDB Design Patterns
 
New paradigms
New paradigmsNew paradigms
New paradigms
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
MongoDB NoSQL - Developer Guide
MongoDB NoSQL - Developer GuideMongoDB NoSQL - Developer Guide
MongoDB NoSQL - Developer Guide
 
Mongodb
MongodbMongodb
Mongodb
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 

Recently uploaded (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 

MongoDB Schema Design Tips & Tricks

  • 1. MongoDB Schema Design Tips & Tricks Grupo Undanet August 2017, Salamanca
  • 2. Who am I Juan Roy Twitter: @juanroycouto Email: juanroycouto@gmail.com MongoDB DBA at Grupo Undanet 2
  • 3. Agenda MongoDB Schema Design ● What is MongoDB ● What is a JSON Document ● What a Document Must Contain ● Relational Approach vs Document Model ● Normalization vs Denormalization ● Embedding Documents ● Things to Keep in Mind ● Goals ● Over Normalization 3 ● Overloaded Documents ● Working Set ● Historic Information ● 1-1 ● 1-Few (Embedding & Referencing) ● N-1 ● 1-Many ● Many-Many ● Recap
  • 4. What is MongoDB MongoDB Schema Design ● Non-Relational Database ● NoSQL Multipurpose Database ● Main Characteristics: ○ Scalability ○ High Availability ○ Automatic Failover ○ … ● Document-based (JSON) 4 SQL MongoDB Database Database Table Collection Register Document
  • 5. What is a JSON Document MongoDB Schema Design 5 { "_id" : ObjectId("59400587962fe33db2194129"), "description" : "MICHELIN 285/30 ZR21 PILOT SUPER SPORT 2012", "date" : ISODate("2017-08-28T04:02:32Z"), "property" : { "tag" : { "noisebands" : "1", "rollingresistance" : "B", "noise" : "69", "wetgrip" : "A" }, "ratio" : 30, }, "ecotasa" : [ { "country" : "724", "price" : NumberDecimal("1.380000"), }, { "country" : "620", "price" : NumberDecimal("0.000000"), } ], "location" : { "type" : Point, "coordinates" : [ -5.724332, 40.959219 ] } } _id string array date subdocument geo-location number
  • 6. What a Document must Contain MongoDB Schema Design ● Ideally ○ All (principal application) item-related data ○ 1 Doc per Item 6 Application Principal Item Catalog Article Finance Client ● Really ○ Most frequently accessed data
  • 7. Relational Approach vs Document Model MongoDB Schema Design 7 { "_id" : ObjectId("59400587962fe33db2194129"), "description" : "MICHELIN 285/30 ZR21 PILOT SUPER SPORT 2012", "date" : ISODate("2017-08-28T04:02:32Z") "property" : { "tag" : { "noisebands" : "1", "rollingresistance" : "B", "noise" : "69", "wetgrip" : "A" }, "ratio" : "30", }, "ecotasa" : [ { "country" : "724", "price" : NumberDecimal("1.380000"), }, { "country" : "620", "price" : NumberDecimal("0.000000"), } ], "location" : { "type" : Point, "coordinates" : [ -5.724332, 40.959219 ] } }
  • 8. Normalization vs Denormalization MongoDB Schema Design 8 People { _id : 1, name : 'Peter', city : 'Salamanca' } Motorbikes { _id : 1, owner : 1, color : 'red', model : 'Suzuki' } { _id : 2, owner : 1, color : 'black', model : 'Harley Davidson' } People { _id : 1, name : 'Peter', city : 'Salamanca', motorbikes : [ { model : 'Suzuki', color : 'red' }, { model : 'Harley Davidson', color : 'black' } ] } Denormalization Normalization
  • 9. Embedding Documents MongoDB Schema Design 9 People { _id : 1, name : 'Peter', city : 'Salamanca' } Motorbikes { _id : 1, owner : 1, color : 'red', model : 'Suzuki' } { _id : 2, owner : 1, color : 'black', model : 'Harley Davidson' } People { _id : 1, name : 'Peter', city : 'Salamanca', motorbikes : [ { model : 'Suzuki', color : 'red' }, { model : 'Harley Davidson', color : 'black' } ] }
  • 10. Things to Keep in Mind MongoDB Schema Design 10 ● Avoid Relational Approach ● What will happen if we scale ● Size of: ○ Data ○ Index ○ Document ● How will users access the data ○ Normal users ○ Machine Learning ○ Business Intelligence
  • 11. Goals MongoDB Schema Design 11 ● Performance ● Scalability ● Simplicity
  • 12. Over Normalization MongoDB Schema Design ● The relational model has been moved directly to the MongoDB model. ● In the relational world is common to have one table per concept. They do not have arrays. ● Only one action implies multiple queries, instead of just querying the data once. 12
  • 13. Overloaded Documents MongoDB Schema Design ● This problem can arise if the application is packing lots of rarely used data into its frequently accessed documents. ● If your application is packing rarely used data into a document that needs to be touched frequently, that means it is more likely to evict other important data from the cache when that document gets read. ● Multiply this across a collection and the net result is that the server could be paging a lot more data than necessary in order to service the application. 13
  • 14. Working Set MongoDB Schema Design 14 The Working Set is the size of: ● Our Data * plus ● Our Indexes * But only the size of our most accessed data The Working Set must fit in RAM!
  • 15. Working Set MongoDB Schema Design 15 The Working Set does not fit in RAM, what should I do? ● Add more RAM to our machine ● Shard ● Reduce the size of our Working Set: ○ Limit our arrays ○ Limit our embedded documents ○ … ○ Benefits: ■ Fast data retrieval ■ One query brings all the information needed
  • 16. Historic Information MongoDB Schema Design 16 ● When our data grows up continuously (historical) and we embed them on our main collection, our document will own a lot of information not needed habitually. But maybe, I want to store that for analytics purposes. So we’ll keep it away from the user document. ● That is not the case of information with a limited growth (addresses, phone numbers, etc).
  • 17. 1-1 MongoDB Schema Design 17 id name phone_number zip_code 1 Rick 555-111-1234 01209 2 Mike 555-222-2345 30062 Users { _id : 1, name : 'Rick', phone_number : '555-111-1234', zip_code : '01209' } { _id : 2, name : 'Mike', phone_number : '555-222-2345', zip_code : '30062' }
  • 18. 1-Few MongoDB Schema Design 18 ● Referencing (or Normalization) ○ To show a user’s information we need to do joins (or more than one query), this implies random seeks, a very low-performance operation! ● Embedding (or Denormalization) ○ We can avoid joins via denormalization. This implies redundancy data and more complex applications for not to generate inconsistencies. ○ Arrays help us to get no redundancy. This solution gives us perform benefits. ○ With denormalization, we have a lot of data model possibilities and this makes more difficult to define our model.
  • 19. 1-Few MongoDB Schema Design 19 id name zip_code 1 Rick 01209 2 Mike 30062 id user_id phone_number 1 1 555-111-1234 2 2 555-222-2345 3 2 555-333-3456
  • 20. 1-Few (MongoDB-Embedding) MongoDB Schema Design ● The approach that gives us the best performance and data consistency guarantees. ● Locality: MongoDB stores documents contiguously on disk, putting all the data you need into one document means that you’re never more than one seek away from everything you need. ● Atomicity and Isolation: Embedding we get atomicity (transactionality). 20 { _id : 2, name : 'Mike', zip_code : '30062', phone_numbers : [ '555-222-2345', '555-333-3456' ] }
  • 21. 1-Few (MongoDB-Referencing) MongoDB Schema Design 21 { _id : 2, name : 'Mike', zip_code : '30062', phone_numbers : [ 2, 3 ] } { _id : 2, user_id : 2, phone_number : '555-222-2345' } { _id : 3, user_id : 2, phone_number : '555-333-3456' } ● Referencing we lose transactionality. ● We need: ○ More than one query ○ To use $lookup (joins) ● This approach is worst than embedding for performance. ● If we have to read our data frequently is better to embed it. ● Flexibility in order to project desired fields.
  • 22. N-1 MongoDB Schema Design 22 { _id : 2, name : 'Mike', zip_code : '30062', phone_numbers : [ 2, 3 ], address : '13, Rue del Percebe' } { _id : 1, name : 'Rick', zip_code : '01209', phone_numbers : [ 2, 3 ], address : '13, Rue del Percebe' } What if two people share an address? ● Does that mean that you have to store the address twice? Yes, you do have to store it twice, three times, etc. ● This is better than make unnecessary joins. This extra space on the disk you are going to need will make your queries faster.
  • 23. 1-Many MongoDB Schema Design Case: A blog with hundreds, or even thousands, of comments for a given post. Embedding carries significant penalties: ● The larger a document is, the more RAM it uses. The fewer documents in RAM, the more likely the server is to page fault to retrieve documents, and ultimately page faults lead to random disk I/O. ● Growing documents must eventually be copied to larger spaces. ● The document never stops growing up. ● MongoDB documents have a hard size limit of 16MB. Referencing: ● The document will not grow up because we will have one document per comment in a second collection. ● For very high or unpredictable one-to-many relationships. Solution: We may only wish to display the first three comments when showing a blog entry, more is simply wasting RAM. 23
  • 24. Many-Many MongoDB Schema Design ● We will embed a list of _id values in both directions ● We no longer have redundant information 24 Product { _id : 'My product', category_ids : [ 'My category',... ] } Category { _id : 'My category', product_ids : [ 'My product', … ] }
  • 25. Recap MongoDB Schema Design ● Avoid round trips to the database. ● User events should only generate a small number of queries. ● Use arrays when needed and of course when they won’t grow indefinitely. ● Don’t just migrate relational schemas. ● Data that is queried together should be in the same document whenever possible. ● Store the last login time, plus the shopping cart, in the user document since that is all we need for the landing page. ● Embedding for performance and atomicity (transactionality). ● Referencing for huge relationships. Ultimately, the decision depends on the access patterns of your application. 25
  • 27. Thank you! MongoDB Schema Design Thank you for your attention! 27