SlideShare a Scribd company logo
1 of 51
Perl Engineer & Evangelist, 10gen
Mike Friedman
#MongoDBdays
Schema Design
Four Real-World Use
Cases
Single Table En
Agenda
• Why is schema design important
• 4 Real World Schemas
– Inbox
– History
– IndexedAttributes
– Multiple Identities
• Conclusions
Why is Schema Design
important?
• Largest factor for a performant system
• Schema design with MongoDB is different
• RDBMS – "What answers do I have?"
• MongoDB – "What question will I have?"
#1 - Message Inbox
Let’s get
Social
Sending Messages
?
Design Goals
• Efficiently send new messages to recipients
• Efficiently read inbox
Reading my Inbox
?
3 Approaches (there are
more)
• Fan out on Read
• Fan out on Write
• Fan out on Write with Bucketing
// Shard on "from"
db.shardCollection( "mongodbdays.inbox", { from: 1 } )
// Make sure we have an index to handle inbox reads
db.inbox.ensureIndex( { to: 1, sent: 1 } )
msg = {
from: "Joe",
to: [ "Bob", "Jane" ],
sent: new Date(),
message: "Hi!",
}
// Send a message
db.inbox.save( msg )
// Read my inbox
db.inbox.find( { to: "Joe" } ).sort( { sent: -1 } )
Fan out on read
Fan out on read – Send
Message
Shard 1 Shard 2 Shard 3
Send
Message
Fan out on read – Inbox Read
Shard 1 Shard 2 Shard 3
Read
Inbox
Considerations
• One document per message sent
• Reading an inbox means finding all messages
with my own name in the recipient field
• Requires scatter-gather on sharded cluster
• Then a lot of random IO on a shard to find
everything
// Shard on “recipient” and “sent”
db.shardCollection( "mongodbdays.inbox", { ”recipient”: 1, ”sent”: 1 } )
msg = {
from: "Joe",
to: [ "Bob", "Jane" ],
sent: new Date(),
message: "Hi!",
}
// Send a message
for ( recipient in msg.to ) {
msg.recipient = msg.to[recipient]
db.inbox.save( msg );
}
// Read my inbox
db.inbox.find( { recipient: "Joe" } ).sort( { sent: -1 } )
Fan out on write
Fan out on write – Send
Message
Shard 1 Shard 2 Shard 3
Send
Message
Fan out on write– Read Inbox
Shard 1 Shard 2 Shard 3
Read
Inbox
Considerations
• One document per recipient
• Reading my inbox is just finding all of the
messages with me as the recipient
• Can shard on recipient, so inbox reads hit one
shard
• But still lots of random IO on the shard
// Shard on “owner / sequence”
db.shardCollection( "mongodbdays.inbox", { owner: 1, sequence: 1 } )
db.shardCollection( "mongodbdays.users", { user_name: 1 } )
msg = {
from: "Joe",
to: [ "Bob", "Jane" ],
sent: new Date(),
message: "Hi!",
}
Fan out on write with buckets
// Send a message
for( recipient in msg.to) {
count = db.users.findAndModify({
query: { user_name: msg.to[recipient] },
update: { "$inc": { "msg_count": 1 } },
upsert: true,
new: true }).msg_count;
sequence = Math.floor(count / 50);
db.inbox.update({
owner: msg.to[recipient], sequence: sequence },
{ $push: { "messages": msg } },
{ upsert: true } );
}
// Read my inbox
db.inbox.find( { owner: "Joe" } ).sort ( { sequence: -1 } ).limit( 2 )
Fan out on write with buckets
Fan out on write with buckets
• Each “inbox” document is an array of messages
• Append a message onto “inbox” of recipient
• Bucket inboxes so there’s not too many
messages per document
• Can shard on recipient, so inbox reads hit one
shard
• 1 or 2 documents to read the whole inbox
Fan out on write with buckets -
Send
Shard 1 Shard 2 Shard 3
Send
Message
Fan out on write with buckets -
Read
Shard 1 Shard 2 Shard 3
Read
Inbox
#2 – History
Design Goals
• Need to retain a limited amount of history e.g.
– Hours, Days, Weeks
– May be legislative requirement (e.g. HIPPA, SOX, DPA)
• Need to query efficiently by
– match
– ranges
3 Approaches (there are
more)
• Bucket by Number of messages
• Fixed size Array
• Bucket by Date + TTL Collections
db.inbox.find()
{ owner: "Joe", sequence: 25,
messages: [
{ from: "Joe",
to: [ "Bob", "Jane" ],
sent: ISODate("2013-03-01T09:59:42.689Z"),
message: "Hi!"
},
…
] }
// Query with a date range
db.inbox.find ({owner: "friend1",
messages: {
$elemMatch: {sent:{$gte: ISODate("…") }}}})
// Remove elements based on a date
db.inbox.update({owner: "friend1" },
{ $pull: { messages: {
sent: { $gte: ISODate("…") } } } } )
Inbox – Bucket by #
messages
Considerations
• Shrinking documents, space can be reclaimed
with
– db.runCommand ( { compact: '<collection>' } )
• Removing the document after the last element in
the array as been removed
– { "_id" : …, "messages" : [ ], "owner" : "friend1",
"sequence" : 0 }
msg = {
from: "Your Boss",
to: [ "Bob" ],
sent: new Date(),
message: "CALL ME NOW!"
}
// 2.4 Introduces $each, $sort and $slice for $push
db.messages.update(
{ _id: 1 },
{ $push: { messages: { $each: [ msg ],
$sort: { sent: 1 },
$slice: -50 }
}
}
)
Maintain the latest – Fixed
Size Array
Considerations
• Need to compute the size of the array based on
retention period
// messages: one doc per user per day
db.inbox.findOne()
{
_id: 1,
to: "Joe",
sequence: ISODate("2013-02-04T00:00:00.392Z"),
messages: [ ]
}
// Auto expires data after 31536000 seconds = 1 year
db.messages.ensureIndex( { sequence: 1 },
{ expireAfterSeconds: 31536000 } )
TTL Collections
#3 – Indexed Attributes
Design Goal
• Application needs to stored a variable number of
attributes e.g.
– User defined Form
– Meta Data tags
• Queries needed
– Equality
– Range based
• Need to be efficient, regardless of the number of
attributes
2 Approaches (there are
more)
• Attributes as Embedded Document
• Attributes as Objects in an Array
db.files.insert( { _id: "local.0",
attr: { type: "text", size: 64,
created: ISODate("..." } } )
db.files.insert( { _id: "local.1",
attr: { type: "text", size: 128} } )
db.files.insert( { _id: "mongod",
attr: { type: "binary", size: 256,
created: ISODate("...") } } )
// Need to create an index for each item in the sub-document
db.files.ensureIndex( { "attr.type": 1 } )
db.files.find( { "attr.type": "text"} )
// Can perform range queries
db.files.ensureIndex( { "attr.size": 1 } )
db.files.find( { "attr.size": { $gt: 64, $lte: 16384 } } )
Attributes as a Sub-
Document
Considerations
• Each attribute needs an Index
• Each time you extend, you add an index
• Lots and lots of indexes
db.files.insert( {_id: "local.0",
attr: [ { type: "text" },
{ size: 64 },
{ created: ISODate("...") } ] } )
db.files.insert( { _id: "local.1",
attr: [ { type: "text" },
{ size: 128 } ] } )
db.files.insert( { _id: "mongod",
attr: [ { type: "binary" },
{ size: 256 },
{ created: ISODate("...") } ] } )
db.files.ensureIndex( { attr: 1 } )
Attributes as Objects in Array
Considerations
• Only one index needed on attr
• Can support range queries, etc.
• Index can be used only once per query
#4 – Multiple Identities
Design Goal
• Ability to look up by a number of different
identities e.g.
• Username
• Email address
• FB Handle
• LinkedIn URL
2 Approaches (there are
more)
• Identifiers in a single document
• Separate Identifiers from Content
db.users.findOne()
{ _id: "joe",
email: "joe@example.com,
fb: "joe.smith", // facebook
li: "joe.e.smith", // linkedin
other: {…}
}
// Shard collection by _id
db.shardCollection("mongodbdays.users", { _id: 1 } )
// Create indexes on each key
db.users.ensureIndex( { email: 1} )
db.users.ensureIndex( { fb: 1 } )
db.users.ensureIndex( { li: 1 } )
Single Document by User
Read by _id (shard key)
Shard 1 Shard 2 Shard 3
find( { _id: "joe"} )
Read by email (non-shard
key)
Shard 1 Shard 2 Shard 3
find ( { email: joe@example.com }
)
Considerations
• Lookup by shard key is routed to 1 shard
• Lookup by other identifier is scatter gathered
across all shards
• Secondary keys cannot have a unique index
// Create unique index
db.identities.ensureIndex( { identifier : 1} , { unique: true} )
// Create a document for each users document
db.identities.save(
{ identifier : { hndl: "joe" }, user: "1200-42" } )
db.identities.save(
{ identifier : { email: "joe@abc.com" }, user: "1200-42" } )
db.identities.save(
{ identifier : { li: "joe.e.smith" }, user: "1200-42" } )
// Shard collection by _id
db.shardCollection( "mydb.identities", { identifier : 1 } )
// Create unique index
db.users.ensureIndex( { _id: 1} , { unique: true} )
// Shard collection by _id
db.shardCollection( "mydb.users", { _id: 1 } )
Document per Identity
Read requires 2 reads
Shard 1 Shard 2 Shard 3
db.identities.find({"identifier" : {
"hndl" : "joe" }})
db.users.find( { _id: "1200-42"}
)
Considerations
• Lookup to Identities is a routed query
• Lookup to Users is a routed query
• Unique indexes available
Conclusion
Summary
• Multiple ways to model a domain problem
• Understand the key uses cases of your app
• Balance between ease of query vs. ease of write
• Random IO should be avoided
Perl Engineer & Evangelist, 10gen
Mike Friedman
#MongoDBdays
Thank You

More Related Content

What's hot

Building an Activity Feed with Cassandra
Building an Activity Feed with CassandraBuilding an Activity Feed with Cassandra
Building an Activity Feed with CassandraMark Dunphy
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDBMongoDB
 
Webinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsWebinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsMongoDB
 
Mongoose and MongoDB 101
Mongoose and MongoDB 101Mongoose and MongoDB 101
Mongoose and MongoDB 101Will Button
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB FundamentalsMongoDB
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningPuneet Behl
 
Introducing MongoDB Atlas
Introducing MongoDB AtlasIntroducing MongoDB Atlas
Introducing MongoDB AtlasMongoDB
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patternsjoergreichert
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Advanced Schema Design Patterns
Advanced Schema Design Patterns Advanced Schema Design Patterns
Advanced Schema Design Patterns MongoDB
 

What's hot (20)

Mongo DB Presentation
Mongo DB PresentationMongo DB Presentation
Mongo DB Presentation
 
Building an Activity Feed with Cassandra
Building an Activity Feed with CassandraBuilding an Activity Feed with Cassandra
Building an Activity Feed with Cassandra
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
MongoDB 101
MongoDB 101MongoDB 101
MongoDB 101
 
Webinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance ImplicationsWebinar: MongoDB Schema Design and Performance Implications
Webinar: MongoDB Schema Design and Performance Implications
 
Mongoose and MongoDB 101
Mongoose and MongoDB 101Mongoose and MongoDB 101
Mongoose and MongoDB 101
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Introducing MongoDB Atlas
Introducing MongoDB AtlasIntroducing MongoDB Atlas
Introducing MongoDB Atlas
 
Mongo DB schema design patterns
Mongo DB schema design patternsMongo DB schema design patterns
Mongo DB schema design patterns
 
Schema Design
Schema DesignSchema Design
Schema Design
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
 
Advanced Schema Design Patterns
Advanced Schema Design Patterns Advanced Schema Design Patterns
Advanced Schema Design Patterns
 

Viewers also liked

Salary Negotiation Cheat Sheet
Salary Negotiation Cheat SheetSalary Negotiation Cheat Sheet
Salary Negotiation Cheat SheetLewis Lin 🦊
 
Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018Lewis Lin 🦊
 
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...Lewis Lin 🦊
 
MBA CSEA 2017 Attendees
MBA CSEA 2017 AttendeesMBA CSEA 2017 Attendees
MBA CSEA 2017 AttendeesLewis Lin 🦊
 
UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1Lewis Lin 🦊
 
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...Lewis Lin 🦊
 
Performance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) QuestionsPerformance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) QuestionsLewis Lin 🦊
 
Creating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBCreating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBLewis Lin 🦊
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best PracticesLewis Lin 🦊
 
2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend Report2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend ReportLewis Lin 🦊
 
UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2Lewis Lin 🦊
 
Book Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. LinBook Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. LinLewis Lin 🦊
 

Viewers also liked (12)

Salary Negotiation Cheat Sheet
Salary Negotiation Cheat SheetSalary Negotiation Cheat Sheet
Salary Negotiation Cheat Sheet
 
Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018Enterprise UX Industry Report 2017–2018
Enterprise UX Industry Report 2017–2018
 
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
What Game Developers Look for in a New Graduate: Interviews and Surveys at On...
 
MBA CSEA 2017 Attendees
MBA CSEA 2017 AttendeesMBA CSEA 2017 Attendees
MBA CSEA 2017 Attendees
 
UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1UI Design Patterns for the Web, Part 1
UI Design Patterns for the Web, Part 1
 
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
Facebook Rotational Product Manager Interview: Jewel Lim's Tips on Getting an...
 
Performance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) QuestionsPerformance Based Interviewing (PBI) Questions
Performance Based Interviewing (PBI) Questions
 
Creating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDBCreating social features at BranchOut using MongoDB
Creating social features at BranchOut using MongoDB
 
MongoDB Best Practices
MongoDB Best PracticesMongoDB Best Practices
MongoDB Best Practices
 
2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend Report2016 VC Executive Compensation Trend Report
2016 VC Executive Compensation Trend Report
 
UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2UI Design Patterns for the Web, Part 2
UI Design Patterns for the Web, Part 2
 
Book Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. LinBook Summary: Decode and Conquer by Lewis C. Lin
Book Summary: Decode and Conquer by Lewis C. Lin
 

Similar to MongoDB Schema Design: Four Real-World Examples

Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard keyMongoDB
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real WorldMike Friedman
 
Data Modeling Examples from the Real World
Data Modeling Examples from the Real WorldData Modeling Examples from the Real World
Data Modeling Examples from the Real WorldMongoDB
 
Webinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldWebinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldMongoDB
 
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB
 
Data Modeling Deep Dive
Data Modeling Deep DiveData Modeling Deep Dive
Data Modeling Deep DiveMongoDB
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB
 
Schema Design - Real world use case
Schema Design - Real world use caseSchema Design - Real world use case
Schema Design - Real world use caseMatias Cascallares
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
MongoDB Strange Loop 2009
MongoDB Strange Loop 2009MongoDB Strange Loop 2009
MongoDB Strange Loop 2009Mike Dirolf
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)MongoDB
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling rogerbodamer
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)Uwe Printz
 
Schema design
Schema designSchema design
Schema designchristkv
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC PythonMike Dirolf
 

Similar to MongoDB Schema Design: Four Real-World Examples (20)

Choosing a Shard key
Choosing a Shard keyChoosing a Shard key
Choosing a Shard key
 
Data Modeling for the Real World
Data Modeling for the Real WorldData Modeling for the Real World
Data Modeling for the Real World
 
Data Modeling Examples from the Real World
Data Modeling Examples from the Real WorldData Modeling Examples from the Real World
Data Modeling Examples from the Real World
 
Webinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real WorldWebinar: Data Modeling Examples in the Real World
Webinar: Data Modeling Examples in the Real World
 
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
MongoDB London 2013: Data Modeling Examples from the Real World presented by ...
 
Data Modeling Deep Dive
Data Modeling Deep DiveData Modeling Deep Dive
Data Modeling Deep Dive
 
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
 
Schema Design - Real world use case
Schema Design - Real world use caseSchema Design - Real world use case
Schema Design - Real world use case
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
MongoDB Strange Loop 2009
MongoDB Strange Loop 2009MongoDB Strange Loop 2009
MongoDB Strange Loop 2009
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
 
Full metal mongo
Full metal mongoFull metal mongo
Full metal mongo
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB at GUL
MongoDB at GULMongoDB at GUL
MongoDB at GUL
 
MongoDB at RuPy
MongoDB at RuPyMongoDB at RuPy
MongoDB at RuPy
 
Schema design
Schema designSchema design
Schema design
 
MongoDB NYC Python
MongoDB NYC PythonMongoDB NYC Python
MongoDB NYC Python
 

More from Lewis Lin 🦊

Gaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPointGaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPointLewis Lin 🦊
 
P&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand ManagementP&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand ManagementLewis Lin 🦊
 
Jeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney StudiosJeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney StudiosLewis Lin 🦊
 
Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020Lewis Lin 🦊
 
Gallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance ManagementGallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance ManagementLewis Lin 🦊
 
Twitter Job Opportunities for Students
Twitter Job Opportunities for StudentsTwitter Job Opportunities for Students
Twitter Job Opportunities for StudentsLewis Lin 🦊
 
Facebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management CandidatesFacebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management CandidatesLewis Lin 🦊
 
Performance Management at Google
Performance Management at GooglePerformance Management at Google
Performance Management at GoogleLewis Lin 🦊
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerLewis Lin 🦊
 
Google Interview Prep Guide Product Manager
Google Interview Prep Guide Product ManagerGoogle Interview Prep Guide Product Manager
Google Interview Prep Guide Product ManagerLewis Lin 🦊
 
Skills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. LinSkills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. LinLewis Lin 🦊
 
How Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership TraitsHow Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership TraitsLewis Lin 🦊
 
Product Manager Skills Survey
Product Manager Skills SurveyProduct Manager Skills Survey
Product Manager Skills SurveyLewis Lin 🦊
 
Uxpin Why Build a Design System
Uxpin Why Build a Design SystemUxpin Why Build a Design System
Uxpin Why Build a Design SystemLewis Lin 🦊
 
30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study Guide30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study GuideLewis Lin 🦊
 
30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study Guide30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study GuideLewis Lin 🦊
 
36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study Guide36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study GuideLewis Lin 🦊
 
McKinsey's Assessment on PM Careers
McKinsey's Assessment on PM CareersMcKinsey's Assessment on PM Careers
McKinsey's Assessment on PM CareersLewis Lin 🦊
 
Five Traits of Great Product Managers
Five Traits of Great Product ManagersFive Traits of Great Product Managers
Five Traits of Great Product ManagersLewis Lin 🦊
 

More from Lewis Lin 🦊 (20)

Gaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPointGaskins' memo pitching PowerPoint
Gaskins' memo pitching PowerPoint
 
P&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand ManagementP&G Memo: Creating Modern Day Brand Management
P&G Memo: Creating Modern Day Brand Management
 
Jeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney StudiosJeffrey Katzenberg on Disney Studios
Jeffrey Katzenberg on Disney Studios
 
Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020Carnegie Mellon MS PM Internships 2020
Carnegie Mellon MS PM Internships 2020
 
Gallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance ManagementGallup's Notes on Reinventing Performance Management
Gallup's Notes on Reinventing Performance Management
 
Twitter Job Opportunities for Students
Twitter Job Opportunities for StudentsTwitter Job Opportunities for Students
Twitter Job Opportunities for Students
 
Facebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management CandidatesFacebook's Official Guide to Technical Program Management Candidates
Facebook's Official Guide to Technical Program Management Candidates
 
Performance Management at Google
Performance Management at GooglePerformance Management at Google
Performance Management at Google
 
Google Interview Prep Guide Software Engineer
Google Interview Prep Guide Software EngineerGoogle Interview Prep Guide Software Engineer
Google Interview Prep Guide Software Engineer
 
Google Interview Prep Guide Product Manager
Google Interview Prep Guide Product ManagerGoogle Interview Prep Guide Product Manager
Google Interview Prep Guide Product Manager
 
Skills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. LinSkills Assessment Offering by Lewis C. Lin
Skills Assessment Offering by Lewis C. Lin
 
How Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership TraitsHow Men and Women Differ Across Leadership Traits
How Men and Women Differ Across Leadership Traits
 
Product Manager Skills Survey
Product Manager Skills SurveyProduct Manager Skills Survey
Product Manager Skills Survey
 
Uxpin Why Build a Design System
Uxpin Why Build a Design SystemUxpin Why Build a Design System
Uxpin Why Build a Design System
 
Sourcing on GitHub
Sourcing on GitHubSourcing on GitHub
Sourcing on GitHub
 
30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study Guide30-Day Google PM Interview Study Guide
30-Day Google PM Interview Study Guide
 
30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study Guide30-Day Facebook PM Interview Study Guide
30-Day Facebook PM Interview Study Guide
 
36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study Guide36-Day Amazon PM Interview Study Guide
36-Day Amazon PM Interview Study Guide
 
McKinsey's Assessment on PM Careers
McKinsey's Assessment on PM CareersMcKinsey's Assessment on PM Careers
McKinsey's Assessment on PM Careers
 
Five Traits of Great Product Managers
Five Traits of Great Product ManagersFive Traits of Great Product Managers
Five Traits of Great Product Managers
 

Recently uploaded

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 

Recently uploaded (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 

MongoDB Schema Design: Four Real-World Examples

  • 1. Perl Engineer & Evangelist, 10gen Mike Friedman #MongoDBdays Schema Design Four Real-World Use Cases
  • 2. Single Table En Agenda • Why is schema design important • 4 Real World Schemas – Inbox – History – IndexedAttributes – Multiple Identities • Conclusions
  • 3. Why is Schema Design important? • Largest factor for a performant system • Schema design with MongoDB is different • RDBMS – "What answers do I have?" • MongoDB – "What question will I have?"
  • 4. #1 - Message Inbox
  • 7. Design Goals • Efficiently send new messages to recipients • Efficiently read inbox
  • 9. 3 Approaches (there are more) • Fan out on Read • Fan out on Write • Fan out on Write with Bucketing
  • 10. // Shard on "from" db.shardCollection( "mongodbdays.inbox", { from: 1 } ) // Make sure we have an index to handle inbox reads db.inbox.ensureIndex( { to: 1, sent: 1 } ) msg = { from: "Joe", to: [ "Bob", "Jane" ], sent: new Date(), message: "Hi!", } // Send a message db.inbox.save( msg ) // Read my inbox db.inbox.find( { to: "Joe" } ).sort( { sent: -1 } ) Fan out on read
  • 11. Fan out on read – Send Message Shard 1 Shard 2 Shard 3 Send Message
  • 12. Fan out on read – Inbox Read Shard 1 Shard 2 Shard 3 Read Inbox
  • 13. Considerations • One document per message sent • Reading an inbox means finding all messages with my own name in the recipient field • Requires scatter-gather on sharded cluster • Then a lot of random IO on a shard to find everything
  • 14. // Shard on “recipient” and “sent” db.shardCollection( "mongodbdays.inbox", { ”recipient”: 1, ”sent”: 1 } ) msg = { from: "Joe", to: [ "Bob", "Jane" ], sent: new Date(), message: "Hi!", } // Send a message for ( recipient in msg.to ) { msg.recipient = msg.to[recipient] db.inbox.save( msg ); } // Read my inbox db.inbox.find( { recipient: "Joe" } ).sort( { sent: -1 } ) Fan out on write
  • 15. Fan out on write – Send Message Shard 1 Shard 2 Shard 3 Send Message
  • 16. Fan out on write– Read Inbox Shard 1 Shard 2 Shard 3 Read Inbox
  • 17. Considerations • One document per recipient • Reading my inbox is just finding all of the messages with me as the recipient • Can shard on recipient, so inbox reads hit one shard • But still lots of random IO on the shard
  • 18. // Shard on “owner / sequence” db.shardCollection( "mongodbdays.inbox", { owner: 1, sequence: 1 } ) db.shardCollection( "mongodbdays.users", { user_name: 1 } ) msg = { from: "Joe", to: [ "Bob", "Jane" ], sent: new Date(), message: "Hi!", } Fan out on write with buckets
  • 19. // Send a message for( recipient in msg.to) { count = db.users.findAndModify({ query: { user_name: msg.to[recipient] }, update: { "$inc": { "msg_count": 1 } }, upsert: true, new: true }).msg_count; sequence = Math.floor(count / 50); db.inbox.update({ owner: msg.to[recipient], sequence: sequence }, { $push: { "messages": msg } }, { upsert: true } ); } // Read my inbox db.inbox.find( { owner: "Joe" } ).sort ( { sequence: -1 } ).limit( 2 ) Fan out on write with buckets
  • 20. Fan out on write with buckets • Each “inbox” document is an array of messages • Append a message onto “inbox” of recipient • Bucket inboxes so there’s not too many messages per document • Can shard on recipient, so inbox reads hit one shard • 1 or 2 documents to read the whole inbox
  • 21. Fan out on write with buckets - Send Shard 1 Shard 2 Shard 3 Send Message
  • 22. Fan out on write with buckets - Read Shard 1 Shard 2 Shard 3 Read Inbox
  • 24.
  • 25. Design Goals • Need to retain a limited amount of history e.g. – Hours, Days, Weeks – May be legislative requirement (e.g. HIPPA, SOX, DPA) • Need to query efficiently by – match – ranges
  • 26. 3 Approaches (there are more) • Bucket by Number of messages • Fixed size Array • Bucket by Date + TTL Collections
  • 27. db.inbox.find() { owner: "Joe", sequence: 25, messages: [ { from: "Joe", to: [ "Bob", "Jane" ], sent: ISODate("2013-03-01T09:59:42.689Z"), message: "Hi!" }, … ] } // Query with a date range db.inbox.find ({owner: "friend1", messages: { $elemMatch: {sent:{$gte: ISODate("…") }}}}) // Remove elements based on a date db.inbox.update({owner: "friend1" }, { $pull: { messages: { sent: { $gte: ISODate("…") } } } } ) Inbox – Bucket by # messages
  • 28. Considerations • Shrinking documents, space can be reclaimed with – db.runCommand ( { compact: '<collection>' } ) • Removing the document after the last element in the array as been removed – { "_id" : …, "messages" : [ ], "owner" : "friend1", "sequence" : 0 }
  • 29. msg = { from: "Your Boss", to: [ "Bob" ], sent: new Date(), message: "CALL ME NOW!" } // 2.4 Introduces $each, $sort and $slice for $push db.messages.update( { _id: 1 }, { $push: { messages: { $each: [ msg ], $sort: { sent: 1 }, $slice: -50 } } } ) Maintain the latest – Fixed Size Array
  • 30. Considerations • Need to compute the size of the array based on retention period
  • 31. // messages: one doc per user per day db.inbox.findOne() { _id: 1, to: "Joe", sequence: ISODate("2013-02-04T00:00:00.392Z"), messages: [ ] } // Auto expires data after 31536000 seconds = 1 year db.messages.ensureIndex( { sequence: 1 }, { expireAfterSeconds: 31536000 } ) TTL Collections
  • 32. #3 – Indexed Attributes
  • 33. Design Goal • Application needs to stored a variable number of attributes e.g. – User defined Form – Meta Data tags • Queries needed – Equality – Range based • Need to be efficient, regardless of the number of attributes
  • 34. 2 Approaches (there are more) • Attributes as Embedded Document • Attributes as Objects in an Array
  • 35. db.files.insert( { _id: "local.0", attr: { type: "text", size: 64, created: ISODate("..." } } ) db.files.insert( { _id: "local.1", attr: { type: "text", size: 128} } ) db.files.insert( { _id: "mongod", attr: { type: "binary", size: 256, created: ISODate("...") } } ) // Need to create an index for each item in the sub-document db.files.ensureIndex( { "attr.type": 1 } ) db.files.find( { "attr.type": "text"} ) // Can perform range queries db.files.ensureIndex( { "attr.size": 1 } ) db.files.find( { "attr.size": { $gt: 64, $lte: 16384 } } ) Attributes as a Sub- Document
  • 36. Considerations • Each attribute needs an Index • Each time you extend, you add an index • Lots and lots of indexes
  • 37. db.files.insert( {_id: "local.0", attr: [ { type: "text" }, { size: 64 }, { created: ISODate("...") } ] } ) db.files.insert( { _id: "local.1", attr: [ { type: "text" }, { size: 128 } ] } ) db.files.insert( { _id: "mongod", attr: [ { type: "binary" }, { size: 256 }, { created: ISODate("...") } ] } ) db.files.ensureIndex( { attr: 1 } ) Attributes as Objects in Array
  • 38. Considerations • Only one index needed on attr • Can support range queries, etc. • Index can be used only once per query
  • 39. #4 – Multiple Identities
  • 40. Design Goal • Ability to look up by a number of different identities e.g. • Username • Email address • FB Handle • LinkedIn URL
  • 41. 2 Approaches (there are more) • Identifiers in a single document • Separate Identifiers from Content
  • 42. db.users.findOne() { _id: "joe", email: "joe@example.com, fb: "joe.smith", // facebook li: "joe.e.smith", // linkedin other: {…} } // Shard collection by _id db.shardCollection("mongodbdays.users", { _id: 1 } ) // Create indexes on each key db.users.ensureIndex( { email: 1} ) db.users.ensureIndex( { fb: 1 } ) db.users.ensureIndex( { li: 1 } ) Single Document by User
  • 43. Read by _id (shard key) Shard 1 Shard 2 Shard 3 find( { _id: "joe"} )
  • 44. Read by email (non-shard key) Shard 1 Shard 2 Shard 3 find ( { email: joe@example.com } )
  • 45. Considerations • Lookup by shard key is routed to 1 shard • Lookup by other identifier is scatter gathered across all shards • Secondary keys cannot have a unique index
  • 46. // Create unique index db.identities.ensureIndex( { identifier : 1} , { unique: true} ) // Create a document for each users document db.identities.save( { identifier : { hndl: "joe" }, user: "1200-42" } ) db.identities.save( { identifier : { email: "joe@abc.com" }, user: "1200-42" } ) db.identities.save( { identifier : { li: "joe.e.smith" }, user: "1200-42" } ) // Shard collection by _id db.shardCollection( "mydb.identities", { identifier : 1 } ) // Create unique index db.users.ensureIndex( { _id: 1} , { unique: true} ) // Shard collection by _id db.shardCollection( "mydb.users", { _id: 1 } ) Document per Identity
  • 47. Read requires 2 reads Shard 1 Shard 2 Shard 3 db.identities.find({"identifier" : { "hndl" : "joe" }}) db.users.find( { _id: "1200-42"} )
  • 48. Considerations • Lookup to Identities is a routed query • Lookup to Users is a routed query • Unique indexes available
  • 50. Summary • Multiple ways to model a domain problem • Understand the key uses cases of your app • Balance between ease of query vs. ease of write • Random IO should be avoided
  • 51. Perl Engineer & Evangelist, 10gen Mike Friedman #MongoDBdays Thank You