5. MADRID · NOV 18-19 · 2016
#Data Model
5
● All the information in one document:
○ No joins
○ Single read
○ Higher performance
○ ACID compliance at the document level
● MMAPv1: avoid unbounded document growth
● WiredTiger rewrites the document for each update
6. MADRID · NOV 18-19 · 2016
#Application Patterns
6
● Modify only those fields that have changed
● Project only the fields you need
● Avoid negation in queries
● Use Covered Queries when possible
● Avoid scatter-gather queries
● Write Concern:
○ w : 0 → faster than w : all
7. MADRID · NOV 18-19 · 2016
#.explain()
7
● Use .explain() to get the time a query needs
● Verbose options:
○ queryPlanner (default)
○ executionStats (stats for the winner plan)
○ allPlansExecution (stats for all candidate
plans)
8. MADRID · NOV 18-19 · 2016
#.explain()
8
● In-memory sorting data is limited to 32MB (you
need an index)
● My index selectivity is the relationship between
nReturned and totalKeysExamined
10. MADRID · NOV 18-19 · 2016
#Indexes
10
● Improve the performance of
read-intensive apps
● Overhead:
○ Write operations
○ Disk usage
○ Memory consumption
11. MADRID · NOV 18-19 · 2016
● Working Set is made of:
○ Indexes
○ Your most frequently accessed data
#Indexes-Working Set
11
12. MADRID · NOV 18-19 · 2016
● If your Working Set does not fit into RAM:
○ Increment your RAM
○ Shard the database
○ Avoid scan all documents in the collection
■ Page faults
■ You will fill in your Working Set with
infrequent accessed data
#Indexes-Working Set
12
13. MADRID · NOV 18-19 · 2016
#Indexes-Before creating indexes
13
● Understand your data
● How your queries will be
● Create as few indexes as possible
● Put your highly selective fields first in the index
● Check your current indexes before creating new
ones
14. MADRID · NOV 18-19 · 2016
#Indexes-Advices
14
● Do not index a little cardinality field
● Check your index performance
● Monitor your system to check if:
○ Your strategy is still up to date
○ You need new indexes
○ You can remove unnecessary indexes
15. MADRID · NOV 18-19 · 2016
#Indexes-Compound
15
● A compound index has better performance than:
○ Single indexes
○ Intersection
● Remove indexes that are prefixes of other
● The value of the combined fields must have high
cardinality
17. MADRID · NOV 18-19 · 2016
> use test
> db.example.createIndex( { a : 1, b : 1 } );
> db.example.explain("executionStats").find( { a : 56, b : 87 } );
"nReturned" : 100,
"executionTimeMillis" : 2,
"totalKeysExamined" : 100,
"totalDocsExamined" : 100,
> db.example.dropIndex( { a : 1, b : 1 } );
> db.example.createIndex( { a : 1 } );
> db.example.createIndex( { b : 1 } );
> db.example.explain("executionStats").find( { a : 56, b : 87 } );
"nReturned" : 100,
"executionTimeMillis" : 36,
"totalKeysExamined" : 10000,
"totalDocsExamined" : 10000
#Indexes-Compound example
17
18. MADRID · NOV 18-19 · 2016
#Indexes-Covered Queries
18
● They return results from the indexes
● A covered query does not read from the
document level:
○ ndocsExamined : 0
● In the index must be all the fields:
○ Included in the query
○ Returned by the query
19. MADRID · NOV 18-19 · 2016
#Indexes-Covered Queries example
19
> use test
> db.example.createIndex( { a : 1, b : 1 } );
> db.example.explain("executionStats").find( { a : 56, b : 87 } );
"nReturned" : 100,
"executionTimeMillis" : 3,
"totalKeysExamined" : 100,
"totalDocsExamined" : 100
> db.example.explain("executionStats").find( { a : 56, b : 87 }, {
_id : 0, a : 1, b : 1 } );
"nReturned" : 100,
"executionTimeMillis" : 3,
"totalKeysExamined" : 100,
"totalDocsExamined" : 0
20. MADRID · NOV 18-19 · 2016
#Indexes-TTL
20
● TTL (TimeToLive) Indexes
○ Allow the user to specify a period of time after
which the data will be automatically deleted
21. MADRID · NOV 18-19 · 2016
#Indexes-TTL example
21
> use test
> db.timetolive.createIndex( { date : 1 } , { expireAfterSeconds: 20 } );
> db.timetolive.insert( { date : new Date() } );
> db.timetolive.find().pretty()
{
"_id" : ObjectId("581fb43f5626d57bcf66fe73"),
"date" : ISODate("2016-11-06T22:52:47.373Z")
}
> db.timetolive.count()
0
>
The background task that removes expired documents
runs every 60 seconds.
22. MADRID · NOV 18-19 · 2016
#Indexes-Partial
22
● Partial Indexes
○ Allow to index a subset of your data by
specifying a filtering expression during the
index creation
○ Better query performance while reducing
system overhead
24. MADRID · NOV 18-19 · 2016
#Indexes-Sparse
24
● Sparse Indexes
○ Only contain entries for documents that
contain a specific field
○ Smaller and more efficient indexes
25. MADRID · NOV 18-19 · 2016
> use test
> db.test.createIndex( { b : 1 } , { sparse : true } );
> db.test.stats()
16384
>
> db.test.createIndex( { b : 1 } );
> db.test.stats()
2580480
>
#Indexes-Sparse example
25
26. MADRID · NOV 18-19 · 2016
#Storage Engines-WiredTiger
26
● Reduced storage space and higher I/O scalability
● It compresses indexes into RAM, freeing up
Working Set for documents
● Performance (granular concurrency control)
● Native compression:
○ per collection
○ per index
27. MADRID · NOV 18-19 · 2016
#Storage Engines-WiredTiger
27
● Types of compression:
○ Snappy (default): Data and Journal
○ zlib: maximum compression
○ Prefix: for indexes
28. MADRID · NOV 18-19 · 2016
#Storage Engines: In-Memory
28
● Extreme performance coupled with real time
analytics for the most demanding,
latency-sensitive applications
● Delivers the extreme throughput and predictable
latency
● Eliminates the need for separate caching layers
29. MADRID · NOV 18-19 · 2016
#Query Optimizer
29
● It selects the best index to use
● It periodically runs alternate query plans and
selects the index with the best response time for
each query type
● The results of these tests are stored as a cached
query plan and are updated periodically
30. MADRID · NOV 18-19 · 2016
#Aggregation Framework
30
31. MADRID · NOV 18-19 · 2016
#Aggregation Framework
31
● Use $indexStats to determine how frequently
your index is used
● Each shard has its own statistics
● Stats are reset when you restart your mongod
service
>db.collection.aggregate([
{ $indexStats : {} }
])
34. MADRID · NOV 18-19 · 2016
#Tools-Compass
34
● MongoDB Compass
○ Analyze the size and usage of the indexes
○ It offers visual plan explanations
○ Index creation
○ Real time stats
○ Identify and fix performance and data problems
35. MADRID · NOV 18-19 · 2016
#Tools-Ops Manager
35
● The Visual Query Profiler allows to analyze a
query performance and recommends on the
addition of indexes
● It can be used to visualize output from the
profiler when identifying slow queries
36. MADRID · NOV 18-19 · 2016
#Tools-Cloud Manager & Atlas
36
● Use the Query Targeting chart
37. MADRID · NOV 18-19 · 2016
#Tools-Profiling
37
● Guess the slowest queries for improving their
indexes
● Log information for all events or only those whose
duration exceeds a configurable threshold (100ms
default)
39. MADRID · NOV 18-19 · 2016
#Tuning-SSD
39
● Most disk access patterns are not sequential
● Data files benefit from SSDs
● MongoDB journal files have high sequential write
patterns (good for spinning disks)
● RAM/SSDs better than spinning drives
● Use high performance storage
● Avoid networked storage
40. MADRID · NOV 18-19 · 2016
#Tuning-CPU
40
● MongoDB has better performance on faster CPUs
● The MongoDB WiredTiger Storage Engine is better
able to saturate multi-core processor resources than
the MMAPv1 Storage Engine
41. MADRID · NOV 18-19 · 2016
#Tuning-Thread
41
● The WiredTiger Storage Engine is multi-threaded
and can take advantage of many CPU Cores
42. MADRID · NOV 18-19 · 2016
#Sharding
42
● The balancer has been moved from the mongos to
the primary member of the config server
● Sharding improves the performance of reads
● Benefits (even with a ‘poor’ shard key and
scatter-gather):
○ A larger amount of memory to store indexes
○ Parallelize queries across shards reducing latency
43. MADRID · NOV 18-19 · 2016
#Storage & Compression types
43
● Balance query latency with storage density and
cost by assigning data sets based on a value
such as a timestamp to specific storage devices
44. MADRID · NOV 18-19 · 2016
#Storage & Compression types
44
● Recent, frequently accessed data can be assigned to
high performance SSDs with Snappy compression
● Older, less frequently accessed data is tagged to
lower-throughput hard disk drives compressed with
zlib to attain maximum storage density with a lower
cost-per-bit
● As data ages, MongoDB will automatically migrates it
between shards
45. MADRID · NOV 18-19 · 2016
#Questions?
45
Questions?
46. MongoDB:
Improve the performance of your application
by @juanroycouto
MADRID · NOV 18-19 · 2016
46
Thank you for your attention!