Performance Tuning Cheat Sheet for MongoDB

September 2018
Performance Tuning Cheat
Sheet for MongoDB
Bartłomiej Oleś
Presenter
bart@severalnines.com

Copyright 2017 Severalnines AB
Supported Databases

Free to download
Initial 30 days Enterprise trial
Converts into free Community Edition
Enterprise / paid versions available

Automation & Management
Deployment (Free Community)
● Deploy a Cluster in Minutes
○ On-Prem
○ Cloud (AWS/Azure/Google) - paid
Monitoring (Free Community)
● Systems View with 1 sec Resolution
● DB / OS stats & Performance Advisors
● Configurable Dashboards
● Query Analyzer
● Real-time / historical
Management (Paid Features)
● Backup Management
● Upgrades & Patching
● Security & Compliance
● Operational Reports
● Automatic Recovery & Repair
● Performance Management
● Automatic Performance Advisors

Agenda
● Why performance cheat sheet?
● Free monitoring for performance
● Logging database operations
● Capturing queries - database profiler
● Checking operating system parameters
● Working with the Explain Plan
● Measuring replication lag performance
● Live demo
● Other

Why performance cheat sheet?

Performance complexity
● Services running on multiple hosts
○ Replication
○ Sharding
○ Clustering
● Multiple Data Centers
○ Cloud and/or On-prem
○ Disaster Recovery
● Load balancing and Single point of contact IP
○ For workload management, HA, query caching...
○ E.g., HAProxy, KeepAlived/VIP, ProxySQL, MaxScale

Why we need a database monitoring system
● Data is a key asset of the organisation
● Databases are important as they manage the source of truth
● Database is complex - IO, transaction engine, query optimizer,
caches, locks, versioning,...
● Very dependent on OS, IO subsystems, network
● Distribution across multiple instances makes it even more complex
● Good database monitoring helps make sense of all that

MongoDB
● Similar to most other databases
● Understand the utilization of the hardware
● Capacity planning
● Determine the type of an issue
● I/O related?
● CPU related?
● Network related?

● CPU utilization (should I add more nodes to the cluster?)
● Network utilization (am I running out of bandwidth?)
● Ping (how badly latency affects my MongoDB cluster?)
● Disk throughput and IOPS (am I within my hardware limits?)
● Disk space (do I have to plan for larger disks?)
● Memory utilization (do I suffer from a memory leak?)

● Storage engine specific
● MMAP
● WiredTiger
● MongoRocks
● Insight in how the engine performs
● Internal congestion

● CPU, IO or lock related
● Outcome: similar to Galera
● Lagging behind could cause a full sync

Performance monitoring vs metrics
● Similar to most other databases
● Throughput of the cluster
● Relate throughput to cluster performance
● Determine the type of an issue
● Request spikes?
● Write amplification related?
● Queueing?

Monitoring vs Trending
● Monitoring system (i.e. Nagios)
● Checks if services are healthy
● Sends pages
● Trending system (i.e. Cacti, Graphite)
● Collects metrics
● Generate graphs
● Availability
● Do more than just opening a
connection
● Measure true status of nodes and
cluster
● Test read/write
● Open essential databases and
collections
● Keep an eye on the replication lag
● Increase oplog size?
● Check the full topology

Monitoring vs Trending
● Trending
● Plot trends of key (performance) metrics
● Find problems before they arise
● Pre-emptive problem management
● Trending tools
● Granularity of sampling
● More datapoints = better
● Periodical (daily/weekly) healthchecks
● Insight into all aspects of the database operations
● Post mortem and proactive monitoring
● Capacity planning

Performance monitoring

MongoDB monitoring
● Enable Free Monitoring
db.enableFreeMonitoring()
● Disable Free Monitoring
db.disableFreeMonitoring()

Logging database operations
Operation Execution Times (READ, WRITES, COMMANDS)
Disk utilization (MAX UTIL % OF ANY DRIVE, AVERAGE UTIL % OF ALL DRIVES)
Memory (RESIDENT, VIRTUAL, MAPPED)
Network - Input / Output (BYTES IN, BYTES OUT)
Network - Num Requests (NUM REQUESTS)
Opcounters (INSERT, QUERY, UPDATE, DELETE, GETMORE, COMMAND)
Opcounters - Replication (INSERT, QUERY, UPDATE, DELETE, GETMORE,
COMMAND)
Query Targeting (SCANNED / RETURNED, SCANNED OBJECTS / RETURNED)
Queues (READERS, WRITERS, TOTAL)
System Cpu Usage (USER, NICE, KERNEL, IOWAIT, IRQ, SOFT IRQ, STEAL,
GUEST)

db.getFreeMonitoringStatus()
{ resource: { cluster : true }, actions: [ "setFreeMonitoring",
"checkFreeMonitoringStatus" ] }
db.serverStatus()

{
"state" : "enabled",
"message" : "To see your monitoring data, navigate to the unique URL
below. Anyone you share the URL with will also be able to view this page.
You can disable monitoring at any time by running
db.disableFreeMonitoring().",
"url" :
"https://cloud.mongodb.com/freemonitoring/cluster/XEARVO6RB2OTXEAHKHLKJ5V
6KV3FAM6B",
"userReminder" : "",
"ok" : 1
}

Database profiler
db.getProfilingLevel()
To capture all queries set:
db.setProfilingLevel(2)
profile = <0/1/2>
slowms = <value>

Logging database operations
db.getLogComponents()
Log messages include many components. This is to provide a functional categorization of the
messages. For each of the component, you can set different log verbosity. The current list of
components is:
ACCESS, COMMAND, CONTROL, FTD, GEO, INDEX, NETWORK, QUERY, REPL_HB, REPL,
ROLLBACK, REPL, SHARDING, STORAGE, RECOVERY, JOURNAL, STORAGE, WRITE.

Examples
To list the 10 most recent:
db.system.profile.find().limit(10).sort(
{ ts : -1 }
).pretty()
To list all:
db.system.profile.find( { op:
{ $ne : 'command' }
} ).pretty()
To list all:
db.system.profile.find(
{ ns : 'mydb.test' }
).pretty()

MongoDB logging
/var/log/mongodb/mongod.log
You can find MongoDB configuration file at /etc/mongod.conf.
Here is sample data:
2018-07-01T23:09:27.101+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to
node1:27017
2018-07-01T23:09:27.102+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect
to node1:27017 - HostUnreachable: Connection refused
2018-07-01T23:09:27.102+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Dropping all
pooled connections to node1:27017 due to failed operation on a connection
2018-07-01T23:09:27.102+0000 I REPL_HB [replexec-2] Error in heartbeat (requestId: 21589) to
node1:27017, response status: HostUnreachable: Connection refused
2018-07-01T23:09:27.102+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to
node1:27017

MongoDB logging
db.runCommand({ logRotate : 1 });
db.setLogLevel(2, "query")

MongoDB Oplog
● Similar to MySQL binary logs
● Oplog: a special collection
● Limited size
● Eviction of transactions (FIFO)
● Replication window
● Time between first and last transaction in the oplog

MongoDB Connections
● Similar to MySQL when handling connections
● Client drivers may support connection pooling
● Multiple non-blocking queries can use the same
connection
● Spawns new connections when low on threshold
● Increase of connections
● Locking issues
● Application request bursts

Checking operating system parameters -
memory limits

network
net.core.somaxconn (increase the value)
net.ipv4.tcp_max_syn_backlog (increase the value)
net.ipv4.tcp_fin_timeout (reduce the value)
net.ipv4.tcp_keepalive_intvl (reduce the value)
net.ipv4.tcp_keepalive_time (reduce the value)

Network stack
net.core.somaxconn = 4096
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_time = 120
net.ipv4.tcp_max_syn_backlog = 4096

memory limits
$ sysctl -a | egrep “vm.dirty.*_ratio”
vm.dirty_background_ratio = 10
vm.dirty_ratio = 20

Security
sudo setenforce Enforcing
sudo getenforce

Swappiness
vi /etc/sysctl.conf
vm.swappiness = 1

Transparent huge pages
cat /proc/sys/vm/nr_hugepages
0

Filesystem options
ext4 rw,seclabel,noatime,data=ordered 0 0
XFS (MongoDB 3.0+)
disable access-time updates

NTP Demon
#Red Hat
sudo yum install ntp
#Debian
sudo apt-get install ntp

Explain plan
db.inventory.find( {
status: "A",
$or: [ { qty: { $lt: 30 } }, { item: /^p/ } ]
} ).explain('executionStats');
or append it to the collection:
db.inventory.explain('executionStats').find( {
status: "A",
$or: [ { qty: { $lt: 30 } }, { item: /^p/ } ]
} );

Measuring replication lag performance
db.getReplicationInfo()
db.getReplicationInfo()
{
"logSizeMB" : 2157.1845703125,
"usedMB" : 0.05,
"timeDiff" : 4787,
"timeDiffHours" : 1.33,
"tFirst" : "Sun Jul 01 2018 21:40:32 GMT+0000 (UTC)",
"tLast" : "Sun Jul 01 2018 23:00:19 GMT+0000 (UTC)",
"now" : "Sun Jul 01 2018 23:00:26 GMT+0000 (UTC)"

Measuring replication lag performance
rs.printSlaveReplicationInfo()
rs.status()
● Replication Metrics
● Throughput of the replication
● Durability of the oplog
● Replication lag
● Comparable to Galera replication
● Quorum based
● At least one secondary needs to acknowledge

MongoDB locking
● Three levels of (generic) locking
Global, Database, Collection
mongo_replica_0:PRIMARY> db.serverStatus().locks {
"Global" : { "acquireCount" : { "r" :
NumberLong(6050583), "w" : NumberLong(2416551), "R" :
NumberLong(1), "W" : NumberLong(7) }, "acquireWaitCount"
: { "r" : NumberLong(1), "w" : NumberLong(1), "W" :
NumberLong(1) }, … }

MongoDB locking
(WiredTiger)
Document level locking
Tickets (threads)
Read/Write
mongo_replica_0:PRIMARY>
db.serverStatus().wiredTiger.concurrentTransactions {
"write" : { "out" : 0, "available" : 128, "totalTickets"
: 128 }, "read" : { "out" : 0, "available" : 128,
"totalTickets" : 128 } }

MongoDB cache
● MongoDB uses three tiers of cache
○ Filesystem
○ Active memory
○ Storage engine (WiredTiger / MongoRocks)
● Page faults
● Evictions

How to automate?

Live demo

Links & Resources
● Download / install ClusterControl
● ClusterControl Community Edition Page
● Contact us: info@severalnines.com

Performance Tuning Cheat Sheet for MongoDB

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Performance Tuning Cheat Sheet for MongoDB

Similar to Performance Tuning Cheat Sheet for MongoDB (20)

More from Severalnines

More from Severalnines (16)

Recently uploaded

Recently uploaded (20)

Performance Tuning Cheat Sheet for MongoDB