SlideShare a Scribd company logo
1 of 42
Monitoring MySQL with OpenTSDB
Percona live 2013 Geoffrey Anderson, Box Inc.
@geodbz
Who
Geoffrey Anderson
• Database Operations Engineer @ Box, Inc.
• a.k.a. DBA
• Tooling for MySQL and HBase
• #DBHangOps
The
Situation
Then
You
Get
More
Servers
Enter OpenTSDB
OpenTSDB is...
• Distributed
• Scalable
• Time Series Database
• Runs on HBase
• Created By
Benoit Sigoure
HBase
TSD for
Querying
mydb.example.com
HAProxy
fe1.example.com
TSD for
Storing
Push
Metrics
Query via API
• FAST
• EASY to Scale
• EASY to Populate
• EASY to collect data
• EASY to Query
Why OpenTSDB?
Collecting
Data
#!/usr/bin/env bash
timestamp=$(date +%s)
mysql -ss -e "SHOW GLOBAL STATUS" | while read var val
do
echo "mysql.$var $timestamp $val host=$HOSTNAME"
done
ganderson@mydb.example.com:~$ _./mysql_collector.sh
mysql.Aborted_connects 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.com
mysql.Bytes_received 1366399993 19453687 host=mydb.example.com
mysql.Bytes_sent 1366399993 1238166682 host=mydb.example.com
mysql.Com_admin_commands 1366399993 1 host=mydb.example.com
mysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com
...
Example: mysql_collector.sh
#!/usr/bin/env bash
timestamp=$(date +%s)
mysql -ss -e "SHOW GLOBAL STATUS" | while read var val
do
echo "mysql.$var $timestamp $val host=$HOSTNAME"
done
ganderson@mydb.example.com:~$ _./mysql_collector.sh
mysql.Aborted_connects 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_cache_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.com
mysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.com
mysql.Bytes_received 1366399993 19453687 host=mydb.example.com
mysql.Bytes_sent 1366399993 1238166682 host=mydb.example.com
mysql.Com_admin_commands 1366399993 1 host=mydb.example.com
mysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com
...
Example: mysql_collector.sh
Metric name Timestamp Value “Tags” (key=val)
* * * * * mysql_collector.sh | nc opentsdb.example.com 4242
Example: adding a cron for OpenTSDB
ganderson@mydb.example.com:tcollector$ tree
.
|-- collectors
| |-- 0
| | |-- ifstat.py
| | |-- iostat.py
| | |-- procnettcp.py
| | |-- procstats.py
| |-- 15
| | `-- dfstat.py
| |-- 30
| | |-- mysql_collector.sh
| |-- 300
| | `-- ptTcpModel.sh
| `-- etc
| |-- config.py
|-- config
|-- startstop
`-- tcollector.py
Run forever
Run every 15 seconds
Run every 5 minutes
Run every 30 seconds
Querying
Data
http://opentsdb.example.com
/#start=2013/04/10-07:32:29
&end=2013/04/10-07:57:57
&m=sum:proc.stat.cpu.percentage_idle{host=db22}
&o=axis x1y1
&m=sum:db.threads_running{host=db22}
&o=axis x1y2
&ylabel=CPU idle
&y2label=Threads Running
&yrange=[0:]
&wxh=1475x600
&png
http://opentsdb.example.com
/q?start=2013/04/10-07:32:29
&end=2013/04/10-07:57:57
&m=sum:proc.stat.cpu.percentage_idle{host=db22}
&o=axis x1y1
&m=sum:db.threads_running{host=db22}
&o=axis x1y2
&ylabel=CPU idle
&y2label=Threads Running
&yrange=[0:]
&ascii
Leveraging OpenTSDB For MySQL
user_statistics monitoring
table_statistics monitoring
Table Info from I_S
SELECT *, DATA_LENGTH+INDEX_LENGTH AS TOTAL_LENGTH
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA NOT IN
('PERFORMANCE_SCHEMA','INFORMATION_SCHEMA')
Query Throughput
And other “common” metrics
• Various MySQL status counters
• QPS (questions)
• Threads connected
• Temporary tables on disk
• Etc.
• Various server statistics
• %CPU Idle
• Free disk space
• I/O utilization
• Network traffic
• Etc.
Future collectors
• pt-query-digest/mysqlslow query statistics
• Data from “show engine innodb status”
• (that is missing from counters)
• PERFORMANCE_SCHEMA (MySQL 5.6+)
• Query statistics
• Processlist information
• Background thread information
How does this change things?
In all seriousness, though...
• Easily see aggregate graphs
• Easily build graphs on-the-fly
• Full granularity forever
• API request for raw data
• Cluster-wide nagios checks with check_tsd
Challenges Switching
• Aggregates are the default
• Mouse-zooming (patched!)
• Auto-suggest for metrics
• “The graphs aren’t pretty”
• Migrating from proof of concept
• Plan for 3+ machines
• Data pruning may be required
Some
Quick
Numbers OpenTSDB @ Box
 21,294 metrics
 72 tag keys
 5,145,745 tag values
 90% Interactive graphs
return <300ms
Next Steps
Enjoy #PerconaLive 2013
We’re hiring!
https://www.box.com/about-us/careers/
geoff@box.com
Image credits
 http://upload.wikimedia.org/wikipedia/commons/7/7b/Batelco_Network_Operations_Centre_(NOC).JPG
 http://www.flickr.com/photos/hoyvinmayvin/5873697252/
 http://www.percona.com/doc/percona-monitoring-plugins
 http://www.2cto.com/uploadfile/2012/0731/20120731112415744.jpg
 http://media.tumblr.com/tumblr_lvfspoenWU1qi19a2.png
 http://img.izismile.com/img/img4/20110527/640/you_can_be_a_superhero_640_01.jpg
 http://openclipart.org/image/250px/svg_to_png/26427/Anonymous_notebook.png
 http://images.alphacoders.com/768/2560-1600-76893.jpg
 http://www.flickr.com/photos/in365/4861180503/
 http://openclipart.org/image/250px/svg_to_png/130915/Prohibido_3D.png
 http://www.flickr.com/photos/61114149@N02/5566484951/
 http://opentsdb.net/img/tsd-sample.png
 http://images2.wikia.nocookie.net/__cb20080911160202/bttf/images/5/57/WhatdidItellyou-HQ.jpg
 http://www.flickr.com/photos/lisakayaks/3028350539/
 http://www.flickr.com/photos/25566302@N00/1472400115
 http://www.flickr.com/photos/grandmaitre/5846058698/
 http://www.flickr.com/photos/7518432@N06/2673347604/

More Related Content

What's hot

HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
Gnocchi v3 brownbag
Gnocchi v3 brownbagGnocchi v3 brownbag
Gnocchi v3 brownbagGordon Chung
 
Gnocchi Profiling 2.1.x
Gnocchi Profiling 2.1.xGnocchi Profiling 2.1.x
Gnocchi Profiling 2.1.xGordon Chung
 
Gnocchi v4 (preview)
Gnocchi v4 (preview)Gnocchi v4 (preview)
Gnocchi v4 (preview)Gordon Chung
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXzznate
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemAvleen Vig
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
Gnocchi Profiling v2
Gnocchi Profiling v2Gnocchi Profiling v2
Gnocchi Profiling v2Gordon Chung
 
Gnocchi v4 - past and present
Gnocchi v4 - past and presentGnocchi v4 - past and present
Gnocchi v4 - past and presentGordon Chung
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...DataStax
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an actionGordon Chung
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...NoSQLmatters
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentMongoDB
 
openTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldOliver Hankeln
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxData
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedJ On The Beach
 
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchLet's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchInfluxData
 

What's hot (20)

HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
Gnocchi v3 brownbag
Gnocchi v3 brownbagGnocchi v3 brownbag
Gnocchi v3 brownbag
 
Gnocchi Profiling 2.1.x
Gnocchi Profiling 2.1.xGnocchi Profiling 2.1.x
Gnocchi Profiling 2.1.x
 
Gnocchi v4 (preview)
Gnocchi v4 (preview)Gnocchi v4 (preview)
Gnocchi v4 (preview)
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log system
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Gnocchi v3
Gnocchi v3Gnocchi v3
Gnocchi v3
 
Gnocchi Profiling v2
Gnocchi Profiling v2Gnocchi Profiling v2
Gnocchi Profiling v2
 
Gnocchi v4 - past and present
Gnocchi v4 - past and presentGnocchi v4 - past and present
Gnocchi v4 - past and present
 
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an action
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
 
Back to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production DeploymentBack to Basics Webinar 6: Production Deployment
Back to Basics Webinar 6: Production Deployment
 
openTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed world
 
Aerospike & GCE (LSPE Talk)
Aerospike & GCE (LSPE Talk)Aerospike & GCE (LSPE Talk)
Aerospike & GCE (LSPE Talk)
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
 
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and ElasticsearchLet's Compare: A Benchmark review of InfluxDB and Elasticsearch
Let's Compare: A Benchmark review of InfluxDB and Elasticsearch
 

Similar to Monitoring MySQL with OpenTSDB

Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Ilya Haykinson
 
Functional Hostnames and Why they are Bad
Functional Hostnames and Why they are BadFunctional Hostnames and Why they are Bad
Functional Hostnames and Why they are BadPuppet
 
Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Sadayuki Furuhashi
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
 
Devoxx france 2015 influxdb
Devoxx france 2015 influxdbDevoxx france 2015 influxdb
Devoxx france 2015 influxdbNicolas Muller
 
Devoxx france 2015 influx db
Devoxx france 2015 influx dbDevoxx france 2015 influx db
Devoxx france 2015 influx dbNicolas Muller
 
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)Wesley Beary
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationMydbops
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the CloudWesley Beary
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellEmily Ikuta
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with PuppetKris Buytaert
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Michael Renner
 
How (not) to kill your MySQL infrastructure
How (not) to kill your MySQL infrastructureHow (not) to kill your MySQL infrastructure
How (not) to kill your MySQL infrastructureMiklos Szel
 
How to use the new Domino Query Language
How to use the new Domino Query LanguageHow to use the new Domino Query Language
How to use the new Domino Query LanguageTim Davis
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeWim Godden
 
Performance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerformance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerforce
 
10 Key MongoDB Performance Indicators
10 Key MongoDB Performance Indicators  10 Key MongoDB Performance Indicators
10 Key MongoDB Performance Indicators iammutex
 

Similar to Monitoring MySQL with OpenTSDB (20)

Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4
 
Functional Hostnames and Why they are Bad
Functional Hostnames and Why they are BadFunctional Hostnames and Why they are Bad
Functional Hostnames and Why they are Bad
 
Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理Digdagによる大規模データ処理の自動化とエラー処理
Digdagによる大規模データ処理の自動化とエラー処理
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Jk rubyslava 25
Jk rubyslava 25Jk rubyslava 25
Jk rubyslava 25
 
Devoxx france 2015 influxdb
Devoxx france 2015 influxdbDevoxx france 2015 influxdb
Devoxx france 2015 influxdb
 
Devoxx france 2015 influx db
Devoxx france 2015 influx dbDevoxx france 2015 influx db
Devoxx france 2015 influx db
 
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL Administration
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloud
 
MySQL 5.7 in a Nutshell
MySQL 5.7 in a NutshellMySQL 5.7 in a Nutshell
MySQL 5.7 in a Nutshell
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Logstash
LogstashLogstash
Logstash
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014
 
How (not) to kill your MySQL infrastructure
How (not) to kill your MySQL infrastructureHow (not) to kill your MySQL infrastructure
How (not) to kill your MySQL infrastructure
 
How to use the new Domino Query Language
How to use the new Domino Query LanguageHow to use the new Domino Query Language
How to use the new Domino Query Language
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
 
Performance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerformance & Scalability Improvements in Perforce
Performance & Scalability Improvements in Perforce
 
10 Key MongoDB Performance Indicators
10 Key MongoDB Performance Indicators  10 Key MongoDB Performance Indicators
10 Key MongoDB Performance Indicators
 

Recently uploaded

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Monitoring MySQL with OpenTSDB

  • 1. Monitoring MySQL with OpenTSDB Percona live 2013 Geoffrey Anderson, Box Inc. @geodbz
  • 2. Who Geoffrey Anderson • Database Operations Engineer @ Box, Inc. • a.k.a. DBA • Tooling for MySQL and HBase • #DBHangOps
  • 4.
  • 5.
  • 6.
  • 8.
  • 10. OpenTSDB is... • Distributed • Scalable • Time Series Database • Runs on HBase • Created By Benoit Sigoure HBase TSD for Querying mydb.example.com HAProxy fe1.example.com TSD for Storing Push Metrics Query via API
  • 11. • FAST • EASY to Scale • EASY to Populate • EASY to collect data • EASY to Query Why OpenTSDB?
  • 13. #!/usr/bin/env bash timestamp=$(date +%s) mysql -ss -e "SHOW GLOBAL STATUS" | while read var val do echo "mysql.$var $timestamp $val host=$HOSTNAME" done ganderson@mydb.example.com:~$ _./mysql_collector.sh mysql.Aborted_connects 1366399993 0 host=mydb.example.com mysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.com mysql.Binlog_cache_use 1366399993 0 host=mydb.example.com mysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.com mysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.com mysql.Bytes_received 1366399993 19453687 host=mydb.example.com mysql.Bytes_sent 1366399993 1238166682 host=mydb.example.com mysql.Com_admin_commands 1366399993 1 host=mydb.example.com mysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com ... Example: mysql_collector.sh
  • 14. #!/usr/bin/env bash timestamp=$(date +%s) mysql -ss -e "SHOW GLOBAL STATUS" | while read var val do echo "mysql.$var $timestamp $val host=$HOSTNAME" done ganderson@mydb.example.com:~$ _./mysql_collector.sh mysql.Aborted_connects 1366399993 0 host=mydb.example.com mysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.com mysql.Binlog_cache_use 1366399993 0 host=mydb.example.com mysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.com mysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.com mysql.Bytes_received 1366399993 19453687 host=mydb.example.com mysql.Bytes_sent 1366399993 1238166682 host=mydb.example.com mysql.Com_admin_commands 1366399993 1 host=mydb.example.com mysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com ... Example: mysql_collector.sh Metric name Timestamp Value “Tags” (key=val)
  • 15. * * * * * mysql_collector.sh | nc opentsdb.example.com 4242 Example: adding a cron for OpenTSDB
  • 16.
  • 17. ganderson@mydb.example.com:tcollector$ tree . |-- collectors | |-- 0 | | |-- ifstat.py | | |-- iostat.py | | |-- procnettcp.py | | |-- procstats.py | |-- 15 | | `-- dfstat.py | |-- 30 | | |-- mysql_collector.sh | |-- 300 | | `-- ptTcpModel.sh | `-- etc | |-- config.py |-- config |-- startstop `-- tcollector.py Run forever Run every 15 seconds Run every 5 minutes Run every 30 seconds
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 31. Table Info from I_S SELECT *, DATA_LENGTH+INDEX_LENGTH AS TOTAL_LENGTH FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA NOT IN ('PERFORMANCE_SCHEMA','INFORMATION_SCHEMA')
  • 33. And other “common” metrics • Various MySQL status counters • QPS (questions) • Threads connected • Temporary tables on disk • Etc. • Various server statistics • %CPU Idle • Free disk space • I/O utilization • Network traffic • Etc.
  • 34. Future collectors • pt-query-digest/mysqlslow query statistics • Data from “show engine innodb status” • (that is missing from counters) • PERFORMANCE_SCHEMA (MySQL 5.6+) • Query statistics • Processlist information • Background thread information
  • 35. How does this change things?
  • 36.
  • 37. In all seriousness, though... • Easily see aggregate graphs • Easily build graphs on-the-fly • Full granularity forever • API request for raw data • Cluster-wide nagios checks with check_tsd
  • 38. Challenges Switching • Aggregates are the default • Mouse-zooming (patched!) • Auto-suggest for metrics • “The graphs aren’t pretty” • Migrating from proof of concept • Plan for 3+ machines • Data pruning may be required
  • 39. Some Quick Numbers OpenTSDB @ Box  21,294 metrics  72 tag keys  5,145,745 tag values  90% Interactive graphs return <300ms
  • 41. Enjoy #PerconaLive 2013 We’re hiring! https://www.box.com/about-us/careers/ geoff@box.com
  • 42. Image credits  http://upload.wikimedia.org/wikipedia/commons/7/7b/Batelco_Network_Operations_Centre_(NOC).JPG  http://www.flickr.com/photos/hoyvinmayvin/5873697252/  http://www.percona.com/doc/percona-monitoring-plugins  http://www.2cto.com/uploadfile/2012/0731/20120731112415744.jpg  http://media.tumblr.com/tumblr_lvfspoenWU1qi19a2.png  http://img.izismile.com/img/img4/20110527/640/you_can_be_a_superhero_640_01.jpg  http://openclipart.org/image/250px/svg_to_png/26427/Anonymous_notebook.png  http://images.alphacoders.com/768/2560-1600-76893.jpg  http://www.flickr.com/photos/in365/4861180503/  http://openclipart.org/image/250px/svg_to_png/130915/Prohibido_3D.png  http://www.flickr.com/photos/61114149@N02/5566484951/  http://opentsdb.net/img/tsd-sample.png  http://images2.wikia.nocookie.net/__cb20080911160202/bttf/images/5/57/WhatdidItellyou-HQ.jpg  http://www.flickr.com/photos/lisakayaks/3028350539/  http://www.flickr.com/photos/25566302@N00/1472400115  http://www.flickr.com/photos/grandmaitre/5846058698/  http://www.flickr.com/photos/7518432@N06/2673347604/

Editor's Notes

  1. Will be talking about OpenTSDBHow OpenTSDB changed monitoring at boxHow we leverage it’s abilities for day-to-day management of MySQL DBs
  2. Youprobablyhave the perconacactigraphs and monitoring plugins
  3. Youaddsomeothernagioschecks for funedgecases
  4. And you use different tools from the percona toolkit like:StalkPoor man’s profiler (PMP)Query Digest
  5. Suddenly finding problems and correlating issues is difficultMaybe you don’t have a NOC yetMaybe you do, and they need better graphs
  6. IT’S BIGGER ON THE INSIDE – just kiddingFast!Easy to build graphs on the flyHella easy to scale – just add nodes (HBase or TSDs)Very easy to put data into it – NEXT SLIDES TALK ABOUT THIS YO
  7. Running threads follows the CPU spikes PERFECTLYBox has a “long query” killer that gets more aggressive as more threads stack upShould get a look at queries on the server
  8. Zoom in to get the exact time interval
  9. Know the exact time of a high stack upGo to check Box Anemometer to see what query is there
  10. This is the URL for thatCan easily paste this to anyone to see the same interactive graph
  11. If you prefer text, that’s also an option via APIYou can build cool tools using the APIWeek over Week graphsSimplifies anomaly detectionURL is pretty simpleEffectively just use “q?” and add “&amp;ascii”
  12. Get audit log:LoginsTypes of statements issuedEtc.
  13. Get performance information about:Row and index change activityRow read activity
  14. Generate daily reports of:Are auto increments columns nearing a boundary on a table?Number of records in a tableSize of a datafile for a table
  15. Using pt-tcp-modelAllows us to identify when server stops doing work5min interval
  16. Aggregate graphs are the defaultDrill down only when problems in aggregate
  17. Aggregatesare thedefault–shift in thinking from lookingatspecificimportantservers.Zooming in on a timeslice was painfullymanual– I wroteup a patch to addmouse-zooming and upstreamed. Thiscementedopentsdb as a powerful monitoring tool for Box, overnightAuto-suggest for metricsisspotty– we wrote a quick cron job that dumps full metric list into JSON “Graphs aren’t pretty” – a few changes to the base GNUPlot options solved this. There’s also a “Smooth” option in the interface nowMigrating from POC – we had a single-node setup for the longest time until that fell over...a lotPlan for 3+ machines – it’s enough to run all the needed bits for a light-weight distributed HBase and TSD setupData pruning – ~4 bytes per metric before HDFS replication add up quicklymysql_tcollector - 370 metrics -- ~1.5k per server. X 30s interval = ~4.2MB/dayeither have a plan to prune old data or build out extra capacity and predict storage needs per server/metric added