SlideShare a Scribd company logo
1 of 41
Download to read offline
OpenTSDB 2.0
Distributed, Scalable Time Series Database
Benoit Sigoure tsunanet@gmail.com
Chris Larsen clarsen@llnw.com
Who We Are
Benoit Sigoure
● Created OpenTSDB at StumbleUpon
● Software Engineer @ Arista Networks
Chris Larsen
● Release manager for OpenTSDB 2.0
● Operations Engineer @ Limelight Networks
What Is OpenTSDB?
● Open Source Time Series Database
● Store trillions of data points
● Sucks up all data and keeps going
● Never lose precision
● Scales using HBase
What good is it?
● Systems Monitoring & Measurement
○ Servers
○ Networks
● Sensor Data
○ The Internet of Things
○ SCADA
● Financial Data
● Scientific Experiment Results
Use Cases
OVH: #3 largest cloud/hosting provider
Monitor everything: networking, temperature, voltage,
application performance, resource utilization,
customer-facing metrics, etc.
● 35 servers, 100k writes/s, 25TB raw data
● 5-day moving window of HBase snapshots
● Redis cache on top for customer-facing data
Use Cases
Yahoo
Monitoring application performance and statistics
● 15 servers, 280k writes/s
● Increased UID size to 4 bytes instead of 3, allowing
for over 4 billion values
● Looking at using HBase Append requests to avoid
TSD compactions
Use Cases
Arista Networks: High performance networking
● Single-node HBase (no HDFS) + 2 TSDs (one for
writing, one for reading)
● 5K writes per second, 500G of data, piece of cake
to deploy/maintain
● Varnish for caching
Some Other Users
● Box: 23 servers, 90K wps, System, app network, business metrics
● Limelight Networks: 8 servers, 30k wps, 24TB of data
● Ticketmaster: 13 servers, 90K wps, ~40GB a day
What Are Time Series?
● Time Series: data points for an identity
over time
● Typical Identity:
○ Dotted string: web01.sys.cpu.user.0
● OpenTSDB Identity:
○ Metric: sys.cpu.user
○ Tags (name/value pairs):
host=web01 cpu=0
What Are Time Series?
Data Point:
● Metric + Tags
● + Value: 42
● + Timestamp: 1234567890
sys.cpu.user 1234567890 42 host=web01 cpu=0
^ a data point ^
How it Works
Writing Data
1) Open Telnet style socket, write:
put sys.cpu.user 1234567890 42 host=web01 cpu=0
2) ..or with 2.0, post JSON to:
http://<host>:<port>/api/put
3) .. or import big files with CLI
● No schema definition
● No RRD file creation
● Just write!
Querying Data
● Graph with the GUI
● CLI tools
● HTTP API
● Aggregate multiple series
● Simple query language
To average all CPUs on host:
start=1h-ago
avg sys.cpu.user host=web01
HBase Data Tables
● tsdb - Data point table. Massive
● tsdb-uid - Name to UID and UID to
name mappings
● tsdb-meta - Time series index and
meta-data (new in 2.0)
● tsdb-tree - Config and index for
heirarchical naming schema (new
in 2.0)
Lets see how OpenTSDB uses
HBase...
UID Table Schema
● Integer UIDs assigned to each value per type (metric,
tagk, tagv) in tsdb-uid table
● 64 bit integers in row x00 reflect last used UID
CF:Qualifier Row Key UID
id:metric sys.cpu.user 1
id:tagk host 1
id:tagv web01 1
id:tagk cpu 2
id:tagv 0 2
Improved UID Assignment
● Pre 1.2 Assignment:
○ Client acquires lock on row x00
○ uid = getRequest(UID type)
○ Increment uid
○ getRequest(name) confirm name hasn’t been assigned
○ putRequest(type, uid)
○ putRequest(reverse map)
○ putRequest(forward map)
○ Release lock
● Lock held for a long time
● Puts overwrite data
Improved UID Assignment
● 1.2 & 2.0 Assignment:
○ atomicIncrementRequest(UID type)
○ getRequest(name) confirm name hasn’t been assigned
○ compareAndSet(reverse map)
○ compareAndSet(forward map)
● 3 atomic operations on different rows
● Much better concurrency with multiple TSDs assigning
UIDs
● CAS calls fail on existing data, log the error
Data Table Schema
● Row key is a concatenation of UIDs and time:
○ metric + timestamp + tagk1 + tagv1… + tagkN + tagvN
● sys.cpu.user 1234567890 42 host=web01 cpu=0
x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02
● Timestamp normalized on 1 hour boundaries
● All data points for an hour are stored in one row
● Enables fast scans of all time series for a metric
● …or pass a row key regex filter for specific time series
with particular tags
Data Table Schema
1.x Column Qualifiers:
● Type: floating point or integer value
● Value Length: number of bytes the value is encoded on
● Delta: offset from row timestamp in seconds
● Compacted columns concatenate an hour of qualifiers
and values into a single column
[ 0b11111111, 0b1111 0111 ]
<----------------> ^<->
delta (seconds) type value length
Data Table Schema
OpenTSDB 2.x Schema Design Goals
● Must maintain backwards compatibility
● Support millisecond precision timestamps
● Support other objects, e.g. annotations, blobs
○ E.g. could store quality measurements for each data
point
● Store millisecond and other objects in the same row as
data points
○ Scoops up all relevant time series data in one scan
Data Table Schema
Millisecond Support:
● Need 3,599,999 possible time offsets instead of 3,600
● Solution: 4 byte qualifier instead of 2
● Prefixed with xF0 to differentiate from a 2 value
compacted column
● Can mix second and millisecond precision in one row
● Still concatenates into one compacted column
[ 0b11111101, 0b10111011, 0b10011111, 0b11 000111 ]
<--><--------------------------------> ^^^<->
msec precision delta (milliseconds) type value length
2 unused bits
Data Table Schema
Annotations and Other Objects
● Odd number of bytes in qualifier (3 or 5)
● Prefix IDs: x01 = annotation, x02 = blob
● Remainder is offset in seconds or milliseconds
● Unbounded length, not meant for compaction
{
"tsuid": "000001000001000001",
"description": "Server Maintenance",
"notes": "Upgrading the server, ignore these values, winter is coming",
"endTime": 1369159800,
"startTime": 1369159200
}
TSUID Index and Meta Table
● Time Series UID = data table row key without timestamp
E.g. sys.cpu.user host=web01 cpu=0
x00x00x01x00x00x01x00x00x01x00x00x02x00x00x02
● Use TSUID as the row key
○ Can use a row key regex filter to scan for metrics with a tag pair
or the tags associated with a metric
● Atomic Increment for each data point
○ ts_counter column: Track number of data points written
○ ts_meta column: Async callback chain to create meta object if
increment returns a 1
○ Potentially doubles RPC count
OpenTSDB Trees
● Provide a hierarchical
representation of time series
○ Useful for Graphite or
browsing the data store
● Flexible rule system processes
metrics and tags
● Out of band creation or
process in real-time with
TSUID increment callback
chaining
OpenTSDB Trees
Example Time Series:
myapp.bytes_sent dc=dal host=web01
myapp.bytes_received dc=dal host=web01
Example Ruleset:
Level Order Rule
Type
Field Regex
0 0 tagk dc
1 0 tagk host
2 0 metric (.*)..*
3 0 metric .*.(*.)
OpenTSDB Trees
Example Results:
Flattened Names:
dal.web01.myapp.bytes_sent
dal.web01.myapp.bytes_received
Tree View :
● dal
○ web01
■ myapp
● bytes_sent
● bytes_received
^ leaves ^
Tree Table Schema
Column Qualifiers:
● tree: JSON object with tree description
● rule:<level>:<order>: JSON object definition of a rule
belonging to a tree
● branch: JSON object linking to child branches and/or leaves
● leaf:<tsuid>: JSON object linking to a specific TSUID
● tree_collision:<tsuid>: Time series was already included in
tree
● tree_not_matched:<tsuid>: Time series did not appear in tree
Tree Table Schema
● Row Keys = Tree ID + Branch ID
● Branch ID = 4 byte hashes of branch hierarchy
E.g. dal.web01.myapp.bytes_sent
dal = 0001838F(hex)
web01 = 06BC4C55
myapp = 06387CF5
bytes_sent = leaf pointing to a TSUID
Key for branch on Tree #1 =
00010001838F06BC4C5506387CF5
Tree Table Schema
Row Key CF: “t” Keys truncated, 1B tree ID, 2 byte hashes
x01 “tree”: {“name”:”Test Tree”} “rule0:0”: {<rule>} “rule1:0”: {<rule>}
x01x83x8F “branch”: {“name”:”dal”, “branches”:[{ “name”:”web01”},{“name”:”web01”}]}
x01x83x8Fx4Cx55 “branch”: {“name”:”web01”, “branches”:[{ “name”:”myapp”}]}
x01x83x8Fx4Cx55x7Cx5
F
“branch”: {“name”:”myapp”,
“branches”:null}
“leaf:<tsuid1>”: {“name”:”
bytes_sent”}
“leaf:<tsuid2>”: {“name”:”
bytes_received”}
x01x83x8Fx4Dx00 “branch”: {“name”:”web02”, “branches”:[{ “name”:”myapp”}]}
x01x83x8Fx4Dx00x7Cx5
F
“branch”: {“name”:”myapp”,
“branches”:null}
“leaf:<tsuid3>”: {“name”:”
bytes_sent”}
“leaf:<tsuid4>”: {“name”:”
bytes_sent”}
Row Key Regex for branch “dal”: ^Qx01x83x8FE(?:.{2})$
Matches branch “web01” and “web02” only.
Time Series Naming Optimization
● Designed for fast aggregation:
○ Average CPU usage across hosts in web pool
○ Total bytes sent from all hosts running application
MySQL
● High cardinality increases query latency
● Aim to minimize rows scanned
● Determine where aggregation is useful
○ May not care about average CPU usage across
entire network.
Therefore shift web tag to the metric name
Time Series Naming Optimization
Example Time Series
● sys.cpu.user host=web01 pool=web
● sys.cpu.user host=db01 pool=db
● host = 1m total unique values, 1,000 in web and 50 in db
● pool = 20 unique values
Very high cardinality for “sys.cpu.user”
● Query 1) 30 day query for “sys.cpu.user pool=db” scans 720m
rows (24h * 30d * 1m hosts) but only returns 36k (24h * 30d * 50)
● Query 2) 30 day query for “sys.cpu.user host=db01 pool=db” still
scans 720m rows but returns 36
Time Series Naming Optimization
Solution: Move pool tag values into metric name
● web.sys.cpu.user host=web01
● db.sys.cpu.user host=db01
Much lower cardinality for “sys.cpu.user”
● Query 1) 30 day query for “db.sys.cpu.user” scans 36k rows (24h
* 30d * 50 hosts) and returns all 36k (24h * 30d * 50)
● Query 2) 30 day query for “db.sys.cpu.user host=db01” still scans
36k rows and returns 36
New for OpenTSDB 2.0
● RESTful HTTP API
● Plugin support for de/serialization
● Emitter plugin support for publishing
○ Use for WebSockets/display
updates
○ Stream Processing
● Non-interpolating aggregation
functions
New for OpenTSDB 2.0
Special rate and counter functions
● Suppress spikes
Raw Rate: Counter Reset = 1000
OpenTSDB Front-Ends
Metrilyx from TicketMaster https://github.com/Ticketmaster/metrilyx-2.0
OpenTSDB Front-Ends
Status Wolf from Box https://github.com/box/StatusWolf
New in AsyncHBase 1.5
● AsyncHBase is a fully asynchronous, multi-
threaded HBase client
● Now supports HBase 0.96 / 0.98
● Remains 2x faster than HTable in
PerformanceEvaluation
● Support for scanner filters, META prefetch,
“fail-fast” RPCs
The Future of OpenTSDB
The Future
● Parallel scanners to improve
queries
● Coprocessor support for query
improvements similar to Salesforce’
s Phoenix with SkipScans
● Time series searching, lookups
(which tags belong to which
series?)
The Future
● Duplicate timestamp handling
● Counter increment and blob storage support
● Rollups/pre-aggregations
● Greater query flexibility:
○ Aggregate metrics
○ Regex on tags
○ Scalar calculations
More Information
● Thank you to everyone who has helped test, debug and add to OpenTSDB
2.0!
● Contribute at github.com/OpenTSDB/opentsdb
● Website: opentsdb.net
● Documentation: opentsdb.net/docs/build/html
● Mailing List: groups.google.com/group/opentsdb
Images
● http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html
● http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg
● http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg
● http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG
● http://upload.wikimedia.org/wikipedia/commons/4/42/PostItNotePad.JPG
● http://openclipart.org/image/300px/svg_to_png/68557/green_leaf_icon.png
● http://nickapedia.com/2012/06/05/api-all-the-things-razor-api-wiki/
● http://lego.cuusoo.com/ideas/view/96
● http://upload.wikimedia.org/wikipedia/commons/3/35/KL_Cyrix_FasMath_CX83D87.jpg
● http://www.flickr.com/photos/smkybear/4624182536/

More Related Content

What's hot

Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Graph Databases & OrientDB
Graph Databases & OrientDBGraph Databases & OrientDB
Graph Databases & OrientDBArpit Poladia
 
Disk and File System Management in Linux
Disk and File System Management in LinuxDisk and File System Management in Linux
Disk and File System Management in LinuxHenry Osborne
 
Course 102: Lecture 14: Users and Permissions
Course 102: Lecture 14: Users and PermissionsCourse 102: Lecture 14: Users and Permissions
Course 102: Lecture 14: Users and PermissionsAhmed El-Arabawy
 
Linux Presentation
Linux PresentationLinux Presentation
Linux Presentationnishantsri
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger InternalsNorberto Leite
 
Working with Databases and MySQL
Working with Databases and MySQLWorking with Databases and MySQL
Working with Databases and MySQLNicole Ryan
 
Web servers – features, installation and configuration
Web servers – features, installation and configurationWeb servers – features, installation and configuration
Web servers – features, installation and configurationwebhostingguy
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
Management file and directory in linux
Management file and directory in linuxManagement file and directory in linux
Management file and directory in linuxZkre Saleh
 
Percona Xtrabackup - Highly Efficient Backups
Percona Xtrabackup - Highly Efficient BackupsPercona Xtrabackup - Highly Efficient Backups
Percona Xtrabackup - Highly Efficient BackupsMydbops
 

What's hot (20)

database
databasedatabase
database
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
CouchDB
CouchDBCouchDB
CouchDB
 
Graph Databases & OrientDB
Graph Databases & OrientDBGraph Databases & OrientDB
Graph Databases & OrientDB
 
Disk and File System Management in Linux
Disk and File System Management in LinuxDisk and File System Management in Linux
Disk and File System Management in Linux
 
Session 14 - Hive
Session 14 - HiveSession 14 - Hive
Session 14 - Hive
 
Course 102: Lecture 14: Users and Permissions
Course 102: Lecture 14: Users and PermissionsCourse 102: Lecture 14: Users and Permissions
Course 102: Lecture 14: Users and Permissions
 
Linux Presentation
Linux PresentationLinux Presentation
Linux Presentation
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
 
Working with Databases and MySQL
Working with Databases and MySQLWorking with Databases and MySQL
Working with Databases and MySQL
 
Web servers – features, installation and configuration
Web servers – features, installation and configurationWeb servers – features, installation and configuration
Web servers – features, installation and configuration
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
MongoDB
MongoDBMongoDB
MongoDB
 
Management file and directory in linux
Management file and directory in linuxManagement file and directory in linux
Management file and directory in linux
 
6.hive
6.hive6.hive
6.hive
 
Percona Xtrabackup - Highly Efficient Backups
Percona Xtrabackup - Highly Efficient BackupsPercona Xtrabackup - Highly Efficient Backups
Percona Xtrabackup - Highly Efficient Backups
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Linux
Linux Linux
Linux
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Similar to OpenTSDB 2.0

HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceHBaseCon
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce HBaseCon
 
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...Qbeast
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017HBaseCon
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Robbie Strickland
 
Streaming Data from Cassandra into Kafka
Streaming Data from Cassandra into KafkaStreaming Data from Cassandra into Kafka
Streaming Data from Cassandra into KafkaAbrar Sheikh
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...HostedbyConfluent
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingYaroslav Tkachenko
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...Amazon Web Services
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSC
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSCRMLL 2013 - Synchronize OpenLDAP and Active Directory with LSC
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSCClément OUDOT
 
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Kevin Xu
 
Go Big or Go Home: Approaching Kafka Replication at Scale
Go Big or Go Home: Approaching Kafka Replication at ScaleGo Big or Go Home: Approaching Kafka Replication at Scale
Go Big or Go Home: Approaching Kafka Replication at ScaleHostedbyConfluent
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseAll Things Open
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifyNeville Li
 
Neo4j after 1 year in production
Neo4j after 1 year in productionNeo4j after 1 year in production
Neo4j after 1 year in productionAndrew Nikishaev
 

Similar to OpenTSDB 2.0 (20)

HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase Update
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
Extending Spark for Qbeast's SQL Data Source​ with Paola Pardo and Cesare Cug...
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
Streaming Data from Cassandra into Kafka
Streaming Data from Cassandra into KafkaStreaming Data from Cassandra into Kafka
Streaming Data from Cassandra into Kafka
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
 
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
How Netflix Uses Amazon Kinesis Streams to Monitor and Optimize Large-scale N...
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSC
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSCRMLL 2013 - Synchronize OpenLDAP and Active Directory with LSC
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSC
 
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
 
Go Big or Go Home: Approaching Kafka Replication at Scale
Go Big or Go Home: Approaching Kafka Replication at ScaleGo Big or Go Home: Approaching Kafka Replication at Scale
Go Big or Go Home: Approaching Kafka Replication at Scale
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 
Neo4j after 1 year in production
Neo4j after 1 year in productionNeo4j after 1 year in production
Neo4j after 1 year in production
 

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 

More from HBaseCon (20)

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBaseHBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
 
HBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBaseHBaseCon2017 Transactions in HBase
HBaseCon2017 Transactions in HBase
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 

Recently uploaded

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 

Recently uploaded (20)

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 

OpenTSDB 2.0

  • 1. OpenTSDB 2.0 Distributed, Scalable Time Series Database Benoit Sigoure tsunanet@gmail.com Chris Larsen clarsen@llnw.com
  • 2. Who We Are Benoit Sigoure ● Created OpenTSDB at StumbleUpon ● Software Engineer @ Arista Networks Chris Larsen ● Release manager for OpenTSDB 2.0 ● Operations Engineer @ Limelight Networks
  • 3. What Is OpenTSDB? ● Open Source Time Series Database ● Store trillions of data points ● Sucks up all data and keeps going ● Never lose precision ● Scales using HBase
  • 4. What good is it? ● Systems Monitoring & Measurement ○ Servers ○ Networks ● Sensor Data ○ The Internet of Things ○ SCADA ● Financial Data ● Scientific Experiment Results
  • 5. Use Cases OVH: #3 largest cloud/hosting provider Monitor everything: networking, temperature, voltage, application performance, resource utilization, customer-facing metrics, etc. ● 35 servers, 100k writes/s, 25TB raw data ● 5-day moving window of HBase snapshots ● Redis cache on top for customer-facing data
  • 6. Use Cases Yahoo Monitoring application performance and statistics ● 15 servers, 280k writes/s ● Increased UID size to 4 bytes instead of 3, allowing for over 4 billion values ● Looking at using HBase Append requests to avoid TSD compactions
  • 7. Use Cases Arista Networks: High performance networking ● Single-node HBase (no HDFS) + 2 TSDs (one for writing, one for reading) ● 5K writes per second, 500G of data, piece of cake to deploy/maintain ● Varnish for caching
  • 8. Some Other Users ● Box: 23 servers, 90K wps, System, app network, business metrics ● Limelight Networks: 8 servers, 30k wps, 24TB of data ● Ticketmaster: 13 servers, 90K wps, ~40GB a day
  • 9. What Are Time Series? ● Time Series: data points for an identity over time ● Typical Identity: ○ Dotted string: web01.sys.cpu.user.0 ● OpenTSDB Identity: ○ Metric: sys.cpu.user ○ Tags (name/value pairs): host=web01 cpu=0
  • 10. What Are Time Series? Data Point: ● Metric + Tags ● + Value: 42 ● + Timestamp: 1234567890 sys.cpu.user 1234567890 42 host=web01 cpu=0 ^ a data point ^
  • 12. Writing Data 1) Open Telnet style socket, write: put sys.cpu.user 1234567890 42 host=web01 cpu=0 2) ..or with 2.0, post JSON to: http://<host>:<port>/api/put 3) .. or import big files with CLI ● No schema definition ● No RRD file creation ● Just write!
  • 13. Querying Data ● Graph with the GUI ● CLI tools ● HTTP API ● Aggregate multiple series ● Simple query language To average all CPUs on host: start=1h-ago avg sys.cpu.user host=web01
  • 14. HBase Data Tables ● tsdb - Data point table. Massive ● tsdb-uid - Name to UID and UID to name mappings ● tsdb-meta - Time series index and meta-data (new in 2.0) ● tsdb-tree - Config and index for heirarchical naming schema (new in 2.0) Lets see how OpenTSDB uses HBase...
  • 15. UID Table Schema ● Integer UIDs assigned to each value per type (metric, tagk, tagv) in tsdb-uid table ● 64 bit integers in row x00 reflect last used UID CF:Qualifier Row Key UID id:metric sys.cpu.user 1 id:tagk host 1 id:tagv web01 1 id:tagk cpu 2 id:tagv 0 2
  • 16. Improved UID Assignment ● Pre 1.2 Assignment: ○ Client acquires lock on row x00 ○ uid = getRequest(UID type) ○ Increment uid ○ getRequest(name) confirm name hasn’t been assigned ○ putRequest(type, uid) ○ putRequest(reverse map) ○ putRequest(forward map) ○ Release lock ● Lock held for a long time ● Puts overwrite data
  • 17. Improved UID Assignment ● 1.2 & 2.0 Assignment: ○ atomicIncrementRequest(UID type) ○ getRequest(name) confirm name hasn’t been assigned ○ compareAndSet(reverse map) ○ compareAndSet(forward map) ● 3 atomic operations on different rows ● Much better concurrency with multiple TSDs assigning UIDs ● CAS calls fail on existing data, log the error
  • 18. Data Table Schema ● Row key is a concatenation of UIDs and time: ○ metric + timestamp + tagk1 + tagv1… + tagkN + tagvN ● sys.cpu.user 1234567890 42 host=web01 cpu=0 x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02 ● Timestamp normalized on 1 hour boundaries ● All data points for an hour are stored in one row ● Enables fast scans of all time series for a metric ● …or pass a row key regex filter for specific time series with particular tags
  • 19. Data Table Schema 1.x Column Qualifiers: ● Type: floating point or integer value ● Value Length: number of bytes the value is encoded on ● Delta: offset from row timestamp in seconds ● Compacted columns concatenate an hour of qualifiers and values into a single column [ 0b11111111, 0b1111 0111 ] <----------------> ^<-> delta (seconds) type value length
  • 20. Data Table Schema OpenTSDB 2.x Schema Design Goals ● Must maintain backwards compatibility ● Support millisecond precision timestamps ● Support other objects, e.g. annotations, blobs ○ E.g. could store quality measurements for each data point ● Store millisecond and other objects in the same row as data points ○ Scoops up all relevant time series data in one scan
  • 21. Data Table Schema Millisecond Support: ● Need 3,599,999 possible time offsets instead of 3,600 ● Solution: 4 byte qualifier instead of 2 ● Prefixed with xF0 to differentiate from a 2 value compacted column ● Can mix second and millisecond precision in one row ● Still concatenates into one compacted column [ 0b11111101, 0b10111011, 0b10011111, 0b11 000111 ] <--><--------------------------------> ^^^<-> msec precision delta (milliseconds) type value length 2 unused bits
  • 22. Data Table Schema Annotations and Other Objects ● Odd number of bytes in qualifier (3 or 5) ● Prefix IDs: x01 = annotation, x02 = blob ● Remainder is offset in seconds or milliseconds ● Unbounded length, not meant for compaction { "tsuid": "000001000001000001", "description": "Server Maintenance", "notes": "Upgrading the server, ignore these values, winter is coming", "endTime": 1369159800, "startTime": 1369159200 }
  • 23. TSUID Index and Meta Table ● Time Series UID = data table row key without timestamp E.g. sys.cpu.user host=web01 cpu=0 x00x00x01x00x00x01x00x00x01x00x00x02x00x00x02 ● Use TSUID as the row key ○ Can use a row key regex filter to scan for metrics with a tag pair or the tags associated with a metric ● Atomic Increment for each data point ○ ts_counter column: Track number of data points written ○ ts_meta column: Async callback chain to create meta object if increment returns a 1 ○ Potentially doubles RPC count
  • 24. OpenTSDB Trees ● Provide a hierarchical representation of time series ○ Useful for Graphite or browsing the data store ● Flexible rule system processes metrics and tags ● Out of band creation or process in real-time with TSUID increment callback chaining
  • 25. OpenTSDB Trees Example Time Series: myapp.bytes_sent dc=dal host=web01 myapp.bytes_received dc=dal host=web01 Example Ruleset: Level Order Rule Type Field Regex 0 0 tagk dc 1 0 tagk host 2 0 metric (.*)..* 3 0 metric .*.(*.)
  • 26. OpenTSDB Trees Example Results: Flattened Names: dal.web01.myapp.bytes_sent dal.web01.myapp.bytes_received Tree View : ● dal ○ web01 ■ myapp ● bytes_sent ● bytes_received ^ leaves ^
  • 27. Tree Table Schema Column Qualifiers: ● tree: JSON object with tree description ● rule:<level>:<order>: JSON object definition of a rule belonging to a tree ● branch: JSON object linking to child branches and/or leaves ● leaf:<tsuid>: JSON object linking to a specific TSUID ● tree_collision:<tsuid>: Time series was already included in tree ● tree_not_matched:<tsuid>: Time series did not appear in tree
  • 28. Tree Table Schema ● Row Keys = Tree ID + Branch ID ● Branch ID = 4 byte hashes of branch hierarchy E.g. dal.web01.myapp.bytes_sent dal = 0001838F(hex) web01 = 06BC4C55 myapp = 06387CF5 bytes_sent = leaf pointing to a TSUID Key for branch on Tree #1 = 00010001838F06BC4C5506387CF5
  • 29. Tree Table Schema Row Key CF: “t” Keys truncated, 1B tree ID, 2 byte hashes x01 “tree”: {“name”:”Test Tree”} “rule0:0”: {<rule>} “rule1:0”: {<rule>} x01x83x8F “branch”: {“name”:”dal”, “branches”:[{ “name”:”web01”},{“name”:”web01”}]} x01x83x8Fx4Cx55 “branch”: {“name”:”web01”, “branches”:[{ “name”:”myapp”}]} x01x83x8Fx4Cx55x7Cx5 F “branch”: {“name”:”myapp”, “branches”:null} “leaf:<tsuid1>”: {“name”:” bytes_sent”} “leaf:<tsuid2>”: {“name”:” bytes_received”} x01x83x8Fx4Dx00 “branch”: {“name”:”web02”, “branches”:[{ “name”:”myapp”}]} x01x83x8Fx4Dx00x7Cx5 F “branch”: {“name”:”myapp”, “branches”:null} “leaf:<tsuid3>”: {“name”:” bytes_sent”} “leaf:<tsuid4>”: {“name”:” bytes_sent”} Row Key Regex for branch “dal”: ^Qx01x83x8FE(?:.{2})$ Matches branch “web01” and “web02” only.
  • 30. Time Series Naming Optimization ● Designed for fast aggregation: ○ Average CPU usage across hosts in web pool ○ Total bytes sent from all hosts running application MySQL ● High cardinality increases query latency ● Aim to minimize rows scanned ● Determine where aggregation is useful ○ May not care about average CPU usage across entire network. Therefore shift web tag to the metric name
  • 31. Time Series Naming Optimization Example Time Series ● sys.cpu.user host=web01 pool=web ● sys.cpu.user host=db01 pool=db ● host = 1m total unique values, 1,000 in web and 50 in db ● pool = 20 unique values Very high cardinality for “sys.cpu.user” ● Query 1) 30 day query for “sys.cpu.user pool=db” scans 720m rows (24h * 30d * 1m hosts) but only returns 36k (24h * 30d * 50) ● Query 2) 30 day query for “sys.cpu.user host=db01 pool=db” still scans 720m rows but returns 36
  • 32. Time Series Naming Optimization Solution: Move pool tag values into metric name ● web.sys.cpu.user host=web01 ● db.sys.cpu.user host=db01 Much lower cardinality for “sys.cpu.user” ● Query 1) 30 day query for “db.sys.cpu.user” scans 36k rows (24h * 30d * 50 hosts) and returns all 36k (24h * 30d * 50) ● Query 2) 30 day query for “db.sys.cpu.user host=db01” still scans 36k rows and returns 36
  • 33. New for OpenTSDB 2.0 ● RESTful HTTP API ● Plugin support for de/serialization ● Emitter plugin support for publishing ○ Use for WebSockets/display updates ○ Stream Processing ● Non-interpolating aggregation functions
  • 34. New for OpenTSDB 2.0 Special rate and counter functions ● Suppress spikes Raw Rate: Counter Reset = 1000
  • 35. OpenTSDB Front-Ends Metrilyx from TicketMaster https://github.com/Ticketmaster/metrilyx-2.0
  • 36. OpenTSDB Front-Ends Status Wolf from Box https://github.com/box/StatusWolf
  • 37. New in AsyncHBase 1.5 ● AsyncHBase is a fully asynchronous, multi- threaded HBase client ● Now supports HBase 0.96 / 0.98 ● Remains 2x faster than HTable in PerformanceEvaluation ● Support for scanner filters, META prefetch, “fail-fast” RPCs
  • 38. The Future of OpenTSDB
  • 39. The Future ● Parallel scanners to improve queries ● Coprocessor support for query improvements similar to Salesforce’ s Phoenix with SkipScans ● Time series searching, lookups (which tags belong to which series?)
  • 40. The Future ● Duplicate timestamp handling ● Counter increment and blob storage support ● Rollups/pre-aggregations ● Greater query flexibility: ○ Aggregate metrics ○ Regex on tags ○ Scalar calculations
  • 41. More Information ● Thank you to everyone who has helped test, debug and add to OpenTSDB 2.0! ● Contribute at github.com/OpenTSDB/opentsdb ● Website: opentsdb.net ● Documentation: opentsdb.net/docs/build/html ● Mailing List: groups.google.com/group/opentsdb Images ● http://photos.jdhancock.com/photo/2013-06-04-212438-the-lonely-vacuum-of-space.html ● http://en.wikipedia.org/wiki/File:Semi-automated-external-monitor-defibrillator.jpg ● http://upload.wikimedia.org/wikipedia/commons/1/17/Dining_table_for_two.jpg ● http://upload.wikimedia.org/wikipedia/commons/9/92/Easy_button.JPG ● http://upload.wikimedia.org/wikipedia/commons/4/42/PostItNotePad.JPG ● http://openclipart.org/image/300px/svg_to_png/68557/green_leaf_icon.png ● http://nickapedia.com/2012/06/05/api-all-the-things-razor-api-wiki/ ● http://lego.cuusoo.com/ideas/view/96 ● http://upload.wikimedia.org/wikipedia/commons/3/35/KL_Cyrix_FasMath_CX83D87.jpg ● http://www.flickr.com/photos/smkybear/4624182536/