SlideShare a Scribd company logo
1 of 57
Download to read offline
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Internals and
Operations
SRECon19 Asia/Pacific
June 13, 2019
Biju Nair
Software Engineer
bnair10@bloomberg.net
© 2019 Bloomberg Finance L.P. All rights reserved.
Agenda
• Introduction to HBase
• Operating HBase
• Questions
© 2019 Bloomberg Finance L.P. All rights reserved.
Bloomberg in a nutshell
The Bloomberg Terminal delivers a diverse
array of information on a single platform to
facilitate financial decision-making.
© 2019 Bloomberg Finance L.P. All rights reserved.
Bloomberg technology by the numbers
• 5,000+ software engineers
• 150+ technologists and data scientists devoted to machine learning
• One of the largest private networks in the world
• 120 billion pieces of data from the financial markets each day, with a peak of more than
10 million messages/second
• 2 million news stories ingested / published each day (500+ news stories ingested/second)
• News content from 125K+ sources
• Over 1 billion messages and Instant Bloomberg (IB) chats handled daily
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase at Bloomberg
• Started with v0.94.6
• >2 billion reads per day
• >1 billion writes per day
• 51+ TB of compressed data stored in HBase
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Principles
• Ordered Key Value Store
• Distributed shared nothing
© 2019 Bloomberg Finance L.P. All rights reserved.
Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-f
© 2019 Bloomberg Finance L.P. All rights reserved.
Ordered Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-a
Lexicographicorder
Key-9993 Value-g
© 2019 Bloomberg Finance L.P. All rights reserved.
Ordered Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-a
Lexicographicorder
Key-9993 Value-g
Region
© 2019 Bloomberg Finance L.P. All rights reserved.
Distributed Ordered Key Value
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
ordered
…
Key-299 Value
…
Key-298 Value
Key-297 Value
ordered
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
Region
Region
© 2019 Bloomberg Finance L.P. All rights reserved.
Distributed Ordered Key Value
RegionServer
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
…
Key-299 Value
…
Key-298 Value
Key-297 Value
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
Region
Region
RegionServer RegionServer
RegionServer RegionServer RegionServer
© 2019 Bloomberg Finance L.P. All rights reserved.
Table Row View
Key Value
Column Id ValueRow Id Timestamp
© 2019 Bloomberg Finance L.P. All rights reserved.
Table Row View
R11|col1|1234567 Value-A
R11|col2|1234567 Value-B
R11|col3|1234567 Value-C
R11|col4|1234567 Value-D
R11
Value-A Value-B Value-C
Col1 Col2 Col3
Value-D
Col4
© 2019 Bloomberg Finance L.P. All rights reserved.
Versioning
R11|col1|1234567 Value-A1
R11|col1|1234566 Value-A
R11|col2|1234567 Value-B
R11|col3|1234567 Value-CC
R11|col3|1234563 Value-C
R11|col4|1234567 Value-DD
R11|col4|1234560 Value-D1
R11|col4|1234557 Value-D
Descendingorder
© 2019 Bloomberg Finance L.P. All rights reserved.
cf2cf1
Column Family
R11|col1|1234567 Value-A
R11|col2|1234567 Value-B
R11|col3|1234567 Value-C
R11
Value-A Value-B Value-C
cf1:col1 cf1:col2 cf1:col3
Value-1
cf2:col1
R11|col1|1234567 Value-1
R11|col2|1234567 Value-2
R11|col3|1234567 Value-3
Value-2
cf2:col2
Value-3
cf2:col3
© 2019 Bloomberg Finance L.P. All rights reserved.
ACIDity
• Atomic at row level
• Consistent to a point in time before the request
• Isolation through MVCC (reads) and row locks (mutations)
• Durability is guaranteed for all successful mutations
© 2019 Bloomberg Finance L.P. All rights reserved.
Namespace
Namespace: Blue Namespace: Yellow
ACL ACL
ACL ACL
ACL
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase Region Server
File System
WAL
Write
1
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
File System
WAL
Write
1
2
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
File System
WAL t1 t1 t1
Store files
Write
1
2
3
Block Block Block
Idx BlmData
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
DFSClient
WAL t1 t1 t1
Store files
Write
1
2
3
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
DFSClient
WAL t1 t1 t1
Store files
Write
1
2
3
HDFS DataNode
WAL t1 t1 t1
© 2019 Bloomberg Finance L.P. All rights reserved.
Compaction
File System
K-x K-x D-1
Store files
K-xK-xK-x
© 2019 Bloomberg Finance L.P. All rights reserved.
Compaction
File System
Store files
K-x
Compaction
File System
K-x K-x D-1
Store files
K-xK-xK-x
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Read
HBase RS Memstore
File System
WAL
Read
1
Block Cache Block
t1 t1 t1
Store files
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Read
HBase RS Memstore
File System
WAL
Read
1
Block Cache
2
Block
t1 t1 t1
Store files
© 2019 Bloomberg Finance L.P. All rights reserved.
HDFS Read
HBase/
DFSClient
HDFS
File System
Data
TCP
© 2019 Bloomberg Finance L.P. All rights reserved.
HDFS Short-Circuit Read
HBase/
DFSClient
HDFS
File System
Data
TCP
Open
File
Pass
FD
© 2019 Bloomberg Finance L.P. All rights reserved.
Large Read Cache
HBase Memstore
File System
WAL t1 t1 t1
Store files
Read
1
Block Cache
2
Block
© 2019 Bloomberg Finance L.P. All rights reserved.
Large Read Cache
HBase
Memstore
File System
WAL t1 t1 t1
Store files
Read
1
Block Cache
3
Block
Off-heap Cache Block
2
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Server Failure
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Server Failure
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Server Failure
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
T1 r1
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1T1 r1
HBase
Client
https://www.youtube.com/watch?v=l6S-Vbs9WsU
© 2019 Bloomberg Finance L.P. All rights reserved.
Load Balancing
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Load Balancing
HMaster
Balancer/AM
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Balancer
• Region Count Cost
• Primary Region Count Cost
• Table Skew Cost
• Locality Cost
• Rack Locality Cost
• Region Replica Host Cost
• Region Replica Rack Cost
• Read Request Cost
• Write Request Cost
• Memstore Size Cost
• Storefile Size Cost
• Move Cost
© 2019 Bloomberg Finance L.P. All rights reserved.
Other Features
• HBase Replication
• HBase multi-tenancy support
— https://www.youtube.com/watch?v=bZjz2G38Ju0
• HBase Co-processors and Filters
— https://www.slideshare.net/Hadoop_Summit/hbase-coprocessors-uses-abuses-solutions
© 2019 Bloomberg Finance L.P. All rights reserved.
NameNode
RS4
HMaster
ZQuorum
ZQuorum
Operator View
HMaster
RS1 RS2 RS4 RS5 RS6
HBase
ClientZK
Quorum
DN1 DN2 DN3 DN4 DN5 DN6
NameNode
JVM
OS
JVM
OS
JVM
OS
JVM
OS
JVM
OS
JVM
OS
HW HW HW HW HW HW
© 2019 Bloomberg Finance L.P. All rights reserved.
ZooKeeper Availability
• ZK Quorum
• One leader and remaining followers
— stat
— ruok
— mntr
• Test for availability
— e.g., List children of a znode
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Availability
• Master Availability - http://hmaster-node:16010/jmx
— "name" : "Hadoop:service=HBase,name=Master,sub=Server","tag.isActiveMaster" : "true"
• Dead RegionServers
— "name" : "Hadoop:service=HBase,name=Master,sub=Server", "numDeadRegionServers" : 0
• Region In Transition
— "name" : "Hadoop:service=HBase,name=Master,sub=AssignmentManger","ritCount" : 0
• Test for availability
— e.g., Query system table by listing tables
© 2019 Bloomberg Finance L.P. All rights reserved.
HDFS Availability
• Namenode Availability - http://namenode-host:50070/jmx
—"name" : "Hadoop:service=NameNode,name=FSNamesystem", "tag.HAState" : "active"
• Dead Datanodes
— "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "DeadNodes" : "{}“
• Missing Blocks
— "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "NumberOfMissingBlocks" : 0
• Percentage Used
— "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "PercentUsed" : 59
• Under replicated blocks
— "name" : "Hadoop:service=NameNode,name=FSNamesystemState","UnderReplicatedBlocks":0
• Test for availability
— e.g., Append data to a test file
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Performance
• RegionServer JMX metrics - http://rs-node:60300/jmx
— "name“:"Hadoop:service=HBase,name=RegionServer,sub=Server“
• Blockcache hit ratio
• Request counts
• Request response time
• Compaction related metrics
• Region count
• Flush related metrics
• Percentage of files local
• Split related metrics
— "name" : "Hadoop:service=HBase,name=RegionServer,sub=Tables",
• Table level metrics
https://www.slideshare.net/MichaelStack4/hbaseconasia2018-track31-serving-billions-of-queries-in-millisecond-latencies
© 2019 Bloomberg Finance L.P. All rights reserved.
JVM
• GC – JMX Metrics
— "name" : "java.lang:type=GarbageCollector,name=ParNew",
— "name" : "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
• GC Logging
— -verbose:gc
— -XX:+PrintHeapAtGC
— -XX:+PrintGCDetails
— -XX:+PrintGCTimeStamps
— -XX:+PrintGCDateStamps
— -XX:+PrintGCApplicationStoppedTime
— -XX:+PrintClassHistogram
— -XX:+PrintGCApplicationConcurrentTime
— -XX:+PrintTenuringDistribution
— -Xloggc:
© 2019 Bloomberg Finance L.P. All rights reserved.
OS/HW
• Memory
• CPU
• Disk
• Networking
© 2019 Bloomberg Finance L.P. All rights reserved.
Logs
• ZooKeeper Log
• HDFS
— Namenode log
— Datanode log
• HBase
— Master log
— RegionServer log
• OS
— Syslog
© 2019 Bloomberg Finance L.P. All rights reserved.
Interacting with HBase
• HBase shell
— DDL: create namespace/table, alter
— Security: grant, revoke
— DML: get, put, scan
— Tools: assign, compact, balance
— General: status
• HBase admin API
• HBase client API
© 2019 Bloomberg Finance L.P. All rights reserved.
Data Backup / Restore
• Snapshot
— hbase shell > snapshot 'table', 'table_mmddyy’
• Restore from snapshot
— hbase shell > restore_snapshot 'table_mmddyy'
• Export Snapshot
— $ hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot
• CopyTable
— hbase org.apache.hadoop.hbase.mapreduce.CopyTable
© 2019 Bloomberg Finance L.P. All rights reserved.
Thank You!
Acknowledgement: Apache HBase Community
Reference: http://hbase.apache.org
Connect with Hadoop Team: hadoop@bloomberg.net
© 2019 Bloomberg Finance L.P. All rights reserved.
We are hiring!
Questions?
https://www.bloomberg.com/careers

More Related Content

What's hot

Hadoop development series(1)
Hadoop development series(1)Hadoop development series(1)
Hadoop development series(1)Amar kumar
 
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAnormanbarker
 
ElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San FranciscoElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San FranciscoAmazon Web Services
 
Intro to big data and hadoop ubc cs lecture series - g fawkes
Intro to big data and hadoop   ubc cs lecture series - g fawkesIntro to big data and hadoop   ubc cs lecture series - g fawkes
Intro to big data and hadoop ubc cs lecture series - g fawkesgfawkesnew2
 
ElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SFElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SFAmazon Web Services
 
Hive - Cost Based Optimizer
Hive - Cost Based OptimizerHive - Cost Based Optimizer
Hive - Cost Based OptimizerJohn pullokkaran
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...VMware Tanzu
 

What's hot (10)

Hadoop development series(1)
Hadoop development series(1)Hadoop development series(1)
Hadoop development series(1)
 
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
 
ElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San FranciscoElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San Francisco
 
Intro to big data and hadoop ubc cs lecture series - g fawkes
Intro to big data and hadoop   ubc cs lecture series - g fawkesIntro to big data and hadoop   ubc cs lecture series - g fawkes
Intro to big data and hadoop ubc cs lecture series - g fawkes
 
ElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SFElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SF
 
Hive - Cost Based Optimizer
Hive - Cost Based OptimizerHive - Cost Based Optimizer
Hive - Cost Based Optimizer
 
CBO-2
CBO-2CBO-2
CBO-2
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
 
Big data course
Big data  courseBig data  course
Big data course
 
ElastiCache and Redis
ElastiCache and RedisElastiCache and Redis
ElastiCache and Redis
 

Similar to HBase Internals And Operations

How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...DataWorks Summit
 
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...confluent
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streamsconfluent
 
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...confluent
 
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019 높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019 Amazon Web Services Korea
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesMichael Stack
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsDataWorks Summit
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfAmazon Web Services
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfAmazon Web Services
 
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...confluent
 
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Amazon Web Services
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Databricks
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...Provectus
 
AWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessAWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessDarko Mesaroš
 
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS SummitScaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS SummitAmazon Web Services
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019AWS Summits
 
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019Amazon Web Services
 
Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services Amazon Web Services LATAM
 

Similar to HBase Internals And Operations (20)

How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...
 
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
 
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
 
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019 높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to Atlas
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, Solutions
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdf
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdf
 
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
 
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
 
Cloud Migration Workshop
Cloud Migration WorkshopCloud Migration Workshop
Cloud Migration Workshop
 
AWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessAWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With Serverless
 
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS SummitScaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
 
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
 
Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services
 

More from Biju Nair

Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleBiju Nair
 
Apache Kafka Reference
Apache Kafka ReferenceApache Kafka Reference
Apache Kafka ReferenceBiju Nair
 
Hadoop security
Hadoop securityHadoop security
Hadoop securityBiju Nair
 
Chef patterns
Chef patternsChef patterns
Chef patternsBiju Nair
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User ReferenceBiju Nair
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaBiju Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload managementBiju Nair
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar DatabaseBiju Nair
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceBiju Nair
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developersBiju Nair
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk ManagementBiju Nair
 
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsWebsphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsBiju Nair
 

More from Biju Nair (14)

Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
 
Apache Kafka Reference
Apache Kafka ReferenceApache Kafka Reference
Apache Kafka Reference
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Chef patterns
Chef patternsChef patterns
Chef patterns
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk Management
 
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsWebsphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentals
 

Recently uploaded

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 

Recently uploaded (20)

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 

HBase Internals And Operations

  • 1. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Internals and Operations SRECon19 Asia/Pacific June 13, 2019 Biju Nair Software Engineer bnair10@bloomberg.net
  • 2. © 2019 Bloomberg Finance L.P. All rights reserved. Agenda • Introduction to HBase • Operating HBase • Questions
  • 3. © 2019 Bloomberg Finance L.P. All rights reserved. Bloomberg in a nutshell The Bloomberg Terminal delivers a diverse array of information on a single platform to facilitate financial decision-making.
  • 4. © 2019 Bloomberg Finance L.P. All rights reserved. Bloomberg technology by the numbers • 5,000+ software engineers • 150+ technologists and data scientists devoted to machine learning • One of the largest private networks in the world • 120 billion pieces of data from the financial markets each day, with a peak of more than 10 million messages/second • 2 million news stories ingested / published each day (500+ news stories ingested/second) • News content from 125K+ sources • Over 1 billion messages and Instant Bloomberg (IB) chats handled daily
  • 5. © 2019 Bloomberg Finance L.P. All rights reserved. HBase at Bloomberg • Started with v0.94.6 • >2 billion reads per day • >1 billion writes per day • 51+ TB of compressed data stored in HBase
  • 6. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Principles • Ordered Key Value Store • Distributed shared nothing
  • 7. © 2019 Bloomberg Finance L.P. All rights reserved. Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-f
  • 8. © 2019 Bloomberg Finance L.P. All rights reserved. Ordered Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-a Lexicographicorder Key-9993 Value-g
  • 9. © 2019 Bloomberg Finance L.P. All rights reserved. Ordered Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-a Lexicographicorder Key-9993 Value-g Region
  • 10. © 2019 Bloomberg Finance L.P. All rights reserved. Distributed Ordered Key Value Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value ordered … Key-299 Value … Key-298 Value Key-297 Value ordered … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Region Region
  • 11. © 2019 Bloomberg Finance L.P. All rights reserved. Distributed Ordered Key Value RegionServer Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Region Region RegionServer RegionServer RegionServer RegionServer RegionServer
  • 12. © 2019 Bloomberg Finance L.P. All rights reserved. Table Row View Key Value Column Id ValueRow Id Timestamp
  • 13. © 2019 Bloomberg Finance L.P. All rights reserved. Table Row View R11|col1|1234567 Value-A R11|col2|1234567 Value-B R11|col3|1234567 Value-C R11|col4|1234567 Value-D R11 Value-A Value-B Value-C Col1 Col2 Col3 Value-D Col4
  • 14. © 2019 Bloomberg Finance L.P. All rights reserved. Versioning R11|col1|1234567 Value-A1 R11|col1|1234566 Value-A R11|col2|1234567 Value-B R11|col3|1234567 Value-CC R11|col3|1234563 Value-C R11|col4|1234567 Value-DD R11|col4|1234560 Value-D1 R11|col4|1234557 Value-D Descendingorder
  • 15. © 2019 Bloomberg Finance L.P. All rights reserved. cf2cf1 Column Family R11|col1|1234567 Value-A R11|col2|1234567 Value-B R11|col3|1234567 Value-C R11 Value-A Value-B Value-C cf1:col1 cf1:col2 cf1:col3 Value-1 cf2:col1 R11|col1|1234567 Value-1 R11|col2|1234567 Value-2 R11|col3|1234567 Value-3 Value-2 cf2:col2 Value-3 cf2:col3
  • 16. © 2019 Bloomberg Finance L.P. All rights reserved. ACIDity • Atomic at row level • Consistent to a point in time before the request • Isolation through MVCC (reads) and row locks (mutations) • Durability is guaranteed for all successful mutations
  • 17. © 2019 Bloomberg Finance L.P. All rights reserved. Namespace Namespace: Blue Namespace: Yellow ACL ACL ACL ACL ACL
  • 18. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase Region Server File System WAL Write 1
  • 19. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore File System WAL Write 1 2
  • 20. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore File System WAL t1 t1 t1 Store files Write 1 2 3 Block Block Block Idx BlmData
  • 21. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore DFSClient WAL t1 t1 t1 Store files Write 1 2 3
  • 22. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore DFSClient WAL t1 t1 t1 Store files Write 1 2 3 HDFS DataNode WAL t1 t1 t1
  • 23. © 2019 Bloomberg Finance L.P. All rights reserved. Compaction File System K-x K-x D-1 Store files K-xK-xK-x
  • 24. © 2019 Bloomberg Finance L.P. All rights reserved. Compaction File System Store files K-x Compaction File System K-x K-x D-1 Store files K-xK-xK-x
  • 25. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Read HBase RS Memstore File System WAL Read 1 Block Cache Block t1 t1 t1 Store files
  • 26. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Read HBase RS Memstore File System WAL Read 1 Block Cache 2 Block t1 t1 t1 Store files
  • 27. © 2019 Bloomberg Finance L.P. All rights reserved. HDFS Read HBase/ DFSClient HDFS File System Data TCP
  • 28. © 2019 Bloomberg Finance L.P. All rights reserved. HDFS Short-Circuit Read HBase/ DFSClient HDFS File System Data TCP Open File Pass FD
  • 29. © 2019 Bloomberg Finance L.P. All rights reserved. Large Read Cache HBase Memstore File System WAL t1 t1 t1 Store files Read 1 Block Cache 2 Block
  • 30. © 2019 Bloomberg Finance L.P. All rights reserved. Large Read Cache HBase Memstore File System WAL t1 t1 t1 Store files Read 1 Block Cache 3 Block Off-heap Cache Block 2
  • 31. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system
  • 32. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system
  • 33. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK
  • 34. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK
  • 35. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK HBase Client
  • 36. © 2019 Bloomberg Finance L.P. All rights reserved. Region Server Failure HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client
  • 37. © 2019 Bloomberg Finance L.P. All rights reserved. Region Server Failure HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client
  • 38. © 2019 Bloomberg Finance L.P. All rights reserved. Region Server Failure HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client
  • 39. © 2019 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client T1 r1
  • 40. © 2019 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1T1 r1 HBase Client
  • 41. © 2019 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1T1 r1 HBase Client https://www.youtube.com/watch?v=l6S-Vbs9WsU
  • 42. © 2019 Bloomberg Finance L.P. All rights reserved. Load Balancing HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK HBase Client
  • 43. © 2019 Bloomberg Finance L.P. All rights reserved. Load Balancing HMaster Balancer/AM RS1 RS2 RS3 RS4 RS5 RS6 system ZK HBase Client
  • 44. © 2019 Bloomberg Finance L.P. All rights reserved. Balancer • Region Count Cost • Primary Region Count Cost • Table Skew Cost • Locality Cost • Rack Locality Cost • Region Replica Host Cost • Region Replica Rack Cost • Read Request Cost • Write Request Cost • Memstore Size Cost • Storefile Size Cost • Move Cost
  • 45. © 2019 Bloomberg Finance L.P. All rights reserved. Other Features • HBase Replication • HBase multi-tenancy support — https://www.youtube.com/watch?v=bZjz2G38Ju0 • HBase Co-processors and Filters — https://www.slideshare.net/Hadoop_Summit/hbase-coprocessors-uses-abuses-solutions
  • 46. © 2019 Bloomberg Finance L.P. All rights reserved. NameNode RS4 HMaster ZQuorum ZQuorum Operator View HMaster RS1 RS2 RS4 RS5 RS6 HBase ClientZK Quorum DN1 DN2 DN3 DN4 DN5 DN6 NameNode JVM OS JVM OS JVM OS JVM OS JVM OS JVM OS HW HW HW HW HW HW
  • 47. © 2019 Bloomberg Finance L.P. All rights reserved. ZooKeeper Availability • ZK Quorum • One leader and remaining followers — stat — ruok — mntr • Test for availability — e.g., List children of a znode
  • 48. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Availability • Master Availability - http://hmaster-node:16010/jmx — "name" : "Hadoop:service=HBase,name=Master,sub=Server","tag.isActiveMaster" : "true" • Dead RegionServers — "name" : "Hadoop:service=HBase,name=Master,sub=Server", "numDeadRegionServers" : 0 • Region In Transition — "name" : "Hadoop:service=HBase,name=Master,sub=AssignmentManger","ritCount" : 0 • Test for availability — e.g., Query system table by listing tables
  • 49. © 2019 Bloomberg Finance L.P. All rights reserved. HDFS Availability • Namenode Availability - http://namenode-host:50070/jmx —"name" : "Hadoop:service=NameNode,name=FSNamesystem", "tag.HAState" : "active" • Dead Datanodes — "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "DeadNodes" : "{}“ • Missing Blocks — "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "NumberOfMissingBlocks" : 0 • Percentage Used — "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "PercentUsed" : 59 • Under replicated blocks — "name" : "Hadoop:service=NameNode,name=FSNamesystemState","UnderReplicatedBlocks":0 • Test for availability — e.g., Append data to a test file
  • 50. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Performance • RegionServer JMX metrics - http://rs-node:60300/jmx — "name“:"Hadoop:service=HBase,name=RegionServer,sub=Server“ • Blockcache hit ratio • Request counts • Request response time • Compaction related metrics • Region count • Flush related metrics • Percentage of files local • Split related metrics — "name" : "Hadoop:service=HBase,name=RegionServer,sub=Tables", • Table level metrics https://www.slideshare.net/MichaelStack4/hbaseconasia2018-track31-serving-billions-of-queries-in-millisecond-latencies
  • 51. © 2019 Bloomberg Finance L.P. All rights reserved. JVM • GC – JMX Metrics — "name" : "java.lang:type=GarbageCollector,name=ParNew", — "name" : "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", • GC Logging — -verbose:gc — -XX:+PrintHeapAtGC — -XX:+PrintGCDetails — -XX:+PrintGCTimeStamps — -XX:+PrintGCDateStamps — -XX:+PrintGCApplicationStoppedTime — -XX:+PrintClassHistogram — -XX:+PrintGCApplicationConcurrentTime — -XX:+PrintTenuringDistribution — -Xloggc:
  • 52. © 2019 Bloomberg Finance L.P. All rights reserved. OS/HW • Memory • CPU • Disk • Networking
  • 53. © 2019 Bloomberg Finance L.P. All rights reserved. Logs • ZooKeeper Log • HDFS — Namenode log — Datanode log • HBase — Master log — RegionServer log • OS — Syslog
  • 54. © 2019 Bloomberg Finance L.P. All rights reserved. Interacting with HBase • HBase shell — DDL: create namespace/table, alter — Security: grant, revoke — DML: get, put, scan — Tools: assign, compact, balance — General: status • HBase admin API • HBase client API
  • 55. © 2019 Bloomberg Finance L.P. All rights reserved. Data Backup / Restore • Snapshot — hbase shell > snapshot 'table', 'table_mmddyy’ • Restore from snapshot — hbase shell > restore_snapshot 'table_mmddyy' • Export Snapshot — $ hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot • CopyTable — hbase org.apache.hadoop.hbase.mapreduce.CopyTable
  • 56. © 2019 Bloomberg Finance L.P. All rights reserved. Thank You! Acknowledgement: Apache HBase Community Reference: http://hbase.apache.org Connect with Hadoop Team: hadoop@bloomberg.net
  • 57. © 2019 Bloomberg Finance L.P. All rights reserved. We are hiring! Questions? https://www.bloomberg.com/careers