Apache HBase Internals you hoped you Never Needed to Understand

Apache HBase Internals
you Hoped you Never
Needed to Understand
Josh Elser
Future of Data, NYC
2016/10/11

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Engineer at Hortonworks, Member of the Apache Software Foundation
Top-Level Projects
• Apache Accumulo®
• Apache Calcite™
• Apache Commons ™
• Apache HBase ®
• Apache Phoenix ™
ASF Incubator
• Apache Fluo ™
• Apache Gossip ™
• Apache Pirk ™
• Apache Rya ™
• Apache Slider ™
These Apache project names are trademarks or registered
trademarks of the Apache Software Foundation.

Apache HBase for storing your data!
CC BY 3.0 US: http://hbase.apache.org/

What happens when things go wrong?
CC BY-ND 2.0: https://www.flickr.com/photos/widnr/6588151679

The BigTable Architecture
 BigTable’s architecture is simple
 Debugging a distributed system is not simple
 How can we break down a complex system?
 How do we write resilient software?
• Log-Structured Merge Tree
• Write-Ahead Logs
• Distributed Coordination
• Row-based, Auto-Sharding
• Strong Consistency
• Read Isolation
• Coprocessors
• Security (AuthN/AuthZ)
• Backups

Naming Conventions
 Servers
– Hostname, Port, and Timestamp
– RegionServer: r01n01.domain.com,16201,1475691463147
– Master: r02n01.domain.com,16000,1475691462616
 Regions
– Table, Start RowKey, Region ID (timestamp), Replica ID, Encoded name
– T1,x04x00x00,1470324608597.c04d94cd4ee9797da2fb906b4dcd2e3c.
– Or simply c04d94cd4ee9797da2fb906b4dcd2e3c

Regions
 A sorted “shard” of a table
 At least one “column family”
– Physical partitions
 Each family can have zero to many files
 Hosted by at most one RegionServer
– Can have many hosting RS’s for reads
 In-memory locks for certain intra-row operations

Region Assignment
 Coordinated by the HBase Master
 A Region must only be hosted by one RegionServer
 State tracked in hbase:meta
– hbck to fix issues
 Region splits/merges make a hard problem even harder
 Moving towards ProcedureV2
Closed Offline Opening OpenPending Open
Normal Region Assignment States

The File System
 HDFS “Compatible”
– Distributed, durable, ”write leases”
 Physical storage of HBase Tables (HFiles)
 Write-ahead logs
 A parent directory in that FileSystem (hbase.rootdir)

The File System
Physical Separation by HBase Namespace
/hbase/data/
/hbase/data/default/<table1>
/hbase/data/default/.tabledesc/.tableinfo…
/hbase/data/default/<table2>/<region_id1>
/hbase/data/my_custom_ns/<table3>/…
/hbase/data/hbase/meta/…
/hbase/archive/…
/hbase/WALs/<regionserver_name>/…
/hbase/oldWALs/…
/hbase/corrupt/…

The File System for one Region
…/.regioninfo
…/.tmp
…/<family1>/<hfile>
…/recovered.edits/<number>.seqid

Writes into HBase
 Mutations inserted into sorted in-memory structure and WAL
– Fast lookups of recent data
– Append-only log for durability and speed
 Mutations are collected by destination Region
 Beware of hot-spotting
 Data in memory eventually flush’ed into sorted (H)files

Compactions and Flushes
 Flush: Taking Key-Values from the In-Memory map and creating an HFile
 Minor Compaction: Rewriting a subset of HFiles for a Region into one HFile
 Major Compaction: Rewriting all HFiles for a Region into one HFile
 Compactions balance improved query performance with cost of rewriting data
– Compactions are good!
– Must understand SLA’s to properly tune compactions

Reads into HBase
 Merge-Sort over multiple streams of data
– Memory
– Disk (many files)
 hbase:meta is the definitive source of where to find Regions
RowKey Region
hbase:meta
RegionServer
ZooKeeper

Apache ZooKeeper™
 Distributed coordination is really hard
 Obvious use cases
– Service Discovery
– Cluster Membership
– “Root Table”
 Non-obvious use cases
– Assignment (sometimes)
– Region Recovery
– WAL Splitting
– Cluster Replication
– Distributed Procedures
– HBase Snapshots
Apache ZooKeeper is a trademark of the Apache Software Foundation

Apache ZooKeeper™
 Discovery/Leader ZNodes
– /hbase/rs/…
– /hbase/master/…
– /hbase/backup-masters/…
 Consensus
– /hbase/splitWAL/…
– /hbase/flush-table-proc/...
– /hbase/table-lock/...
– /hbase/region-in-transition/...
– /hbase/recovering-regions/...

Distributed Procedures
 Resiliency in an unreliable system
– How do we create a table?
 “Procedure V2”
– Resilient, finite state machine
 HBase operations represented as
”procedures”
 Clients are agnostic of Master state
– Clients track procedure state
https://issues.apache.org/jira/secure/attachment/12679960/ProcedureV2.pdf

Distributed Procedures
 Procedures are durable via Write-Ahead Log
– /hbase/MasterProcWALs/…
 Procedures only executed by the active HBase Master
 Reusable framework for the future

HBase RPCs
 Internal and External HBase
Communication
 Half-Sync/Half-Async Model
 Many knobs to tweak
 Listener
 Readers
 Scheduler
 Call Queues
 Call Runners/Handlers
Overview Components

HBase RPCs
L
i
s
t
e
n
e
r
Reader
Reader
Reader
Reader
S
c
h
e
d
u
l
e
r
Call Queues Handlers
Priority
Read
Write
Replication
Request to Execution

Disaster Recovery
 Multiple tools to ensure copies of data in the face of catastrophic failure
 CopyTable
– MapReduce job which reads all data from a source, writing to destination
 Snapshots
– A collection of Regions, their HFiles, and metadata
 Backup & Restore
– HBASE-7912, current targeted for HBase-2.0.0
– Incremental and full backup/restore

Kerberos
 Strong authentication for untrusted networks
 ”Standard” across Apache Hadoop and friends
 Requirements:
– Forward/Reverse DNS
– Unlimited Strength Java Cryptography Extension
 SASL used to build RPC systems
 “Practical Kerberos with Apache HBase” https://goo.gl/y0d9ZO

Finding an Hypothesis
 Logs logs logs
 Application and System
 Metrics exposed by JMX
 Graphing solutions
– Ambari Metrics Server + Grafana

Thank You
jelser@hortonworks.com / elserj@apache.org

Apache HBase Internals you hoped you Never Needed to Understand

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Apache HBase Internals you hoped you Never Needed to Understand

Similar to Apache HBase Internals you hoped you Never Needed to Understand (20)

More from Josh Elser

More from Josh Elser (7)

Recently uploaded

Recently uploaded (20)

Apache HBase Internals you hoped you Never Needed to Understand

Editor's Notes