Improvements to Apache HBase and Its Applications in Alibaba Search

HBase in Alibaba Search
ShaoXuan Wang, Yu Li
{shaoxuan.wsx, jueding.ly}
@alibaba-inc.com

Agenda
n  History of HBase in Alibaba Search
n  Scenarios
l  Search Indexing
l  Machine Learning for Recommendation and Targeting
n  HBase Improvements
l  Multiple WAL, Fine-grained IO Control, etc.
n  HBase Extensions
l  HQueue: light-weight message queue impl on HBase
l  HTunnel: update notification service based on HBase/HQueue
n  Challenges and future

About Us
n  Alibaba
l  Operates the world largest online and mobile marketplace, thriving in the world
largest e-commerce market.
l  Annual GMV $394Billion in year 2015
l  alibaba.com, aliexpress.com, taobao.com, tmall.com
n  Alibaba Search
l  Serving 410 million monthly active users
l  Personalized recommendation and targeting via machine learning system
l  Major contributor to GMV

HBase in Alibaba Search
n HBase is the core storage in Alibaba search system, since 2010
n History of version used online
l  2010~2014: 0.20.6à0.90.3à0.92.1à0.94.1à0.94.2à0.94.5
l  2014~2015: 0.94à0.98.1à0.98.4à0.98.8à0.98.12
l  2016: 0.98.12à1.1.2
n Current scale
l  3 clusters each with 1,000+ nodes
l  Shared with Flink/Yarn
l  Serving over 10Million/s Ops throughout the day

Data Platform for Search Indexing
n Data Storage for Batch and Streaming Processing
Data Source
Hadoop cluster
HBase
Batch &
Streaming Event
Offline & Real
Time Processing
Exporting
Ali ODPS MySQL
Search Engines
HBase HBase
HDFS HDFS HDFS

Data Platform for Search Indexing
n Continuous Updated Materialized View on HBase
Streaming
Data
Join (Apply UDF)
Source Source Source
Join (Apply UDF)
Materialized View
Materialized View
User Defined
Processing DAG
Batch
Data
Continuous
Updated
Result

Database and Queue for Machine Learning
UDF UDF UDFHQueue
Online
log
Parsing
Log Training
User Models
Training
Item ModelsItem ID
User ID
HQueue
Aggregate
Updates
Machine Learning
Models
Online System
Δw
Export
Model
Updates
Model
Flink+MR Processing over Yarn

HBase Improvements: Multiple WAL
n Multiple WAL: HBASE-14457
l  Fix replication under multiwal (HBASE-6617)
l  Namespace-based region grouping strategy (HBASE-14456)
l  Performance improvement observed in production usage
n  Pure SATA disk: ~20% better than single WAL
n  ONE_SSD storage policy with SATA-SSD: ~40% better
n  PCIe-SSD: hsync acceptable with multiwal

HBase Improvements: IO Isolation
n Challenge from shared storage/computing nodes
n Take usage of Storage Policy
l  ALL_SSD for WALs
l  ONE_SSD for HFile: Support CF-level storage policy (HBASE-14061)
l  Support setting storage policy in Bulkload (HBASE-15172)
l  Only use SATA disk for MR temporary data (mapreduce.cluster.local.dir)

HBase Improvements: IO Isolation
n Better control of Disk/Network IO spike
l  Compaction throttling (HBASE-8329) + Flush throttling (HBASE-14969)
l  Per-CF flush improvement: further less flush on small CF (HBASE-14906)

HBase Improvements: Machine Learning specific
n Remove some synchronous in the RpcServer responder (HBASE-11297)
l  Verified to be important if heavy access from one single client
l  Not included in 0.98, but nice to have
n Improve parallel reading a single key from BucketCache (HBASE-14463)
l  Synchronous => read/write lock
l  Not included in 0.98, but nice to have

HBase Improvements: Monitoring
n Add metrics for get/scan/multi/mutate count separately (HBASE-15163)

n Add back HFile HDFS op latency metrics (HBASE-15160)

HBase Extension: HQueue
n A light-weight message queue service embedded in HBase
n Why not simply using Kafka?
l  Easier deployment/management: queue and storage service in a whole
l  Comparable perf/features through native features HBase supplies
n  TTL
n  Load balance
n  Fault tolerant/Failover
n  High throughput
n  Good scalability
n  Replication

n Message write process

n Message read process

n An update-notification service based on HBase/HQueue
n Similar mechanism to replication
n Architecture
HBase Extension: HTunnel

HBase Extension: HTunnel
n Working process

Challenges and future
n Meet hardware revolution
l  PCIe-SSD plus 10Gb network card
l  Optimization required on RPC/HDFS layer
l  CPU may become bottleneck
n Meet challenges from new database
l  HBase v.s. Rocksdb

Improvements to Apache HBase and Its Applications in Alibaba Search

Recommended

Recommended

More Related Content

More from HBaseCon

More from HBaseCon (20)

Improvements to Apache HBase and Its Applications in Alibaba Search