SlideShare a Scribd company logo
1 of 38
Download to read offline
HBase  Low  Latency
Nick	
  Dimiduk,	
  Hortonworks	
  (@xefyr)	
  
Nicolas	
  Liochon,	
  Scaled	
  Risk	
  (@nkeywal)	
  
	
  
HBaseCon	
  May	
  5,	
  2014	
  
Agenda
•  Latency,	
  what	
  is	
  it,	
  how	
  to	
  measure	
  it	
  
•  Write	
  path	
  
•  Read	
  path	
  
•  Next	
  steps	
  
What’s  low  latency
Latency	
  is	
  about	
  percenJles	
  
•  Average	
  !=	
  50%	
  percenJle	
  
•  There	
  are	
  oRen	
  order	
  of	
  magnitudes	
  between	
  «	
  average	
  »	
  and	
  «	
  95	
  
percenJle	
  »	
  
•  Post	
  99%	
  =	
  «	
  magical	
  1%	
  ».	
  Work	
  in	
  progress	
  here.	
  
•  Meaning	
  from	
  micro	
  seconds	
  (High	
  Frequency	
  
Trading)	
  to	
  seconds	
  (interacJve	
  queries)	
  
•  In	
  this	
  talk	
  milliseconds	
  
Measure  latency
bin/hbase	
  org.apache.hadoop.hbase.PerformanceEvaluaJon	
  
•  More	
  opJons	
  related	
  to	
  HBase:	
  autoflush,	
  replicas,	
  …	
  
•  Latency	
  measured	
  in	
  micro	
  second	
  
•  Easier	
  for	
  internal	
  analysis	
  
YCSB	
  -­‐	
  Yahoo!	
  Cloud	
  Serving	
  Benchmark	
  
•  Useful	
  for	
  comparison	
  between	
  databases	
  
•  Set	
  of	
  workload	
  already	
  defined	
  
Write  path
•  Two	
  parts	
  
•  Single	
  put	
  (WAL)	
  
•  The	
  client	
  just	
  sends	
  the	
  put	
  
•  MulJple	
  puts	
  from	
  the	
  client	
  (new	
  behavior	
  since	
  0.96)	
  
•  The	
  client	
  is	
  much	
  smarter	
  
•  Four	
  stages	
  to	
  look	
  at	
  for	
  latency	
  
•  Start	
  (establish	
  tcp	
  connecJons,	
  etc.)	
  
•  Steady:	
  when	
  expected	
  condiJons	
  are	
  met	
  
•  Machine	
  failure:	
  expected	
  as	
  well	
  
•  Overloaded	
  system	
  
Single  put:  communica>on  &  scheduling
•  Client:	
  TCP	
  connecJon	
  to	
  the	
  server	
  
•  Shared:	
  mulJtheads	
  on	
  the	
  same	
  client	
  are	
  using	
  the	
  same	
  TCP	
  connecJon	
  
•  Pooling	
  is	
  possible	
  and	
  does	
  improve	
  the	
  performances	
  in	
  some	
  circonstances	
  
•  hbase.client.ipc.pool.size
	
  
•  Server:	
  mulJple	
  calls	
  from	
  mulJple	
  threads	
  on	
  mulJple	
  machines	
  
•  Can	
  become	
  thousand	
  of	
  simultaneous	
  queries	
  
•  Scheduling	
  is	
  required	
  
	
  
Single  put:  real  work
•  The	
  server	
  must	
  
•  Write	
  into	
  the	
  WAL	
  queue	
  
•  Sync	
  the	
  	
  WAL	
  queue	
  (HDFS	
  flush)	
  
•  Write	
  into	
  the	
  memstore	
  
•  WALs	
  queue	
  is	
  shared	
  between	
  all	
  the	
  regions/handlers	
  
•  Sync	
  is	
  avoided	
  if	
  another	
  handlers	
  did	
  the	
  work	
  
•  You	
  may	
  flush	
  more	
  than	
  expected	
  
Simple  put:  A  small  run
Percen&le	
   Time	
  in	
  ms	
  
Mean	
   1.21	
  
50%	
   0.95	
  
95%	
   1.50	
  
99%	
   2.12	
  
Latency  sources
•  Candidate	
  one:	
  network	
  
•  0.5ms	
  within	
  a	
  datacenter	
  
•  Much	
  less	
  between	
  nodes	
  in	
  the	
  same	
  rack	
  
	
  
	
  
	
  
	
  
Percen&le	
   Time	
  in	
  ms	
  
Mean	
   0.13	
  
50%	
   0.12	
  
95%	
   0.15	
  
99%	
   0.47	
  
Latency  sources
•  Candidate	
  two:	
  HDFS	
  Flush	
  
•  We	
  can	
  sJll	
  do	
  beier:	
  HADOOP-­‐7714	
  &	
  sons.	
  
Percen&le	
   Time	
  in	
  ms	
  
Mean	
   0.33	
  
50%	
   0.26	
  
95%	
   0.59	
  
99%	
   1.24	
  
Latency  sources
•  Millisecond	
  world:	
  everything	
  can	
  go	
  wrong	
  
•  JVM	
  
•  Network	
  
•  OS	
  Scheduler	
  
•  File	
  System	
  
•  All	
  this	
  goes	
  into	
  the	
  post	
  99%	
  percenJle	
  
•  Requires	
  monitoring	
  
•  Usually	
  using	
  the	
  latest	
  version	
  shelps.	
  
Latency  sources
•  Split	
  (and	
  presplits)	
  
•  Autosharding	
  is	
  great!	
  
•  Puts	
  have	
  to	
  wait	
  
•  Impacts:	
  seconds	
  
•  Balance	
  
•  Regions	
  move	
  
•  Triggers	
  a	
  retry	
  for	
  the	
  client	
  
•  hbase.client.pause	
  =	
  100ms	
  since	
  HBase	
  0.96	
  
•  	
  Garbage	
  CollecJon	
  
•  Impacts:	
  10’s	
  of	
  ms,	
  even	
  with	
  a	
  good	
  config	
  
•  Covered	
  with	
  the	
  read	
  path	
  of	
  this	
  talk	
  
From  steady  to  loaded  and  overloaded
•  Number	
  of	
  concurrent	
  tasks	
  is	
  a	
  factor	
  of	
  
•  Number	
  of	
  cores	
  
•  Number	
  of	
  disks	
  
•  Number	
  of	
  remote	
  machines	
  used	
  
•  Difficult	
  to	
  esJmate	
  
•  Queues	
  are	
  doomed	
  to	
  happen	
  
•  hbase.regionserver.handler.count
•  So	
  for	
  low	
  latency	
  
•  Replable	
  scheduler	
  since	
  HBase	
  0.98	
  (HBASE-­‐8884).	
  Requires	
  specific	
  code.	
  
•  RPC	
  PrioriJes:	
  work	
  in	
  progress	
  (HBASE-­‐11048)	
  
From  loaded  to  overloaded
•  MemStore	
  takes	
  too	
  much	
  room:	
  flush,	
  then	
  blocksquite	
  quickly	
  
•  hbase.regionserver.global.memstore.size.lower.limit
•  hbase.regionserver.global.memstore.size
•  hbase.hregion.memstore.block.multiplier
•  Too	
  many	
  Hfiles:	
  block	
  unJl	
  compacJons	
  keep	
  up	
  
•  hbase.hstore.blockingStoreFiles
•  Too	
  many	
  WALs	
  files:	
  Flush	
  and	
  block	
  
•  hbase.regionserver.maxlogs
Machine  failure
•  Failure	
  
•  Dectect	
  
•  Reallocate	
  
•  Replay	
  WAL	
  
•  Replaying	
  WAL	
  is	
  NOT	
  required	
  for	
  puts	
  
•  hbase.master.distributed.log.replay	
  
•  (default	
  true	
  in	
  1.0)	
  
•  Failure	
  =	
  Dectect	
  +	
  Reallocate	
  +	
  Retry	
  
•  That’s	
  in	
  the	
  range	
  of	
  ~1s	
  for	
  simple	
  failures	
  
•  Silent	
  failures	
  leads	
  puts	
  you	
  in	
  the	
  10s	
  range	
  if	
  the	
  hardware	
  does	
  not	
  help	
  
•  zookeeper.session.timeout
Single  puts
•  Millisecond	
  range	
  
•  Spikes	
  do	
  happen	
  in	
  steady	
  mode	
  
•  100ms	
  
•  Causes:	
  GC,	
  load,	
  splits	
  
Streaming  puts
Htable#setAutoFlushTo(false)!
Htable#put!
Htable#flushCommit!
•  As	
  simple	
  puts,	
  but	
  
•  Puts	
  are	
  grouped	
  and	
  send	
  in	
  background	
  
•  Load	
  is	
  taken	
  into	
  account	
  
•  Does	
  not	
  block	
  
Mul>ple  puts
hbase.client.max.total.tasks (default 100)
hbase.client.max.perserver.tasks (default 5)
hbase.client.max.perregion.tasks (default 1)
•  Decouple	
  the	
  client	
  from	
  a	
  latency	
  spike	
  of	
  a	
  region	
  server	
  
•  Increase	
  the	
  throughput	
  by	
  50%	
  compared	
  to	
  old	
  mulJput	
  
•  Makes	
  split	
  and	
  GC	
  more	
  transparent	
  
Conclusion  on  write  path
•  Single	
  puts	
  can	
  be	
  very	
  fast	
  
•  It’s	
  not	
  a	
  «	
  hard	
  real	
  Jme	
  »	
  system:	
  there	
  are	
  spikes	
  
•  Most	
  latency	
  spikes	
  can	
  be	
  hidden	
  when	
  streaming	
  puts	
  
•  Failure	
  are	
  NOT	
  that	
  difficult	
  for	
  the	
  write	
  path	
  
•  No	
  WAL	
  to	
  replay	
  
And  now  for  the  read  path
Read  path
•  Get/short	
  scan	
  are	
  assumed	
  for	
  low-­‐latency	
  operaJons	
  
•  Again,	
  two	
  APIs	
  
•  Single	
  get:	
  HTable#get(Get)
•  MulJ-­‐get:	
  HTable#get(List<Get>)
•  Four	
  stages,	
  same	
  as	
  write	
  path	
  
•  Start	
  (tcp	
  connecJon,	
  …)	
  
•  Steady:	
  when	
  expected	
  condiJons	
  are	
  met	
  
•  Machine	
  failure:	
  expected	
  as	
  well	
  
•  Overloaded	
  system:	
  you	
  may	
  need	
  to	
  add	
  machines	
  or	
  tune	
  your	
  workload	
  
Mul>  get  /  Client
Group	
  Gets	
  by	
  
RegionServer	
  
Execute	
  them	
  
one	
  by	
  one	
  
Mul>  get  /  Server
Mul>  get  /  Server
Access  latency  magnides
Storage hierarchy: a different view
Dean/2009	
  
Memory	
  is	
  100000x	
  
faster	
  than	
  disk!	
  
Disk	
  seek	
  =	
  10ms	
  
Known  unknowns
•  For	
  each	
  candidate	
  HFile	
  
•  Exclude	
  by	
  file	
  metadata	
  
•  Timestamp	
  
•  Rowkey	
  range	
  
•  Exclude	
  by	
  bloom	
  filter	
  
StoreFileScanner#	
  
shouldUseScanner()	
  
Unknown  knowns
•  Merge	
  sort	
  results	
  polled	
  from	
  Stores	
  
•  Seek	
  each	
  scanner	
  to	
  a	
  reference	
  KeyValue	
  
•  Retrieve	
  candidate	
  data	
  from	
  disk	
  
•  MulJple	
  HFiles	
  =>	
  mulitple	
  seeks	
  
•  hbase.storescanner.parallel.seek.enable=true	
  
•  Short	
  Circuit	
  Reads	
  
•  dfs.client.read.shortcircuit=true	
  
•  Block	
  locality	
  
•  Happy	
  clusters	
  compact!	
  
HFileBlock#	
  
readBlockData()	
  
BlockCache
•  Reuse	
  previously	
  read	
  data	
  
•  Maximize	
  cache	
  hit	
  rate	
  
•  Larger	
  cache	
  
•  Temporal	
  access	
  locality	
  
•  Physical	
  access	
  locality	
  
BlockCache#getBlock()	
  
BlockCache  Showdown
•  LruBlockCache	
  
•  Default,	
  onheap	
  
•  Quite	
  good	
  most	
  of	
  the	
  Jme	
  
•  EvicJons	
  impact	
  GC	
  
•  BucketCache	
  
•  Oxeap	
  alternaJve	
  
•  SerializaJon	
  overhead	
  
•  Large	
  memory	
  configuraJons	
  
hip://www.n10k.com/blog/
blockcache-­‐showdown/	
  
L2	
  off-­‐heap	
  BucketCache	
  
makes	
  a	
  strong	
  showing	
  
Latency  enemies:  Garbage  Collec>on
•  Use	
  heap.	
  Not	
  too	
  much.	
  With	
  CMS.	
  
•  Max	
  heap	
  
•  30GB	
  (compressed	
  pointers)	
  
•  8-­‐16GB	
  if	
  you	
  care	
  about	
  9’s	
  
•  Healthy	
  cluster	
  load	
  
•  regular,	
  reliable	
  collecJons	
  
•  25-­‐100ms	
  pause	
  on	
  regular	
  interval	
  
•  Overloaded	
  RegionServer	
  suffers	
  GC	
  overmuch	
  
	
  
Off-­‐heap  to  the  rescue?  
•  BucketCache	
  (0.96,	
  HBASE-­‐7404)	
  
•  Network	
  interfaces	
  (HBASE-­‐9535)	
  
•  MemStore	
  et	
  al	
  (HBASE-­‐10191)	
  
Latency  enemies:  Compac>ons
•  Fewer	
  HFiles	
  =>	
  fewer	
  seeks	
  
•  Evict	
  data	
  blocks!	
  
•  Evict	
  Index	
  blocks!!	
  
•  hfile.block.index.cacheonwrite	
  
•  Evict	
  bloom	
  blocks!!!	
  
•  hfile.block.bloom.cacheonwrite	
  
•  OS	
  buffer	
  cache	
  to	
  the	
  rescue	
  
•  Compactected	
  data	
  is	
  sJll	
  fresh	
  
•  Beier	
  than	
  going	
  all	
  the	
  way	
  back	
  to	
  disk	
  
Failure
•  Detect	
  +	
  Reassign	
  +	
  Replay	
  
•  Strong	
  consistency	
  requires	
  replay	
  
•  Locality	
  drops	
  to	
  0	
  
•  Cache	
  starts	
  from	
  scratch	
  
Hedging  our  bets
•  HDFS	
  Hedged	
  reads	
  (2.4,	
  HDFS-­‐5776)	
  
•  Reads	
  on	
  secondary	
  DataNodes	
  
•  Strongly	
  consistent	
  
•  Works	
  at	
  the	
  HDFS	
  level	
  
•  Timeline	
  consistency	
  (HBASE-­‐10070)	
  
•  Reads	
  on	
  «	
  Replica	
  Region	
  »	
  
•  Not	
  strongly	
  consistent	
  
Read  latency  in  summary
•  Steady	
  mode	
  
•  Cache	
  hit:	
  <	
  1	
  ms	
  
•  Cache	
  miss:	
  +	
  10	
  ms	
  per	
  seek	
  
•  WriJng	
  while	
  reading	
  =>	
  cache	
  churn	
  
•  GC:	
  25-­‐100ms	
  pause	
  on	
  regular	
  interval	
  
Network	
  request	
  +	
  (1	
  -­‐	
  P(cache	
  hit))	
  *	
  (10	
  ms	
  *	
  seeks)	
  
	
  
•  Same	
  long	
  tail	
  issues	
  as	
  write	
  
•  Overloaded:	
  same	
  scheduling	
  issues	
  as	
  write	
  
•  ParJal	
  failures	
  hurt	
  a	
  lot	
  
HBase  ranges  for  99%  latency
	
  	
   Put	
  
Streamed	
  
Mul&put	
   Get	
   Timeline	
  get	
  
Steady	
   milliseconds	
   milliseconds	
   milliseconds	
   milliseconds	
  
Failure	
   seconds	
   seconds	
   seconds	
   milliseconds	
  
GC	
  
10’s	
  of	
  
milliseconds	
   milliseconds	
  
10’s	
  of	
  
milliseconds	
   milliseconds	
  
What’s  next
•  Less	
  GC	
  
•  Use	
  less	
  objects	
  
•  Oxeap	
  
•  Compressed	
  BlockCache	
  (HBASE-­‐8894)	
  
•  Prefered	
  locaJon	
  (HBASE-­‐4755)	
  
•  The	
  «	
  magical	
  1%	
  »	
  
•  Most	
  tools	
  stops	
  at	
  the	
  99%	
  latency	
  
•  What	
  happens	
  aRer	
  is	
  much	
  more	
  complex	
  
Thanks!
Nick	
  Dimiduk,	
  Hortonworks	
  (@xefyr)	
  
Nicolas	
  Liochon,	
  Scaled	
  Risk	
  (@nkeywal)	
  
	
  
HBaseCon	
  May	
  5,	
  2014	
  

More Related Content

What's hot

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction HBaseCon
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory HBaseCon
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon
 
HBase at Xiaomi
HBase at XiaomiHBase at Xiaomi
HBase at XiaomiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaCloudera, Inc.
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduseScott Miao
 
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed Storage
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed StorageHBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed Storage
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed StorageCloudera, Inc.
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 

What's hot (20)

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory Breaking the Sound Barrier with Persistent Memory
Breaking the Sound Barrier with Persistent Memory
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
HBase at Xiaomi
HBase at XiaomiHBase at Xiaomi
HBase at Xiaomi
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
Accordion HBaseCon 2017
Accordion HBaseCon 2017Accordion HBaseCon 2017
Accordion HBaseCon 2017
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
 
004 architecture andadvanceduse
004 architecture andadvanceduse004 architecture andadvanceduse
004 architecture andadvanceduse
 
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed Storage
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed StorageHBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed Storage
HBaseCon 2013: Apache HBase at Pinterest - Scaling Our Feed Storage
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 

Similar to Apache HBase Low Latency

Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopAyon Sinha
 
HBASE by Nicolas Liochon - Meetup HUGFR du 22 Sept 2014
HBASE by  Nicolas Liochon - Meetup HUGFR du 22 Sept 2014HBASE by  Nicolas Liochon - Meetup HUGFR du 22 Sept 2014
HBASE by Nicolas Liochon - Meetup HUGFR du 22 Sept 2014Modern Data Stack France
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...ScyllaDB
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri
 
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?Accumulo Summit
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trencheswchevreuil
 
Building an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen TingBuilding an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen Tingjaxconf
 
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...Bob Pusateri
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Alexey Rybak
 
Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Cosmin Lehene
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CachePer Buer
 
Cassandra and drivers
Cassandra and driversCassandra and drivers
Cassandra and driversBen Bromhead
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsSpeedment, Inc.
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayDataStax Academy
 

Similar to Apache HBase Low Latency (20)

Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
HBASE by Nicolas Liochon - Meetup HUGFR du 22 Sept 2014
HBASE by  Nicolas Liochon - Meetup HUGFR du 22 Sept 2014HBASE by  Nicolas Liochon - Meetup HUGFR du 22 Sept 2014
HBASE by Nicolas Liochon - Meetup HUGFR du 22 Sept 2014
 
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
 
Building an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen TingBuilding an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen Ting
 
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
 
Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015Elastic HBase on Mesos - HBaseCon 2015
Elastic HBase on Mesos - HBaseCon 2015
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 
Tuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish CacheTuning the Kernel for Varnish Cache
Tuning the Kernel for Varnish Cache
 
Cassandra and drivers
Cassandra and driversCassandra and drivers
Cassandra and drivers
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 

More from Nick Dimiduk

Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseNick Dimiduk
 
Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixNick Dimiduk
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for ArchitectsNick Dimiduk
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)Nick Dimiduk
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the CloudNick Dimiduk
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for ArchitectsNick Dimiduk
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)Nick Dimiduk
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop EasyNick Dimiduk
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLNick Dimiduk
 

More from Nick Dimiduk (11)

Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - Phoenix
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
HBase Data Types
HBase Data TypesHBase Data Types
HBase Data Types
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for Architects
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the Cloud
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for Architects
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop Easy
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQL
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

Apache HBase Low Latency

  • 1. HBase  Low  Latency Nick  Dimiduk,  Hortonworks  (@xefyr)   Nicolas  Liochon,  Scaled  Risk  (@nkeywal)     HBaseCon  May  5,  2014  
  • 2. Agenda •  Latency,  what  is  it,  how  to  measure  it   •  Write  path   •  Read  path   •  Next  steps  
  • 3. What’s  low  latency Latency  is  about  percenJles   •  Average  !=  50%  percenJle   •  There  are  oRen  order  of  magnitudes  between  «  average  »  and  «  95   percenJle  »   •  Post  99%  =  «  magical  1%  ».  Work  in  progress  here.   •  Meaning  from  micro  seconds  (High  Frequency   Trading)  to  seconds  (interacJve  queries)   •  In  this  talk  milliseconds  
  • 4. Measure  latency bin/hbase  org.apache.hadoop.hbase.PerformanceEvaluaJon   •  More  opJons  related  to  HBase:  autoflush,  replicas,  …   •  Latency  measured  in  micro  second   •  Easier  for  internal  analysis   YCSB  -­‐  Yahoo!  Cloud  Serving  Benchmark   •  Useful  for  comparison  between  databases   •  Set  of  workload  already  defined  
  • 5. Write  path •  Two  parts   •  Single  put  (WAL)   •  The  client  just  sends  the  put   •  MulJple  puts  from  the  client  (new  behavior  since  0.96)   •  The  client  is  much  smarter   •  Four  stages  to  look  at  for  latency   •  Start  (establish  tcp  connecJons,  etc.)   •  Steady:  when  expected  condiJons  are  met   •  Machine  failure:  expected  as  well   •  Overloaded  system  
  • 6. Single  put:  communica>on  &  scheduling •  Client:  TCP  connecJon  to  the  server   •  Shared:  mulJtheads  on  the  same  client  are  using  the  same  TCP  connecJon   •  Pooling  is  possible  and  does  improve  the  performances  in  some  circonstances   •  hbase.client.ipc.pool.size   •  Server:  mulJple  calls  from  mulJple  threads  on  mulJple  machines   •  Can  become  thousand  of  simultaneous  queries   •  Scheduling  is  required    
  • 7. Single  put:  real  work •  The  server  must   •  Write  into  the  WAL  queue   •  Sync  the    WAL  queue  (HDFS  flush)   •  Write  into  the  memstore   •  WALs  queue  is  shared  between  all  the  regions/handlers   •  Sync  is  avoided  if  another  handlers  did  the  work   •  You  may  flush  more  than  expected  
  • 8. Simple  put:  A  small  run Percen&le   Time  in  ms   Mean   1.21   50%   0.95   95%   1.50   99%   2.12  
  • 9. Latency  sources •  Candidate  one:  network   •  0.5ms  within  a  datacenter   •  Much  less  between  nodes  in  the  same  rack           Percen&le   Time  in  ms   Mean   0.13   50%   0.12   95%   0.15   99%   0.47  
  • 10. Latency  sources •  Candidate  two:  HDFS  Flush   •  We  can  sJll  do  beier:  HADOOP-­‐7714  &  sons.   Percen&le   Time  in  ms   Mean   0.33   50%   0.26   95%   0.59   99%   1.24  
  • 11. Latency  sources •  Millisecond  world:  everything  can  go  wrong   •  JVM   •  Network   •  OS  Scheduler   •  File  System   •  All  this  goes  into  the  post  99%  percenJle   •  Requires  monitoring   •  Usually  using  the  latest  version  shelps.  
  • 12. Latency  sources •  Split  (and  presplits)   •  Autosharding  is  great!   •  Puts  have  to  wait   •  Impacts:  seconds   •  Balance   •  Regions  move   •  Triggers  a  retry  for  the  client   •  hbase.client.pause  =  100ms  since  HBase  0.96   •   Garbage  CollecJon   •  Impacts:  10’s  of  ms,  even  with  a  good  config   •  Covered  with  the  read  path  of  this  talk  
  • 13. From  steady  to  loaded  and  overloaded •  Number  of  concurrent  tasks  is  a  factor  of   •  Number  of  cores   •  Number  of  disks   •  Number  of  remote  machines  used   •  Difficult  to  esJmate   •  Queues  are  doomed  to  happen   •  hbase.regionserver.handler.count •  So  for  low  latency   •  Replable  scheduler  since  HBase  0.98  (HBASE-­‐8884).  Requires  specific  code.   •  RPC  PrioriJes:  work  in  progress  (HBASE-­‐11048)  
  • 14. From  loaded  to  overloaded •  MemStore  takes  too  much  room:  flush,  then  blocksquite  quickly   •  hbase.regionserver.global.memstore.size.lower.limit •  hbase.regionserver.global.memstore.size •  hbase.hregion.memstore.block.multiplier •  Too  many  Hfiles:  block  unJl  compacJons  keep  up   •  hbase.hstore.blockingStoreFiles •  Too  many  WALs  files:  Flush  and  block   •  hbase.regionserver.maxlogs
  • 15. Machine  failure •  Failure   •  Dectect   •  Reallocate   •  Replay  WAL   •  Replaying  WAL  is  NOT  required  for  puts   •  hbase.master.distributed.log.replay   •  (default  true  in  1.0)   •  Failure  =  Dectect  +  Reallocate  +  Retry   •  That’s  in  the  range  of  ~1s  for  simple  failures   •  Silent  failures  leads  puts  you  in  the  10s  range  if  the  hardware  does  not  help   •  zookeeper.session.timeout
  • 16. Single  puts •  Millisecond  range   •  Spikes  do  happen  in  steady  mode   •  100ms   •  Causes:  GC,  load,  splits  
  • 17. Streaming  puts Htable#setAutoFlushTo(false)! Htable#put! Htable#flushCommit! •  As  simple  puts,  but   •  Puts  are  grouped  and  send  in  background   •  Load  is  taken  into  account   •  Does  not  block  
  • 18. Mul>ple  puts hbase.client.max.total.tasks (default 100) hbase.client.max.perserver.tasks (default 5) hbase.client.max.perregion.tasks (default 1) •  Decouple  the  client  from  a  latency  spike  of  a  region  server   •  Increase  the  throughput  by  50%  compared  to  old  mulJput   •  Makes  split  and  GC  more  transparent  
  • 19. Conclusion  on  write  path •  Single  puts  can  be  very  fast   •  It’s  not  a  «  hard  real  Jme  »  system:  there  are  spikes   •  Most  latency  spikes  can  be  hidden  when  streaming  puts   •  Failure  are  NOT  that  difficult  for  the  write  path   •  No  WAL  to  replay  
  • 20. And  now  for  the  read  path
  • 21. Read  path •  Get/short  scan  are  assumed  for  low-­‐latency  operaJons   •  Again,  two  APIs   •  Single  get:  HTable#get(Get) •  MulJ-­‐get:  HTable#get(List<Get>) •  Four  stages,  same  as  write  path   •  Start  (tcp  connecJon,  …)   •  Steady:  when  expected  condiJons  are  met   •  Machine  failure:  expected  as  well   •  Overloaded  system:  you  may  need  to  add  machines  or  tune  your  workload  
  • 22. Mul>  get  /  Client Group  Gets  by   RegionServer   Execute  them   one  by  one  
  • 23. Mul>  get  /  Server
  • 24. Mul>  get  /  Server
  • 25. Access  latency  magnides Storage hierarchy: a different view Dean/2009   Memory  is  100000x   faster  than  disk!   Disk  seek  =  10ms  
  • 26. Known  unknowns •  For  each  candidate  HFile   •  Exclude  by  file  metadata   •  Timestamp   •  Rowkey  range   •  Exclude  by  bloom  filter   StoreFileScanner#   shouldUseScanner()  
  • 27. Unknown  knowns •  Merge  sort  results  polled  from  Stores   •  Seek  each  scanner  to  a  reference  KeyValue   •  Retrieve  candidate  data  from  disk   •  MulJple  HFiles  =>  mulitple  seeks   •  hbase.storescanner.parallel.seek.enable=true   •  Short  Circuit  Reads   •  dfs.client.read.shortcircuit=true   •  Block  locality   •  Happy  clusters  compact!   HFileBlock#   readBlockData()  
  • 28. BlockCache •  Reuse  previously  read  data   •  Maximize  cache  hit  rate   •  Larger  cache   •  Temporal  access  locality   •  Physical  access  locality   BlockCache#getBlock()  
  • 29. BlockCache  Showdown •  LruBlockCache   •  Default,  onheap   •  Quite  good  most  of  the  Jme   •  EvicJons  impact  GC   •  BucketCache   •  Oxeap  alternaJve   •  SerializaJon  overhead   •  Large  memory  configuraJons   hip://www.n10k.com/blog/ blockcache-­‐showdown/   L2  off-­‐heap  BucketCache   makes  a  strong  showing  
  • 30. Latency  enemies:  Garbage  Collec>on •  Use  heap.  Not  too  much.  With  CMS.   •  Max  heap   •  30GB  (compressed  pointers)   •  8-­‐16GB  if  you  care  about  9’s   •  Healthy  cluster  load   •  regular,  reliable  collecJons   •  25-­‐100ms  pause  on  regular  interval   •  Overloaded  RegionServer  suffers  GC  overmuch    
  • 31. Off-­‐heap  to  the  rescue?   •  BucketCache  (0.96,  HBASE-­‐7404)   •  Network  interfaces  (HBASE-­‐9535)   •  MemStore  et  al  (HBASE-­‐10191)  
  • 32. Latency  enemies:  Compac>ons •  Fewer  HFiles  =>  fewer  seeks   •  Evict  data  blocks!   •  Evict  Index  blocks!!   •  hfile.block.index.cacheonwrite   •  Evict  bloom  blocks!!!   •  hfile.block.bloom.cacheonwrite   •  OS  buffer  cache  to  the  rescue   •  Compactected  data  is  sJll  fresh   •  Beier  than  going  all  the  way  back  to  disk  
  • 33. Failure •  Detect  +  Reassign  +  Replay   •  Strong  consistency  requires  replay   •  Locality  drops  to  0   •  Cache  starts  from  scratch  
  • 34. Hedging  our  bets •  HDFS  Hedged  reads  (2.4,  HDFS-­‐5776)   •  Reads  on  secondary  DataNodes   •  Strongly  consistent   •  Works  at  the  HDFS  level   •  Timeline  consistency  (HBASE-­‐10070)   •  Reads  on  «  Replica  Region  »   •  Not  strongly  consistent  
  • 35. Read  latency  in  summary •  Steady  mode   •  Cache  hit:  <  1  ms   •  Cache  miss:  +  10  ms  per  seek   •  WriJng  while  reading  =>  cache  churn   •  GC:  25-­‐100ms  pause  on  regular  interval   Network  request  +  (1  -­‐  P(cache  hit))  *  (10  ms  *  seeks)     •  Same  long  tail  issues  as  write   •  Overloaded:  same  scheduling  issues  as  write   •  ParJal  failures  hurt  a  lot  
  • 36. HBase  ranges  for  99%  latency     Put   Streamed   Mul&put   Get   Timeline  get   Steady   milliseconds   milliseconds   milliseconds   milliseconds   Failure   seconds   seconds   seconds   milliseconds   GC   10’s  of   milliseconds   milliseconds   10’s  of   milliseconds   milliseconds  
  • 37. What’s  next •  Less  GC   •  Use  less  objects   •  Oxeap   •  Compressed  BlockCache  (HBASE-­‐8894)   •  Prefered  locaJon  (HBASE-­‐4755)   •  The  «  magical  1%  »   •  Most  tools  stops  at  the  99%  latency   •  What  happens  aRer  is  much  more  complex  
  • 38. Thanks! Nick  Dimiduk,  Hortonworks  (@xefyr)   Nicolas  Liochon,  Scaled  Risk  (@nkeywal)     HBaseCon  May  5,  2014