SlideShare a Scribd company logo
1 of 27
HBase	
  Storage	
  Internals,	
  present	
  and	
  future!	
  
    Ma6eo	
  Bertozzi	
  |	
  @Cloudera	
  
    	
  March	
  2013	
  -­‐	
  Hadoop	
  Summit	
  Europe	
  




1
What	
  is	
  HBase	
  

    •    Open	
  source	
  Storage	
  Manager	
  that	
  provides	
  random	
  
         read/write	
  on	
  top	
  of	
  HDFS	
  
    •    Provides	
  Tables	
  with	
  a	
  “Key:Column/Value”	
  interface	
  
          •    Dynamic	
  columns	
  (qualifiers),	
  no	
  schema	
  needed	
  
          •    “Fixed”	
  column	
  groups	
  (families)	
  
          •    table[row:family:column]	
  =	
  value	
  




2
HBase	
  EcoSystem	
  

    •    Apache	
  Hadoop	
  HDFS	
  for	
  data	
  
         durability	
  and	
  reliability	
  (Write-­‐Ahead	
     App	
      MR	
  

         Log)	
  
    •    Apache	
  ZooKeeper	
  for	
  distributed	
  
         coordina]on	
                                             ZK	
     HDFS	
  
    •    Apache	
  Hadoop	
  MapReduce	
  built-­‐in	
  
         support	
  for	
  running	
  MapReduce	
  jobs	
  




3
How	
  HBase	
  Works?	
  
    “View	
  from	
  10000c”	
  




4
Master,	
  Region	
  Servers	
  and	
  Regions	
  

                                                                            •    Region	
  Server	
  
                      Client	
  
                                                                                  •  Server	
  that	
  contains	
  a	
  set	
  of	
  Regions	
  

                                                            ZooKeeper	
  
                                                                                  •  Responsible	
  to	
  handle	
  reads	
  and	
  writes	
  
                                                                            •    Region	
  
                                                              Master	
  
                                                                                  •  The	
  basic	
  unit	
  of	
  scalability	
  in	
  HBase	
  
                                                                                  •  Subset	
  of	
  the	
  table’s	
  data	
  
    Region	
  Server	
       Region	
  Server	
     Region	
  Server	
            •  Con]guous,	
  sorted	
  range	
  of	
  rows	
  stored	
  
         Region	
                  Region	
              Region	
                    together.	
  
         Region	
                  Region	
              Region	
  
                                                                            •    Master	
  
         Region	
                  Region	
              Region	
  
                                                                                  •  Coordinates	
  the	
  HBase	
  Cluster	
  
                                   HDFS	
                                              •  Assignment/Balancing	
  of	
  the	
  Regions	
  
                                                                                  •  Handles	
  admin	
  opera]ons	
  
                                                                                       •  create/delete/modify	
  table,	
  …	
  




5
Autosharding	
  and	
  .META.	
  table	
  
    •  A	
  Region	
  is	
  a	
  Subset	
  of	
  the	
  table’s	
  data	
  
    •  When	
  there	
  is	
  too	
  much	
  data	
  in	
  a	
  Region…	
  
          •  a	
  split	
  is	
  triggered,	
  crea]ng	
  2	
  regions	
  
    •  The	
  associa]on	
  “Region	
  -­‐>	
  Server”	
  is	
  stored	
  in	
  a	
  System	
  Table	
  
    •  The	
  Loca]on	
  of	
  .META.	
  Is	
  stored	
  in	
  ZooKeeper	
  

                                                       Table	
       Start	
  Key	
     Region	
  ID	
      Region	
  Server	
          machine01	
  
                                                                                                                                   Region	
  1	
  -­‐	
  testTable	
  
                                                     testTable	
      Key-­‐00	
              1	
          machine01.host	
        Region	
  4	
  -­‐	
  testTable	
  

                                                     testTable	
      Key-­‐31	
              2	
          machine03.host	
  
                                                                                                                                       machine02	
  
                                                     testTable	
      Key-­‐65	
              3	
          machine02.host	
  
                                                                                                                                   Region	
  3	
  -­‐	
  testTable	
  
                                                     testTable	
      Key-­‐83	
              4	
          machine01.host	
           Region	
  1	
  -­‐	
  users	
  

                                                          …	
              …	
                …	
                   …	
  
                                                                                                                                       machine03	
  
                                                       users	
        Key-­‐AB	
              1	
          machine03.host	
        Region	
  2	
  -­‐	
  testTable	
  

                                                       users	
        Key-­‐KG	
              2	
          machine02.host	
           Region	
  2	
  -­‐	
  users	
  




6
The	
  Write	
  Path	
  –	
  Create	
  a	
  New	
  Table	
  

•  The	
  client	
  asks	
  to	
  the	
  master	
  to	
  create	
  a	
  new	
  Table	
  
      •  hbase>	
  create	
  ‘myTable’,	
  ‘cf’	
                                                       Client	
  
                                                                                                                    createTable()	
  

•  The	
  Master	
                                                                                    Master	
  

      •  Store	
  the	
  Table	
  informa]on	
  (“schema”)	
                                                      Store	
  Table	
  
                                                                                                                  “Metadata”	
  



      •  Create	
  Regions	
  based	
  on	
  the	
  key-­‐splits	
  provided	
                    Assign	
  the	
  Regions	
  
                                                                                                       “enable”	
  



            •  no	
  splits	
  provided,	
  one	
  single	
  region	
  by	
  
                                                                                     Region	
            Region	
                      Region	
  
              default	
                                                              Server	
  
                                                                              Region	
  
                                                                                                         Server	
                      Server	
  
                                                                                                                                       Region	
  
                                                                                                          Region	
  

      •  Assign	
  the	
  Regions	
  to	
  the	
  Region	
  Servers	
         Region	
                    Region	
                     Region	
  



           •  The	
  assignment	
  Region	
  -­‐>	
  Server	
  
              is	
  wri6en	
  to	
  a	
  system	
  table	
  called	
  “.META.”	
  


7
The	
  Write	
  Path	
  –	
  “Inser]ng”	
  data	
  

                                                                                                               Client	
  
•  table.put(row-­‐key:family:column,	
  value)	
                                            Where	
  is	
  
                                                                                             .META.?	
                       Scan	
  
                                                                                                                            .META.	
  

•  The	
  client	
  asks	
  ZooKeeper	
  the	
  loca]on	
  of	
  .META.	
                   ZooKeeper	
                                  Region	
  Server	
  

                                                                                                                   Insert	
  
                                                                                                                                             Region	
  

•  The	
  client	
  scans	
  .META.	
  searching	
  for	
  the	
  	
                                              KeyValue	
                 Region	
  


   Region	
  Server	
  responsible	
  to	
  handle	
  the	
  Key	
                      Region	
  Server	
  
                                                                                           Region	
  
•  The	
  client	
  asks	
  the	
  Region	
  Server	
  to	
  	
                            Region	
  

   insert/update/delete	
  the	
  specified	
  key/value.	
  
                                                                                           Region	
  



•  The	
  Region	
  Server	
  process	
  the	
  request	
  and	
  dispatch	
  it	
  to	
  the	
  
   Region	
  responsible	
  to	
  handle	
  	
  the	
  Key	
  
      •  The	
  opera]on	
  is	
  wri6en	
  to	
  a	
  Write-­‐Ahead	
  Log	
  (WAL)	
  
      •  …and	
  the	
  KeyValues	
  added	
  to	
  the	
  Store:	
  “MemStore”	
  




8
The	
  Write	
  Path	
  –	
  Append	
  Only	
  to	
  Random	
  R/W	
  

•  Files	
  in	
  HDFS	
  are	
                                                                        RS	
  
                                                                                                                                                 Region	
  




                                                                                                                                       WAL	
  
                                                                                                                                                                    Region	
                 Region	
  

      •  Append-­‐Only	
  	
  
      •  Immutable	
  once	
  closed	
                                                                                                            MemStore	
  +	
  Store	
  Files	
  (HFiles)	
  




•  HBase	
  provides	
  Random	
  Writes?	
  
      •  …not	
  really	
  from	
  a	
  storage	
  point	
  of	
  view	
  
      •  KeyValues	
  are	
  stored	
  in	
  memory	
  and	
  wri6en	
  to	
  disk	
  on	
  pressure	
  
                •  Don’t	
  worry	
  your	
  data	
  is	
  safe	
  in	
  the	
  WAL!	
  
                                                                                                                                                                         Key0	
  –	
  value	
  0	
  
                     •    (The	
  Region	
  Server	
  can	
  recover	
  data	
  from	
  the	
  WAL	
  is	
  case	
  of	
  crash)	
                                       Key1	
  –	
  value	
  1	
  
                                                                                                                                                                         Key2	
  –	
  value	
  2	
  
                                                                                                                                                                         Key3	
  –	
  value	
  3	
  

              But	
  this	
  allow	
  to	
  sort	
  data	
  by	
  Key	
  before	
  wri]ng	
  on	
  disk	
  
              • 
                                                                                                                                                                         Key4	
  –	
  value	
  4	
  
                                                                                                                                                                         Key5	
  –	
  value	
  5	
  
                                                                                                                                                                                  	
  


       •  Deletes	
  are	
  like	
  Inserts	
  but	
  with	
  a	
  “remove	
  me	
  flag”	
                                                                                       Store	
  Files	
  




9
The	
  Read	
  Path	
  –	
  “reading”	
  data	
  
•  The	
  client	
  asks	
  ZooKeeper	
  the	
  loca]on	
  of	
  .META.	
                                   Client	
  
                                                                                                     Where	
  is	
  
•  The	
  client	
  scans	
  .META.	
  searching	
  for	
  the	
  Region	
  Server	
                 .META.?	
                           Scan	
  
                                                                                                                                        .META.	
  

   responsible	
  to	
  handle	
  the	
  Key	
                                              ZooKeeper	
                                              Region	
  Server	
  

                                                                                                                                                         Region	
  
•  The	
  client	
  asks	
  the	
  Region	
  Server	
  to	
  get	
  the	
  specified	
  key/
                                                                                                                       Get	
  Key	
  
                                                                                                                                                         Region	
  

   value.	
                                                                                       Region	
  Server	
  

•  The	
  Region	
  Server	
  process	
  the	
  request	
  and	
  dispatch	
  it	
  to	
  
                                                                                                      Region	
  
                                                                                                     Region	
  
   the	
  Region	
  responsible	
  to	
  handle	
  	
  the	
  Key	
                                   Region	
  

      •  MemStore	
  and	
  Store	
  Files	
  are	
  scanned	
  to	
  find	
  the	
  key	
  




10
The	
  Read	
  Path	
  –	
  Append	
  Only	
  to	
  Random	
  R/W	
  

•  Each	
  flush	
  a	
  new	
  file	
  is	
  created	
  	
  
                                                                                  Key0	
  –	
  value	
  0.0	
     Key0	
  –	
  value	
  0.1	
  
                                                                                  Key2	
  –	
  value	
  2.0	
     Key5	
  –	
  value	
  5.0	
  
                                                                                  Key3	
  –	
  value	
  3.0	
     Key1	
  –	
  value	
  1.0	
  
                                                                                  Key5	
  –	
  value	
  5.0	
     Key5	
  –	
  [deleted]	
  
                                                                                  Key8	
  –	
  value	
  8.0	
     Key6	
  –	
  value	
  6.0	
  


•  Each	
  file	
  have	
  KeyValues	
  sorted	
  by	
  key	
  
                                                                                  Key9	
  –	
  value	
  9.0	
     Key7–	
  value	
  7.0	
  




•  Two	
  or	
  more	
  files	
  can	
  contains	
  the	
  same	
  key	
  
   (updates/deletes)	
  
•  To	
  find	
  a	
  Key	
  you	
  need	
  to	
  scan	
  all	
  the	
  files	
  
     •  …with	
  some	
  op]miza]ons	
  
     •  Filter	
  Files	
  Start/End	
  Key	
  
     •  Having	
  a	
  bloom	
  filter	
  on	
  each	
  file	
  




11
HFile	
  
     HBase	
  Store	
  File	
  Format	
  




12
HFile	
  format	
  

                                                                                                          Blocks	
  
•  Only	
  Sequen]al	
  Writes,	
  just	
  append(key,	
  value)	
  
                                                                                                           Header	
  
•  Large	
  Sequen]al	
  Reads	
  are	
  be6er	
                                                          Record	
  0	
  
                                                                                                          Record	
  1	
  
•  Why	
  grouping	
  records	
  in	
  blocks?	
                        Key/Value	
                          …	
  
                                                                        (record)	
                        Record	
  N	
  

       •  Easy	
  to	
  split	
                                          Key	
  Length	
  :	
  int	
  
                                                                                                           Header	
  
                                                                        Value	
  Length	
  :	
  int	
  
                                                                                                          Record	
  0	
  
       •  Easy	
  to	
  read	
                                               Key	
  :	
  byte[]	
  
                                                                                                          Record	
  1	
  
                                                                                                             …	
  
       •  Easy	
  to	
  cache	
                                            Value	
  :	
  byte[]	
         Record	
  N	
  

                                                                                                           Index	
  0	
  
       •  Easy	
  to	
  index	
  (if	
  records	
  are	
  sorted)	
                                           …	
  
                                                                                                           Index	
  N	
  

       •  Block	
  Compression	
  (snappy,	
  lz4,	
  gz,	
  …)	
                                           Trailer	
  




13
Data	
  Block	
  Encoding	
  

•  “Be	
  aware	
  of	
  the	
  data”	
  
•  Block	
  Encoding	
  allows	
  to	
  compress	
  the	
  Key	
  based	
  on	
  
     what	
  we	
  know	
  
      •  Keys	
  are	
  sorted…	
  prefix	
  may	
  be	
  similar	
  in	
  most	
  cases	
  
      •  One	
  file	
  contains	
  keys	
  from	
  one	
  Family	
  only	
  
      •  Timestamps	
  are	
  “similar”,	
  we	
  can	
  store	
  the	
  diff	
         “on-­‐disk”	
  
      •  Type	
  is	
  “put”	
  most	
  of	
  the	
  ]me…	
                            KeyValue	
  
                                                                                       Row	
  Length	
  :	
  short	
  
                                                                                                            Row	
  :	
  byte[]	
  
                                                                                                      Family	
  Length	
  :	
  byte	
  
                                                                                                        Family	
  :	
  byte[]	
  
                                                                                                        Qualifier	
  :	
  byte[]	
  
                                                                                                        Timestamp	
  :	
  long	
  
                                                                                                           Type	
  :	
  byte	
  




14
Compac]ons	
  
     Op]mize	
  the	
  read-­‐path	
  




15
Compac]ons	
  

•  Reduce	
  the	
  number	
  of	
  files	
  to	
  look	
  into	
  during	
  a	
  scan	
  
                                                                                            Key0	
  –	
  value	
  0.0	
                   Key0	
  –	
  value	
  0.1	
  
                                                                                            Key2	
  –	
  value	
  2.0	
                   Key1	
  –	
  value	
  1.0	
  
                                                                                            Key3	
  –	
  value	
  3.0	
                   Key4–	
  value	
  4.0	
  
                                                                                            Key5	
  –	
  value	
  5.0	
                   Key5	
  –	
  [deleted]	
  
                                                                                            Key8	
  –	
  value	
  8.0	
                   Key6	
  –	
  value	
  6.0	
  


      •  Removing	
  duplicated	
  keys	
  (updated	
  values)	
  
                                                                                            Key9	
  –	
  value	
  9.0	
                   Key7–	
  value	
  7.0	
  




      •  Removing	
  deleted	
  keys	
                                                                         Key0	
  –	
  value	
  0.1	
  
                                                                                                               Key1	
  –	
  value	
  1.0	
  


•  Creates	
  a	
  new	
  file	
  by	
  merging	
  the	
  content	
  of	
  2+	
  files	
  
                                                                                                               Key2	
  –	
  value	
  2.0	
  
                                                                                                               Key4–	
  	
  value	
  4.0	
  
                                                                                                               Key6	
  –	
  value	
  6.0	
  
                                                                                                               Key7–	
  value	
  7.0	
  
                                                                                                               Key8–	
  value	
  8.0	
  

      •  Remove	
  the	
  old	
  files	
  
                                                                                                               Key9–	
  value	
  9.0	
  




16
Pluggable	
  Compac]ons	
  

•  Try	
  different	
  algorithm	
  
                                                                                    Key0	
  –	
  value	
  0.0	
                   Key0	
  –	
  value	
  0.1	
  
                                                                                    Key2	
  –	
  value	
  2.0	
                   Key1	
  –	
  value	
  1.0	
  
                                                                                    Key3	
  –	
  value	
  3.0	
                   Key4–	
  value	
  4.0	
  
                                                                                    Key5	
  –	
  value	
  5.0	
                   Key5	
  –	
  [deleted]	
  
                                                                                    Key8	
  –	
  value	
  8.0	
                   Key6	
  –	
  value	
  6.0	
  

•  Be	
  aware	
  of	
  the	
  data	
                                               Key9	
  –	
  value	
  9.0	
                   Key7–	
  value	
  7.0	
  




      •  Time	
  Series?	
  I	
  guess	
  no	
  updates	
  from	
  the	
  80s	
  
                                                                                                       Key0	
  –	
  value	
  0.1	
  


•  Be	
  aware	
  of	
  the	
  requests	
  	
  
                                                                                                       Key1	
  –	
  value	
  1.0	
  
                                                                                                       Key2	
  –	
  value	
  2.0	
  
                                                                                                       Key4–	
  	
  value	
  4.0	
  
                                                                                                       Key6	
  –	
  value	
  6.0	
  
                                                                                                       Key7–	
  value	
  7.0	
  


      •  Compact	
  based	
  on	
  sta]s]cs	
  
                                                                                                       Key8–	
  value	
  8.0	
  
                                                                                                       Key9–	
  value	
  9.0	
  




      •  which	
  files	
  are	
  hot	
  and	
  which	
  are	
  not	
  
      •  which	
  keys	
  are	
  hot	
  and	
  which	
  are	
  not	
  




17
Snapshots	
  
     Zero-­‐copy	
  snapshots	
  and	
  table	
  clones	
  




18
How	
  taking	
  a	
  snapshot	
  works?	
  

     •  The	
  master	
  orchestrate	
  the	
  RSs	
  
          •  the	
  communica]on	
  is	
  done	
  via	
  ZooKeeper	
  
          •  using	
  a	
  “2-­‐phase	
  commit	
  like”	
  transac]on	
  (prepare/commit)	
  
     •  Each	
  RS	
  is	
  responsible	
  to	
  take	
  its	
  “piece”	
  of	
  snapshot	
  
          •  For	
  each	
  Region	
  store	
  the	
  metadata	
  informa]on	
  needed	
  
          •  (list	
  of	
  Store	
  Files,	
  WALs,	
  region	
  start/end	
  keys,	
  …)	
  	
  

                                                                                                                                                                                    ZK	
                ZK	
  

                                                                                                                              Master	
                                                         ZK	
  




                                                                                              RS	
                                                             RS	
  
                                                                              Region	
  
                                                                    WAL	
  
                                                                                                  Region	
              Region	
               Region	
  




                                                                                                                                     WAL	
  
                                                                                                                                                                   Region	
                  Region	
  




                                                                                       Store	
  Files	
  (HFiles)	
                                     Store	
  Files	
  (HFiles)	
  




19
What	
  is	
  a	
  Snapshots?	
  

     •  “a	
  Snapshot	
  is	
  not	
  a	
  copy	
  of	
  the	
  table”	
  
     •  a	
  Snapshot	
  is	
  a	
  set	
  of	
  metadata	
  informa]on	
  
           •  The	
  table	
  “schema”	
  (column	
  families	
  and	
  a6ributes)	
  
           •  The	
  Regions	
  informa]on	
  (start	
  key,	
  end	
  key,	
  …)	
  
                 •  The	
  list	
  of	
  Store	
  Files	
  
                 •  The	
  list	
  of	
  WALs	
  ac]ve	
  

                                                                                                                                                                                        ZK	
                ZK	
  

                                                                                                                                  Master	
                                                         ZK	
  




                                                                                                  RS	
                                                             RS	
  
                                                                                  Region	
  
                                                                        WAL	
  
                                                                                                      Region	
              Region	
               Region	
  




                                                                                                                                         WAL	
  
                                                                                                                                                                       Region	
                  Region	
  




                                                                                           Store	
  Files	
  (HFiles)	
                                     Store	
  Files	
  (HFiles)	
  




20
Cloning	
  a	
  Table	
  from	
  a	
  Snapshots	
  

     •  hbase>	
  clone_snapshot	
  ‘snapshotName’,	
  ‘tableName’	
  
       …	
  


     •  Creates	
  a	
  new	
  table	
  with	
  the	
  data	
  “contained”	
  in	
  the	
  snapshot	
  
     •  No	
  data	
  copies	
  involved	
  
           •  HFiles	
  are	
  immutable	
  
           •  And	
  shared	
  between	
  tables	
  and	
  snapshots	
  
     •  You	
  can	
  insert/update/remove	
  data	
  from	
  the	
  new	
  table	
  
           •  No	
  repercussions	
  on	
  the	
  snapshot,	
  original	
  tables	
  or	
  other	
  
            cloned	
  tables	
  




21
Compac]ons	
  &	
  Archiving	
  

     •  HFiles	
  are	
  immutable,	
  and	
  shared	
  between	
  tables	
  and	
  snapshots	
  
     	
  


     •  On	
  compac]on	
  or	
  table	
  dele]on,	
  files	
  are	
  removed	
  from	
  disk	
  
     •  If	
  files	
  are	
  referenced	
  by	
  a	
  snapshot	
  or	
  a	
  cloned	
  table	
  
            •  The	
  file	
  is	
  moved	
  to	
  an	
  “archive”	
  directory	
  
            •  And	
  deleted	
  later,	
  when	
  there’re	
  no	
  references	
  to	
  it	
  




22
Compac]ons	
  
     Op]mize	
  the	
  read-­‐path	
  




23
0.96	
  is	
  coming	
  up	
  

     •    Moving	
  RPC	
  to	
  Protobuf	
  
           •    Allows	
  rolling	
  upgrades	
  with	
  no	
  surprises	
  
     •    HBase	
  Snapshots	
  
     •    Pluggable	
  Compac]ons	
  
     •    Remove	
  -­‐ROOT-­‐	
  	
  
     •    Table	
  Locks	
  




24
0.98	
  and	
  Beyond	
  
     •    Transparent	
  Table/Column-­‐Family	
  Encryp]on	
  
     •    Cell-­‐level	
  security	
  
     •    Mul]ple	
  WALs	
  per	
  Region	
  Server	
  (MTTR)	
  
     •    Data	
  Placement	
  Awareness	
  (MTTR)	
  
     •    Data	
  Type	
  Awareness	
  
     •    Compac]on	
  policies,	
  based	
  on	
  the	
  data	
  needs	
  
     •    Managing	
  blocks	
  directly	
  (instead	
  of	
  files)	
  




25
Ques]ons?	
  




26
Thank	
  you!	
  
Ma6eo	
  Bertozz,	
  @cloudera   	
     	
     	
     	
     	
     	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  @th30z	
  

	
  

More Related Content

What's hot

Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseenissoz
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsAnton Kirillov
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixRajeshbabu Chintaguntla
 
A simple introduction to redis
A simple introduction to redisA simple introduction to redis
A simple introduction to redisZhichao Liang
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryDataWorks Summit
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastDataWorks Summit
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentationArvind Kumar
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Mydbops
 
UKOUG - 25 years of hints and tips
UKOUG - 25 years of hints and tipsUKOUG - 25 years of hints and tips
UKOUG - 25 years of hints and tipsConnor McDonald
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceDatabricks
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
InnoDB Performance Optimisation
InnoDB Performance OptimisationInnoDB Performance Optimisation
InnoDB Performance OptimisationMydbops
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiHBaseCon
 

What's hot (20)

Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAse
 
YARN Federation
YARN Federation YARN Federation
YARN Federation
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
Internal Hive
Internal HiveInternal Hive
Internal Hive
 
Local Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache PhoenixLocal Secondary Indexes in Apache Phoenix
Local Secondary Indexes in Apache Phoenix
 
A simple introduction to redis
A simple introduction to redisA simple introduction to redis
A simple introduction to redis
 
Light-weighted HDFS disaster recovery
Light-weighted HDFS disaster recoveryLight-weighted HDFS disaster recovery
Light-weighted HDFS disaster recovery
 
The PostgreSQL Query Planner
The PostgreSQL Query PlannerThe PostgreSQL Query Planner
The PostgreSQL Query Planner
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
HBase Low Latency
HBase Low LatencyHBase Low Latency
HBase Low Latency
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
UKOUG - 25 years of hints and tips
UKOUG - 25 years of hints and tipsUKOUG - 25 years of hints and tips
UKOUG - 25 years of hints and tips
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Cosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle ServiceCosco: An Efficient Facebook-Scale Shuffle Service
Cosco: An Efficient Facebook-Scale Shuffle Service
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
InnoDB Performance Optimisation
InnoDB Performance OptimisationInnoDB Performance Optimisation
InnoDB Performance Optimisation
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
 

Similar to HBase Storage Internals

Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBasedave_revell
 
Realtime Apache Hadoop at Facebook
Realtime Apache Hadoop at FacebookRealtime Apache Hadoop at Facebook
Realtime Apache Hadoop at Facebookparallellabs
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aSchubert Zhang
 
Introduction To Maxtable
Introduction To MaxtableIntroduction To Maxtable
Introduction To Maxtablemaxtable
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Inc.
 
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBaseOmid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBaseDataWorks Summit
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 
1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day 1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day Kimihiko Kitase
 
Evan Ellis "Tumblr. Massively Sharded MySQL"
Evan Ellis "Tumblr. Massively Sharded MySQL"Evan Ellis "Tumblr. Massively Sharded MySQL"
Evan Ellis "Tumblr. Massively Sharded MySQL"Alexey Mahotkin
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And HbaseEdward Yoon
 
Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)jmhsieh
 
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)Sharad Agarwal
 
Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2
Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2
Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2Jeroen Burgers
 
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseStrata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseCloudera, Inc.
 

Similar to HBase Storage Internals (20)

HBase internals
HBase internalsHBase internals
HBase internals
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBase
 
Realtime Apache Hadoop at Facebook
Realtime Apache Hadoop at FacebookRealtime Apache Hadoop at Facebook
Realtime Apache Hadoop at Facebook
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221a
 
Introduction To Maxtable
Introduction To MaxtableIntroduction To Maxtable
Introduction To Maxtable
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduce
 
Omid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBaseOmid Efficient Transaction Mgmt and Processing for HBase
Omid Efficient Transaction Mgmt and Processing for HBase
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day 1 Introduction at CloudStack Developer Day
1 Introduction at CloudStack Developer Day
 
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS cloudsCloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
CloudStack Hyderabad Meetup: Using CloudStack to build IaaS clouds
 
Evan Ellis "Tumblr. Massively Sharded MySQL"
Evan Ellis "Tumblr. Massively Sharded MySQL"Evan Ellis "Tumblr. Massively Sharded MySQL"
Evan Ellis "Tumblr. Massively Sharded MySQL"
 
CloudStack technical overview
CloudStack technical overviewCloudStack technical overview
CloudStack technical overview
 
Hbase
HbaseHbase
Hbase
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
 
Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)
 
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
Apachecon Hadoop YARN - Under The Hood (at ApacheCon Europe)
 
Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2
Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2
Siebel Server Cloning available in 8.1.1.9 / 8.2.2.2
 
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseStrata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

HBase Storage Internals

  • 1. HBase  Storage  Internals,  present  and  future!   Ma6eo  Bertozzi  |  @Cloudera    March  2013  -­‐  Hadoop  Summit  Europe   1
  • 2. What  is  HBase   •  Open  source  Storage  Manager  that  provides  random   read/write  on  top  of  HDFS   •  Provides  Tables  with  a  “Key:Column/Value”  interface   •  Dynamic  columns  (qualifiers),  no  schema  needed   •  “Fixed”  column  groups  (families)   •  table[row:family:column]  =  value   2
  • 3. HBase  EcoSystem   •  Apache  Hadoop  HDFS  for  data   durability  and  reliability  (Write-­‐Ahead   App   MR   Log)   •  Apache  ZooKeeper  for  distributed   coordina]on   ZK   HDFS   •  Apache  Hadoop  MapReduce  built-­‐in   support  for  running  MapReduce  jobs   3
  • 4. How  HBase  Works?   “View  from  10000c”   4
  • 5. Master,  Region  Servers  and  Regions   •  Region  Server   Client   •  Server  that  contains  a  set  of  Regions   ZooKeeper   •  Responsible  to  handle  reads  and  writes   •  Region   Master   •  The  basic  unit  of  scalability  in  HBase   •  Subset  of  the  table’s  data   Region  Server   Region  Server   Region  Server   •  Con]guous,  sorted  range  of  rows  stored   Region   Region   Region   together.   Region   Region   Region   •  Master   Region   Region   Region   •  Coordinates  the  HBase  Cluster   HDFS   •  Assignment/Balancing  of  the  Regions   •  Handles  admin  opera]ons   •  create/delete/modify  table,  …   5
  • 6. Autosharding  and  .META.  table   •  A  Region  is  a  Subset  of  the  table’s  data   •  When  there  is  too  much  data  in  a  Region…   •  a  split  is  triggered,  crea]ng  2  regions   •  The  associa]on  “Region  -­‐>  Server”  is  stored  in  a  System  Table   •  The  Loca]on  of  .META.  Is  stored  in  ZooKeeper   Table   Start  Key   Region  ID   Region  Server   machine01   Region  1  -­‐  testTable   testTable   Key-­‐00   1   machine01.host   Region  4  -­‐  testTable   testTable   Key-­‐31   2   machine03.host   machine02   testTable   Key-­‐65   3   machine02.host   Region  3  -­‐  testTable   testTable   Key-­‐83   4   machine01.host   Region  1  -­‐  users   …   …   …   …   machine03   users   Key-­‐AB   1   machine03.host   Region  2  -­‐  testTable   users   Key-­‐KG   2   machine02.host   Region  2  -­‐  users   6
  • 7. The  Write  Path  –  Create  a  New  Table   •  The  client  asks  to  the  master  to  create  a  new  Table   •  hbase>  create  ‘myTable’,  ‘cf’   Client   createTable()   •  The  Master   Master   •  Store  the  Table  informa]on  (“schema”)   Store  Table   “Metadata”   •  Create  Regions  based  on  the  key-­‐splits  provided   Assign  the  Regions   “enable”   •  no  splits  provided,  one  single  region  by   Region   Region   Region   default   Server   Region   Server   Server   Region   Region   •  Assign  the  Regions  to  the  Region  Servers   Region   Region   Region   •  The  assignment  Region  -­‐>  Server   is  wri6en  to  a  system  table  called  “.META.”   7
  • 8. The  Write  Path  –  “Inser]ng”  data   Client   •  table.put(row-­‐key:family:column,  value)   Where  is   .META.?   Scan   .META.   •  The  client  asks  ZooKeeper  the  loca]on  of  .META.   ZooKeeper   Region  Server   Insert   Region   •  The  client  scans  .META.  searching  for  the     KeyValue   Region   Region  Server  responsible  to  handle  the  Key   Region  Server   Region   •  The  client  asks  the  Region  Server  to     Region   insert/update/delete  the  specified  key/value.   Region   •  The  Region  Server  process  the  request  and  dispatch  it  to  the   Region  responsible  to  handle    the  Key   •  The  opera]on  is  wri6en  to  a  Write-­‐Ahead  Log  (WAL)   •  …and  the  KeyValues  added  to  the  Store:  “MemStore”   8
  • 9. The  Write  Path  –  Append  Only  to  Random  R/W   •  Files  in  HDFS  are   RS   Region   WAL   Region   Region   •  Append-­‐Only     •  Immutable  once  closed   MemStore  +  Store  Files  (HFiles)   •  HBase  provides  Random  Writes?   •  …not  really  from  a  storage  point  of  view   •  KeyValues  are  stored  in  memory  and  wri6en  to  disk  on  pressure   •  Don’t  worry  your  data  is  safe  in  the  WAL!   Key0  –  value  0   •  (The  Region  Server  can  recover  data  from  the  WAL  is  case  of  crash)   Key1  –  value  1   Key2  –  value  2   Key3  –  value  3   But  this  allow  to  sort  data  by  Key  before  wri]ng  on  disk   •  Key4  –  value  4   Key5  –  value  5     •  Deletes  are  like  Inserts  but  with  a  “remove  me  flag”   Store  Files   9
  • 10. The  Read  Path  –  “reading”  data   •  The  client  asks  ZooKeeper  the  loca]on  of  .META.   Client   Where  is   •  The  client  scans  .META.  searching  for  the  Region  Server   .META.?   Scan   .META.   responsible  to  handle  the  Key   ZooKeeper   Region  Server   Region   •  The  client  asks  the  Region  Server  to  get  the  specified  key/ Get  Key   Region   value.   Region  Server   •  The  Region  Server  process  the  request  and  dispatch  it  to   Region   Region   the  Region  responsible  to  handle    the  Key   Region   •  MemStore  and  Store  Files  are  scanned  to  find  the  key   10
  • 11. The  Read  Path  –  Append  Only  to  Random  R/W   •  Each  flush  a  new  file  is  created     Key0  –  value  0.0   Key0  –  value  0.1   Key2  –  value  2.0   Key5  –  value  5.0   Key3  –  value  3.0   Key1  –  value  1.0   Key5  –  value  5.0   Key5  –  [deleted]   Key8  –  value  8.0   Key6  –  value  6.0   •  Each  file  have  KeyValues  sorted  by  key   Key9  –  value  9.0   Key7–  value  7.0   •  Two  or  more  files  can  contains  the  same  key   (updates/deletes)   •  To  find  a  Key  you  need  to  scan  all  the  files   •  …with  some  op]miza]ons   •  Filter  Files  Start/End  Key   •  Having  a  bloom  filter  on  each  file   11
  • 12. HFile   HBase  Store  File  Format   12
  • 13. HFile  format   Blocks   •  Only  Sequen]al  Writes,  just  append(key,  value)   Header   •  Large  Sequen]al  Reads  are  be6er   Record  0   Record  1   •  Why  grouping  records  in  blocks?   Key/Value   …   (record)   Record  N   •  Easy  to  split   Key  Length  :  int   Header   Value  Length  :  int   Record  0   •  Easy  to  read   Key  :  byte[]   Record  1   …   •  Easy  to  cache   Value  :  byte[]   Record  N   Index  0   •  Easy  to  index  (if  records  are  sorted)   …   Index  N   •  Block  Compression  (snappy,  lz4,  gz,  …)   Trailer   13
  • 14. Data  Block  Encoding   •  “Be  aware  of  the  data”   •  Block  Encoding  allows  to  compress  the  Key  based  on   what  we  know   •  Keys  are  sorted…  prefix  may  be  similar  in  most  cases   •  One  file  contains  keys  from  one  Family  only   •  Timestamps  are  “similar”,  we  can  store  the  diff   “on-­‐disk”   •  Type  is  “put”  most  of  the  ]me…   KeyValue   Row  Length  :  short   Row  :  byte[]   Family  Length  :  byte   Family  :  byte[]   Qualifier  :  byte[]   Timestamp  :  long   Type  :  byte   14
  • 15. Compac]ons   Op]mize  the  read-­‐path   15
  • 16. Compac]ons   •  Reduce  the  number  of  files  to  look  into  during  a  scan   Key0  –  value  0.0   Key0  –  value  0.1   Key2  –  value  2.0   Key1  –  value  1.0   Key3  –  value  3.0   Key4–  value  4.0   Key5  –  value  5.0   Key5  –  [deleted]   Key8  –  value  8.0   Key6  –  value  6.0   •  Removing  duplicated  keys  (updated  values)   Key9  –  value  9.0   Key7–  value  7.0   •  Removing  deleted  keys   Key0  –  value  0.1   Key1  –  value  1.0   •  Creates  a  new  file  by  merging  the  content  of  2+  files   Key2  –  value  2.0   Key4–    value  4.0   Key6  –  value  6.0   Key7–  value  7.0   Key8–  value  8.0   •  Remove  the  old  files   Key9–  value  9.0   16
  • 17. Pluggable  Compac]ons   •  Try  different  algorithm   Key0  –  value  0.0   Key0  –  value  0.1   Key2  –  value  2.0   Key1  –  value  1.0   Key3  –  value  3.0   Key4–  value  4.0   Key5  –  value  5.0   Key5  –  [deleted]   Key8  –  value  8.0   Key6  –  value  6.0   •  Be  aware  of  the  data   Key9  –  value  9.0   Key7–  value  7.0   •  Time  Series?  I  guess  no  updates  from  the  80s   Key0  –  value  0.1   •  Be  aware  of  the  requests     Key1  –  value  1.0   Key2  –  value  2.0   Key4–    value  4.0   Key6  –  value  6.0   Key7–  value  7.0   •  Compact  based  on  sta]s]cs   Key8–  value  8.0   Key9–  value  9.0   •  which  files  are  hot  and  which  are  not   •  which  keys  are  hot  and  which  are  not   17
  • 18. Snapshots   Zero-­‐copy  snapshots  and  table  clones   18
  • 19. How  taking  a  snapshot  works?   •  The  master  orchestrate  the  RSs   •  the  communica]on  is  done  via  ZooKeeper   •  using  a  “2-­‐phase  commit  like”  transac]on  (prepare/commit)   •  Each  RS  is  responsible  to  take  its  “piece”  of  snapshot   •  For  each  Region  store  the  metadata  informa]on  needed   •  (list  of  Store  Files,  WALs,  region  start/end  keys,  …)     ZK   ZK   Master   ZK   RS   RS   Region   WAL   Region   Region   Region   WAL   Region   Region   Store  Files  (HFiles)   Store  Files  (HFiles)   19
  • 20. What  is  a  Snapshots?   •  “a  Snapshot  is  not  a  copy  of  the  table”   •  a  Snapshot  is  a  set  of  metadata  informa]on   •  The  table  “schema”  (column  families  and  a6ributes)   •  The  Regions  informa]on  (start  key,  end  key,  …)   •  The  list  of  Store  Files   •  The  list  of  WALs  ac]ve   ZK   ZK   Master   ZK   RS   RS   Region   WAL   Region   Region   Region   WAL   Region   Region   Store  Files  (HFiles)   Store  Files  (HFiles)   20
  • 21. Cloning  a  Table  from  a  Snapshots   •  hbase>  clone_snapshot  ‘snapshotName’,  ‘tableName’   …   •  Creates  a  new  table  with  the  data  “contained”  in  the  snapshot   •  No  data  copies  involved   •  HFiles  are  immutable   •  And  shared  between  tables  and  snapshots   •  You  can  insert/update/remove  data  from  the  new  table   •  No  repercussions  on  the  snapshot,  original  tables  or  other   cloned  tables   21
  • 22. Compac]ons  &  Archiving   •  HFiles  are  immutable,  and  shared  between  tables  and  snapshots     •  On  compac]on  or  table  dele]on,  files  are  removed  from  disk   •  If  files  are  referenced  by  a  snapshot  or  a  cloned  table   •  The  file  is  moved  to  an  “archive”  directory   •  And  deleted  later,  when  there’re  no  references  to  it   22
  • 23. Compac]ons   Op]mize  the  read-­‐path   23
  • 24. 0.96  is  coming  up   •  Moving  RPC  to  Protobuf   •  Allows  rolling  upgrades  with  no  surprises   •  HBase  Snapshots   •  Pluggable  Compac]ons   •  Remove  -­‐ROOT-­‐     •  Table  Locks   24
  • 25. 0.98  and  Beyond   •  Transparent  Table/Column-­‐Family  Encryp]on   •  Cell-­‐level  security   •  Mul]ple  WALs  per  Region  Server  (MTTR)   •  Data  Placement  Awareness  (MTTR)   •  Data  Type  Awareness   •  Compac]on  policies,  based  on  the  data  needs   •  Managing  blocks  directly  (instead  of  files)   25
  • 27. Thank  you!   Ma6eo  Bertozz,  @cloudera                                                                                                          @th30z