Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HBase In Action - Chapter 10 - Operations


Published on

HBase In Action - Chapter 10: Operations

Learning HBase, Real-time Access to Your Big Data, Data Manipulation at Scale, Big Data, Text Mining, HBase, Deploying HBase

Published in: Education
  • Login to see the comments

  • Be the first to like this

HBase In Action - Chapter 10 - Operations

  1. 1. Chapter 10: Operations HBase In Action
  2. 2. Overview: Operations Monitoring your cluster Performance of your HBase cluster Cluster Management Backup and Replication Summary
  3. 3. 09/24/15 10.1 Monitoring Your Cluster  A critical aspect of any production system is the ability of its operators to monitor its state and behavior.  In this section, we’ll talk about how HBase exposes metrics and the frameworks that are available to you to capture these metrics and use them to make sense of how your cluster is performing.  How HBase exposes metrics  Collecting and graphing the metrics  The metrics HBase exposes  Application-side monitoring
  4. 4. 09/24/15 10.1.1 How HBase exposes metrics  The metrics framework is another of the many ways that HBase depends on Hadoop.  HBase is tightly integrated with Hadoop and uses Hadoop’s underlying metrics framework to expose its metrics.  The metrics framework works by outputting metrics based on a context implementation that implements the MetricsContext interface.  Ganglia context and File context.  HBase also exposes metrics using Java Management Extensions
  5. 5. Hbase Course Data Manipulation at Scale: Systems and Algorithms Using HBase for Real-time Access to Your Big Data
  6. 6. 09/24/15 10.1.2 Collecting and graphing the metrics  Metrics solutions involve two aspects: collection and graphing.  Collection frameworks collect the metrics being generated by the system that is being monitored and store them efficiently so they can be used later.  Graphing tools use the data captured and stored by collection frameworks and make it easily consumable for the end user in the form of graphs and pretty pictures.  Numerous collection and graphing tools are available. But not all of them are tightly integrated with how Hadoop and HBase expose metrics.  GANGLIA  JMX
  7. 7. 09/24/15 10.1.2 Collecting and graphing the metrics  GANGLIA  Ganglia ( 5 is a distributed monitoring framework designed to monitor clusters.  It was developed at UC Berkeley and open-sourced.  Configure HBase to output metrics to Ganglia  Set the parameters in the hadoop- file, which resides in the $HBASE_HOME/conf/ directory.
  8. 8. 09/24/15 10.1.2 Collecting and graphing the metrics  JMX  Several open source tools such as Cacti and OpenTSDB can be used to collect metrics via JMX. JMX metrics can also be viewed as JSON from the Master and RegionServer web UI:  JMX metrics from the Master: http://master_ip_address:port/jmx  JMX metrics from a particular RegionServer: http://region_server_ip _address:port/jmx  The default port for the Master is 60010 and for the RegionServer is 60030.  FILE BASED  HBase can also be configured to output metrics into a flat file.  File-based metrics aren’t a useful way of recording metrics because they’re hard to consume thereafter.
  9. 9. 09/24/15 10.1.3 The metrics HBase exposes The Master and RegionServers expose metrics. The metrics of interest depend on the workload the cluster is sustaining, and we’ll categorize them accordingly.  GENERAL METRICS  HDFS throughput and latency  HDFS usage  Underlying disk throughput  Network throughput and latency from each node  WRITE-RELATED METRICS  To understand the system state during writes, the metrics of interest are the ones that are collected as data is written into the system.  READ-RELATED METRICS  Reads are different than writes, and so are the metrics you should monitor to understand them.
  10. 10. 09/24/15 10.1.3 The metrics HBase exposes(con't)
  11. 11. 09/24/15 10.1.3 The metrics HBase exposes(con't)
  12. 12. 09/24/15 10.1.3 The metrics HBase exposes(con't)
  13. 13. Hbase Course Data Manipulation at Scale: Systems and Algorithms Using HBase for Real-time Access to Your Big Data
  14. 14. 09/24/15 10.1.4 Application-side monitoring  In a production environment, we recommend that you add to the system-level monitoring that Ganglia and other tools provide and also monitor how HBase looks from your application’s perspective.  Put performance as seen by the client (the application) for every RegionServer  Get performance as seen by the client for every RegionServer  Scan performance as seen by the client for every RegionServer  Connectivity to all RegionServers  Network latencies between the application tier and the HBase cluster  Number of concurrent clients opening to HBase at any point in time  Connectivity to ZooKeeper
  15. 15. 09/24/15 10.2 Performance of your HBase cluster  Performance of any database is measured in terms of the response times of the operations that it supports.  This is important to measure in the context of your application so you can set the right expectations for users.  To make sure your HBase cluster is performing within the expected SLAs, you must test performance thoroughly and tune the cluster to extract the maximum performance you can get out of it.  Performance testing  What impacts HBase’s performance?  Tuning dependency systems  Tuning HBase
  16. 16. 09/24/15 10.2.1 Performance testing  There are different ways you can test the performance of your HBase cluster.  PERFORMANCEEVALUATION TOOL—BUNDLED WITH HBASE  HBase ships with a tool called PerformanceEvaluation, which you can use to evaluate the performance of your HBase cluster in terms of various operations. Examples: To run a single evaluation client: $ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1 $ hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=10 sequentialWrite 1
  17. 17. 09/24/15 10.2.1 Performance testing (con't)  YCSB—YAHOO! CLOUD SERVING BENCHMARK7  YCSB is the closest we have come to having a standard benchmarking tool that can be used to measure and compare the performance of different distributed databases.  YCSB is available from the project’s GitHub repository (  Before running the workload, you need to create the HBase table YCSB will write to. You can do that from the shell:  hbase(main):002:0> create 'mytable', 'myfamily'  $ bin/ycsb load hbase -P workloads/workloada -p columnfamily=myfamily -p table=mytable  You can do all sorts of fancy stuff with YCSB workloads, including configuring multiple clients, configuring multiple threads, and running mixed workloads with different statistical distributions of the data.
  18. 18. 09/24/15 10.2.2 What impacts HBase’s performance?  HBase is a distributed database and is tightly coupled with Hadoop. That makes it susceptible to the entire stack under it (figure 10.8) when it comes to performance.
  19. 19. 09/24/15 10.2.3 Tuning dependency systems  Tuning an HBase cluster to extract maximum performance involves tuning all dependencies.  HARDWARE CHOICES  NETWORK CONFIGURATION  OPERATING SYSTEM  LOCAL FILE SYSTEM  HDFS
  20. 20. 09/24/15 10.2.4 Tuning HBase  Tuning an HBase cluster typically involves tuning multiple different configuration parameters to suit the workload that you plan to put on the cluster.  Random-read-heavy  Sequential-read-heavy  Write-heavy  Mixed  Each of these workloads demands a different kind of configuration tuning
  21. 21. 09/24/15 10.2.4 Tuning HBase  RANDOM-READ-HEAVY : For random-read-heavy workloads, effective use of the cache and better indexing will get you higher performance.
  22. 22. 09/24/15 10.2.4 Tuning HBase  For sequential-read-heavy workloads, the read cache doesn’t buy you a lot; chances are you’ll be hitting the disk more often than not unless the sequential reads are small in size and are limited to a particular key range.
  23. 23. 09/24/15 10.2.4 Tuning HBase  WRITE-HEAVY : Write-heavy workloads need different tuning than read-heavy ones. The cache doesn’t play an important role anymore. Writes always go into the MemStore and are flushed to form new HFiles, which later are compacted.  The way to get good write performance is by not flushing, compacting, or splitting too often because the I/O load goes up during that time, slowing the system.
  24. 24. 09/24/15 10.2.4 Tuning HBase
  25. 25. 09/24/15 10.2.4 Tuning HBase  MIXED : With completely mixed workloads, tuning becomes slightly trickier. You have to tweak a mix of the parameters described earlier to achieve the optimal combination. Iterate over various combinations, and run performance tests to see where you get the best results.  Compression  Rowkey design  Major compactions  RegionServer handler count
  26. 26. 09/24/15 10.3 Cluster management  During the course of running a production system, management tasks need to be performed at different stages.  Things like starting or stopping the cluster, upgrading the OS on the nodes, replacing bad hardware, and backing up data are important tasks and need to be done right to keep the cluster running smoothly.  This section highlights some of the important tasks you may need to perform and teaches how to do them.
  27. 27. 09/24/15 10.3.1 Starting and stopping HBase  The order in which the HBase daemons are stopped and started matters only to the extent that the dependency systems (HDFS and ZooKeeper) need to be up before HBase is started and should be shut down only after HBase has shut down.  SCRIPTS : in the $HBASE_HOME/bin directory  CENTRALIZED MANAGEMENT : Cluster-management frameworks like Puppet and Chef can be used to manage the starting and stopping of daemons from a central location.
  28. 28. 09/24/15 10.3.2 Graceful stop and decommissioning nodes  When you need to shut down daemons on individual servers for any management purpose (upgrading, replacing hardware, and so on), you need to ensure that the rest of the cluster keeps working fine and there is minimal outage as seen by client applications.  The script follows these steps (in order) to gracefully stop a RegionServer:  Disable the region balancer.  Move the regions off the RegionServer, and randomly assign them to other servers in the cluster  Stop the REST and Thrift services if they’re active.  Stop the RegionServer process. $ bin/ Usage: [--config <conf-dir>] [--restart] [--reload] [--thrift] [--rest] <hostname> thrift If we should stop/start thrift before/after the
  29. 29. 09/24/15 10.3.3 Adding nodes  As your application gets more successful or more use cases crop up, chances are you’ll need to scale up your HBase cluster.  It could also be that you’re replacing a node for some reason. The process to add a node to the HBase cluster is the same in both cases.
  30. 30. 09/24/15 10.3.4 Rolling restarts and upgrading  It’s not rare to patch or upgrade Hadoop and HBase releases in running clusters.  In production systems, upgrades can be tricky. Often, it isn’t possible to take downtime on the cluster to do upgrades.  But not all upgrades are between major releases and require downtime.  To do upgrades without taking a downtime, follow these steps:  Deploy the new HBase version to all nodes in the cluster, including the new ZooKeeper if that needs an update as well.  Turn off the balancer process. One by one, gracefully stop the RegionServers and bring them back up.  Restart the HBase Masters one by one.  If ZooKeeper requires a restart, restart all the nodes in the quorum one by one.  Upgrade the clients.  You can use the same steps to do a rolling restart for any other purpose as well.
  31. 31. 09/24/15 10.3.5 bin/hbase and the HBase shell  The script basically runs the Java class associated with the command you choose to pass it:
  32. 32. 09/24/15 10.3.5 bin/hbase and the HBase shell
  33. 33. 09/24/15  We’ll focus on the tools group of commands (shown in bold). To get a description for any command, you can run help 'command_name' in the shell like this  ZK_DUMP : You can use the zk_dump command to find out the current state of ZooKeeper:  STATUS COMMAND : You can use the status command to determine the status of the cluster.  COMPACTIONS  BALANCER  SPLITTING TABLES OR REGIONS  ALTERING TABLE SCHEMAS  TRUNCATING TABLES 10.3.5 bin/hbase and the HBase shell
  34. 34. 09/24/15 10.3.6 Maintaining consistency—hbck  HBase comes with a tool called hbck (or HBaseFsck) that checks for the consistency and integrity of the HBase cluster.  Hbck recently underwent an overhaul, and the resulting tool was nicknamed uberhbck.  Hbck is a tool that helps in checking for inconsistencies in HBase clusters. Inconsistencies can occur at two levels:  Region inconsistencies  Table inconsistencies  Hbck performs two primary functions: detect inconsistencies and fix inconsistencies.  DETECTING INCONSISTENCIES :  $ $HBASE_HOME/bin/hbase hbck  $ $HBASE_HOME/bin/hbase hbck -details  FIXING INCONSISTENCIES :  Incorrect assignments  Missing or extra regions
  35. 35. 09/24/15 10.3.7 Viewing HFiles and HLogs  HBase provides utilities to examine the HFiles and HLogs (WAL) that are being created at write time.  The HLogs are located in the .logs directory in the HBase root directory on the file system. You can examine them by using the hlog command of the bin/hbase script, like this:
  36. 36. 09/24/15 10.3.7 Viewing HFiles and HLogs  The script has a similar utility for examining the HFiles. To print the help for the command, run the command without any arguments:  You can see that there is a lot of information about the HFile. Other options can be used to get different bits of information.
  37. 37. 09/24/15 10.3.8 Presplitting tables  Table splitting during heavy write loads can result in increased latencies. Splitting is typically followed by regions moving around to balance the cluster, which adds to the overhead.  Presplitting tables is also desirable for bulk loads, which we cover later in the chapter. If the key distribution is well known, you can split the table into the desired number of regions at the time of table creation.
  38. 38. 09/24/15 10.4 Backup and replication  Inter-cluster replication  Backup using MapReduce jobs  Backing up the root directory
  39. 39. 09/24/15 10.4.1 Inter-cluster replication  Inter-cluster replication can be of three types:  Master-slave  Master-master  Cyclic
  40. 40. 09/24/15 10.4.2 Backup using MapReduce jobs  MapReduce jobs can be configured to use HBase tables as the source and sink, as we covered in chapter 3. This ability can come in handy to do point- in-time backups of tables by scanning through them and outputting the data into flat files or other HBase tables.  This is different from inter-cluster replication, which the last section described.  Inter-cluster replication is a push mechanism.  Running MapReduce jobs over tables is a pull mechanism  EXPORT/IMPORT  The prebundled Export MapReduce job can be used to export data from HBase tables into flat files.  That data can then later be imported into another HBase table on the same or a different cluster using the Import job.
  41. 41. 09/24/15 10.4.2 Backup using MapReduce jobs
  42. 42. 09/24/15 10.4.2 Backup using MapReduce jobs  ADVANCED IMPORT WITH IMPORTTSV  ImportTsv is more feature-rich.  It allows you to load data from newline- terminated, delimited text files.
  43. 43. 09/24/15 10.4.3 Backing up the root directory  HBase stores its data in the directory specified by the hbase.rootdir configuration property. This directory contains all the region information, all the HFiles for the tables, as well as the WALs for all RegionServers.  When an HBase cluster is up and running, several things are going on: MemStore flushes, region splits, compactions, and so on.  But if you stop the HBase daemons cleanly, the MemStore is flushed and the root directory isn’t altered by any process.
  44. 44. Hbase Course Data Manipulation at Scale: Systems and Algorithms Using HBase for Real-time Access to Your Big Data
  45. 45. 09/24/15 10.5 Summary Production-quality operations of any software system are learned over time. This chapter covered several aspects of operating HBase in production with the intention of getting you started on the path to understanding the concepts. New tools and scripts probably will be developed by HBase users and will benefit you.  The first aspect of operations is instrumenting and monitoring the system.  From monitoring, the chapter transitioned into talking about performance testing, measuring performance, and tuning HBase for different kinds of workloads.  From there we covered a list of common management tasks and how and when to do them.  Mastering HBase operations requires an understanding of the internals and experience gained by working with the system.