SlideShare a Scribd company logo
1 of 11
Tuning Java Hotspot G1 GC for BigData
A HBase Case Study
Yanping Wang (Intel/Data Center Java Technologies)
Liqi Yi (Intel/Data Center Java Technologies)
Yu Zhang (Oracle/Java GC performance)
Who are we?
Liqi
Yanping
YU (Jenny)
In 2014, IEEE Spectrum
weighted and combined
12 metrics from
10 sources and ranked
the most popular
programming
languages. (http://spectrum.iee
e.org/computing/software/top-10-
programming-languages)
The #1 is: Java
Why Java?
When BigData Meets Java: HotSpot jdk8 G1
GC to replace CMS for Latency
HBase G1 Tuning Case
GC Pause Before  GC Pause After 
Throughput Latency
HBase configuration and Jdk8u40 build 23
Hotspot G1 GC: jdk8u40 build 23 https://jdk8.java.net/download.html
• Single Socket Intel® Xeon® Ivy-bridge EP processor,
serves as datanode and regionserver, 128 GB DDR3-
1600 RAM, three 400GB SSDs as HDFS storage.
• Apache HBase version 0.98.3-hadoop2 and HDFS 2.2.0
for HFile storage.
• HBase test table was configured as 400 million rows, and
it was 580GB in size. Snappy codec was used to
compress the HFiles, and short-circuit read was enabled
for faster HDFS block access
• Default HBase heap strategy: 40% for blockcache, 40%
for memstore, no off-heap caches.
• YCSB client, residing on a separate system, was used to
drive 600 work threads sending 50% read and 50% Write
requests to the HBase regionserver for 3600 seconds
G1 Concepts
 Heap is divided to ~2K non-contiguous eden, survivor, and old
spaces, region size can be 1MB, 2MB, 4MB, 8MB, 16MB,
32MB
 Humongous regions are old regions for large objects that are
larger than 1/2 of region size. Humongous objects are stored
in contiguous regions in heap.
 Number of regions in eden and survivor can be changed
between GC’s
 Young GC starts when eden is full, , live objects from eden
and survivor regions are evacuated to a set of unused regions
which becomes the new survivor regions
 Concurrent cycle starts when heap is 45% full (by default)
 After concurrent cycle, mixed GC starts, old regions are
collected at the same time as minor GCs
 Full GC starts when heap is 95% full or heap allocation failure
G1 Command line Options
 To get start:
 -XX:+UseG1GC -Xms -Xmx
 -XX:MaxGCPauseMillis (default 200ms)
 -XX:ParallelGCThreads=? -
XX:+ParallelRefProcEnabled
 To Print GC information in log:
 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -
XX:+PrintGCTimeStamps -XX:+PrintReferenceGC -
XX:+PrintAdaptiveSizePolicy
 First Line G1 Tuning Flags
 -XX:G1HeapWastePercent (default 5)
 -XX:InitiatingHeapOccupancyPercent (default 45%)
 -XX:G1MixedGCLiveThresholdPercent (default 85)
 -XX:ConcGCThreads (default ~1/4 of parallel threads)
 -XX:G1MixedGCCountTarget (default 8)
HBase G1 Tuning Case
GC Pause Before  GC Pause After 
Throughput Latency
HBase G1 Tuning Case
(A single G1 Tuning Flag can make huge
difference)
 Before Tuning: -
XX:+UseG1GC -Xms100g -
Xmx100g -
XX:MaxGCPauseMillis=100 -
XX:ParallelGCThreads=16 -
XX:+ParallelRefProcEnabled
 AfterTuning: -
XX:+UseG1GC -Xms100g -
Xmx100g -
XX:MaxGCPauseMillis=100 -
XX:ParallelGCThreads=16 -
XX:+ParallelRefProcEnabled
-
XX:G1HeapWastePercent=2
 29.3% reduction of total GC pause time
 18.6% Throughput improvement
 15.7% Latency reduction
 Much tamed GC behavior, improved HBase performance
Questions?
Join the OpenJDK community:
http://openJDK.java.net/
GC Tuning email list: hotspot-gc-use@openjdk.java.net
Contribute & Get the latest source:
http://hg.openJDK.java.net/jdk9/hs-gc/hotspot
Contact: yanping.wang@intel.com
liqi.yi@intel.com

More Related Content

What's hot

GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerMonica Beckwith
 
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...Monica Beckwith
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Spark Summit
 
Game of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GCGame of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GCMonica Beckwith
 
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 PresentationGC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 PresentationLudovic Poitou
 
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationMonica Beckwith
 
Storing Cassandra Metrics
Storing Cassandra MetricsStoring Cassandra Metrics
Storing Cassandra MetricsChris Lohfink
 
Tuning Java GC to resolve performance issues
Tuning Java GC to resolve performance issuesTuning Java GC to resolve performance issues
Tuning Java GC to resolve performance issuesSergey Podolsky
 
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineThe Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineMonica Beckwith
 
JVM memory management & Diagnostics
JVM memory management & DiagnosticsJVM memory management & Diagnostics
JVM memory management & DiagnosticsDhaval Shah
 
JVM and Garbage Collection Tuning
JVM and Garbage Collection TuningJVM and Garbage Collection Tuning
JVM and Garbage Collection TuningKai Koenig
 
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!Monica Beckwith
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesDatabricks
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon
 
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)Red Hat Developers
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
HBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to CoprocessorsHBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to CoprocessorsCloudera, Inc.
 

What's hot (20)

-XX:+UseG1GC
-XX:+UseG1GC-XX:+UseG1GC
-XX:+UseG1GC
 
Moving to G1GC
Moving to G1GCMoving to G1GC
Moving to G1GC
 
GC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance Engineer
 
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
 
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
 
Game of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GCGame of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GC
 
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 PresentationGC Tuning in the HotSpot Java VM - a FISL 10 Presentation
GC Tuning in the HotSpot Java VM - a FISL 10 Presentation
 
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
 
Storing Cassandra Metrics
Storing Cassandra MetricsStoring Cassandra Metrics
Storing Cassandra Metrics
 
Tuning Java GC to resolve performance issues
Tuning Java GC to resolve performance issuesTuning Java GC to resolve performance issues
Tuning Java GC to resolve performance issues
 
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineThe Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
 
Basics of JVM Tuning
Basics of JVM TuningBasics of JVM Tuning
Basics of JVM Tuning
 
JVM memory management & Diagnostics
JVM memory management & DiagnosticsJVM memory management & Diagnostics
JVM memory management & Diagnostics
 
JVM and Garbage Collection Tuning
JVM and Garbage Collection TuningJVM and Garbage Collection Tuning
JVM and Garbage Collection Tuning
 
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!
 
TeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage DevicesTeraCache: Efficient Caching Over Fast Storage Devices
TeraCache: Efficient Caching Over Fast Storage Devices
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
 
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
HBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to CoprocessorsHBaseCon 2013: A Developer’s Guide to Coprocessors
HBaseCon 2013: A Developer’s Guide to Coprocessors
 

Similar to Hadoop world g1_gc_forh_base_v4

Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1상욱 송
 
H2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleH2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleSri Ambati
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvmPrem Kuppumani
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8Rahul Gupta
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationTatsuhiro Chiba
 
Jvm problem diagnostics
Jvm problem diagnosticsJvm problem diagnostics
Jvm problem diagnosticsDanijel Mitar
 
Pig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big DataPig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big DataDataWorks Summit
 
Postgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even FasterPostgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even FasterEDB
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLinside-BigData.com
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Cloudera, Inc.
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDinakar Guniguntala
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simpleDori Waldman
 

Similar to Hadoop world g1_gc_forh_base_v4 (20)

Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1Java 어플리케이션 성능튜닝 Part1
Java 어플리케이션 성능튜닝 Part1
 
Progress_190315
Progress_190315Progress_190315
Progress_190315
 
Progress_190130
Progress_190130Progress_190130
Progress_190130
 
Progress_190118
Progress_190118Progress_190118
Progress_190118
 
H2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleH2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt Dowle
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvm
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8Distributed caching-computing v3.8
Distributed caching-computing v3.8
 
Yahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at ScaleYahoo's Experience Running Pig on Tez at Scale
Yahoo's Experience Running Pig on Tez at Scale
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
JVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark applicationJVM and OS Tuning for accelerating Spark application
JVM and OS Tuning for accelerating Spark application
 
Jvm problem diagnostics
Jvm problem diagnosticsJvm problem diagnostics
Jvm problem diagnostics
 
Pig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big DataPig on Tez - Low Latency ETL with Big Data
Pig on Tez - Low Latency ETL with Big Data
 
Postgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even FasterPostgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even Faster
 
Getting started with AMD GPUs
Getting started with AMD GPUsGetting started with AMD GPUs
Getting started with AMD GPUs
 
Progress_190213
Progress_190213Progress_190213
Progress_190213
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
 
DevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on KubernetesDevoxxUK: Optimizating Application Performance on Kubernetes
DevoxxUK: Optimizating Application Performance on Kubernetes
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Big data should be simple
Big data should be simpleBig data should be simple
Big data should be simple
 

Hadoop world g1_gc_forh_base_v4

  • 1. Tuning Java Hotspot G1 GC for BigData A HBase Case Study Yanping Wang (Intel/Data Center Java Technologies) Liqi Yi (Intel/Data Center Java Technologies) Yu Zhang (Oracle/Java GC performance)
  • 3. In 2014, IEEE Spectrum weighted and combined 12 metrics from 10 sources and ranked the most popular programming languages. (http://spectrum.iee e.org/computing/software/top-10- programming-languages) The #1 is: Java Why Java?
  • 4. When BigData Meets Java: HotSpot jdk8 G1 GC to replace CMS for Latency
  • 5. HBase G1 Tuning Case GC Pause Before  GC Pause After  Throughput Latency
  • 6. HBase configuration and Jdk8u40 build 23 Hotspot G1 GC: jdk8u40 build 23 https://jdk8.java.net/download.html • Single Socket Intel® Xeon® Ivy-bridge EP processor, serves as datanode and regionserver, 128 GB DDR3- 1600 RAM, three 400GB SSDs as HDFS storage. • Apache HBase version 0.98.3-hadoop2 and HDFS 2.2.0 for HFile storage. • HBase test table was configured as 400 million rows, and it was 580GB in size. Snappy codec was used to compress the HFiles, and short-circuit read was enabled for faster HDFS block access • Default HBase heap strategy: 40% for blockcache, 40% for memstore, no off-heap caches. • YCSB client, residing on a separate system, was used to drive 600 work threads sending 50% read and 50% Write requests to the HBase regionserver for 3600 seconds
  • 7. G1 Concepts  Heap is divided to ~2K non-contiguous eden, survivor, and old spaces, region size can be 1MB, 2MB, 4MB, 8MB, 16MB, 32MB  Humongous regions are old regions for large objects that are larger than 1/2 of region size. Humongous objects are stored in contiguous regions in heap.  Number of regions in eden and survivor can be changed between GC’s  Young GC starts when eden is full, , live objects from eden and survivor regions are evacuated to a set of unused regions which becomes the new survivor regions  Concurrent cycle starts when heap is 45% full (by default)  After concurrent cycle, mixed GC starts, old regions are collected at the same time as minor GCs  Full GC starts when heap is 95% full or heap allocation failure
  • 8. G1 Command line Options  To get start:  -XX:+UseG1GC -Xms -Xmx  -XX:MaxGCPauseMillis (default 200ms)  -XX:ParallelGCThreads=? - XX:+ParallelRefProcEnabled  To Print GC information in log:  -XX:+PrintGCDetails -XX:+PrintGCDateStamps - XX:+PrintGCTimeStamps -XX:+PrintReferenceGC - XX:+PrintAdaptiveSizePolicy  First Line G1 Tuning Flags  -XX:G1HeapWastePercent (default 5)  -XX:InitiatingHeapOccupancyPercent (default 45%)  -XX:G1MixedGCLiveThresholdPercent (default 85)  -XX:ConcGCThreads (default ~1/4 of parallel threads)  -XX:G1MixedGCCountTarget (default 8)
  • 9. HBase G1 Tuning Case GC Pause Before  GC Pause After  Throughput Latency
  • 10. HBase G1 Tuning Case (A single G1 Tuning Flag can make huge difference)  Before Tuning: - XX:+UseG1GC -Xms100g - Xmx100g - XX:MaxGCPauseMillis=100 - XX:ParallelGCThreads=16 - XX:+ParallelRefProcEnabled  AfterTuning: - XX:+UseG1GC -Xms100g - Xmx100g - XX:MaxGCPauseMillis=100 - XX:ParallelGCThreads=16 - XX:+ParallelRefProcEnabled - XX:G1HeapWastePercent=2  29.3% reduction of total GC pause time  18.6% Throughput improvement  15.7% Latency reduction  Much tamed GC behavior, improved HBase performance
  • 11. Questions? Join the OpenJDK community: http://openJDK.java.net/ GC Tuning email list: hotspot-gc-use@openjdk.java.net Contribute & Get the latest source: http://hg.openJDK.java.net/jdk9/hs-gc/hotspot Contact: yanping.wang@intel.com liqi.yi@intel.com