5. Physical Server
Hypervisor
Operating System
JVM Process
Your code
Internet
Physical Server
Operating System
RDBMS
Your Schema/
Queries
Local Disk
Physical Server
Disk
Disk
DiskSAN
Storage
OS
Performance tuning requires looking at a complete picture of the system and addressing the right issues
(usually your code)
● Tuning the wrong component can have no
impact and can sometimes make things worse!
● This presentation is focused on the JVM -
others will follow...
JVM Tuning Technical Talk … Overview …
6. There are many implementations of the JVM in existence today, and it is important to be aware of what you
are working with because they may be tuned differently
• There are many JVM implementations in existence today (75+)
• 2 main implementations to be aware of:
– OpenJDK
▪ This is a completely open source JVM choice
▪ This is the default packaged JVM with many Linux distros
– HotSpot
▪ Currently owned by Oracle
▪ Based on OpenJDK, but includes in other implementations of various
pieces (some closed source)
– Other commercial JVMs exist: be aware which one you are running
▪ Running “java -version” will tell you
Reference: https://plumbr.eu/blog/java/java-version-and-vendor-data-analyzed-2016-edition
JVM Tuning Technical Talk … Overview …
7. Tuning your JVM can have a significant impact on application performance
• JVM Tuning - what and why?
– Involves using various tools to get more insight on what the JVM is doing “under the hood”
– Generally involves various command line arguments to tune behaviors primarily centered around
modifying garbage collection
– Garbage collection is expensive!
• When approaching JVM tuning, remember 2 things:
– Tuning your JVM can have a significant impact on application performance
– “Premature optimization is the root of all evil” - Donald Knuth
JVM Tuning Technical Talk … Overview …
9. Garbage collection “eliminates” the need for a programmer to have to manage memory themselves in code
● Trivial C++ example of manual memory management:
{
foo* f = new foo();
// Do interesting things with f
delete f; // you'll see that the object is destroyed.
}
● In Java, there is no “delete”
● The JVM shields the developer from the complexities of manual memory management
● The memory still needs to be reclaimed to become available for future use
● Garbage collection automatically identifies “dead objects” to reclaim the memory they occupy and make it
available again for future use
JVM Tuning Technical Talk … Garbage Collection …
Garbage collection is a form of automatic memory management which attempts to keep the developer from
doing it manually
10. JVM Tuning Technical Talk … Garbage Collection …
● “Generational hypothesis”: most objects survive for only
a short period of time
○ The majority of objects “die young”
○ GC can be typically be more efficiently performed
by focusing on collecting younger objects
● The JVM uses the concept of “generations” of an
object, based on analysis of how typical object
allocation/deallocation occurs over time
Common terms when tuning GC:
● Throughput - how much actual work your application
is doing (i.e. total time spent not doing GC)
● Pauses - Times your application appears unresponsive
due to performing GC
Reference: https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/generations.html
GC algorithms in HotSpot/OpenJDK operate under the “generational hypothesis”
11. With the Java 8-based memory architecture, objects move
through the heap in stages
1. When objects are first created, they are allocated into eden
2. Once eden fills up, a GC event is triggered
○ Dead objects are effectively removed from eden
○ Surviving objects are moved into the survivor spaces
3. When a survivor space fill up, a GC event is triggered
○ Objects alternate between S0/S1 through GC
○ Objects which survive a certain number of GC passes
move to tenured
4. When oldgen/tenured fills up, a GC event is triggered
JVM Tuning Technical Talk … Garbage Collection …
JVM Memory management utilizes “generations” to bucket objects in memory based on their age
Eden
S0
(Survivor)
S1
(Survivor)
Tenured
Young gen Old gen (tenured)
-Xmx (max heap)
12. Different GC events:
● Minor GC - cleanup younggen
○ Happens when eden or survivor spaces fills up
○ Generally fast
● Major GC - cleanup oldgen
○ Generally takes longer (more objects to deal with)
● Full GC - cleanup both younggen and oldgen
Notes:
● Minor GC collections happen much more frequently - this
is how the generational hypothesis is implemented
● All of these GC events are “stop the world” operations at
some point during operation
JVM Tuning Technical Talk … Garbage Collection …
Three main types of GC events can occur to trigger the cleanup of dead objects and graduate live ones
Eden
S0
(Survivor)
S1
(Survivor)
Tenured
Young gen Old gen (tenured)
-Xmx (max heap)
○ “Stop the world” events cause ALL application threads to stop while the GC code does its work
● Different GC algorithms take different approaches in cleaning these spaces
○ Some algorithms do different parts of this concurrently
● GC tuning revolves around minimizing the occurrence and reducing the duration of these application
pauses
13. Metaspace (new in Java 8) is a block of “native memory” where the java classes and methods are loaded
JVM Tuning Technical Talk … Garbage Collection …
Java 8 replaced PermGen with Metaspace
Eden
S0
(Survivor)
S1
(Survivor)
Tenured
Young gen Old gen (tenured)
-Xmx (max heap)
Entire java process memory allocation
Metaspace
14. GC roots are special to the garbage collector in that they cause other objects to stay alive and reachable
During live object identification, the JVM:
● Begins by finding all “GC roots”
● Iterates, starting from GC roots, to all objects that are reachable and “marks them”
● Marking always performs a “stop the world” operation at some point
○ more alive objects == more time marking == more time with all threads paused
○ Sometimes more heap memory can be bad - it can take a lot more time to perform GC
Example GC roots:
● Local variables (active method)
● Live threads
● Static fields of loaded classes
● Some JNI references
Image courtesy of: https://plumbr.eu/handbook/garbage-collection-algorithms
JVM Tuning Technical Talk … Garbage Collection …
15. There are four main choices for GC implementations, each with their own strengths and weaknesses
Common Name Details
Serial GC
-XX:+UseSerialGC
● All threads are stopped and then live objects are marked, copied/made contiguous
● NEVER use this for any multi-core machine; it can be beneficial for single-core boxes
● Ideal use case: Single processor machines (think embedded devices)
Parallel GC
-XX:+UseParallelGC
-XX:+UseParallelOldGC
● All phases are run with multiple threads (a parallelized serial GC)
● All cores do GC work; no CPU cycles are used for GC when not running GC
● Can result in increased latency (stop the world collections)
● Ideal use case: High throughput, don’t mind long pauses
Concurrent Mark and Sweep (CMS)
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
● 1/4 of the CPU cores are regularly used to perform background object analysis
● Minimizes pause times, but results in lower throughput
● Can be less predictable on large heaps
● Ideal use case: Minimizes pause times, at the sacrifice of total throughput
G1
-XX:+UseG1GC
● This is the latest in GC algorithms and is regularly being improved with JVM updates
● This divides young and oldgen into over 2000 different regions, allowing for incremental collection
● Most advanced, most changing implementation
● Will be the default for Java 9
● Ideal use case: Large 6G+ heaps with desire for predictable GC pauses as well as minimum
impact on throughput
JVM Tuning Technical Talk … Garbage Collection …
17. The high-level approach to tuning garbage collection involves trying different combinations, measuring
throughput and pauses, and making educated guesses
Practical approach to performing GC tuning at a high-level:
1. Pick a GC algorithm which initially makes sense for your application based on known
strengths/weaknesses
2. Derive a load test for your application (JMeter, LoadUI, etc.)
○ Must independently measure throughput (i.e., how well did the test perform versus my goals)
○ Must be consistent, repeatable
○ Minimize your variables
3. Enable verbose GC logging (and dump to a separate file - see recommended JVM args later)
○ Allows you to collect the data you need to understand your results
4. Execute load test
5. Analyze load test results, GC logs
6. Make adjustments (1 variable at a time), rinse, repeat
If you aren’t sure which GC algorithm is best for your application, you can try multiple!
● Recommend doing this before tuning individual settings for specific GC algorithms
● Attempt load test with GC algorithm-specific default settings first
JVM Tuning Technical Talk … Garbage Collection …
18. ● GCHisto
○ Allows for visual analysis of verbose GC logs
○ Free, originally a plugin for VisualVM
○ https://github.com/jewes/gchisto
● GCViewer
○ Allows for visual analysis of verbose GC logs similar to GCHisto
○ https://github.com/chewiebug/GCViewer/wiki
● JConsole
○ Free, shipped with the JDK
○ Can use JMX connection to get access to and insight about a running JVM “on the fly”
○ Gives you insight into current memory usage and GC
○ Can be used to view and manipulate specifically exposed data within the JVM on the fly without
restarting the process (future talk)
There are a few useful tools to help analyze garbage collection performance
JVM Tuning Technical Talk … Garbage Collection …
20. JVM Tuning Technical Talk … JVM Profiling …
A Java process can be inspected to get insight on data related to performance and memory usage
Main goals around JVM profiling:
● Analyze code behavior
○ Inspect individual method execution invocation counts and and durations
○ Inspect individual thread states
○ Can be used to help find your performance bottlenecks
● Analyze memory usage
○ Every object in the heap is available for inspection
○ Identify GC roots (or lack thereof) for each object
These things can be done in real-time via attaching a profiler, or offline inspecting data files:
○ thread dump file
■ The state of all the threads in the application
○ hprof file
■ Snapshot of everything in the JVM heap with other useful metadata
■ Can be generated via: JConsole, JVisualVM, jmap, etc.
■ hprof files can be generated automatically on OutOfMemoryErrors - *** very useful ***
21. JVM Tuning Technical Talk … JVM Profiling …
Profiling is made available locally and remotely via different mechanisms and it is important to understand
how you get access to this information
Your profiling tools are running in different processes than your target JVMs
● They can even be on different machines
● It is important to be aware of these boundaries
● The profiling tools can connect via a socket connection - be aware of firewall rules, etc.
JVM monitoring connection options:
● jstatd
○ this is a standalone daemon process which needs to be running on the same machine as the JVM
you want to monitor/manage
○ This requires starting jstatd on the machine you want to get access to
○ Monitoring capabilities are limited
○ Can be useful in the case where you don’t have JMX enabled and can’t restart your process
● JMX
○ More powerful access to the JVM
○ Requires starting your target java process with some command line arguments
○ This is enabled by default on Java 6+ locally
■ Be warned - if the target process isn’t running as the same user or JVM as the monitoring tool,
they won’t find each other unless you explicitly define ports
●
22. ● VisualVM
○ Free, shipped with the JDK
○ Allows for deep insight into the performance of a given JVM
○ Will demo this today
● YourKit
○ Has many useful features above and beyond VisualVM, but usually not necessary
○ https://www.yourkit.com/java/profiler/
○ Paid commercial tool with free trial period
● JProfiler
○ Has a graphical representation of call stacks, can be easier to navigate
○ Paid commercial tool
It’s demo time!!!
There are a plethora of tools available for getting insight into JVM activities and performance
JVM Tuning Technical Talk … JVM Profiling …
23. ● Many other command line tools are shipped with the JDK
○ jmap - can be used to force a heap dump from a running JVM
○ jstatd - allows remote access to already running JVMs
○ http://docs.oracle.com/javase/8/docs/technotes/tools/
JVM Tuning Technical Talk … JVM Tuning Tools …
Other notable tools available for getting insight into and analysis of JVM performance
25. ● -server
○ This causes the JVM’s JIT to be more aggressive
○ This increases boot times, but improves performance
○ Usually set by default based on “physical” system profile, but doesn’t hurt to force it
● -Xms=<size> and -Xmx=<size>
○ These set the minimum and maximum heap allocation sizes
○ If not specified, the JVM will use some defaults based on the amount of memory on the machine -
can be dangerous
○ Protip: set them to the same value (i.e. -Xms=512m -Xmx=512m)
■ Removing the JVM’s ability to reduce the heap allocation will improve performance
■ Also helps ensure you have enough memory to handle all the processes you intend to run
● -XX:MaxMetaspaceSize=<size>
○ This limits the metaspace memory allocation, which is important since a metaspace leak is outside of
the heap
○ Without this, “worst case scenario” is that your java process grows unbounded
○ Must monitor load test to get a feel for how much will be needed
JVM Tuning Technical Talk … JVM Arguments for Production …
The following JVM arguments should be utilized in production for most server-side java processes
26. ● -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<some legit path to dump the log>
○ Forces the generation of an hprof file when OOME occurs
○ Setting the path forces the JVM to put the hprof (which can be big) to go somewhere you expect
○ Make sure that appropriate space will exist (hprof will be larger than -Xmx value)
○ Make sure it won’t cause disk space issues in production
● -XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails -Xloggc:<file path and name>
○ These arguments enable GC logging which is ingestable via the GC analysis tools
○ Necessary to get insight into JVM GC behavior and performance
○ Be aware that this increases logging
● -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<X> -XX:GCLogFileSize=<max size>
○ This causes the GC logs to be automatically rotated
○ This will limit the disk space the GC logs occupy (making it easier to have verbose GC logs on in
prod)
● -XX:+PrintCommandLineFlags
○ This will show you which GC is being used (as well as other defaulted JVM args)
JVM Tuning Technical Talk … JVM Arguments for Production …
The following JVM arguments should be utilized in production for most server-side java processes
27. Oracle documentation: https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/
Plumbr garbage collection handbook: https://plumbr.eu/java-garbage-collection-handbook
SE Radio podcast (which contains its own useful links):
http://www.se-radio.net/2016/04/se-radio-episode-255-monica-beckwith-on-java-garbage-collection/
G1GC tuning reference: http://www.oracle.com/technetwork/tutorials/tutorials-1876574.html
JVM Tuning Technical Talk … Garbage Collection …
Links for further study