SlideShare a Scribd company logo
1 of 64
Download to read offline
yo ur
                    y        r ite
                            w ark
                tsk      to m
            zni
       er e          a nt ch
Dr or B          w       en
         ou rob
       y ic
 So m
        n
    ow         8t h2
                    008
           er 1
         mb
     Dece
Agenda


  • Introduction
  • Java™ micro benchmarking pitfalls
  • Writing your own benchmark
  • Micro benchmarking tools
  • Summary



                                        2
Microbenchmark – simple definition




 1. Start the   2. Run the code   3. Stop the   4. Report
    clock                            clock




                                                            3
Better microbenchmark definition


   • Small program
   • Goal: Measure something about a few
     lines of code
   • All other variables should be removed
   • Returns some kind of a numeric
     result



                                             4
Why do I need microbenchmarks?


 • Discover something about my code:
  • How fast is it
  • Calculate throughput – TPS, KB/s
 • Measure the result of changing my code:
  • Should I replace a HashMap with a TreeMap?
  • What is the cost of synchronizing a method?


                                                  5
Why are you talking about this?


   • It’s hard to write a robust
     microbenchmark
   • it’s even harder to do it in Java™
   • There are not enough Java
     microbenchmarking tools
   • There are too many flawed
     microbenchmarks out there


                                          6
Agenda


  • Introduction
  • Java micro benchmarking pitfalls
  • Writing your own benchmark
  • Micro benchmarking tools
  • Summary



                                       7
A microbenchmark story: the problem


The boss asks you to solve a performance issue
  in one of the components


                                Blah, blah …




                                               8
A microbenchmark story: the cause


   You find out that the cause is excessive use
     of Math.sqrt()




                                              9
A microbenchmark story: a solution?




   • You decide to develop a state of the art
     square root approximation
   • After developing the square root
     approximation you want to benchmark it
     against the java.lang.Math
     implementation


                                                10
SQRT approximation microbenchmark


  Let’s run this little piece of code in a loop
    and see what happens …

   public static void main(String[] args) {
       long start = System.currentTimeMillis(); // start the clock
       for (double i = 0; i < 10 * 1000 * 1000; i++) {
           mySqrt(i); // little piece of code
       }
       long end = System.currentTimeMillis(); // stop the clock
       long duration = end - start;
       System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
   }


                                                                     11
SQRT microbenchmark results


  Wow, this is really fast !

  Test duration: 0 (ms)




                               12
Flawed microbenchmark




                        13
SQRT microbenchmark: what’s wrong?


    Dynamic optimizations
   Garbage collection         Dead code elimination


The Java™ HotSpot virtual machine
    Classloading
                       Dynamic Compilation
          On Stack Replacement

                                                      14
The HotSpot: a mixed mode system



                                            2
             Code is
        1
            interpreted                Profiling



                                        3
 Interpreted again                            Dynamic
   or recompiled                            Compilation
        5

                        Stuff      4
                       Happen

                                                          15
Dynamic compilation


 • Dynamic compilation is unpredictable
  • Don’t know when the compiler will run
  • Don’t know how long the compiler will run
  • Same code may be compiled more than once
  • The JVM can switch to compiled code at will



                                                  16
Dynamic compilation cont.


   • Dynamic compilation can seriously
     influence microbenchmark results

   Continuous recompilation              Steady-state


   Interpreted execution +
                                  Compiled / Interpreted code
   Dynamic compilation +      ≠           execution
   Compiled code execution



                                                           17
Dynamic optimizations



   • The HotSpot server compiler performs
     large variety of optimizations:
     • loop unrolling
     • range check elimination
     • dead-code elimination
     • code hoisting …


                                            18
Code hoisting ?


                  Did he just said
                     “code
                     hoisting”?




                                     19
What the heck is code hoisting ?


   • Hoist = to raise or lift
   • Size optimization
   • Eliminate duplicated pieces
     of code in method bodies
     by hoisting expressions
     or statements




                                   20
Code hoisting example


         a + b is a busy                                      After hoisting the
        expression                                            expression a + b. A
                                                              new local variable t
                                                              has been introduced




Optimizing Java for Size: Compiler Techniques for Code Compaction, Samuli Heilala

                                                                                     21
Dynamic optimizations cont.




  • Most of the optimizations are performed
    at runtime
  • Profiling data is used by the compiler to
    improve optimization decisions
  • You don’t have access to the dynamically
    compiled code


                                                22
Example: Very fast square root?



 10,000,000 calls to Math.sqrt() ~ 4 ms

  public static void main(String[] args) {
      long start = System.nanoTime();
      int result = 0;
      for (int i = 0; i < 10 * 1000 * 1000; i++) {
          result += Math.sqrt(i);
      }
      long duration = (System.nanoTime() - start) / 1000000;
      System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
  }


                                                                  23
Example: not so fast?


 Now it takes ~ 2000 ms ?!?
  public static void main(String[] args) {
      long start = System.nanoTime();
      int result = 0;
      for (int i = 0; i < 10 * 1000 * 1000; i++) {
          result += Math.sqrt(i);                          Single line
                                                           of code
      }                                                    added
      System.out.format(quot;Result: %d %nquot;, result);
      long duration = (System.nanoTime() - start) / 1000000;
      System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
  }

                                                                    24
DCE - Dead Code Elimination


 • Dead code - code that has no effect on the
   outcome of the program execution
  public static void main(String[] args) {
      long start = System.nanoTime();
      int result = 0;
      for (int i = 0; i < 10 * 1000 * 1000; i++) {
          result += Math.sqrt(i);
      }                                                        Dead Code

      long duration = (System.nanoTime() - start) / 1000000;
      System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
  }


                                                                      25
OSR - On Stack Replacement



   • Methods are HOT if they cumulatively
     execute more than 10,000 of loop
     iterations
   • Older JVM versions did not switch to the
     compiled version until the method exited
     and was re-entered
   • OSR - switch from interpretation to
     compiled code in the middle of a loop

                                                26
OSR and microbenchmarking


 • OSR’d code may be less performant
   • Some optimizations are not performed
 • OSR usually happen when you put
    everything into one long method
   • Developers tend to write long main()
     methods when benchmarking
   • Real life applications are hopefully divided
     into more fine grained methods
                                                    27
Classloading



  • Classes are usually loaded only when
     they are first used
  • Class loading takes time
    • I/O
    • Parsing
    • Verification
  • May flow your benchmark results
                                           28
Garbage Collection


   • JVM automatically claim resources by
     • Garbage collection
     • Objects finalization
   • Outside of developer’s control
   • Unpredictable
   • Should be measured if invoked as a result
     of the benchmarked code

                                             29
Time measurement


     How long is one millisecond?
 public static void main(String[] args) throws
    InterruptedException {
  long start = System.currentTimeMillis();
  Thread.sleep(1);
  final long end = System.currentTimeMillis();
  final long duration = (end - start);
  System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
 }


Test duration: 16 (ms)

                                                              30
System.curremtTimeMillis()


 • Accuracy varies with platform

  Resolution              Platform            Source
 55 ms         Windows 95/98              Java Glossary

 10 – 15 ms    Windows NT, 2K, XP, 2003   David Holmes

 1 ms          Mac OS X                   Java Glossary

 1 ms          Linux – 2.6 kernel         Markus Kobler



                                                       31
Wrong target platform


   • Choosing the wrong platform for your
     microbenchmark
     • Benchmarking on Windows when your
       target platform is Linux
     • Benchmarking a highly threaded
       application on a single core machine
     • Benchmarking on a Sun JVM when the
       target platform is Oracle (BEA) JRockit

                                                 32
Caching



  • Caching
    • Hardware – CPU caching
    • Operating System – File system caching
    • Database – query caching



                                               33
Caching: CPU L1 and L2 caches


 • The more the data accessed are far from the
   CPU, the more the delays are high
 • Size of dataset affects access cost
    Array size              Time (us)                 Cost (ns)
    16k                     413451                    9.821
    8192K                   5743812                   136.446


      Jcachev2 results for Intel® core™2 duo T8300, L1 = 32 KB, L2 = 3 MB


                                                                            34
Busy environment


 • Running in a busy environment – CPU,
   IO, Memory




                                          35
Agenda


  • Introduction
  • Java micro benchmarking pitfalls
  • Writing your own benchmark
  • Micro benchmarking tools
  • Summary



                                       36
Warm-up your code




                    37
Warm-up up your code




  • Let the JVM reach steady state execution
     profile before you start benchmarking
  • All classes should be loaded before
     benchmarking
  • Usually executing your code for ~10
     seconds should be enough


                                               38
Warm-up up your code – cont.


   • Detect JIT compilations by using
    • CompilationMXBean.
         getTotalCompilationTime()
     • -XX:+PrintCompilation
   • Measure classloading time
     • Use the ClassLoadingMXBean


                                        39
CompilationMXBean usage



 import java.lang.management.ManagementFactory;
 import java.lang.management.CompilationMXBean;


 long compilationTimeTotal;

 CompilationMXBean compBean =
    ManagementFactory.getCompilationMXBean();

 if (compBean.isCompilationTimeMonitoringSupported())
    compilationTimeTotal = compBean.getTotalCompilationTime();




                                                                 40
Dynamic optimizations


   • Avoid on stack replacement
     • Don’t put all your benchmark code in one
       big main() method
   • Avoid dead code elimination
     • Print the final result
     • Report unreasonable speedups


                                                  41
Garbage Collection


   • Measure garbage collection time
     • Force garbage collection and finalization
       before benchmarking
     • Perform enough iteration to reach garbage
       collection steady state
     • Gather gc stats:
       -XX:PrintGCTimeStamps
       -XX:PrintGCDetails

                                                   42
Time measurement


 • Use System.nanoTime()
   • Microseconds accuracy on modern operating
     systems and hardware
   • Not worse than currentTimeMillis()
   • Notice: Windows users
    • executes in microseconds
    • don’t overuse !

                                             43
JVM configuration


  • Use similar JVM options to your target
    environment:
    • -server or –client JVM
    • Enough heap space (-Xmx)
    • Garbage collection options
    • Thread stack size (-Xss)
    • JIT compiling options

                                             44
Other issues


   • Use fixed size data sets
     • Too large data sets can cause L1 cache
       blowout
   • Notice system load
     • Don’t play GTA while benchmarking !




                                                45
Agenda


  • Introduction
  • Java micro benchmarking pitfalls
  • Writing your own benchmark
  • Micro benchmarking tools
  • Summary



                                       46
Java™ benchmarking tools


  • Various specialized benchmarks
    • SPECjAppServer ®
    • SPECjvm™
    • CaffeineMark 3.0™
    • SciMark 2.0
  • Only a few benchmarking frameworks


                                         47
Japex Micro-Benchmark framework


  • Similar in spirit to JUnit
  • Measures throughput – work over time
    • Transactions Per Second (Default)
    • KBs per second
  • XML based configuration
  • XML/HTML reports


                                           48
Japex: Drivers


   • Encapsulates knowledge about a specific
      algorithm implementation
   • Must extend JapexDriverBase
  public interface JapexDriver extends Runnable {
      public void initializeDriver();
      public void prepare(TestCase testCase);
      public void warmup(TestCase testCase);
      public void run(TestCase testCase);
      public void finish(TestCase testCase);
      public void terminateDriver();
  }
                                                    49
Japex: Writing your own driver



public class SqrtNewtonApproxDriver extends JapexDriverBase {
    private long tmp;
    …
    @Override
    public void warmup(TestCase testCase) {
        tmp += sqrt(getNextRandomNumber());
    }
    …
}




                                                                50
Japex: Test suite


 <testSuite name=quot;SQRT Test Suitequot;
        xmlns=http://www.sun.com/japex/testSuite …>
    <param name=quot;libraryDirquot; value=quot;C:/java/japex/libquot;/>
    <param name=quot;japex.classPathquot; value=quot;./target/classesquot;/>
    <param name=quot;japex.runIterationsquot; value=quot;1000000quot; />
    <driver name=quot;SqrtApproxNewtonDriverquot;>
       <param name=quot;Descriptionquot; value=quot;Newton Driverquot;/>
       <param name=quot;japex.driverClass“
              value=quot;com.alphacsp.javaedge.benchmark.
                     japex.driver.SqrtNewtonApproxDriverquot;/>
    </driver>
    <testCase name=quot;testcase1quot;/>
 </testSuite>

                                                               51
Japex: HTML Reports




                      52
Japex: more chart types




                          Scatter chart




        Line chart




                                          53
Japex: pros and cons


   • Pros
     • Similar to JUnit
     • Nice HTML reports
   • Cons
     • Last stable release on March 2007
     • HotSpot issues are not handled
     • XML configuration
                                           54
Brent Boyer’s Benchmark framework


 • Part of the “Robust Java benchmarking”
   article by Brent Boyer
 • Automate as many aspects as possible:
   • Resource reclamation
   • Class loading
   • Dead code elimination
   • Statistics

                                            55
Benchmark framework example


 Benchmark.Params params = new Benchmark.Params(true);

 params.setExecutionTimeGoal(0.5);

 params.setNumberMeasurements(50);

 Runnable task = new Runnable() {

       public void run() {

              sqrt(getNextRandomNumber());

       }

 };

 Benchmark benchmark = new Benchmark(task, params);

 System.out.println(benchmark.toString());


                                                         56
Benchmark single line summary


  Benchmark output:
  first = 25.702 us,
  mean = 91.070 ns
    (CI deltas: -115.591 ps, +171.423 ps)
  sd = 1.451 us (CI deltas: -461.523 ns, +676.964 ns)


  WARNING: execution times have mild outliers, SD
   VALUES MAY BE INACCURATE

                                                        57
Outlier and serial correlation issues

   • Records outlier and serial correlation
     issues
   • Outliers indicate that a major
     measurement error happened
      • Large outliers - some other activity started on the
         computer during measurement
      • Small outliers might hint that DCE occurred
      • Serial correlation indicates that the JVM has not
         reached its steady-state performance profile


                                                              58
Benchmark : pros and cons


   • Pros
    • Handles HotSpot related issues
    • Detailed statistics
   • Cons
    • Each run takes a lot of time
    • Not a formal project
    • Lacks documentation
                                       59
Agenda


  • Introduction
  • Java micro benchmarking pitfalls
  • Writing your own benchmark
  • Micro benchmarking tools
  • Summary



                                       60
Summary 1


  • Micro benchmarking is hard when it
    comes to Java™
  • Define what you want to measure and
    how want to do it, pick your goals
  • Know what you are doing
   • Always warm-up your code
   • Handle DCE, OSR, GC issues
   • Use fixed size data sets and fixed work
                                               61
Summary 2


 • Do not rely solely on microbenchmark
   results
   • Sanity check results
   • Use a profiler
   • Test your code in real life scenarios under
     realistic load (macro-benchmark)



                                                   62
Summary: resources


  • http://www.ibm.com/developerworks/java/librar
     y/j-benchmark1.html
  • http://www.azulsystems.com/events/javaone_20
     02/microbenchmarks.pdf
  • https://japex.dev.java.net/
  • http://www.ibm.com/developerworks/java/librar
     y/j-jtp12214/
  • http://www.dei.unipd.it/~bertasi/jcache/


                                                    63
Thank
You !
        64

More Related Content

What's hot

An Introduction To Java Profiling
An Introduction To Java ProfilingAn Introduction To Java Profiling
An Introduction To Java Profilingschlebu
 
I know why your Java is slow
I know why your Java is slowI know why your Java is slow
I know why your Java is slowaragozin
 
Software Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and FlamegraphsSoftware Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and FlamegraphsIsuru Perera
 
Find bottleneck and tuning in Java Application
Find bottleneck and tuning in Java ApplicationFind bottleneck and tuning in Java Application
Find bottleneck and tuning in Java Applicationguest1f2740
 
Thread dump troubleshooting
Thread dump troubleshootingThread dump troubleshooting
Thread dump troubleshootingJerry Chan
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumpsTier1 App
 
"Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi...
"Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi..."Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi...
"Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi...Vadym Kazulkin
 
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMKris Mok
 
Efficient Memory and Thread Management in Highly Parallel Java Applications
Efficient Memory and Thread Management in Highly Parallel Java ApplicationsEfficient Memory and Thread Management in Highly Parallel Java Applications
Efficient Memory and Thread Management in Highly Parallel Java ApplicationsPhillip Koza
 
Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)aragozin
 
exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...
exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...
exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...tutorialsruby
 
JProfiler8 @ OVIRT
JProfiler8 @ OVIRTJProfiler8 @ OVIRT
JProfiler8 @ OVIRTLiran Zelkha
 
Java on Linux for devs and ops
Java on Linux for devs and opsJava on Linux for devs and ops
Java on Linux for devs and opsaragozin
 
자바 성능 강의
자바 성능 강의자바 성능 강의
자바 성능 강의Terry Cho
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performancepradeepfn
 
Web Sphere Problem Determination Ext
Web Sphere Problem Determination ExtWeb Sphere Problem Determination Ext
Web Sphere Problem Determination ExtRohit Kelapure
 

What's hot (20)

An Introduction To Java Profiling
An Introduction To Java ProfilingAn Introduction To Java Profiling
An Introduction To Java Profiling
 
I know why your Java is slow
I know why your Java is slowI know why your Java is slow
I know why your Java is slow
 
Software Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and FlamegraphsSoftware Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and Flamegraphs
 
Find bottleneck and tuning in Java Application
Find bottleneck and tuning in Java ApplicationFind bottleneck and tuning in Java Application
Find bottleneck and tuning in Java Application
 
Thread dump troubleshooting
Thread dump troubleshootingThread dump troubleshooting
Thread dump troubleshooting
 
JVM
JVMJVM
JVM
 
Don't dump thread dumps
Don't dump thread dumpsDon't dump thread dumps
Don't dump thread dumps
 
The Java Memory Model
The Java Memory ModelThe Java Memory Model
The Java Memory Model
 
"Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi...
"Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi..."Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi...
"Java memory model for practitioners" at JavaLand 2017 by Vadym Kazulkin/Rodi...
 
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VM
 
Efficient Memory and Thread Management in Highly Parallel Java Applications
Efficient Memory and Thread Management in Highly Parallel Java ApplicationsEfficient Memory and Thread Management in Highly Parallel Java Applications
Efficient Memory and Thread Management in Highly Parallel Java Applications
 
Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)Java profiling Do It Yourself (jug.msk.ru 2016)
Java profiling Do It Yourself (jug.msk.ru 2016)
 
exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...
exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...
exploit-writing-tutorial-part-5-how-debugger-modules-plugins-can-speed-up-bas...
 
JProfiler8 @ OVIRT
JProfiler8 @ OVIRTJProfiler8 @ OVIRT
JProfiler8 @ OVIRT
 
Java on Linux for devs and ops
Java on Linux for devs and opsJava on Linux for devs and ops
Java on Linux for devs and ops
 
De Java 8 a Java 17
De Java 8 a Java 17De Java 8 a Java 17
De Java 8 a Java 17
 
Java On Speed
Java On SpeedJava On Speed
Java On Speed
 
자바 성능 강의
자바 성능 강의자바 성능 강의
자바 성능 강의
 
Application Profiling for Memory and Performance
Application Profiling for Memory and PerformanceApplication Profiling for Memory and Performance
Application Profiling for Memory and Performance
 
Web Sphere Problem Determination Ext
Web Sphere Problem Determination ExtWeb Sphere Problem Determination Ext
Web Sphere Problem Determination Ext
 

Similar to So You Want To Write Your Own Benchmark

Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoValeriia Maliarenko
 
Accelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slidesAccelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slidesDmitry Vostokov
 
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationMonica Beckwith
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceESUG
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performancePiotr Przymus
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
 
Get Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsGet Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsScyllaDB
 
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, UkraineVladimir Ivanov
 
Code lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf LinzCode lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf LinzIvan Krylov
 
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineThe Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineMonica Beckwith
 
Code instrumentation
Code instrumentationCode instrumentation
Code instrumentationBryan Reinero
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects Andrey Karpov
 
BlueHat v18 || A mitigation for kernel toctou vulnerabilities
BlueHat v18 || A mitigation for kernel toctou vulnerabilitiesBlueHat v18 || A mitigation for kernel toctou vulnerabilities
BlueHat v18 || A mitigation for kernel toctou vulnerabilitiesBlueHat Security Conference
 
H2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleH2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleSri Ambati
 
JFokus Java 9 contended locking performance
JFokus Java 9 contended locking performanceJFokus Java 9 contended locking performance
JFokus Java 9 contended locking performanceMonica Beckwith
 
Eclipse Day India 2015 - Java bytecode analysis and JIT
Eclipse Day India 2015 - Java bytecode analysis and JITEclipse Day India 2015 - Java bytecode analysis and JIT
Eclipse Day India 2015 - Java bytecode analysis and JITEclipse Day India
 
Haskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCHaskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCdterei
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017Andrey Karpov
 

Similar to So You Want To Write Your Own Benchmark (20)

Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
 
Accelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slidesAccelerated .NET Memory Dump Analysis training public slides
Accelerated .NET Memory Dump Analysis training public slides
 
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performance
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
 
Get Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsGet Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java Applications
 
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
 
Surge2012
Surge2012Surge2012
Surge2012
 
Code lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf LinzCode lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf Linz
 
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineThe Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
 
Code instrumentation
Code instrumentationCode instrumentation
Code instrumentation
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
BlueHat v18 || A mitigation for kernel toctou vulnerabilities
BlueHat v18 || A mitigation for kernel toctou vulnerabilitiesBlueHat v18 || A mitigation for kernel toctou vulnerabilities
BlueHat v18 || A mitigation for kernel toctou vulnerabilities
 
H2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleH2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt Dowle
 
JFokus Java 9 contended locking performance
JFokus Java 9 contended locking performanceJFokus Java 9 contended locking performance
JFokus Java 9 contended locking performance
 
Eclipse Day India 2015 - Java bytecode analysis and JIT
Eclipse Day India 2015 - Java bytecode analysis and JITEclipse Day India 2015 - Java bytecode analysis and JIT
Eclipse Day India 2015 - Java bytecode analysis and JIT
 
Haskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCHaskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHC
 
Java Memory Model
Java Memory ModelJava Memory Model
Java Memory Model
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
 

Recently uploaded

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 

Recently uploaded (20)

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 

So You Want To Write Your Own Benchmark

  • 1. yo ur y r ite w ark tsk to m zni er e a nt ch Dr or B w en ou rob y ic So m n ow 8t h2 008 er 1 mb Dece
  • 2. Agenda • Introduction • Java™ micro benchmarking pitfalls • Writing your own benchmark • Micro benchmarking tools • Summary 2
  • 3. Microbenchmark – simple definition 1. Start the 2. Run the code 3. Stop the 4. Report clock clock 3
  • 4. Better microbenchmark definition • Small program • Goal: Measure something about a few lines of code • All other variables should be removed • Returns some kind of a numeric result 4
  • 5. Why do I need microbenchmarks? • Discover something about my code: • How fast is it • Calculate throughput – TPS, KB/s • Measure the result of changing my code: • Should I replace a HashMap with a TreeMap? • What is the cost of synchronizing a method? 5
  • 6. Why are you talking about this? • It’s hard to write a robust microbenchmark • it’s even harder to do it in Java™ • There are not enough Java microbenchmarking tools • There are too many flawed microbenchmarks out there 6
  • 7. Agenda • Introduction • Java micro benchmarking pitfalls • Writing your own benchmark • Micro benchmarking tools • Summary 7
  • 8. A microbenchmark story: the problem The boss asks you to solve a performance issue in one of the components Blah, blah … 8
  • 9. A microbenchmark story: the cause You find out that the cause is excessive use of Math.sqrt() 9
  • 10. A microbenchmark story: a solution? • You decide to develop a state of the art square root approximation • After developing the square root approximation you want to benchmark it against the java.lang.Math implementation 10
  • 11. SQRT approximation microbenchmark Let’s run this little piece of code in a loop and see what happens … public static void main(String[] args) { long start = System.currentTimeMillis(); // start the clock for (double i = 0; i < 10 * 1000 * 1000; i++) { mySqrt(i); // little piece of code } long end = System.currentTimeMillis(); // stop the clock long duration = end - start; System.out.format(quot;Test duration: %d (ms) %nquot;, duration); } 11
  • 12. SQRT microbenchmark results Wow, this is really fast ! Test duration: 0 (ms) 12
  • 14. SQRT microbenchmark: what’s wrong? Dynamic optimizations Garbage collection Dead code elimination The Java™ HotSpot virtual machine Classloading Dynamic Compilation On Stack Replacement 14
  • 15. The HotSpot: a mixed mode system 2 Code is 1 interpreted Profiling 3 Interpreted again Dynamic or recompiled Compilation 5 Stuff 4 Happen 15
  • 16. Dynamic compilation • Dynamic compilation is unpredictable • Don’t know when the compiler will run • Don’t know how long the compiler will run • Same code may be compiled more than once • The JVM can switch to compiled code at will 16
  • 17. Dynamic compilation cont. • Dynamic compilation can seriously influence microbenchmark results Continuous recompilation Steady-state Interpreted execution + Compiled / Interpreted code Dynamic compilation + ≠ execution Compiled code execution 17
  • 18. Dynamic optimizations • The HotSpot server compiler performs large variety of optimizations: • loop unrolling • range check elimination • dead-code elimination • code hoisting … 18
  • 19. Code hoisting ? Did he just said “code hoisting”? 19
  • 20. What the heck is code hoisting ? • Hoist = to raise or lift • Size optimization • Eliminate duplicated pieces of code in method bodies by hoisting expressions or statements 20
  • 21. Code hoisting example a + b is a busy After hoisting the expression expression a + b. A new local variable t has been introduced Optimizing Java for Size: Compiler Techniques for Code Compaction, Samuli Heilala 21
  • 22. Dynamic optimizations cont. • Most of the optimizations are performed at runtime • Profiling data is used by the compiler to improve optimization decisions • You don’t have access to the dynamically compiled code 22
  • 23. Example: Very fast square root? 10,000,000 calls to Math.sqrt() ~ 4 ms public static void main(String[] args) { long start = System.nanoTime(); int result = 0; for (int i = 0; i < 10 * 1000 * 1000; i++) { result += Math.sqrt(i); } long duration = (System.nanoTime() - start) / 1000000; System.out.format(quot;Test duration: %d (ms) %nquot;, duration); } 23
  • 24. Example: not so fast? Now it takes ~ 2000 ms ?!? public static void main(String[] args) { long start = System.nanoTime(); int result = 0; for (int i = 0; i < 10 * 1000 * 1000; i++) { result += Math.sqrt(i); Single line of code } added System.out.format(quot;Result: %d %nquot;, result); long duration = (System.nanoTime() - start) / 1000000; System.out.format(quot;Test duration: %d (ms) %nquot;, duration); } 24
  • 25. DCE - Dead Code Elimination • Dead code - code that has no effect on the outcome of the program execution public static void main(String[] args) { long start = System.nanoTime(); int result = 0; for (int i = 0; i < 10 * 1000 * 1000; i++) { result += Math.sqrt(i); } Dead Code long duration = (System.nanoTime() - start) / 1000000; System.out.format(quot;Test duration: %d (ms) %nquot;, duration); } 25
  • 26. OSR - On Stack Replacement • Methods are HOT if they cumulatively execute more than 10,000 of loop iterations • Older JVM versions did not switch to the compiled version until the method exited and was re-entered • OSR - switch from interpretation to compiled code in the middle of a loop 26
  • 27. OSR and microbenchmarking • OSR’d code may be less performant • Some optimizations are not performed • OSR usually happen when you put everything into one long method • Developers tend to write long main() methods when benchmarking • Real life applications are hopefully divided into more fine grained methods 27
  • 28. Classloading • Classes are usually loaded only when they are first used • Class loading takes time • I/O • Parsing • Verification • May flow your benchmark results 28
  • 29. Garbage Collection • JVM automatically claim resources by • Garbage collection • Objects finalization • Outside of developer’s control • Unpredictable • Should be measured if invoked as a result of the benchmarked code 29
  • 30. Time measurement How long is one millisecond? public static void main(String[] args) throws InterruptedException { long start = System.currentTimeMillis(); Thread.sleep(1); final long end = System.currentTimeMillis(); final long duration = (end - start); System.out.format(quot;Test duration: %d (ms) %nquot;, duration); } Test duration: 16 (ms) 30
  • 31. System.curremtTimeMillis() • Accuracy varies with platform Resolution Platform Source 55 ms Windows 95/98 Java Glossary 10 – 15 ms Windows NT, 2K, XP, 2003 David Holmes 1 ms Mac OS X Java Glossary 1 ms Linux – 2.6 kernel Markus Kobler 31
  • 32. Wrong target platform • Choosing the wrong platform for your microbenchmark • Benchmarking on Windows when your target platform is Linux • Benchmarking a highly threaded application on a single core machine • Benchmarking on a Sun JVM when the target platform is Oracle (BEA) JRockit 32
  • 33. Caching • Caching • Hardware – CPU caching • Operating System – File system caching • Database – query caching 33
  • 34. Caching: CPU L1 and L2 caches • The more the data accessed are far from the CPU, the more the delays are high • Size of dataset affects access cost Array size Time (us) Cost (ns) 16k 413451 9.821 8192K 5743812 136.446 Jcachev2 results for Intel® core™2 duo T8300, L1 = 32 KB, L2 = 3 MB 34
  • 35. Busy environment • Running in a busy environment – CPU, IO, Memory 35
  • 36. Agenda • Introduction • Java micro benchmarking pitfalls • Writing your own benchmark • Micro benchmarking tools • Summary 36
  • 38. Warm-up up your code • Let the JVM reach steady state execution profile before you start benchmarking • All classes should be loaded before benchmarking • Usually executing your code for ~10 seconds should be enough 38
  • 39. Warm-up up your code – cont. • Detect JIT compilations by using • CompilationMXBean. getTotalCompilationTime() • -XX:+PrintCompilation • Measure classloading time • Use the ClassLoadingMXBean 39
  • 40. CompilationMXBean usage import java.lang.management.ManagementFactory; import java.lang.management.CompilationMXBean; long compilationTimeTotal; CompilationMXBean compBean = ManagementFactory.getCompilationMXBean(); if (compBean.isCompilationTimeMonitoringSupported()) compilationTimeTotal = compBean.getTotalCompilationTime(); 40
  • 41. Dynamic optimizations • Avoid on stack replacement • Don’t put all your benchmark code in one big main() method • Avoid dead code elimination • Print the final result • Report unreasonable speedups 41
  • 42. Garbage Collection • Measure garbage collection time • Force garbage collection and finalization before benchmarking • Perform enough iteration to reach garbage collection steady state • Gather gc stats: -XX:PrintGCTimeStamps -XX:PrintGCDetails 42
  • 43. Time measurement • Use System.nanoTime() • Microseconds accuracy on modern operating systems and hardware • Not worse than currentTimeMillis() • Notice: Windows users • executes in microseconds • don’t overuse ! 43
  • 44. JVM configuration • Use similar JVM options to your target environment: • -server or –client JVM • Enough heap space (-Xmx) • Garbage collection options • Thread stack size (-Xss) • JIT compiling options 44
  • 45. Other issues • Use fixed size data sets • Too large data sets can cause L1 cache blowout • Notice system load • Don’t play GTA while benchmarking ! 45
  • 46. Agenda • Introduction • Java micro benchmarking pitfalls • Writing your own benchmark • Micro benchmarking tools • Summary 46
  • 47. Java™ benchmarking tools • Various specialized benchmarks • SPECjAppServer ® • SPECjvm™ • CaffeineMark 3.0™ • SciMark 2.0 • Only a few benchmarking frameworks 47
  • 48. Japex Micro-Benchmark framework • Similar in spirit to JUnit • Measures throughput – work over time • Transactions Per Second (Default) • KBs per second • XML based configuration • XML/HTML reports 48
  • 49. Japex: Drivers • Encapsulates knowledge about a specific algorithm implementation • Must extend JapexDriverBase public interface JapexDriver extends Runnable { public void initializeDriver(); public void prepare(TestCase testCase); public void warmup(TestCase testCase); public void run(TestCase testCase); public void finish(TestCase testCase); public void terminateDriver(); } 49
  • 50. Japex: Writing your own driver public class SqrtNewtonApproxDriver extends JapexDriverBase { private long tmp; … @Override public void warmup(TestCase testCase) { tmp += sqrt(getNextRandomNumber()); } … } 50
  • 51. Japex: Test suite <testSuite name=quot;SQRT Test Suitequot; xmlns=http://www.sun.com/japex/testSuite …> <param name=quot;libraryDirquot; value=quot;C:/java/japex/libquot;/> <param name=quot;japex.classPathquot; value=quot;./target/classesquot;/> <param name=quot;japex.runIterationsquot; value=quot;1000000quot; /> <driver name=quot;SqrtApproxNewtonDriverquot;> <param name=quot;Descriptionquot; value=quot;Newton Driverquot;/> <param name=quot;japex.driverClass“ value=quot;com.alphacsp.javaedge.benchmark. japex.driver.SqrtNewtonApproxDriverquot;/> </driver> <testCase name=quot;testcase1quot;/> </testSuite> 51
  • 53. Japex: more chart types Scatter chart Line chart 53
  • 54. Japex: pros and cons • Pros • Similar to JUnit • Nice HTML reports • Cons • Last stable release on March 2007 • HotSpot issues are not handled • XML configuration 54
  • 55. Brent Boyer’s Benchmark framework • Part of the “Robust Java benchmarking” article by Brent Boyer • Automate as many aspects as possible: • Resource reclamation • Class loading • Dead code elimination • Statistics 55
  • 56. Benchmark framework example Benchmark.Params params = new Benchmark.Params(true); params.setExecutionTimeGoal(0.5); params.setNumberMeasurements(50); Runnable task = new Runnable() { public void run() { sqrt(getNextRandomNumber()); } }; Benchmark benchmark = new Benchmark(task, params); System.out.println(benchmark.toString()); 56
  • 57. Benchmark single line summary Benchmark output: first = 25.702 us, mean = 91.070 ns (CI deltas: -115.591 ps, +171.423 ps) sd = 1.451 us (CI deltas: -461.523 ns, +676.964 ns) WARNING: execution times have mild outliers, SD VALUES MAY BE INACCURATE 57
  • 58. Outlier and serial correlation issues • Records outlier and serial correlation issues • Outliers indicate that a major measurement error happened • Large outliers - some other activity started on the computer during measurement • Small outliers might hint that DCE occurred • Serial correlation indicates that the JVM has not reached its steady-state performance profile 58
  • 59. Benchmark : pros and cons • Pros • Handles HotSpot related issues • Detailed statistics • Cons • Each run takes a lot of time • Not a formal project • Lacks documentation 59
  • 60. Agenda • Introduction • Java micro benchmarking pitfalls • Writing your own benchmark • Micro benchmarking tools • Summary 60
  • 61. Summary 1 • Micro benchmarking is hard when it comes to Java™ • Define what you want to measure and how want to do it, pick your goals • Know what you are doing • Always warm-up your code • Handle DCE, OSR, GC issues • Use fixed size data sets and fixed work 61
  • 62. Summary 2 • Do not rely solely on microbenchmark results • Sanity check results • Use a profiler • Test your code in real life scenarios under realistic load (macro-benchmark) 62
  • 63. Summary: resources • http://www.ibm.com/developerworks/java/librar y/j-benchmark1.html • http://www.azulsystems.com/events/javaone_20 02/microbenchmarks.pdf • https://japex.dev.java.net/ • http://www.ibm.com/developerworks/java/librar y/j-jtp12214/ • http://www.dei.unipd.it/~bertasi/jcache/ 63