JMH 筆記

使用 JMH 进行微基准测试：不要猜，要测试！

這幾天讀了幾篇很有趣的文章, 是關於 lambda 跟 jvm 效能評估的文章,
分別是
Java8 Lambda表达式和流操作如何让你的代码变慢5倍
 使用JMH进行微基准测试：不要猜，要测试！

JMH

JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targetting the JVM.

JMH 是用來衡量 JVM 容器上運作的
(Java, Scala, Kotlin, Groovy, Clojure, etc.) 效能的工具,
官方建議透過 maven 來建立測試專案, 可以避免一些奇怪設定影響效能的問題,
groupdId 就替換成自己的 package name 吧,
artifactId 就替換成測試的 project name, 會按照這個 project name 在當下路徑建立一個資料夾,


mvn archetype:generate \
          -DinteractiveMode=false \
          -DarchetypeGroupId=org.openjdk.jmh \
          -DarchetypeArtifactId=jmh-java-benchmark-archetype \
          -DgroupId=com.example \
          -DartifactId=jmh-benchmark \
          -Dversion=1.0

maven Builde 出來的 code


import org.openjdk.jmh.annotations.Benchmark;

public class MyBenchmark {

    @Benchmark
    public void testMethod() {
        // This is a demo/sample template for building your JMH benchmarks. Edit as needed.
        // Put your benchmark code here.
    }

}

不管 3721 跑了再說,


$ cd jmh-benchmark
$ mvn clean install
$ java -jar target/benchmarks.jar

運行結果


# JMH version: 1.19
# VM version: JDK 1.8.0_92, VM 25.92-b14
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_92.jdk/Contents/Home/jre/bin/java
# VM options: 
# Warmup: 20 iterations, 1 s each
# Measurement: 20 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.example.MyBenchmark.testMethod

...

# Run progress: 90.00% complete, ETA 00:00:40
# Fork: 10 of 10
# Warmup Iteration   1: 3189934685.250 ops/s
# Warmup Iteration   2: 3068794073.424 ops/s
# Warmup Iteration   3: 3236114041.508 ops/s
# Warmup Iteration   4: 3134197602.346 ops/s
# Warmup Iteration   5: 3148638128.516 ops/s
# Warmup Iteration   6: 3173960761.557 ops/s
# Warmup Iteration   7: 3169265377.660 ops/s
# Warmup Iteration   8: 3077919609.443 ops/s
# Warmup Iteration   9: 3166464153.044 ops/s
# Warmup Iteration  10: 3078477796.372 ops/s
# Warmup Iteration  11: 3099929982.724 ops/s
# Warmup Iteration  12: 3036217732.999 ops/s
# Warmup Iteration  13: 3006113218.527 ops/s
# Warmup Iteration  14: 3181689542.757 ops/s
# Warmup Iteration  15: 3119538331.842 ops/s
# Warmup Iteration  16: 3105301506.039 ops/s
# Warmup Iteration  17: 3048849508.645 ops/s
# Warmup Iteration  18: 3072101635.390 ops/s
# Warmup Iteration  19: 3144030955.986 ops/s
# Warmup Iteration  20: 3151332956.538 ops/s
Iteration   1: 3164452829.947 ops/s
Iteration   2: 3113814464.802 ops/s
Iteration   3: 3052097465.786 ops/s
Iteration   4: 2980582092.738 ops/s
Iteration   5: 3042852421.053 ops/s
Iteration   6: 3091791507.999 ops/s
Iteration   7: 3137421146.240 ops/s
Iteration   8: 2921380582.360 ops/s
Iteration   9: 2991112676.148 ops/s
Iteration  10: 3141795516.024 ops/s
Iteration  11: 3106155855.084 ops/s
Iteration  12: 3099071004.994 ops/s
Iteration  13: 3165802127.937 ops/s
Iteration  14: 3117700695.037 ops/s
Iteration  15: 3184560049.227 ops/s
Iteration  16: 3176507506.414 ops/s
Iteration  17: 3133869785.413 ops/s
Iteration  18: 3150455744.882 ops/s
Iteration  19: 3189445594.500 ops/s
Iteration  20: 3089706559.098 ops/s


Result "com.example.MyBenchmark.testMethod":
  3133357710.097 ±(99.9%) 12882845.404 ops/s [Average]
  (min, avg, max) = (2921380582.360, 3133357710.097, 3284444409.646), stdev = 54546774.121
  CI (99.9%): [3120474864.693, 3146240555.501] (assumes normal distribution)


# Run complete. Total time: 00:06:44

Benchmark                Mode  Cnt           Score          Error  Units
MyBenchmark.testMethod  thrpt  200  3133357710.097 ± 12882845.404  ops/s

說明

我沒有找到很明確的說明文件, 下面的說明是從網路上整理來跟一部份自己猜測的

運行的時候應盡量關閉不必要的 applications , 確保沒有其他變因。
運行的次數越多越好, 避免 max / min 影響, 結果是取平均值。
最上方的說明, Warmup: 20 iterations, 1 s each 預熱 20 次, 每次執行 1s, 不是很確定預熱的原理跟用意, 猜測是減少啟動 JVM 所造成的變因？
最上方的說明, Measurement: 20 iterations, 1 s each 評估 20 次, 每次執行 1s。
評估的單位是 ops/s (operations per second), 每秒運行有 @Benchmark 標記的 method 的次數。
Benchmark mode: Throughput, ops/time Throughput (吞吐量), 每秒可以運作的次數作為衡量標準, 其他 mode 下面補充。
最下方的 Result, 3133357710.097 ±(99.9%) 12882845.404 ops/s, 平均是 3133357710.097 ops/s , 誤差上下 12882845.404 ops/s。
(min, avg, max) = (2921380582.360, 3133357710.097, 3284444409.646), stdev = 54546774.121, stdev 樣本標準差 54546774.121。
CI (99.9%): [3120474864.693, 3146240555.501], 常態分佈(高斯分佈) !? 數學不好, 不太確定。

Mode

Mode.Throughput - 評估時間內吞吐量, 單位時間內的執行次數。
Mode.AverageTime - 評估執行平均時間, 多次執行花費的時間, 取平均值。
Mode.SampleTime - 評估樣本(Sample) 的執行時間, (n %) Sample 在某個時間內執行完成
Mode.SingleShotTime - 冷測試評估, 不做 JVM warm up, 只執行一次有 @Benchmark 標記的 method, 用來評估在 JVM 啟動運行所需的時間。

Dead Code Elimination

Dead Code 指的是沒有被使用的 code, 比如下方的 int sum = a + b;,
sum 沒有繼續做任何運算處理, jmh 會因為 sum 沒有被使用,
而忽略評估 a + b 這段運算, 所以評估就不準啦。


    @Benchmark @BenchmarkMode(Mode.Throughput)
    public void testMethod() {
        // This is a demo/sample template for building your JMH benchmarks. Edit as needed.
        // Put your benchmark code here.

        int a = 1;
        int b = 2;
        int sum = a + b;
    }

Avoiding Dead Code Elimination

解決 Dead Code 的方法,

return sum, 讓 sum 確實有被使用。
Passing Value to a Black Hole , 意思是弄一個黑洞把變數丟進去, 假裝黑洞用了那個變數, 類似 Mockito 的 @Mock, 做法看下來


import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.infra.Blackhole;

public class MyBenchmark {

    @Benchmark
    public void testMethod(Blackhole blackhole) {
        int a = 1;
        int b = 2;
        int sum = a + b;
        blackhole.consume(sum);
    }
}

Return the result of your code from the benchmark method.
Pass the calculated value into a "black hole" provided by JMH.

大大們的筆記

tutorials.jenkov.com, 滿詳細的一篇, http://tutorials.jenkov.com/java-performance/jmh.html#your-first-jmh-benchmark
java-performance.info - http://java-performance.info/jmh/
importnew - http://www.importnew.com/12548.html
blog.dyngr.com - http://blog.dyngr.com/blog/2016/10/29/introduction-of-jmh/

Just note