Thanks for the details on your benchmarks. I would like sometime to extend BLP t...

JacksonAllan · on Nov 15, 2023

I'll try to download and play around with your benchmarks when I have a chance. After reading you explanation of how you create the tables at the desired load factor, some of those plots definitely look rather odd to me. What I'd expect to see across all your benchmarks is a bunch of upward curves tapering off at the top (or perhaps just an upward liner lines, given that your horizontal scale is exponential). Basically, the performance of a given table at the same load factor should be fundamentally similar irrespective of how many elements are in the table, except that the higher the count gets, the less frequently the table will benefit from incidental cache hits when consecutive lookups coincidentally hit the same part of the buckets arrays. You can see this in my benchmarks (except for the cumulative "Total time to insert N nonexisting elements" benchmarks and the iteration benchmarks, which are a whole other can of worms). Notice how my peaks grow higher with the element count but tapper off on the right-hand side? In contrast, your plots' data points (which I think should be analogous to the peaks in my graphs) seem to jump around, with the high-element-count points often appearing lower than the low-element count ones. This seems very unexpected.

H̵e̵r̵e̵'̵s̵ ̵o̵n̵e̵ ̵i̵d̵e̵a̵:̵ ̵I̵t̵'̵s̵ ̵b̵e̵e̵n̵ ̵m̵a̵n̵y̵ ̵y̵e̵a̵r̵s̵ ̵s̵i̵n̵c̵e̵ ̵I̵ ̵t̵o̵u̵c̵h̵e̵d̵ ̵J̵a̵v̵a̵.̵ ̵H̵o̵w̵ ̵d̵o̵e̵s̵ ̵J̵a̵v̵a̵'̵s̵ ̵g̵a̵r̵b̵a̵g̵e̵ ̵c̵o̵l̵l̵e̵c̵t̵o̵r̵ ̵w̵o̵r̵k̵?̵ ̵D̵o̵e̵s̵ ̵i̵t̵ ̵k̵i̵c̵k̵ ̵i̵n̵ ̵i̵n̵t̵e̵r̵m̵i̵t̵t̵e̵n̵t̵l̵y̵?̵ ̵C̵o̵u̵l̵d̵ ̵t̵h̵e̵ ̵g̵a̵r̵b̵a̵g̵e̵ ̵c̵o̵l̵l̵e̵c̵t̵o̵r̵ ̵b̵e̵ ̵m̵u̵d̵d̵l̵i̵n̵g̵ ̵y̵o̵u̵r̵ ̵m̵e̵a̵s̵u̵r̵e̵m̵e̵n̵t̵s̵,̵ ̵a̵n̵d̵ ̵i̵f̵ ̵s̵o̵,̵ ̵c̵a̵n̵ ̵i̵t̵ ̵b̵e̵ ̵d̵i̵s̵a̵b̵l̵e̵d̵?̵ Edit: Sorry, I just reread your comment and saw that you already addressed garbage collection.

senderista · on Nov 15, 2023

Despite my disclaimer about GC (and my effort to use JMH properly), I find it difficult to trust microbenchmarks on the JVM. I don't know when I'll have time for this, but "someday" I'd like to port this whole codebase to Rust/Criterion (which should be straightforward because the algorithms/data structures are "trivial"), and see if the more surprising artifacts persist. I do find the overall differentiation between RH and BLP surprising; I expected them to have pretty similar performance profiles.

In any case, I would definitely appreciate someone else rerunning the benchmarks on a different platform/JVM!