> There shouldn't be any swapping during the tests - I've got RAM fairly
> carefully allocated and I believe swappiness was tuned down on those
> machines, though I will double check to be certain.

Does HBase mmap() significant amounts of memory for I/O purposes? I'm
not very familiar with HBase and a quick Googling didn't yield an

With extensive mmap():ed I/O, excessive swapping of the application
seems to be a common problem even with significant memory margins,
sometimes even with swapiness turned down to 0. I've seen it happen
under several circumstances, and based on reports on the
cassandra-user mailing list during the past couple of months it seems
I'm not alone.

To be sure I recommend checking actual swapping history (or at least
check that the absolute amount of memory swapped out is reasonable
over time).

> I'll try to read through your full email in detail while looking at the
> source and the G1 paper -- right now it's a bit above my head :)

Well, just to re-iterate though I have really only begun looking at it
myself and my ramblings may be completely off the mark.

> FWIW, my tests on JRockit JRRT's gcprio:deterministic collector didn't go
> much better - eventually it fell back to a full compaction which lasted 45
> seconds or so. HBase must be doing something that's really hard for GCs to
> deal with - either on the heuristics front or on the allocation pattern
> front.

Interesting. I don't know a lot about JRockit's implementation since
not a lot of information seems to be available. I did my LRU
micro-benchmark with a ~20-30 GB heap and JRockit. I could definitely
press it hard enough to cause a fallback, but that seemed to be
directly as a result of high allocation rates simply exceeding the
forward progress made by the GC (based on blackbox observation

(The other problem was that the compaction pauses were never able to
complete; it seems compaction is O(n) with respect to the number of
objects being compacted, and I was unable to make it compact less than
1% per GC (because the command line option only accepted integral
percents), and with my object count the 1% was enough to hit the pause
time requirement so compaction was aborted every time. LIkely this
would have poor results over time as fragmentation becomes

Does HBase go into periodic modes of very high allocation rate, or is
it fairly constant over time? I'm thinking that perhaps the concurrent
marking is just not triggered early enough and if large bursts of
allocations happen when the heap is relatively full, that might be the
triggering factor?

