Extend NMT to JDK native libraries?
thomas.stuefe at gmail.com
Wed Nov 21 14:28:38 UTC 2018
(yet again not sure if this is serviceablity-dev or not - I start at
hs-dev, feel free to move this mail around.)
Do we have any plans to extend NMT to cover native JDK libaries too?
That would be a really cool feature.
We at SAP have done a similar thing in the past:
We have a monitoring facility in our port which tracks C-heap
allocations, non-imaginatively called "malloc statistic". This feature
predates NMT somewhat - had we had NMT at that time, we would not have
bothered. Our Malloc statistic is less powerful than NMT and
implementation-wise completely at odds with it, so I never felt the
urge to bring it upstream. However, one thing we did do is we extended
its coverage to the JDK native code.
This has been quite helpful in the past to find leaks in JDK, see
We did this by exposing os::malloc, os::free etc from libjvm.so
("JVM_malloc", "JVM_realloc", "JVM_free"). In the JDK native code, we
then either manually replaced calls to raw ::malloc(), ::free() etc
with JVM_malloc(), JVM_free(). Or, in places where this was possible,
we did this replacement stuff wholesale by employing a header which
re-defined malloc(), free() etc JVM_malloc, JVM_free etc. Of course,
we also had to add a number of linkage dependencies to the libjvm.so.
All this is pretty standard stuff.
One detail stood out: malloc headers are evil. In our experience, JDK
native code was more difficult to control and "unbalanced
malloc/frees" kept creeping in - especially with the
wholesale-redefinition technique. Unbalanced mallocs/frees means cases
where malloc() is instrumented but ::free() stays raw, or the other
way around. Both combinations are catastrophic since os::malloc uses
malloc headers. We typically corrupted the C-Heap and crashed, often
much later in completely unrelated places.
These types of bugs were very hard to spot and hence very expensive.
And they can creep in in many ways. One example, there exist a
surprising number of system APIs which return results in C-heap and
require the user to free that, which of course must happen with raw
::free(), not os::free().
We fixed this by not using malloc headers. That means a pointer
returned by os::malloc() is compatible with raw ::free() and vice
versa. The only bad thing happening would be our statistic numbers
being slightly off.
Instead of malloc headers we use a hand-groomed hash table to track
the malloced memory. It is actually quite fast, fast enough that this
malloc statistic feature is on-by-default in our port.
Of course, if we extend NMT to JDK native code we also would want to
extend it to mmap() etc - we never did this with our statistic, since
it only tracked malloc.
What do you think? Did anyone else play with similar ideas? Would it
be worth the effort?
More information about the hotspot-dev