JNI-performance - Is it really that fast?
linuxhippy at gmail.com
Tue Mar 25 16:13:27 PDT 2008
Thanks a lot for answering that detailed. Congratulations to the
BiasedLocking work, its really great to see such inovative features in
the JVM :)
> good choice when "synchronized" doesn't fit the bill, such as when you
> might need timed waits, trylock, hand-over-hand "coupled" locking,
> etc. ReentrantLock also tends to be used in situations where the
> programmer is sure multiple threads are actively coordinating their
> operation, meaning that ReentrantLock would benefit little from biased
> locking. For most synchronization -- contended or uncontended --
> you're better off with synchronized as you get the benefits of biased
> locking, adaptive spinning, potential lock elision via escape
> analysis, and in the future, hardware transactional lock elision (http://blogs.sun.com/dave/entry/rock_style_transactional_memory_lock
I was asking because I did some benchmarking and (on my dual-core
machine, with an obscure microbenchmark) the grabbing the AWTLock for
a 1x1 rectangle takes almost as much time as the whole Xlib-processing
The code looks like:
long xgc = validate(sg2d); //simple, pure Java method
XFillRect(sg2d.surfaceData.getNativeOps(), .....); // native method
10mio 1x1 rect:
600ms native method commented out
850ms locking commented out.
1400ms locking+native method
The numbers include all the code-path from Graphics.fillRect() up to
As you can see locking (at least on my machine) is almost as expensive
as the JNI-Downcall and the real work together. I used the
The AWTLock was a java-monitor till JDK5 (not 100% sure), but was a
victim of contention because it was used also from native code and
sometimes from multiple threads (but I guess it was not heavy
contended in most cases).
IN JDK6 it was replaced with a ReentrantLock, some features like
tryLock() where used to implement the new OpenGL pipeline ...
performance also improved.
> In your case if the lock is ever shared -- that is, locked by multiple
> threads during its lifetime -- then biased locking probably won't
> provide the latency reduction benefit you're after. The object will
> likely become unbiased at some point. I suspect that sharing will
> ultimately occur in your case, but be infrequent, correct?
Exactly, the most likely scenary is that there is one rendering-thread
which does million of locks, and a few other calls from native code
(currently they upcall from C to lock the ReentrantLock).
It could also happen that there are two or more active rendering
threads at the same time, but this is not really common and a fallback
to unbiased would be totally ok.
Wouldn't be a BiasedLock something worth to implement, maybe with the
possibility how fast/likely the Lock can become unbiased?
However this really has not a lot prioritry to me ... I really should
care about other things ... somehow I entraped into this when deciding
the design of my XRender-Java2d pipeline. Sorry for all the traffic...
More information about the discuss