<AWT Dev>  Review request for 8015730: PIT: On Linux, OGL=true and fbobject=false leads to deadlock or infinite loop
artem.ananiev at oracle.com
Wed Jul 3 07:47:58 PDT 2013
On 7/1/2013 8:50 PM, Anthony Petrov wrote:
> Thanks for the additional information, Anton. Since this fix simply
> reverts the behavior in GLXSurfaceData.c back to the pre-8005607 era, it
> could probably be considered a good interim solution for the problem.
> I'd like to hear Artem's opinion on this, though. Should we file a P4
> bug to investigate the issue further so that in the future we could
> avoid calling XSync() w/o the AWTLock?
I raised exactly the same concerns about calling XSync and waiting on
the AWT lock, when we discussed this issue offline with Anton. Before
the fix for 8005607, XSync() was called without AWT lock, which is not
what the current fix is: now XSync() is not called at all. So behavior
will be different than it was before 8005607.
Anton, did you investigate, why XSync() was in the native macros before
8005607? Note that all the XErrorHandler code in XAWT worked fine
without XSync() before that fix, so why does the native Java2D require this?
> best regards,
> On 07/01/2013 07:24 PM, Anton Litvinov wrote:
>> Hello Anthony,
>> Thank you for the review of this fix. I would like to remark that this
>> deadlock is a regression of the fix for the bug 8005607, and in the code
>> of the file "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>> before 8005607 fix, where the previous XError handling mechanism not
>> involving "sun.awt.X11.XErrorHandlerUtil" class was used, native
>> "XSync()" function was called without acquiring of AWT lock. So a fix
>> for the current bug with a deadlock just reverted a part of the fix
>> 8005607 which enforced taking AWT lock from the function
>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file
>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c". Answers to
>> your questions are provided below.
>> 1. "AWT EventQueue" holds AWT lock and waits till "Java2D Queue Flusher"
>> thread finishes its job, because in the method
>> "sun.java2d.opengl.OGLSurfaceData.initSurface(final int width, final int
>> height)" execution of "initSurfaceNow(int width, int height)" is
>> dispatched to "Java2D Queue Flusher". Before this dispatching in the
>> method "initSurface" AWT lock is taken by the lines
>> 308 OGLRenderQueue rq = OGLRenderQueue.getInstance();
>> 309 rq.lock();
>> and then with held AWT lock "AWT EventQueue" thread starts waiting on
>> the second lock "sun.java2d.opengl.OGLRenderQueue.flusher" in the method
>> 181 wait();
>> 2. Yes, I investigated the option of waiting on AWT lock instead of
>> "sun.java2d.opengl.OGLRenderQueue.flusher" lock in the class
>> "sun.java2d.opengl.OGLRenderQueue", but this is impossible, because
>> access to the always running thread "Java2D Queue Flusher" should be
>> synchronized on some lock other than AWT lock, otherwise there will be a
>> performance degradation, because it will be trying to get AWT lock each
>> 100 milliseconds. As I understood a possible solution for this problem
>> can be not locking on AWT lock before dispatching execution of any code
>> to "Java2D Queue Flusher" or complete refactoring of locking mechanism
>> in the class "sun.java2d.opengl.OGLRenderQueue". Since the current bug
>> blocks SQE from running any tests involving OpenGL and does not allow to
>> run any Java GUI application with enabled OpenGL rendering on Linux OS,
>> I suppose the variant of refactoring is not acceptable. That is why as
>> the most secure solution I decided just to call XSync() from
>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" as it was
>> before the fix for 8005607.
>> Thank you,
>> On 7/1/2013 3:11 PM, Anthony Petrov wrote:
>>> Hi Anton,
>>> I'm not sure if this a good fix since it enabled the GL thread to call
>>> Xlib APIs w/o acquiring the AWTLock. This may not present a problem
>>> currently since we know exactly when this method is called and that
>>> another thread is holding the lock and isn't calling other X11
>>> functions at the moment. But I doubt this knowledge will be widely
>>> known and remembered in the future, and if another thread starts
>>> calling X11 routines, we'll get into trouble...
>>> Why would another thread (the AWT EventQueue if I got the problem
>>> right) hold the AWTLock and wait till the GL thread finishes its job?
>>> I'd assume it should release the lock for the period of waiting. This
>>> would allow the GL thread to acquire the lock and perform the XSync()
>>> call w/o any potential issues. Have you investigated this option?
>>> best regards,
>>> On 06/28/2013 09:16 PM, Anton Litvinov wrote:
>>>> Could you please review the following fix for a bug, which consists
>>>> in a
>>>> deadlock provoked by concurrency between AWT-EventQueue and Java2D
>>>> Flusher for taking AWT lock, when OpenGL rendering is enabled.
>>>> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015730
>>>> Webrev: http://cr.openjdk.java.net/~alitvinov/8015730/webrev.00
>>>> The fix allows the code from the native function
>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" of the file
>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" execute all
>>>> XError handling procedures using "sun.awt.X11.XErrorHandlerUtil" class
>>>> without acquiring AWT lock. It is the only available solution for this
>>>> problem, because the current design of
>>>> "sun.java2d.opengl.OGLRenderQueue" class does not allow to take AWT
>>>> in Java2D Queue Flusher thread without reaching a deadlock, since all
>>>> calls to the method
>>>> "sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)" are
>>>> guarded by AWT lock.
>>>> Thank you,
More information about the awt-dev