<AWT Dev>  Review request for 8015730: PIT: On Linux, OGL=true and fbobject=false leads to deadlock or infinite loop
anthony.petrov at oracle.com
Wed Jul 3 08:46:29 PDT 2013
On 07/03/2013 06:47 PM, Artem Ananiev wrote:
> On 7/1/2013 8:50 PM, Anthony Petrov wrote:
>> Thanks for the additional information, Anton. Since this fix simply
>> reverts the behavior in GLXSurfaceData.c back to the pre-8005607 era, it
>> could probably be considered a good interim solution for the problem.
>> I'd like to hear Artem's opinion on this, though. Should we file a P4
>> bug to investigate the issue further so that in the future we could
>> avoid calling XSync() w/o the AWTLock?
> I raised exactly the same concerns about calling XSync and waiting on
> the AWT lock, when we discussed this issue offline with Anton. Before
> the fix for 8005607, XSync() was called without AWT lock, which is not
> what the current fix is: now XSync() is not called at all. So behavior
> will be different than it was before 8005607.
I believe there's a misunderstanding. With the current fix (8015730)
XSync is being called:
> 398 XSync(awt_display, False);
All other usages of the EXEC_WITH_XERROR_HANDLER() pass TRUE to the
RESTORE_XERROR_HANDLER(), and hence call XSync(), too (in
XErrorHandlerUtil.java) with or without the current fix...
> Anton, did you investigate, why XSync() was in the native macros before
> 8005607? Note that all the XErrorHandler code in XAWT worked fine
> without XSync() before that fix, so why does the native Java2D require
This isn't correct. XSYnc() is really required when using an
XErrorHandler to ensure that the request that may generate an error is
sent to the X server. While this doesn't ensure if the request gets
processed, it's nevertheless better (and works in most practical cases)
than going w/o an XSync() call.
>> best regards,
>> On 07/01/2013 07:24 PM, Anton Litvinov wrote:
>>> Hello Anthony,
>>> Thank you for the review of this fix. I would like to remark that this
>>> deadlock is a regression of the fix for the bug 8005607, and in the code
>>> of the file "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>>> before 8005607 fix, where the previous XError handling mechanism not
>>> involving "sun.awt.X11.XErrorHandlerUtil" class was used, native
>>> "XSync()" function was called without acquiring of AWT lock. So a fix
>>> for the current bug with a deadlock just reverted a part of the fix
>>> 8005607 which enforced taking AWT lock from the function
>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file
>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c". Answers to
>>> your questions are provided below.
>>> 1. "AWT EventQueue" holds AWT lock and waits till "Java2D Queue Flusher"
>>> thread finishes its job, because in the method
>>> "sun.java2d.opengl.OGLSurfaceData.initSurface(final int width, final int
>>> height)" execution of "initSurfaceNow(int width, int height)" is
>>> dispatched to "Java2D Queue Flusher". Before this dispatching in the
>>> method "initSurface" AWT lock is taken by the lines
>>> 308 OGLRenderQueue rq = OGLRenderQueue.getInstance();
>>> 309 rq.lock();
>>> and then with held AWT lock "AWT EventQueue" thread starts waiting on
>>> the second lock "sun.java2d.opengl.OGLRenderQueue.flusher" in the method
>>> 181 wait();
>>> 2. Yes, I investigated the option of waiting on AWT lock instead of
>>> "sun.java2d.opengl.OGLRenderQueue.flusher" lock in the class
>>> "sun.java2d.opengl.OGLRenderQueue", but this is impossible, because
>>> access to the always running thread "Java2D Queue Flusher" should be
>>> synchronized on some lock other than AWT lock, otherwise there will be a
>>> performance degradation, because it will be trying to get AWT lock each
>>> 100 milliseconds. As I understood a possible solution for this problem
>>> can be not locking on AWT lock before dispatching execution of any code
>>> to "Java2D Queue Flusher" or complete refactoring of locking mechanism
>>> in the class "sun.java2d.opengl.OGLRenderQueue". Since the current bug
>>> blocks SQE from running any tests involving OpenGL and does not allow to
>>> run any Java GUI application with enabled OpenGL rendering on Linux OS,
>>> I suppose the variant of refactoring is not acceptable. That is why as
>>> the most secure solution I decided just to call XSync() from
>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" as it was
>>> before the fix for 8005607.
>>> Thank you,
>>> On 7/1/2013 3:11 PM, Anthony Petrov wrote:
>>>> Hi Anton,
>>>> I'm not sure if this a good fix since it enabled the GL thread to call
>>>> Xlib APIs w/o acquiring the AWTLock. This may not present a problem
>>>> currently since we know exactly when this method is called and that
>>>> another thread is holding the lock and isn't calling other X11
>>>> functions at the moment. But I doubt this knowledge will be widely
>>>> known and remembered in the future, and if another thread starts
>>>> calling X11 routines, we'll get into trouble...
>>>> Why would another thread (the AWT EventQueue if I got the problem
>>>> right) hold the AWTLock and wait till the GL thread finishes its job?
>>>> I'd assume it should release the lock for the period of waiting. This
>>>> would allow the GL thread to acquire the lock and perform the XSync()
>>>> call w/o any potential issues. Have you investigated this option?
>>>> best regards,
>>>> On 06/28/2013 09:16 PM, Anton Litvinov wrote:
>>>>> Could you please review the following fix for a bug, which consists
>>>>> in a
>>>>> deadlock provoked by concurrency between AWT-EventQueue and Java2D
>>>>> Flusher for taking AWT lock, when OpenGL rendering is enabled.
>>>>> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015730
>>>>> Webrev: http://cr.openjdk.java.net/~alitvinov/8015730/webrev.00
>>>>> The fix allows the code from the native function
>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" of the file
>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" execute
>>>>> XError handling procedures using "sun.awt.X11.XErrorHandlerUtil" class
>>>>> without acquiring AWT lock. It is the only available solution for this
>>>>> problem, because the current design of
>>>>> "sun.java2d.opengl.OGLRenderQueue" class does not allow to take AWT
>>>>> in Java2D Queue Flusher thread without reaching a deadlock, since all
>>>>> calls to the method
>>>>> "sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)" are
>>>>> guarded by AWT lock.
>>>>> Thank you,
More information about the awt-dev