<AWT Dev> [8] Review request for 8015730: PIT: On Linux, OGL=true and fbobject=false leads to deadlock or infinite loop

Anton Litvinov anton.litvinov at oracle.com
Wed Jul 3 08:45:16 PDT 2013

Hello Artem,

Thank you for the review of this fix. No, I am still convinced that the 
fix reverts XError handling part of the function 
"Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file 
"jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" to not 
taking AWT lock for a call to XSync() function as it was before 8005607. 
It is so, because the fix adds an explicit call to XSync() function, 
before a call of the macro RESTORE_XERROR_HANDLER. Please look at the 
next line in the file 
"src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" from the webrev.

398     XSync(awt_display, False);

I suppose that XSync() function calls were in the native macros 
dispatching of the XErrors, which were collected during the previous 
calls to Xlib, to the XError handler registered by the application.

Unfortunately, I do not see how XSync() calls were introduced by the fix 
8005607 to XAWT and Java2D code, since that fix just moved XError 
handling code from "sun.awt.X11.XToolkit" class to a new separate class 
"sun.awt.X11.XErrorHandlerUtil". So I am sure that XSync() function was 
called for all calls of both the macro RESTORE_XERROR_HANDLER and the 
method "sun.awt.X11.XToolkit.RESTORE_XERROR_HANDLER()" before the fix 

Thank you,

On 7/3/2013 6:47 PM, Artem Ananiev wrote:
> On 7/1/2013 8:50 PM, Anthony Petrov wrote:
>> Thanks for the additional information, Anton. Since this fix simply
>> reverts the behavior in GLXSurfaceData.c back to the pre-8005607 era, it
>> could probably be considered a good interim solution for the problem.
>> I'd like to hear Artem's opinion on this, though. Should we file a P4
>> bug to investigate the issue further so that in the future we could
>> avoid calling XSync() w/o the AWTLock?
> I raised exactly the same concerns about calling XSync and waiting on 
> the AWT lock, when we discussed this issue offline with Anton. Before 
> the fix for 8005607, XSync() was called without AWT lock, which is not 
> what the current fix is: now XSync() is not called at all. So behavior 
> will be different than it was before 8005607.
> Anton, did you investigate, why XSync() was in the native macros 
> before 8005607? Note that all the XErrorHandler code in XAWT worked 
> fine without XSync() before that fix, so why does the native Java2D 
> require this?
> Thanks,
> Artem
>> -- 
>> best regards,
>> Anthony
>> On 07/01/2013 07:24 PM, Anton Litvinov wrote:
>>> Hello Anthony,
>>> Thank you for the review of this fix. I would like to remark that this
>>> deadlock is a regression of the fix for the bug 8005607, and in the 
>>> code
>>> of the file "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>>> before 8005607 fix, where the previous XError handling mechanism not
>>> involving "sun.awt.X11.XErrorHandlerUtil" class was used, native
>>> "XSync()" function was called without acquiring of AWT lock. So a fix
>>> for the current bug with a deadlock just reverted a part of the fix
>>> 8005607 which enforced taking AWT lock from the function
>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file
>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c". Answers to
>>> your questions are provided below.
>>> 1. "AWT EventQueue" holds AWT lock and waits till "Java2D Queue 
>>> Flusher"
>>> thread finishes its job, because in the method
>>> "sun.java2d.opengl.OGLSurfaceData.initSurface(final int width, final 
>>> int
>>> height)" execution of "initSurfaceNow(int width, int height)" is
>>> dispatched to "Java2D Queue Flusher". Before this dispatching in the
>>> method "initSurface" AWT lock is taken by the lines
>>>      308        OGLRenderQueue rq = OGLRenderQueue.getInstance();
>>>      309        rq.lock();
>>> and then with held AWT lock "AWT EventQueue" thread starts waiting on
>>> the second lock "sun.java2d.opengl.OGLRenderQueue.flusher" in the 
>>> method
>>> "sun.java2d.opengl.OGLRenderQueue.QueueFlusher.flushNow()"
>>>      181                    wait();
>>> 2. Yes, I investigated the option of waiting on AWT lock instead of
>>> "sun.java2d.opengl.OGLRenderQueue.flusher" lock in the class
>>> "sun.java2d.opengl.OGLRenderQueue", but this is impossible, because
>>> access to the always running thread "Java2D Queue Flusher" should be
>>> synchronized on some lock other than AWT lock, otherwise there will 
>>> be a
>>> performance degradation, because it will be trying to get AWT lock each
>>> 100 milliseconds. As I understood a possible solution for this problem
>>> can be not locking on AWT lock before dispatching execution of any code
>>> to "Java2D Queue Flusher" or complete refactoring of locking mechanism
>>> in the class "sun.java2d.opengl.OGLRenderQueue". Since the current bug
>>> blocks SQE from running any tests involving OpenGL and does not 
>>> allow to
>>> run any Java GUI application with enabled OpenGL rendering on Linux OS,
>>> I suppose the variant of refactoring is not acceptable. That is why as
>>> the most secure solution I decided just to call XSync() from
>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" as it was
>>> before the fix for 8005607.
>>> Thank you,
>>> Anton
>>> On 7/1/2013 3:11 PM, Anthony Petrov wrote:
>>>> Hi Anton,
>>>> I'm not sure if this a good fix since it enabled the GL thread to call
>>>> Xlib APIs w/o acquiring the AWTLock. This may not present a problem
>>>> currently since we know exactly when this method is called and that
>>>> another thread is holding the lock and isn't calling other X11
>>>> functions at the moment. But I doubt this knowledge will be widely
>>>> known and remembered in the future, and if another thread starts
>>>> calling X11 routines, we'll get into trouble...
>>>> Why would another thread (the AWT EventQueue if I got the problem
>>>> right) hold the AWTLock and wait till the GL thread finishes its job?
>>>> I'd assume it should release the lock for the period of waiting. This
>>>> would allow the GL thread to acquire the lock and perform the XSync()
>>>> call w/o any potential issues. Have you investigated this option?
>>>> -- 
>>>> best regards,
>>>> Anthony
>>>> On 06/28/2013 09:16 PM, Anton Litvinov wrote:
>>>>> Hello,
>>>>> Could you please review the following fix for a bug, which consists
>>>>> in a
>>>>> deadlock provoked by concurrency between AWT-EventQueue and Java2D
>>>>> Queue
>>>>> Flusher for taking AWT lock, when OpenGL rendering is enabled.
>>>>> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015730
>>>>> Webrev: http://cr.openjdk.java.net/~alitvinov/8015730/webrev.00
>>>>> The fix allows the code from the native function
>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" of the file
>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" 
>>>>> execute all
>>>>> XError handling procedures using "sun.awt.X11.XErrorHandlerUtil" 
>>>>> class
>>>>> without acquiring AWT lock. It is the only available solution for 
>>>>> this
>>>>> problem, because the current design of
>>>>> "sun.java2d.opengl.OGLRenderQueue" class does not allow to take AWT
>>>>> lock
>>>>> in Java2D Queue Flusher thread without reaching a deadlock, since all
>>>>> calls to the method
>>>>> "sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)" are
>>>>> guarded by AWT lock.
>>>>> Thank you,
>>>>> Anton

More information about the awt-dev mailing list