[9] RFR(S): 8153527: break_tty_lock_for_safepoint causes "assert(false) failed: bad tag in log" and broken compile log

Nils Eliasson nils.eliasson at oracle.com
Fri Apr 22 13:53:11 UTC 2016


The compilelog can get corrupted and the VM may assert on "failed: bad 
tag in log" when printing opto_assembly. (Print assembly turns on print 
opto_assembly if hs_dis is not present.)

When printing opto_assembly in output.cpp we may loose the ttylock 
(break_tty_lock_for_safepoint) due to a safepoint in both print_metadata 
and dump_asm. Another thread can claim the lock and start printing. When 
the safepoint is over both threads will think they own the lock. The 
content will look ok thanks to the xml stream adding the writing thread 
tag to the log. The closing xml-tag has two problems: 1) It uses a 
raw_print and may get intermingled with other output 2) The xml tag 
stack tracking may see a bad sequence of tags.

Retake the ttylock before printing the closing print_optoassembly tag. 
(I have only observed this safepoint issue with print_optoassembly.)

If another tag already has the lock and is printing print_nmethod for 
example, print opto_assembly will block. Here we can have two variants:

1) the other thread will print something else (like print_nmethod) - 
then that tag will be closed before releaseing the lock, and the tag 
stack will be consistent but the output may look like 

2) the other thread is also printing opto_assembly. Then that thread may 
yield the look during a safepoint while the first one is retaking the 
look. Then we can get <print_optoassembly id=1><print_optoassembly 
id=2></print_optoassembly (for id 1)></print_optoassembly (for id 2)> or 
<print_optoassembly id=1><print_optoassembly id=2></print_optoassembly 
(for id 2)></print_optoassembly (for id 1)>

Fortunately the xml stack consistency will be ok since it doesn't make 
any difference on what thread wrote the print_optoassembly tag.

If this issue pops-up again we must investigate if there are more places 
in the compile log code that yields the tty lock on a safepoint. 
NoSafePointVerifiers don't seem to check on all transitions.

Bug: https://bugs.openjdk.java.net/browse/JDK-8153527
Webrev: http://cr.openjdk.java.net/~neliasso/8153527/webrev.02

All regression tests.

Nils Eliasson

On 2016-04-20 22:07, Vladimir Kozlov wrote:
> On 4/20/16 12:56 PM, Nils Eliasson wrote:
>> Hi,
>> Thanks for the help,
>> I got it to work, and added NoSafePointVerifiers to make sure I hadn't
>> missed anything. Then after many test iterations it failed again. It
>> didn't fail on the NSPV, but in dump_asm we blocked on a VM entry to get
>> a ciSymbol->as_utf8. Now I am considering if I should direct dump_asm to
>> the temporary buffer too, or relax the tag checks in the xml and accept
>> that the output may need to be sorted by writer-thread before use. The
>> output looks like:
>> <writer id=1/>
>> <print_optoassembly>
>> ...
>> releases tty when blocking on a safepoint
>> <writer id=2/>
>> <print_nmethod>
>> ...
>> <writer id=1/>         // back again after safepoint writing without
>> ttylock now.
>> </print_optoassembly>  // Here we fail on an assert today when we expect
>> a closing print_nmethod tag
>> <writer id=2/>
>> </print_nmethod>
>> This is malformed xml but has enough information to be reconstructed.
>> Would this be an acceptable output?
> Yes, I think it is acceptable - we don't loose information. And it is 
> not worse than it was before.
> Thanks,
> Vladimir
>> Regards,
>> Nils
>> On 2016-04-18 19:30, Vladimir Kozlov wrote:
>>> tty would have the same problem but it use C_HEAP to allocate:
>>>     defaultStream::instance = new(ResourceObj::C_HEAP, mtInternal)
>>> defaultStream();
>>> Please, look if you can do something similar.
>>> Thanks,
>>> Vladimir
>>> On 4/18/16 4:24 AM, Nils Eliasson wrote:
>>>> Resizeable is better, but then we assert on expanding the stringbuffer
>>>> while being under a different ResourceMark.
>>>> Regards,
>>>> Nils
>>>> On 2016-04-15 19:44, Vladimir Kozlov wrote:
>>>>> Use resizable stream:
>>>>> stringStream(size_t initial_bufsize = 256);
>>>>> 1024 may not be enough.
>>>>> Thanks,
>>>>> Vladimir
>>>>> On 4/15/16 8:10 AM, Nils Eliasson wrote:
>>>>>> Hi,
>>>>>> Please review this fix of print opto_assembly.
>>>>>> Summary:
>>>>>> The compilelog can get corrupted and the VM may assert on "failed:
>>>>>> bad tag in log".
>>>>>> When printing assembly in output.cpp we first take the ttylock, 
>>>>>> print
>>>>>> the head and then the method metadata. However the
>>>>>> metadata printing makes a vm entry and may block for a safepoint and
>>>>>> will then release the lock
>>>>>> (break_tty_lock_for_safepoint). After that some of the other 
>>>>>> compiler
>>>>>> thread that haven't safepointed will take the lock
>>>>>> and the broken log will be a fact when the safepoint is over and the
>>>>>> first thread starts logging again.
>>>>>> Solution:
>>>>>> Print the method metadata to a temporary buffer, then take the tty
>>>>>> lock.
>>>>>> Testing:
>>>>>> Repro from bug stops failing.
>>>>>> Running :hotspot_all
>>>>>> (http://jdash.se.oracle.com/rbt/rbt-nils.eliasson-compiler_control-20160415-1508-10854) 
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8153527
>>>>>> Webrev: http://cr.openjdk.java.net/~neliasso/8153527/webrev.01/
>>>>>> Regards,
>>>>>> Nils Eliasson

More information about the hotspot-compiler-dev mailing list