Undefined behaviour in hotspot

Volker Simonis volker.simonis at gmail.com
Wed Apr 23 17:47:05 UTC 2014


this didn't let me sleep so I did some deeper investigation. The outcome is
that I'm pretty sure this is a bug in gcc 4.9. Following the details:

It seems that 'MacroAssembler::jump_cc(Assembler::Condition,
AddressLiteral)' is getting compiled incorrectly. Looking at the assembly
of this function in gdb shows the following:

 push   %ebp
 mov    %esp,%ebp
 push   %esi
 push   %ebx
 call   0xf67a88e5 <__x86.get_pc_thunk.bx>
 add    $0x7b61de,%ebx
 mov    0x8(%ebp),%esi
 mov    0xc(%esi),%eax
 mov    0x4(%eax),%edx
 test   %edx,%edx
 je     0xf7195df7
 lea    -0x386e28(%ebx),%eax
 push   %eax
 lea    -0x386884(%ebx),%eax
 push   %eax
 lea    -0x3872f0(%ebx),%eax
 push   $0xe3
 push   %eax
 call   0xf6c9b730 <_Z15report_vm_errorPKciS0_S0_>
 call   0xf72bc1c0 <breakpoint>
 mov    0xc(%esi),%eax
 add    $0x10,%esp
 mov    0x8(%eax),%edx
 mov    %edx,0x4(%eax)
   0xf7195dfd:    nop
   0xf7195dfe:    xchg   %ax,%ax
   0xf7195e00 <_ZN14MacroAssembler10verify_oopEP12RegisterImplPKc.part.49>:
   push   %ebp
<_ZN14MacroAssembler10verify_oopEP12RegisterImplPKc.part.49+1>:    mov

Besides the fact that the assembly for this functions is much too small,
you can see that the function has no return statement. Ath the end it just
runs into the following function (in this case
''MacroAssembler::verify_oop(RegisterImpl*, char const*)").

The check at:

   0xf7195dc6 :    mov    0x4(%eax),%edx
   0xf7195dc9 :    test   %edx,%edx

is from the assertion in InstructionMark constructor:

    InstructionMark(AbstractAssembler* assm) : _assm(assm) {
      assert(assm->inst_mark() == NULL, "overlapping instructions");

which is at the beginning of MacroAssembler::jump_cc():

void MacroAssembler::jump_cc(Condition cc, AddressLiteral dst) {
  if (reachable(dst)) {
    InstructionMark im(this);

If 'assm->inst_mark()' is not zero, we jump right to 0xf7195df7 where we
set the instruction mark of the code section:

   0xf7195df7 :    mov    0x8(%eax),%edx
   0xf7195dfa :    mov    %edx,0x4(%eax)

'_assm->set_inst_mark()' expands to 'code_section()->set_mark()' which in
turn expands to '{ _mark = _end; }'. In register 'eax' we have the
CodeSection object where the members '_end' and '_mark' have the offsets 8
and 4 respectively. So far so good - everything looks pretty until here.

But afterwards we have two NOPs and then we unconditionally enter
'MacroAssembler::verify_oop()" which is obviously wrong!
Later we indirectly call "Assembler::push(RegisterImpl*)" from
'MacroAssembler::verify_oop()" but at that point we're already lost and we
end up with a wrong CodeSection pointer as described in my previous mail.

Funny enough, the "-fsanitize=undefined" warnings are simply wrong for the
exactly same reason. If we disassemble a version of
'MacroAssembler::jump_cc(Assembler::Condition, AddressLiteral)' wich was
compiled with '-fsanitize=undefined' we will see:

   0xf71935d4 <+100>:    mov    -0x1c(%ebp),%esi
   0xf71935d7 <+103>:    test   %esi,%esi
   0xf71935d9 <+105>:    je     0xf719362f
   0xf71935db <+107>:    mov    0xc(%esi),%esi
   0xf71935de <+110>:    test   %esi,%esi
   0xf71935e0 <+112>:    je     0xf719361a
   0xf71935e2 <+114>:    mov    0x8(%esi),%edi
   0xf71935e5 <+117>:    test   %esi,%esi
   0xf71935e7 <+119>:    je     0xf7193605
   0xf71935e9 <+121>:    mov    %edi,0x4(%esi)
   0xf71935ec <+124>:    cmpl   $0xfffffff8,0x10(%ebp)
   0xf71935f0 <+128>:    je     0xf71935f2
   0xf71935f2 <+130>:    push   %eax
   0xf71935f3 <+131>:    push   %eax
   0xf71935f4 <+132>:    lea    0xb750(%ebx),%eax
   0xf71935fa <+138>:    push   $0x0
   0xf71935fc <+140>:    push   %eax
   0xf71935fd <+141>:    call   0xf6797260 <__ubsan_handle_type_mismatch at plt
   0xf7193602 <+146>:    add    $0x10,%esp
   0xf7193605 <+149>:    lea    0xb768(%ebx),%eax
   0xf719360b <+155>:    push   %edx
   0xf719360c <+156>:    push   %edx
   0xf719360d <+157>:    push   $0x0
   0xf719360f <+159>:    push   %eax
   0xf7193610 <+160>:    call   0xf6797260 <__ubsan_handle_type_mismatch at plt

We now have NULL-checks all over in the code (which doesn't fire in our
case) but after we executed 'code_section()->set_mark()' at address
0xf71935e2 to 0xf71935e9 we unconditionally run into the ubsan ("undefined
behavior sanitizer) erro handlers which print the two warnings (noticed
that I've debugged the code to verify that we didn't jump into the ubsan
handlers after a 'test' instruction).

So to cut a long story short, I've identified '-fdevirtualize' as the
culprit. '-fdevirtualize' is automatically set if we compile with -O2 but
if I compile macroAssembler_x86.o only with '-O2 -fno-devirtualize' I get a
VM which at least starts up and passes some smoke tests. @Omair: could you
please verify this. I haven't tried x86_64 until now so I can not say if
this fix also helps there.

I'll try to prepare a usefully bug-report for GCC although I've no idea how
I could cut this down to a small and reproducible test case.


On Tue, Apr 22, 2014 at 9:11 PM, Volker Simonis <volker.simonis at gmail.com>wrote:

> Hi Omair,
> I've downloaded and compiled a brand new version of gcc 4.9.0 (it was
> actually released today!)
> I could reproduce your problem on x86. On x86_64 there's a problem as
> well - the VM hangs during the initialization. Funny enough I had no
> problems on ppc64. A freshly compiled VM from jdk9/hs-comp ran without
> any problems. So this seems to be x86 related.
> I've further analyzed the problem you described and the good news is
> that it is not related to calling encoding() on a zero pointer - so
> hopefully no need to massively change HotSpot code. The assembly code
> of "Assembler::push(RegisterImpl* src)" looks as follows:
> 0xf6959174 <_ZN9Assembler4pushEP12RegisterImpl+20>:    mov
>  0xc(%ebp),%esi
> 0xf6959177 <_ZN9Assembler4pushEP12RegisterImpl+23>:    mov
>  0x8(%ebp),%edi
> 0xf695917a <_ZN9Assembler4pushEP12RegisterImpl+26>:    cmp    $0x7,%esi
> 0xf695917d <_ZN9Assembler4pushEP12RegisterImpl+29>:    jbe
> 0xf69591a3 <_ZN9Assembler4pushEP12RegisterImpl+67>
> 0xf695917f <_ZN9Assembler4pushEP12RegisterImpl+31>:    lea
> -0x3a0c4e(%ebx),%eax
> 0xf6959185 <_ZN9Assembler4pushEP12RegisterImpl+37>:    push   %eax
> 0xf6959186 <_ZN9Assembler4pushEP12RegisterImpl+38>:    lea
> -0x3a0c3d(%ebx),%eax
> 0xf695918c <_ZN9Assembler4pushEP12RegisterImpl+44>:    push   %eax
> 0xf695918d <_ZN9Assembler4pushEP12RegisterImpl+45>:    lea
> -0x39be48(%ebx),%eax
> 0xf6959193 <_ZN9Assembler4pushEP12RegisterImpl+51>:    push   $0x41
> 0xf6959195 <_ZN9Assembler4pushEP12RegisterImpl+53>:    push   %eax
> 0xf6959196 <_ZN9Assembler4pushEP12RegisterImpl+54>:    call
> 0xf6c9b730 <_Z15report_vm_errorPKciS0_S0_>
> 0xf695919b <_ZN9Assembler4pushEP12RegisterImpl+59>:    call
> 0xf72bc1c0 <breakpoint>
> 0xf69591a0 <_ZN9Assembler4pushEP12RegisterImpl+64>:    add    $0x10,%esp
> 0xf69591a3 <_ZN9Assembler4pushEP12RegisterImpl+67>:    mov
>  0xc(%edi),%edx
> 0xf69591a6 <_ZN9Assembler4pushEP12RegisterImpl+70>:    mov    %esi,%eax
> 0xf69591a8 <_ZN9Assembler4pushEP12RegisterImpl+72>:    or     $0x50,%eax
> 0xf69591ab <_ZN9Assembler4pushEP12RegisterImpl+75>:    mov
>  0x8(%edx),%ecx
> 0xf69591ae <_ZN9Assembler4pushEP12RegisterImpl+78>:    mov    %al,(%ecx)
> 'this' is loaded into register 'edi' and 'src' is loaded into register
> 'esi'. 'src' (i.e. 'esi') is compared against '0x7' which comes from
> the call to 'is_valid()'. If everything is fine, we proceed at address
> '0xf69591a3'. There we load the '_code_section' field with offest 0xc
> from the AbstractAssembler which is in register 'edi' and stroe it in
> register 'edx'. We OR the register encoding with '0x50' which succeeds
> altough the actual RegisterImpl pointer (i.e. 'src') is NULL.
> We then load the '_end' field with offest '0x8' from the CodeSection
> in register 'edx' into 'ecx'. But this gives us a wrong address (i.e.
> 0xcccccccc) and that's the reason why we crash when we try to write
> (i.e. emit_int8()) the register encoding from register 'al' to that
> address in 'ecx'.
> So the problem is that for some reason the CodeSection of the
> AbstractAssembler isn't initialized correctly. This may be related to
> the "codeBuffer.hpp:181:51: runtime error: member access within null
> pointer of type 'struct CodeSection'" - problem when compiling with
> "-fsanitize=undefined" as reported by you. I'll take a look at that
> tomorrow.
> Regards,
> Volker
> On Tue, Apr 22, 2014 at 6:34 PM, Omair Majid <omajid at redhat.com> wrote:
> > * Volker Simonis <volker.simonis at gmail.com> [2014-04-22 04:41]:
> >> I think the simplest and safest fix would be to make encoding() (and
> >> all the corresponding functions) static functions which take a
> >> Register as argument like:
> >>
> >> static int encoding(const RegisterImpl* r) { assert(is_valid(r),
> >> "invalid register"); return (intptr_t)r; }
> >>
> >> This wouldn't waste any more memory and it would be fully C++
> >> compliant at the price of a slightly more verbose usage:
> >>
> >> 2577    void Assembler::push(Register src) {
> >> 2578      int encode = prefix_and_encode(RegisterImpl::encoding(src));
> >> 2579
> >>
> >> And of course this would work with any compiler.
> >
> > Are you sure about this?
> >
> > It seems to me that a pointer created from an int does not satisfy the
> > conditions for a "safely-derived pointer". It has implementation defined
> > behaviour [1].
> >
> >> What do you think?
> >
> > That certainly works for me, but not sure for others.
> >
> > For the sake of safety, I would rather use a typedef for registers and a
> > separate class with static methods to operate on the typedef.
> >
> > Thanks,
> > Omair
> >
> > [1] http://cpp0x.centaur.ath.cx/basic.stc.dynamic.safety.html
> >
> > --
> > PGP Key: 66484681 (http://pgp.mit.edu/)
> > Fingerprint = F072 555B 0A17 3957 4E95  0056 F286 F14F 6648 4681

More information about the hotspot-dev mailing list