SA and disassemblers

Tom Rodriguez Thomas.Rodriguez at Sun.COM
Mon Sep 14 08:28:27 PDT 2009

On Sep 11, 2009, at 5:59 AM, Volker Simonis wrote:

> Hi Tom,
> I like the extension of the decode_instructions interface, especially
> because this is something we are already using anyway. We have
> extended the disassembler in the VM to decode instructions which had
> been dumped by the VM earlier when no disassembler library was
> available. We realized this with an 'AbstractDisassaembler' which in
> our case is the base class of 'Disassembler'. If the native
> disassembler can not be loaded, we automatically fall back to the
> abstract disassembler which prints a kind of hex dump, but with all
> the usual annotations. Later on, we can take such an output produced
> by the abstract disassembler (for example from an hs_err file) and
> decode it to real assembly code. Perhaps this is an extension you also
> want to consider? I could contribute some code if you'd like.

That's interesting.  We have a separate tool that we use like this for  
decoding the disassembly in an hs_err.  It's different than our hsdis  
library since it allows disassembly any supported CPU on any supported  
CPU, so we can process an x86 hs_err on sparc or vice versa.  If you  
are motivated to contribute your changes they sound interesting so  
send them out.

> Besides this I have some remarks/suggestions regarding hsdis:
> - I don't really think it is necessary for hsdis.c to export two
> functions: decode_instructions() and decode_instructions_virtual(). I
> think the more general one decode_instructions_virtual() would be
> enough together with a small change in disassembler.{cpp,hpp}:

I'd updated disassembler.cpp to use the new interface but I want to  
maintain the old interface so the hsdis library itself will continue  
to work with old JVMs.

> address decode_env::decode_instructions(address start, address end,
> address virtual_start /* = 0*/) {
>  if (virtual_start == 0) virtual_start = start;
> This would unify the hsdis interface used by the VM and the SA and
> allow one to easily implement the decoding of "abstract disassembly"
> as described above.

I wasn't planning on exposing that argument outside of  
disassembler.cpp but there's no reason it couldn't be later if it's  

> - in hsdis.c in hsdis_read_memory_func you use a C++ style comment
> which I think should be better changed into a C style comment and
> 'membar' should probably read 'mamaddr':

Agreed.  I also fixed a few other instances of // style comments in  
that file.

> ...
> - The Readme file should contain the hint:
> "The binutils should be configured with the '--disable-nls' flag to
> disable Native Language Support, otherwise you might get an "undefined
> reference to `libintl_gettext'" if you try to load on systems
> which don't have NLS by default."

It's used automatically in the makefile when it configures binutils  
but it's true there's no explanation of why.  I added this comment:

Binutils should be configured with the '--disable-nls' flag to disable
Native Language Support, otherwise you might get an "undefined
reference to `libintl_gettext'" if you try to load on systems
which don't have NLS by default.  It also avoids build problems on
other configurations that don't include the full NLS support.

> - There's a typo in the README file of the hsdis plugin:
> line 70: 'hsdid-$' should read 'hsdis-$'

I don't see this.  The only reference I see in the README is as you say.

> - On IA64 and PPC, function pointers are pointers to function
> descriptors, so hsdis-demo.c should be changed accordingly:

I've added those.

> - In the Makefile I need to link hsdis-demo against 'ld'. So I suggest
> adding (at least for Linux):
> LDFLAGS		+= -ldl

The current Makefile already includes that.  I think you are looking  
at an old version of these files. I updated hsdis quite a bit in 
  and fixed the hsdid typo, the binutils setup and the ldflags.  It  
should just work out of the box as of that version.

> - I don't understand why 'decode_instructions' in hsdis.c needs to
> print a newline after every instruction. I think it would be more
> consistent if this newline would be printed in the appropriate
> callback in the disassembler class. I would therefore suggest to
> delete the following lines from decode_instructions in hsdis.c:
>     /* follow each complete insn by a nice newline */
>     /*(*printf_callback)(printf_stream, "\n");*/

I agree is seems like the wrong place to do that for an disassembly  
API.  I could make it default to not printing the newline when called  
as decode_instructions_virtual but keep the old behaviour for  
compatibility.  What do you think?

> - The previous change would also make it easier to use
> decode_instructions() to decode a single instruction. But it would be
> additionally necessary to change the line:
>    while (p < end && !app_data.losing) {
> to:
>    while (p <= end && !app_data.losing) {
> to make that possible. Additionally, "out of bound check" in
> hsdis_read_memory_func() will have to be removed, if we want to decode
> single instructions:
> if (memaddr_p + length > app_data->end) {
>  /* read is out of bounds */
>  return EIO;
> } else {
> I don't think that this does any harm, but on the other side it will
> make the hsdis plugin more generally usable.

Except that you could crash by reading further than was actually  
provided.  Alternatively we could add end_va to  
decode_instructions_virtual to make it complete.  How about this for  
the signature?

void* decode_instructions_virtual(uintptr_t start_va, uintptr_t end_va,
                                   unsigned char* buffer, int length,
                                   void* (*event_callback)(void*,  
const char*, void*),
                                   void* event_stream,
                                   int (*printf_callback)(void*, const  
char*, ...),
                                   void* printf_stream,
                                   const char* options);

> Finally I would like to suggest moving the printing of nmethods and
> code blobs into the corresponding print methods of the CodeBlob and
> nmethod classes.This could be easily achieved with the new
> functionality which allows the decoding of a single assembler
> instruction. I don't see a reason why the disassembler class needs to
> know the internals of nmethod and CodeBlob. In my opinion it would be
> a cleaner design if the disassembler class would export a clear
> interface for decoding assembler instructions and leave all the rest
> to the class like nmethod or CodeBlob which use it. Again this would
> simplify the implementation of an "AbstractDisassembler" class as
> described above because the code which prints all the various
> annotations around the instructions would not have to be duplicated in
> the "AbstractDisassembler" class.

I'm not against that but I'm don't want to tackle that as part of this  

Thanks for you feedback.


> Thank you and best regards,
> Volker
> On 8/27/09, Tom Rodriguez <Thomas.Rodriguez at> wrote:
>> One of the limitations of the current SA is that it doesn't have a
>> disassembler for amd64 which makes looking at 64 bit cores somewhat
>> challenging.  Instead of trying to implement a full disassembler  
>> for x64 in
>> Java I think we should switch the SA to using hsdis-arch for the  
>> decoding.
>> It requires a minor extension to the decode_instructions interface  
>> so that
>> we can disassemble a buffer of code that's not physically located  
>> where the
>> code came from.  I've got a working version of this that also fixes  
>> the
>> printing to match up better with what the JVM itself prints.  I was  
>> planning
>> on blowing away the other disassemblers completely because they  
>> don't really
>> fit into the new model.  I thought I'd float this before finishing  
>> it to see
>> if there are any concerns.  The initial webrev is at
>>  This includes
>> building the SA with source 1.5 to allow use of printf which will  
>> probably
>> be pushed separately.
>> tom

More information about the hotspot-dev mailing list