Hello, and other things
Kenneth.Russell at Sun.COM
Fri Mar 14 16:53:36 PDT 2008
Quick pointer to a project a co-worker told me about a while back:
John Rose wrote:
> On Feb 29, 2008, at 4:53 PM, Jason Fordham wrote:
>> I started thinking about targeting GCC for the JVM last week.
> That's a neat project!
> I have heard of JVMs being used to simulate very small assembly-level
> on the order of 16-bit computers. The challenges with this come from
> in a second level of virtualization. The execution of the simulated
> CPU is hard to integrate with the JVM's libraries.
>> It quickly became clear that the JVM instruction set is designed to
>> the C programming model difficult: the separation of bytecodes,
>> frames, and object space, and the generally unconvertible addressType
>> quickly led me to a model where the JVM stacks are ignored except for
>> primitive operations, while memory - for data, bss and heap - is
>> in a large array. In order to model C's function calls by pointer, I
>> figured a handle pair, class and method, hashing the strings, with a
>> linking stage after compilation to perform fixup - much as I imagine
>> slide 17 in the LangNet presentation implies.
> I agree that method handles will help with this sort of thing.
> The hard part, though, is the essentially untyped nature of C memory.
> I've seen C implementations that run over typed heaps, but they
> are artful compromises, rather than simple ports to a new backend.
> Centerline C and Zeta-C come to mind. (Both are old projects, that
> may pre-date the Google cache. I don't have references handy.)
> The latter was a C compiler for the Symbolic Lisp Machine which
> used ordered pairs (cons cells) for all C pointers, to represent the
> combination of a base address and an arbitrary offset.
> A similar product was Bounds-Check C, which widened
> pointers into little 3-tuples (min, max, cur). The idea is
> that a tuple-based pointer will never be allowed to "reach
> beyond" the heap object it was created for; such operations
> are always indeterminate, since there is no guaranteed
> distance (or ordering) of heap objects, from one instruction
> to the next, in a system like the Symbolics with a powerful GC.
> That would work very nicely on the JVM also. You could use
> the sun.misc.Unsafe API (with great care!) to handle punning
> among memory-resident primitive types. You must avoid
> using Unsafe to pun between primitives and references, because
> there is absolutely no way to control when the GC might want
> to move things around underneath your code.
>> The key obstacles I see are that the instruction set makes
>> a C-like stack expensive: there are no neat push and pop operations
>> this memory model, it feels like microcoding. Though I understand the
>> motivation, which is to protect the bytecodes from malicious or
>> lazy use
>> of buffer overflows, and other mechanisms for executing data.
> The stack is really just a shorthand for operand renaming.
> Feel free to generate code to a register-to-register machine,
> and map your virtual registers to JVM locals.
>> I like the method handle mechanism, for a variety of reasons, and I
>> would like to see some easing up on where the a stack is located so
>> operations which index into the stack are more flexible, and fast. Is
>> this possible?
> If you need a memory-resident stack, you can just build an array
> to hold it, can't you? I'm not sure where the pain point is here, yet.
> Best wishes,
> -- John
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
More information about the mlvm-dev