RFR(S) 8241071 Generation of classes.jsa is not deterministic
thomas.stuefe at gmail.com
Mon Apr 27 09:14:50 UTC 2020
Rethinking this a bit more I realize you need not addresses growing
monotonously but deterministic allocation: given a sequence of Metaspace
allocation operations (Metaspace::allocate(), Metaspace::deallocate(), and
collection of class loaders), the pointers returned
by Metaspace::allocate() should come in the same order each time that
sequence is repeated for a new VM. This invalidates some of my arguments in
my last mail, but not all.
I also thought about restrictions this places on the callers.
- class loader collection are triggered by GCs. Can be be sure that this
happens at exactly the same point at each run? Some GCs do class unloading
concurrently, which adds a nondeterministic timing factor.
- classes may be loaded concurrently by multiple threads, adding a timing
- You may have classes which are implicitly created like hidden classes for
lambdas, or reflection glue classes. Their creation may not be
deterministic. Even though they are not put into the archive, they live in
class space too and their allocation mixes up things.
Also, requiring Metaspace allocation to be deterministic requires each part
of it being deterministic (e.g. the deallocation block management). E.g. we
never could base any decision on the numerical form of an address, which is
location dependent and can vary between VM runs.
I really think reproducable builds are valuable, but my fear is that
relying on Metaspace for deterministic allocation would be too fragile.
Thanks again, Thomas
On Mon, Apr 27, 2020 at 9:58 AM Thomas Stüfe <thomas.stuefe at gmail.com>
> Hi Ioi,
> Please don't do this :)
> First off, how would this work when dumping with
> UseCompressedClassPointers off? In that case allocation would be relegated
> to non-class metaspace which cannot guarantee that kind of address
> Even in class space, I do not think you can guarantee addresses growing
> monotonously. Class unloading could happens during dump time, so space may
> be returned to class freelist and later reused. Metadata can be prematurely
> deallocated, e.g. if a class load error occurs or byte code is rewritten by
> some agent. Remainder of Metachunks are used up in a delayed fashion. All
> these cases will present you with pointers which are not growing
> I also believe this problem of non-deterministic placement is not limited
> to Symbols, but that you should see it for Klass structures too, albeit
> rarely. I believe the fact that you do not see this is an accident, or we
> are not looking that closely. E.g. if we were to change the frequency at
> which we retrieve MetaBlocks from the freeblocklist (see
> you would get more reuse of deallocated blocks and would certainly see more
> volatility in the addresses.
> This is all true with the current implementation; the upcoming new one
> uses a buddy style allocator behind the scenes where it is by no means
> guaranteed that the first chunks get used first. I think this is what still
> happens, by sheer accident, but I am hesitant to promise such a behavior in
> the future. It removes freedom from the implementation in a lot of ways.
> Small examples, we might want to shepherd certain allocations into
> separate parts of class space (e.g. Klass structures from hidden classes)
> to minimize fragmentation. Or add a mode, for testing, where we would
> allocate Klass at the very end of ccs, or at certain "round" addresses, to
> shake loose errors in the calling layers which rely too much on how Klass
> pointers look like.
> Bottomline I think the assumption that ccs allocates in
> monotonously ascending order is not even correct today, and may break very
> easily, and we should not rely on this.
> I think instead of misusing ccs for this, it would be cleaner to just
> allocate a large C heap area as backing storage for the symbols? How much
> space are we talking about? If memory is a concern, we could just reserve a
> range and commit it manually as we go.
> Or could we not order the placement of Klass and Symbol at dump time? Dump
> time is not that time critical, no?
> Thanks & Sorry,
> On Mon, Apr 27, 2020 at 7:31 AM Ioi Lam <ioi.lam at oracle.com> wrote:
>> The goal is to for "java -Xshare:dump" to produce deterministic contents
>> the CDS archive that depend only on the contents of
>>  Symbols in the CDS archive may have non-deterministic order because
>> Arena allocation is non-deterministic.
>>  The contents of the CDS shared heap region may be randomized due to
>>  With -Xshare:dump, allocate Symbols from the class space (64-bit
>> See changes in symbol.cpp for details.
>>  When running the VM with -Xshare:dump, ImmutableCollections.SALT32L is
>> initialized with a deterministic seed. NOTE: this affects ONLY when
>> VM is running with the special flag -Xshare:dump to dump the CDS
>> It does NOT affect normal execution of Java programs.
>> I also cleaned up the -Xlog:cds output and print out the CRC of each
>> CDS region, to help diagnose why two CDS archives may have different
>> - Ioi
More information about the hotspot-runtime-dev