RFR(S): 8223444: Improve CodeHeap Free Space Management

Schmidt, Lutz lutz.schmidt at sap.com
Wed May 15 12:53:59 UTC 2019

Hi Erik,

had not heard that. Just to make sure I understand your statement correctly: on SPARC, memset does something fancy which causes random byte values to become visible to concurrent readers. Correct?

As you state, sufficiently capable compilers will recognize explicit initialization loops and convert them to memset calls. Replacing the loop by a memset call therefore does not introduce a new risk. If at all, it changes probabilities.

Furthermore, I do not see concurrent readers. Segment map initialization is only called when the CodeHeap is expanded or when it is cleared in its entirety.

If anybody has I remaining concern, I’m ready to revert from memset to explicit loop.

Thanks for your heads up!

From: Erik Osterlund <erik.osterlund at oracle.com>
Date: Wednesday, 15. May 2019 at 09:22
To: Lutz Schmidt <lutz.schmidt at sap.com>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>, "hotspot-compiler-dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR(S): 8223444: Improve CodeHeap Free Space Management


Didn’t read the full conversation, only the part where it was claimed we have byte atomicity on all platforms with memset.
That’s not true. On SPARC, the default is to use block initializing stores (BIS), which exposes concurrent readers to out-of-thin-air values that were never in the original memory state or the new memory state.

Moreover, the compiler is allowed to, and will also in practice, transform a loop that performs memset, into a memset call as long as the memory being memset is non-volatile.

These problems were observed in card tables and led to the introduction of
memset_with_concurrent_readers() to force the memset not to use BIS.

Just a FYI.


On 15 May 2019, at 07:53, Schmidt, Lutz <lutz.schmidt at sap.com<mailto:lutz.schmidt at sap.com>> wrote:
Hi Vladimir,

thank you for your comments. About filling CodeHeap with bad values after split_block:
- in deallocate_tail, the leading part must remain intact. It contains valid code.
- in search_freelist, one free block is split into two. There I could invalidate the contents of both parts.
- If you want added safety, wouldn't it then be better to invalidate the block contents during add_to_freelist()? You could then be sure there is no executable code in a free block.


On 14.05.19, 23:00, "Vladimir Kozlov" <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>> wrote:

   On 5/14/19 1:09 PM, Schmidt, Lutz wrote:

Hi Vladimir,

I had the same thought re atomicity. memset() is not consistent even on one platform. But I believe it's not a factor here. The original code was a byte-by-byte loop. And we have byte atomicity on all supported platforms, even with memset().

It's a different thing with sequence of initialization. Do we really depend on byte(i) being initialized before byte(i+1)? If so, we would have a problem even with the explicit byte loop. Not on x86, but on ppc with its weak memory ordering.

   Okay, if it is byte copy I am fine with it.

About segment map marking:
There is a short description how the segment map works in heap.cpp, right before CodeHeap::find_start().
In short: each segment map element contains an (unsigned) index which, when subtracted from that element index, addresses the segment map element where the heap block starts. Thus, when you re-initialize the tail part of a heap block range to describe a newly formed heap block, the leading part remains valid.

Segmap  before             after
Index    split             split
  I       0 <- block start   0 <- block start (now shorter)
  I+1     1                  1    each index 0..9 still points
  I+2     2                  2    back to the block start
  I+3     3                  3
  I+4     4                  4
  I+5     5                  5
  I+6     6                  6
  I+7     7                  7
  I+8     8                  8
  I+9     9                  9
  I+10    10                 0 <- new block start
  I+11    11                 1
  I+12    12                 2
  I+13    13                 3
  I+14    14                 4
  I+15    0 <- block start   0 <- block start
  I+16    1                  1
  I+17    2                  2
  I+18    3                  3
  I+19    4                  4

There is a (very short) description about what's happening at the very end of search_freelist(). split_block() is called there as well. Would you like to see a similar comment in deallocate_tail()?

   Thank you, I forgot about that first block mapping is still valid.

   What about storing bad value (in debug mode) only in second part and not both parts?

Once I have your response, I will create a new webrev reflecting your input. I need to do that anyway because the assert in heap.cpp:200 has to go away. It fires spuriously. The checks can't be done at that place. In addition, I will add one line of comment and rename a local variable. That's it.




On 14.05.19, 20:53, "hotspot-compiler-dev on behalf of Vladimir Kozlov" <hotspot-compiler-dev-bounces at openjdk.java.net<mailto:hotspot-compiler-dev-bounces at openjdk.java.net> on behalf of vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>> wrote:


    Do we need to be concern about atomicity of marking? We know that memset() is not atomic (may be I am wrong here).
    An other thing is I did not get logic in deallocate_tail(). split_block() marks only second half of split segments as
    used and (after call) store bad values in it. What about first part? May be add comment.


    On 5/14/19 3:47 AM, Schmidt, Lutz wrote:
Dear all,

May I please request reviews for my change?
Bug:    https://bugs.openjdk.java.net/browse/JDK-8223444
Webrev: https://cr.openjdk.java.net/~lucy/webrevs/8223444.00/

What this change is all about:
While working on another topic, I came across the code in share/memory/heap.cpp. I applied some small changes which I would call improvements.

Furthermore, and in particular with these changes, the platform-specific parameter CodeCacheMinBlockLength should by fine-tuned to minimize the number of residual small free blocks. Heap block allocation does not create free blocks smaller than CodeCacheMinBlockLength. This parameter value should match the minimal requested heap block size. If it is too small, such free blocks will never be re-allocated. The only chance for them to vanish is when a block next to them gets freed. Otherwise, they linger around (mostly at the beginning of) the free list, slowing down the free block search.

The following free block counts have been found after running JVM98 with different CodeCacheMinBlockLength values. I have used -XX:+PrintCodeHeapAnalytics to see the CodeHeap state at VM shutdown.

JDK-8223444 not applied

        Segment  |  free blocks with CodeCacheMinBlockLength=
          Size   |       1      2      3      4      6      8
aarch      128   |       0    153     75     30     38      2
ppc        128   |       0    149     98     59     14      2
ppcle      128   |       0    219    161    110     69     34
s390       256   |       0    142     93     59     30     10
x86        128   |       0    215    157    118     42     11

JDK-8223444 applied

        Segment  |  free blocks with CodeCacheMinBlockLength=  |  suggested
          Size   |       1      2      3      4      6      8  |   setting
aarch      128   |     221    115     80     36      7      1  |     6
ppc        128   |     245    152    101     54     14      4  |     6
ppcle      128   |     243    144     89     72     20      5  |     6
s390       256   |     168     60     67      8      6      2  |     4
x86        128   |     223    139     83     50     11      2  |     6

Thank you for your time and opinion!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20190515/f69ff81f/attachment-0001.html>

More information about the hotspot-compiler-dev mailing list