Understanding ZGC details
gil at azul.com
Mon Jul 16 16:07:44 UTC 2018
Sent from my iPad
On Jul 16, 2018, at 7:08 AM, Simone Bordet <simone.bordet at gmail.com<mailto:simone.bordet at gmail.com>> wrote:
keeping this between us since it's not really relevant for ZGC :)
On Fri, Jul 13, 2018 at 7:52 PM Gil Tene <gil at azul.com<mailto:gil at azul.com>> wrote:
Speaking about the C4 mechanism: while C4 does perform various mapping manipulations
for various reasons, LVB detection does not use SEGVs and does not rely on memory
protection. Instead, The C4 LVB fast path uses a test & jmp (a single u-op on x86) to verify
that the loaded reference is not in a "currently invalid" phase, with a slow path that fixes the
reference to be in a valid phase (doing whatever is needed to achieve that: ensure marking,
fixup to point to relocated object location, or perform actual object relocation if needed).
Ok. So I'm reading the GC Handbook from Jones et al. 2nd edition, and
they say that Pauless and C4 protects from-pages.
If I read correctly, from-pages are protected so that if a mutator
dereferences a pointer, it will trap, and the trap will "fix" the
pointer, either by copying it to to-space (so the mutator does the
job), or by looking at side metadata for the new address.
Look at the C4 ISMM paper (https://www.azul.com/files/c4_paper_acm2.pdf) for the actual terminology and use. The paper defines (in section 2) what an LVB is, the invariants it maintains, and explains the various ways to ensure barriered references match an expected state. It also discusses implementations variants (section 4), where it explains how LVB fast path test implementations could vary dramatically (e.g. between interpreted, JIT’ed, runtime C++ codes, or on different hardware architectures, and obviously by specific implementation choices) without changing semantics, and provides pseudo code (appendix A) for the logical coverage.
The TLDR summary in the context of your above question would be: “Protected” means “logically protected (by the algorithm) from mutator access”. “Trap” or ‘trigger” means “do something other than the fast path”., and “Page” means “a power-of-2 aligned range of addresses”. Virtual memory protection and privilege-changing trapping is one possible (but not very efficient or practical for the common case) way to implement an LVB, but the vast majority of LVB tests are implemented via user-mode testing and conditional branching.
Adhering to actual virtual memory page boundaries, and actually matching collector protection with virtual memory protection has some bonus logical benefits, including allowing multiple implementations of the same LVB semantics enforcement in the same code base, as well as enhanced error detection that goes beyond the LVB semantics requirements (but helps a lot in stabilizing actual collector implementation, which like all software, may include bugs at some point).
The fast path of an LVB test can be optimized to a test-and-jump, and that’s how it’s been implemented for roughly a decade for reasonable (currently up to 2TB) heap sizes. The size of the mask being tested and the amount of pre-shifting involved (if any) determines the possible granularity of protected page set selection for a relocation phase flip. E.g. selecting the entire (up to 2TB on x86) heap or an entire generation at the same time allows for no shifting at all (mask in place), test-and-jump. Selecting sub-phases (e.g. relocating 1/8th or 1/16th of the heap at a time) can fit with 64 bit register masks using an extra shift, higher granularity can be achieved with vector registers, and arbitrary granularity (down to a single page if wished for) could be achieved by testing against a memory (rather than a register) bitmask.
I understand that if there is a JIT-injected load barrier always
present, this work of "fixing" the pointer can be done by the slow
path of the load barrier, but then (for what pertains pointers)
protecting pages is not necessary; but then, why the GC handbook
I believe that the current edition of the handbook was finalized before they had access to the published C4 ISM! paper... The fact that some C4 algorithm implementations (like the ones we actually use in our products) *also* use virtual page mapping and ritual page protection for unrelated (as in not algorithmically necessary but otherwise useful) purposes, like error detection, additional optimization opportunities, and code stabilization, may have confused the C4 algorithm description with some implementation concerns.
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless. Victoria Livschitz
More information about the zgc-dev