RFR(L) 8244778 Archive full module graph in CDS
lois.foltan at oracle.com
Fri Aug 7 19:06:24 UTC 2020
Overall looks promising. I have some review comments below, but not a
complete review. I concentrated on the JVM side primarily how the
archived module graph is read in, how the ModuleEntry and PackageEntry
tables are created from the archive for the 3 builtin loaders and
potential timing issues during start up.
On 7/22/2020 3:36 PM, Ioi Lam wrote:
> Please review this patch that stores the full module graph in the CDS
> archive heap. This reduces the initialization time of the basic JVM by
> about 22%:
> $ perf stat -r 100 bin/java -version
> before: 98,219,329 instructions 0.03971 secs elapsed (+- 0.44%)
> after: 55,835,815 instructions 0.03109 secs elapsed (+- 0.65%)
>  Start with ModuleBootstrap.java. The current implementation is
> quite restrictive: the archived module graph is used only when no
> module options are specified.
> See ModuleBootstrap.mayUseArchivedBootLayer().
> We can probably support options such as main module and module path
> in a future RFE.
Yes, I noticed the restrictions. The JBS issue discusses the
possibility of using the dumped module graph in the event that the value
of the options are the same. I think for this implementation to be
viable and have a positive impact you should consider comparing at least
the options --add-modules, --add-exports and --add-reads.
>  In the current JDK implementation, there is no single object
> that represents "the module graph". Most of the information
> is stored in the archive bootLayer object, but a few additional
> restoration operations need to be performed:
> + See ModuleBootstrap.getArchivedBootLayer()
> + Some static fields need to be archived/restored in
> Module.java, BuiltinClassLoader.java, ClassLoaders.java
> and BootLoader.java
>  I ran into a complication with two loader instances of
> PlatformClassLoader and AppClassLoader. They are stored in
> multiple tables inside the module graph (e.g.,
> BuiltinClassLoader$LoadedModule) so I cannot easily recreate
> them at runtime.
> However, these two loaders contain information specific to the
> dump time VM lifecycle (such as the classes that were loaded
> during CDS dumping) that need to be scrubbed. I couldn't find an
> elegant way of doing this, so I added a private "resetArchivedStates"
> method to a few classes. They are called inside
>  Related native data structures (PackageEntry and ModuleEntry)
> are also archived. Start with classLoaderData.cpp
> Passes mach5 tiers 1-4. I will test with additional tiers.
> - Ioi
- line #1644 pulling the javabase_moduleEntry() check outside of the
Module_lock maybe problematic if after you check it another
thread initializes in the time between the check and the obtaining of
the Module_lock on line #1645.
- somewhere in ArchivedClassLoaderData there should be an assert to make
sure that the CLD, whose package entries and module entries are
being archived is not a "_has_class_mirror_holder" CLD for a weakly
defined hidden class. Those dedicated CLDs should never have
package entries or module entries.
- line #400, typo "conver" --> "convert"
- line #500, maybe sort if n is greater than 1 instead of 0 (same
comment for packageEntry.cpp line #270)
- It looks like the PackageEntry and ModuleEntry tables for the system
and platform class loaders are added within
SystemDictionary::compute_java_loaders() which is called from
Thread::create_vm() but after the call in Thread::create_vm() to
call_initPhase2 which is when Universe::_module_initialized is set to
true. Did I read this correctly? If yes, then I think you have to be
sure that Universe::_module_initialized is not set until the module
graph for the 3 builtin loaders is loaded from the archive.
- line #237 typo "modue" --> "module"
- line #2545, comment "TO DO -- point it to the archived PackageEntry
are you thinking that since the module graph is read in ahead of time
that it can be set when an InstanceKlass is created? There is a point
before java.base is defined that InstanceKlass::set_package takes into
account that could be a timing issue.
- Since the CDS support for capturing the module graph does not archive
unnamed modules, I assume Modules::set_bootloader_unnamed_module() is
still called. Is this true or does the unnamed module for the boot
loader get archived?
- There are some checks in modules.cpp that are valuable to have during
bootstrapping. For example, packages that are encountered during
bootstrapping before java.base is defined are put into the java.base
module but then verified after java.base is defined via
verify_javabase_packages. So at what point in the bootstrapping process
does the boot loader's archived module's become known? Could classes
have been defined to the VM before this point? Then their packages will
have to be verified to be sure that they are indeed packages within
java.base. Consider looking at other checks that occur within
classfile/modules.cpp as well.
I may have more review comments as I continue to look at this!
More information about the core-libs-dev