RFR - 8132734: java.util.jar.* changes to support multi-release jar files
steve.drach at oracle.com
Wed Feb 10 01:04:01 UTC 2016
Yet another webrev, http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/ <http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/>, with a change to JarEntryIterator to fix a problem discovered by performance tests — calling hasNext() twice as often as needed. I also removed the @since 9 tags from the methods entries() and stream(), and added an additional sentence to the spec for each of those methods that clarifies what a base entry is (actually is not).
> I think having stream and entries do this is right although I would like to see some performance data if possible.
See http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf <http://cr.openjdk.java.net/~sdrach/8132734/JarFile%20Performance.pdf>
I used JMH to run the benchmark. See http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java <http://cr.openjdk.java.net/~sdrach/8132734/MyBenchmark.java>. I used two jar files, the rt.jar file from JDK 8 that has 20653 entries and the multi-release.jar found in the test directory with 14 entries. Obviously rt.jar is not a multi-release jar file.
The first two tables (1 and 2) are comparable and the second two tables are somewhat comparable (3 and 4).
Tables 1 and 2 have 4 sections that show the results of tests on the two jar files in 4 configurations of JarFile. The tests were done with a JarFile object constructed without the Release object argument, essentially the legacy constructor. The section labeled "JDK 8 JarFile” was done with JDK 8u66. The section labeled “JDK 9 JarFile” was done with the latest build of openjdk/jdk9/dev without any changes in my 8132734 changeset. I chose this section as the reference, so the last column shows the values normalized to 1 micro/milli second per operation (rt.jar times are in milliseconds and multi-release.jar times are in microseconds). It should be obvious that JDK 9 is much faster than JDK 8, somewhere on the order of 5-6 times faster. I think that is because ZipFile doesn’t use JNI in JDK 9.
Of the two remaining sections in Tables 1 and 2, the section labeled “MultiRelease JarFile” differs from the section labeled “MultiRelease JarFile, new iterator” only in the JarEntryIterator class. The first one uses the original iterator in JarFile.java that can be seen starting with line 551 of webrev.04 <http://cr.openjdk.java.net/~sdrach/8132734/webrev.04/>, and the new iterator starts with line 553 of webrev.06 <http://cr.openjdk.java.net/~sdrach/8132734/webrev.06/>. The results are strange. The new, more complex, iterator appears to be faster than the old, simpler, iterator. I double, and triple, checked it, but it was always faster. I used jitwatch to look at the hotspot logs generated during compilation and neither method was compiled. I suppose I could dig into it further but decided not to. Consider it good news. The results do show that the multi-release enhancement slows JarFile entries/stream down by 2-18% depending on the size of the jar file. But they are still much better than the JDK 8 values.
> Also I would expect that if a JAR file is not mult-release but the library opens it with Release.RUNTIME to perform the same as opening the JAR file with the Release-less constructors.
The results in Table 3 attempts to answer this question since rt.jar is not a multi-release jar file. This tells me that if one opens the JarFile with Release.RUNTIME, that there is a performance penalty of 2-6% on this very large jar file.
Finally, the results in Table 4 tell me that processing a true multi-release jar file takes about 80% more time per entries() or stream() operation. I’ve looked at this in a profiler and there is no particular area that stands out to me, it’s just more complicated to process a multi-release jar file, as would be expected.
> I think the javadoc will need to also need to make it clear whether entries with names starting with META-INF/versions/ are returned.
> I see you've added @since 9 to the existing methods, I assume you didn't mean to do this.
> At some point then we need to discuss how RUNTIME_VERSION is computed. Iris (via Mandy) has pushed jdk.Version to jdk9/dev but having it exported by java.base conflicts with the design principles in JEP 200. Moving it to another module means that code in java.base cannot use it and thus the JAR file can't use it.
Left as is.
More information about the core-libs-dev