Looking ahead: proposed Hg forest consolidation for JDK 10
joe.darcy at oracle.com
Tue Oct 11 17:30:47 UTC 2016
On 10/11/2016 2:30 AM, Lindenmaier, Goetz wrote:
> I see several problems with this approach.
> 1.) Mercurial already has problems scaling with the current repositories.
> This will get worse with bigger repos. E.g. 'hg diff' takes
> 14 secs on jdk, but only 2 secs on jaxp:
> jdk: ~90000 files, 15000 changes, hg diff takes 14 secs
> jaxp: ~12000 files, 1000 changes, hg diff takes 2 secs
By its nature, hg diff needs to walk the directory tree so a bigger tree
will generally be slower. Doing a diff on a particular subdirectory, say
for hotspot, should have comparable performance as today.
The fsmonitor extension,
https://www.mercurial-scm.org/wiki/FsMonitorExtension, could help in
this case too.
> 2.) Cloning the repo does not scale.
> Cloning the root repo and calling get_source.sh takes 20 min.
> I ususally only clone the root repo and hotspot. This only
> takes 3 min.
> I don't think merging the repos might improve the 20 mins.
> In contrary, as cloning the jdk repo takes most of the time,
> and the others run in parallel, cloning an even bigger repo
> will be slower.
> Alternatively, one could hold a 'master' repo and replicate that
> by local copy. But this shows similar timings (1:40 vs. 9min).
We've discussed this kind of use-case internally as well. The
recommendation is to have a designated local master and then do local
clones of that. On a unix system if the local clones are on the same
disk, hard links are used with a copy-on-write policy so the clones are
space-efficient and time-efficient to create. The local clone times
we've seen are about 30 seconds in that case.
> 3.) Having to clone the full repos will require considerably more
> disk space.
> I'm working on various issues in hotspot and keep them seperated
> by doing this in individual repositories that only contain hotspot.
> These repos will require considerably more space.
If disk space is a concern, you can use mq or bookmarks against a single
> 4.) There will be additional merges because changes that are now done
> in two repos will then be done in a single repo. If I then sync back
> a few hotspot changes, a lot of files in the other subdirectories
> will get touched. This slows down sync and causes rebuilds.
> Sure this might just be what is intended, but currently I don't
> need to rebuild jdk etc. very often.
While hotspot and the rest of the JDK can often be treated as
approximately independent, they are not truly independent.
> 5.) It will get harder to monitor submitted changes that are relevant
> for a specific area. E.g., I might only want to see changes in hotspot.
> In the web frontend, you can not browse changes on subdirectory basis.
> Maybe this can be solved, as the commandline 'hg log' etc. already support
We don't have plans to change the Hg web UI so I think a command line
solution would be appropriate here.
> 6.) A single repo will simplify making combined changes. So there will be
> more of these. But combined changes complicate handling of our licensed
> In our activities as licensee, we are consuming hotspot change-wise.
> This is because we modified a lot in hotspot, and merging hotspot
> changes step by step simplifies the merging.
> On the other side, we consume the changes to jdk etc. as chunks.
> This is because we changed much less in these directories so
> that merging causes less problems. Also, there are much more
> changes and we don't have the manpower to consume them change-wise.
> Having combined changes requires more synchronization between
> the two merging tasks. It's already an increasing effort in
> Also, to follow these two different merging approaches for hotspot
> and the rest, we would have to first split the single repo into
> two parts.
> Comments to the JEP:
> I appreciate that the change history is kept as it makes research
> in old changes more easy. On the other side, dropping the history
> might speed up handling of the new repo.
We are aware that Facebook has developed Hg plugins to allow shallow
clones, i.e. clones without all the history, but we haven't investigated
using them yet.
> I also appreciate the changes in directory layout. If the
> repos are merged, this should be done this way.
> We find it difficult to keep the jtreg runner in sync with our
> current version of jdk9, especially as we have two of them (We
> test openJdk and SAP JVM 9, and within SAP JVM 9 hotspot and
> jdk often differ in a few builds.)
> I would appreciate if the runner could be included in the
> root/test directory.
I'm not quite sure what you are referring to by the jtreg runner.
More information about the jdk9-dev