As a customer, I can say... please no training runs! In my opinion,
that makes an already complicated system even more so, for what I
would think is marginal gain.

Something you may want to consider is allowing applications an
interface into the GC subsystem (I've brought this up on this list
before as well). In addition to the interface methods I suggested
previously, you could add:

  public void startingApplicationPreloadingPhase();

  public void startingApplicationSteadyStatePhase();

Applications that have the preloading/steady state behavior can then
signal the GC subsystem with the state of the application, which
allows you to start "training" the GC only at steady state, without
any guesswork on the part of the GC subsystem.

Any applications not using those functions will simply run as they do

