RFC: 8200559: Java agents doing instrumentation need a means to define auxiliary classes

Rafael Winterhalter rafael.wth at gmail.com
Sun Apr 18 20:28:23 UTC 2021

'legitimate' is probably a poor choice of word. What I mean to express is
that libraries, other than agents, are able to address class loaders
explicitly in their APIs and can often expect to share a hierarchy with the
code they interact with. Using method handles and the defineClass method,
you can also require appropriate arguments or that modules are opened to
the library. I would not know of a use case that I was not able to cover
with the existing API, despite developing some of the more 'low level'
libraries for the JVM.

With agents on the other hand, there is no API that your targeted
application can be made aware of. You cannot require a lookup to be
provided to you, as an agent is supplementary and is not known to the
application. Furthermore, an agent cannot make assumptions, for example
about class loaders and it cannot require much from the application, as
they are normally developed separately and users often want to apply them
to a range of applications without rewriting their code. Anything else
would defeat their purpose. This often requires to establish some sort of
dispatcher logic across class loaders, which again requires injection. I do
not see this to be the case for a regular library. I do however believe
that this is already understood, I assume that is why the Instrumentation
API offers a way of injecting code into the boot and system loaders. My
point is that these APIs are dated as they cannot accept a byte array but
require writing to the file system and since they only allow targeting two
well-known class loaders.

As by 'we can do this anyway', I explicitly mean that it is possible to
emulate the API that I propose already today, without using unofficial
APIs. I have previously prototyped emulations, just to measure performance
impact which is normally inacceptable. If you wanted to define a class, you
can for example retransform a carrier class and inject classes from a known
method which you add to the constant pool as a base64 encoding:

class Carrier {
  void m() { }

can for example be retransformed into:

class Carrier {
  void m() {
    if (someGuard()) // some condition and maybe synchronization
      MethodHandles.lookup().defineClass("<base 64 class file>".getBytes());

This would define the class in question anytime a "trigger method" is
invoked within a well-known pattern. Also, I already showed how you could
emulate general injection by offering a package-local injection using the
API that was proposed as an alternative in the previous mail. The real
difference is in the end of course convenience and performance, but I do
not find it intuitive to not offer a performant, reliable API if there is a
well founded requirement for it. As someone who is working with Java in the
enterprise as a day job, I cannot stress out enough how important the
tooling is for Java's use in the enterprise. Static and dynamic Java
agents, APM and security monitoring, are deployed in every production
environment I come across. And this tooling needs a way of class injection
as I outlined it.

Am So., 18. Apr. 2021 um 12:30 Uhr schrieb Ron Pressler <
ron.pressler at oracle.com>:

> Hi Rafael!
> The use cases you mention might be legitimate, but why are they legitimate
> *for
> agents*? For example, why can’t the Replacement class be defined in a
> regular library,
> not injected by the agent?
> As to the existence of Unsafe hacks and others, those are going away, so
> “we can do
> this anyway,” is not an excuse. I think library authors need to accept
> that things
> they’re able to achieve with various hacks might not be possible to do as
> conveniently soon. The security and ecosystem maintenance philosophy is
> that certain
> things are allowed only when the *application* explicitly gives its
> consent by signifying
> it somehow on the command line. Libraries will not be able to take such
> permissions for
> themselves, and “the application gives its consent by choosing my library”
> is insufficient.
> Yes, it will be less convenient, but convenience is not the only important
> concern.
> At this point, that capabilities that might compromise security or
> upgradability (i.e
> reliance on non-API elements) might require explicit consent on the
> command line, even
> when this might be less convenient, should be taken as an axiom for the
> discussion.
> — Ron
> > On 16 Apr 2021, at 21:18, Rafael Winterhalter <rafael.wth at gmail.com>
> wrote:
> >
> > I am trying to revive issue 8200559 (
> > https://bugs.openjdk.java.net/browse/JDK-8200559) which was briefly
> > discussed on this mailing list over three years ago (
> > http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/000405.html).
> I
> > am hopeful that a solution could be reached prior to releasing Java 17.
> > With the LTS label, JVMs of that version are likely to be seen for quite
> > some time and without a fix, many Java agents will still rely on unsafe
> > to support that version. With the issue resolved, agents that aim to
> > support Java 17 as a baseline could be implemented using only official
> > which I consider a desirable milestone. This would also allow me to
> remove
> > use of unsafe API from a range of popular open source projects that I
> > maintain once Java 17 becomes their baseline. I have proactively added a
> > pull request (https://github.com/openjdk/jdk/pull/3546) with what I
> > consider to be the most complete and satisfactory solution for agent
> > developers. I implemented this patch to test potential migration for my
> own
> > and customer projects and can confirm that this would indeed make any
> > unsafe API obsolete.
> >
> > I had already lined out why APIs such as
> MethodHandles.Lookup::defineClass
> > do not work for ClassFileTransformers in my original posting. I will try
> to
> > avoid reiterating on my original argument. As a result of this
> discussion,
> > an API was proposed where the ClassFileTransformer would be overloaded
> with
> > an additional parameter that receives an instance of ClassDefiner as its
> > argument. The instance would allow you to define a class but assert that
> > the auxiliary class is located in the same package as the class under
> > transformation. For example, if a class:
> >
> > package p;
> > class C {
> >  void m(A a) {
> >    a.callback(new Value());
> >  }
> > }
> >
> > would be instrumented to become:
> >
> > package p;
> > class C {
> >  void m(A a) {
> >    a.callback(new Replacement()); // changed
> >  }
> > }
> >
> > then ClassDefiner would assert that the auxiliary class Replacement is
> also
> > defined in package 'p'. In my eyes, this is not a very satisfying
> solution.
> > In some cases, a Java agent is for example using existing classes to
> carry
> > additional state between two instrumented classes. For instance, class
> > Replacement might be used as a carrier of additional state to pass to
> class
> > A which is itself instrumented as follows:
> >
> > package p2;
> > class A {
> >  void callback(Value v) {
> >    if (v instanceof Replacement) {
> >      // read data of v and process
> >    }
> >  }
> > }
> >
> > This example is far from constructed, but a common and effective strategy
> > to implement tracing, security monitoring or code intelligence.
> >
> > It is of course possible to implement this pattern by using the
> > package-restricted ClassDefiner. It is however rather costly and
> > complicated to implement. A ClassFileTransformer is a simple callback
> that
> > is not in control of the class loading order but is only invoked
> whenever a
> > class is loaded for the first time. The first class to be loaded might of
> > course be either C or A, it might not even be deterministic for the same
> > application. If the class Replacement was required to live in the package
> > of the class that is currently instrumented, Replacement would need to be
> > defined in either p or p2, depending on the materialized class loading
> > order. This would require some form of state management in the class file
> > transformer to keep track of the package that one would need to use when
> > instrumenting the second class. Due to parallel class loading and with
> > constellations that involve even more classes, this can get quite
> > complicated. Using a stable package name is however not only a
> convenience.
> > The name of a package can affect reflection-heavy frameworks, debugging
> > opportunities, logs, stacktraces and more.
> >
> > Furthermore, one should consider the costs of migrating an existing agent
> > in the light that an agent can easily open jdk.internal.misc.Unsafe by
> > using the instrumentation API or use a class file transformer to erase
> the
> > explicit package assertion of ClassDefiner. Current agents often use
> > sun.misc.Unsafe or jdk.internal.misc.Unsafe to invoke the defineClass
> > method of these types. This API is very performant and easy to use. To
> some
> > degree, it is even possible to emulate the Unsafe API using ClassDefiner
> by
> > strategically retransforming classes in the right package. The following
> > code allows to define a class using a witness class by using a
> > non-operational class file transformer:
> >
> > static Class<?> defineClass(Instrumentation inst, final Class<?> witness,
> > final byte[] classFile) throws Exception {
> >  AtomicReference<Class<?>> ref = new AtomicReference<>();
> >  ClassFileTransformer definer = new ClassFileTransformer() {
> >    @Override
> >    public byte[] transform(ClassDefiner classDefiner,
> >                            Module module,
> >                            ClassLoader loader,
> >                            String className,
> >                            Class<?> classBeingRedefined,
> >                            ProtectionDomain protectionDomain,
> >                            byte[] classfileBuffer) {
> >      if (classBeingRedefined == witness) {
> >        ref.set(classDefiner.define(classFile));
> >      }
> >      return null;
> >    }
> >  };
> >  inst.addTransformer(definer, true);
> >  try {
> >    inst.retransformClasses(witness);
> >  } finally {
> >    inst.removeTransformer(definer);
> >  }
> >  return ref.get();
> > }
> >
> > In this sense, Java agents already have the possibility to define new
> > classes within any package. The restriction that ClassDefiner introduces
> is
> > therefore merely cosmetic and does not really reduce the actual
> > capabilities of any owner of an instance of Instrumentation. Beyond that,
> > JNI already offers a similar API for defining a class which is also used
> as
> > there are wrappers that compile the method for any common platform and
> > architecture.
> >
> > Finally, many Java agents need to define general infrastructure classes
> to
> > function. A tracing framework might for example want to be notified of
> all
> > HTTP calls of known frameworks. To accomplish this, some form of common
> > callback is injected into a generally known class loader, often the
> > globally visible bootstrap loader:
> >
> > package com.acme.tracer;
> > public class AgentCallback {
> >  public static void callback(String framework) { ... }
> > }
> >
> > This is typically done before registering a first ClassFileTransformer
> > which is later instrumenting framework classes to invoke this callback,
> > none of which is defined by the boot loader which is only used to assure
> > that the counter can be seen from all application classes and the agent
> > framework alike. At the moment,
> > Instrumentation::appendToBootstrapClassLoaderSearch is the most feasible
> > approach to this. However, the API also has significant downsides. First
> of
> > all, it is of course restricted to the boot or system loader by the way
> the
> > API is set up. In some scenarios, another class loader would however be
> > more appropriate, for example if a JVM is segmented as in a JEE container
> > where the callback should only apply to a specific deployment area and
> > might therefore better be placed in a different class loader that is root
> > to the deployment. Additionally, it is rather inefficient to serialize
> > these auxiliary classes to the file system, only to pass it to the API
> > which then reads them back into memory to load the contained classes. In
> > some environments, a file system for such purposes is not even available.
> >
> > For all these reasons, I still believe that adding
> > Instrumentation::defineClass is the best solution to address all of these
> > legitimate needs. Java agents have created excellent tooling for the JVM
> > which make the platform successful in the enterprise. In my eyes it is
> > overdue that these legitimate use cases can start relying on a stable
> API.
> > Going this route would offer an easy path for migration and agents that
> > supported older versions than Java 17 could easily replace their usage of
> > Unsafe by this similar API which promises a quick adoption.
> Alternatively,
> > I am afraid many will find ways around, even by emulating unsafe using
> > official API as I demonstrated, the cost being decreased performance and
> > convenience.
> >
> > Thank you for your comments. I really hope this can be addressed before
> 17
> > is released, I am hopeful that this would make the use of unsafe API in
> > agents disappear sooner rather than later.
> >
> > Best regards, Rafael

More information about the core-libs-dev mailing list