RFR 6913047: SunPKCS11 memory leak

Martin Balao mbalao at redhat.com
Wed Oct 11 13:31:30 UTC 2017


Hi,

I'd like to propose a fix for bug JDK-6913047: "Long term memory leak when
using PKCS11 and JCE exceeds 32 bit process address space" [1]. This fix
does not contain changes in the GC and is SunPKCS11 internal only.

PROBLEM
........................................................

When using the SunPKCS11 crypto provider (for cipher, signature, mac, key
generation or any other operation), multiple key objects may be created.
I.e.: every time a TLS session is established, a unique master key (derived
from the pre-master-key) has to be created and then used for encryption and
decryption operations. This is a legitimate use case in which key caching
does not make sense as each key is unique per session. These keys are of
P11Key type and have a corresponding native key object created. In the case
of NSS SunPKCS11 backend (PKCS11 software token), this native key object is
temporarily stored in the process native heap. The interface is simple: a
JNI call is done to create a native key object (C_CreateObject,
C_CopyObject, C_DeriveKey, C_GenerateKeys, etc., according to the PKCS11
interface) and an integer handler is kept in the Java side (P11Key). When
the P11Key object is destroyed, a finalizer code is executed to free the
native key object (through C_DestroyObject JNI call). The problem is that
finalizer code execution happens only if the JVM garbage-collector cleans
up the P11Key object. That may be delayed or not done at all, depending on
different GC algorithms, parameters and environment conditions. As a
result, the native heap may be exhausted with not freed native key objects,
and the JVM will then crash -this is particularly true for 32 bits VMs
where the virtual address space can be exhausted-.


SCOPE
........................................................

The fix is proposed for SunPKCS11 with NSS backend only. Other PKCS11
backends are not currently under scope. It's likely that hardware PKCS11
backends store native key objects in their own memory, preventing a native
heap exhaustion and a JVM crash. However, it might be possible to cause an
exhaustion on their own memory blocking key objects creation at some point.
In any case, this is speculative as no tests were done on our side with
real hardware.


SOLUTION
........................................................

Assuming that native keys are extractable, the idea is to hold native key
data in the Java heap while keys are not in use. When a P11Key is created,
every CK_ATTRIBUTE (PKCS11) value for the native key is queried, data
stored in an opaque Java byte[] (inside the P11Key object) and native key
destroyed. Every time the P11Key is about to be used, the native key is
created with the stored data. After usage, the native key is again
destroyed. Thus, it's not necessary to wait for a finalizer execution to
cleanup native resources: cleanup is done at deterministic and
previously-known points. This comes with a resposibility for key users
-which are all SunPKCS11 internal services like P11Signature, P11Cipher,
P11KeyGenerator, etc.-: create and destroy native keys through a reference
counting scheme exposed by P11Key class. There are two kind of usages:

 1) stateless: the native key is "atomically" created, used and destroyed.
I.e.: MAC calculation, getEncodedInternal operation (on P11Key objects),
signature operations, TLS key derivation, etc.

 2) statefull: the native key is created, one or multiple intermediate
actions are performed by the key user, a final action is performed and
finally the native key is destroyed. I.e.: cipher operations.

For keys that are extractable but sensitive (CKA_SENSITIVE attribute is
true), as the case when operating in FIPS mode, wrapping/unwrapping is used
as a workaround to extract session keys. Wrapper key is global and lives
forever.

There are no interface changes for SunPKCS11 external users.

If keys are not extractable or the feature cannot be enabled for any other
reason, the previous finalizer scheme is used as a fallback.


ADDITIONAL IMPLEMENTATION NOTES
........................................................

When a P11Key is created, a constructor parameter exists to indicate if the
feature is enabled for that specific key. For this feature to be enabled, 2
additional conditions apply: 1) SunPKCS11 backend has to be NSS, and 2) key
has to be extractable. If the feature is not enabled for a key, behavior is
as previous to this patch (native key destruction relies on finalizer
execution).

The only P11Key user that explicitly does not use this feature is
P11KeyStore. This is because these keys (token keys) are managed by alias
names and makes no sense to remove them from the key store (they need to be
accessible by an alias at any time).

Because P11Key objects can be used by multiple threads at a time, there is
a thread-safe reference counting scheme in order to decide when a native
key object has to be created or destroyed. The SunPKCS11 internal API to
use a P11Key is as follows: 1) increment the reference counter (which will
eventually create the native key object if it doesn't exist), 2) use the
key and 3) decrement the reference counter (which will eventually destroy
the native key if there it's not being used by anyone else).

The reason why an opaque byte[] is used in P11Key objects to store native
keys data (instead of a CK_ATTRIBUTE[] Java objects, queried by Java's
C_GetAttributeValue function) is performance. My prototypes show a
difference of 4x in speed. 2 functions were added to libj2pkcs11 library:
getNativeKeyInfo (to extract the opaque byte[] from a native key object)
and createNativeKey (to create a native key object from an opaque byte[]).


CHANGESET
........................................................

This changeset is JDK-10 (at jdk c8796a577885 rev) based:

 *
http://people.redhat.com/mbalaoal/webrevs/jdk_6913047_sunpkcs11_nss_memory_leak/2017_10_06/6913047.webrev.04/
(browse online)
 *
http://people.redhat.com/mbalaoal/webrevs/jdk_6913047_sunpkcs11_nss_memory_leak/2017_10_06/6913047.webrev.04.zip
(download)


TESTING
........................................................

Test suite for 32 bits JVMs only:
http://people.redhat.com/mbalaoal/webrevs/jdk_6913047_sunpkcs11_nss_memory_leak/2017_10_06/Bug6913047.java

 * Suite (Bug6913047.java)
  * Tests JVM memory exhaustion while using keys for different services:
P11Cipher, P11Signature, P11KeyAgreement, P11Mac, P11Digest,
P11KeyGenerator, P11KeyFactory, etc.
  * Tests functional regression.
   * Including Key Stores (P11KeyStore)

Parameters to run the reproducer (on JDK-10):
 * javac: --add-modules jdk.crypto.cryptoki --add-exports
java.base/sun.security.internal.spec=ALL-UNNAMED
 * java: -XX:+UseParallelGC -Xmx3500m --add-modules jdk.crypto.cryptoki
--add-opens java.base/javax.crypto=ALL-UNNAMED --add-opens
jdk.crypto.cryptoki/sun.security.pkcs11=ALL-UNNAMED

You can also use jtreg.


PERFORMANCE
........................................................

For a quick reproducer previously developed (which looped 100000 times
creating P11Cipher and P11Key objects to encrypt a plaintext), these are
the figures I got:

 * real 1m11.328s (without fix)
 * real 1m12.795s (with fix)

Performance penalty seems to be low in current state.

OTHER
.......................................................

My employer has an OCA agreement with Oracle and this work has been done in
that context.

Look forward to your comments.

Kind regards,
Martin.-

--
[1] - https://bugs.openjdk.java.net/browse/JDK-6913047
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/security-dev/attachments/20171011/bc9af9a8/attachment.html>


More information about the security-dev mailing list