RFR JDK-8185582, Update Zip implementation to use Cleaner, not finalizers

Peter Levart peter.levart at gmail.com
Wed Nov 1 12:04:31 UTC 2017

Hi Sherman,

On 11/01/17 00:25, Xueming Shen wrote:
> Hi Peter,
> After tried couple implementations it seems the most reasonable 
> approach is to
> use the coding pattern you suggested to move all pieces into ZSStream 
> Ref. Given
> we are already using the internal API CleanerFactory it is attractive 
> to go a little
> further to subclass PhantomCleanable directly (in which I would assume 
> we can
> save one extra object), but I think I'd better to follow the 
> "suggested" clean usage
> (to register a runnable into the cleaner) for now.
>   39 class ZStreamRef implements Runnable {
>   40
>   41     private LongConsumer end;
>   42     private volatile long address;
>   43     private final Cleanable cleanable;
>   44
>   45     ZStreamRef (Object owner, LongSupplier addr, LongConsumer end) {
>   46         this.cleanable = CleanerFactory.cleaner().register(owner, 
> this);
>   47         this.end = end;
>   48         this.address = addr.getAsLong();
>   49     }
>   50
> Similar change has been made for the ZipFile cleaner to follow the 
> same coding
> pattern. The "cleaner" is now renamed from Releaser to CleanableResource.
> http://cr.openjdk.java.net/~sherman/8185582/webrev/
> Thanks,
> Sherman

This looks mostly fine. One minor nit is that ZStreamRef.address field 
does not have to be volatile. I checked all usages of 
ZStreamRef.address() and all of them invoke it while holding a lock on 
the ZStreamRef instance. The ZStreamRef.run() which modifies the address 
is also synchronized. The other minor nit is that the following ZipFile 
imports are not needed:

import java.nio.file.Path;
import java.util.Map;
import jdk.internal.misc.JavaIORandomAccessFileAccess;
import static java.util.zip.ZipConstants.*;

(at least my IDEA colors them unused)

Cleaner is modified in this webrev (just one empty line deleted) - 
better not touch this file in this changeset

Additional nits in ZipFile:

- in lines 355, 356 you pull out two res fields into locals, but then 
not use them consistently (lines 372, 389)
- line 403 has a TAB character (is this OK?) and shows incorrectly 
indented in my editor (should I set tab stops differently?)
- line 457 - while should actually be if ?
- in ensureOpen() the check for res.zsrc == null can never succeed 
(CleanableResource.zsrc is a final field. If CleanableResource 
constructor fails, there's no res object and there's no ZipFile object 
either as ZipFile constructor does not do anything for this to escape 
- why don't you let IOExceptions thrown from CleanableResource.run() 
propagate out of ZipFile.close() ?

I would also rename static method Source.close(Source) to 
Source.release(Source) so it would not clash with instance method 
Source.close() which makes it ambiguous when one wants to use 
Source::close method reference (both methods apply). I would also make 
static methods Source.get() and Source.release(Source) package-private 
(currently one is public and the other is private, which needs compiler 
bridges to be invoked) and both are in a private nested class.

Inflater/Deflater/ZipFile now follow the coding pattern as suggested. 
But ZipFileInflaterInputStream still does not. It's not critical since 
failing to register cleanup which releases the inflater back into cache 
would simply mean that Inflater employs its own cleanup and ends itself.

And now another thing I would like to discuss. Why an initiative for 
using Cleaner instead of finalization()? Among drawbacks finalization 
has one of the more troubling is that the tracked referent survives the 
GC cycle that initiates its finalization reference processing pipeline, 
so the GC may reclaim the object (and referenced objects) only after the 
finalize() has finished in yet another GC round. Cleaner API separates 
the tracked object from the cleanup function and state needed to perform 
it into distinct instances. The tracked object can be reclaimed and the 
cleanup reference processing pipeline initiated in the same GC cycle. 
More heap may be reclaimed earlier.

Unless we are careful and create a cleaning function for one tracked 
object which captures (directly or indirectly) another object which 
registers its own cleaning function but we don't deal with explicit 
cleaning of the 2nd object in the 1st cleaning function.

Take for example the ZipFileInflaterInputStream's cleaning function. It 
captures the ZipFile in order to invoke ZipFile.releaseInflater instance 
method. What this means is that ZipFile will be kept reachable until all 
ZipFileInflaterInputStream's cleaning functions have fired. So we are 
back to the finalization drawback which needs at least 2 GC cycles to 
collect and clean-up what might be done in one go.

I suggest moving the getInflater() and releaseInflater() from ZipFile 
into the CleanableResource so that ZipFileInflaterInputStream's cleaning 
function captures just the CleanableResource and not the ZipFile. 
ZipFile therefore may become eligible for cleanup as soon as all opened 
input streams become eligible (but their cleaning functions need not 
have fired yet). CleanableResource.run() (the ZipFile cleaning function) 
and CleanableResource.releaseInflater() (the 
ZipFileInflaterInputStream's cleaning function) may therefore be invoked 
in arbitrary order (maybe even concurrently if one of them is explicit 
cleanup and the other is automatic), so code must be prepared for that.

I have tried to capture all above in a modified webrev (taking your last 
webrev as a base):


Regards, Peter

More information about the core-libs-dev mailing list