ZipFileSystem performance regression
xueming.shen at gmail.com
Tue Apr 16 19:44:01 UTC 2019
One of the motivations back then is to speed up the performance of accessing
those entries, means you don't have to deflate/inflate those new/updated
during the lifetime of that zipfilesystem. Those updated entries only
when go to storage. So the regression is more like a trade off of
different usages. (it also simplifies the logic on handing different
types of entries ...)
One idea I experimented long time ago for jartool is to concurrently
entries when need compression ... it does gain some performance improvement
on multi-cores, but not lots, as it ends up coming back to the main
write out to the underlying filesystem.
On 4/16/19 5:21 AM, Claes Redestad wrote:
> Both before and after this regression, it seems the default behavior is
> not to use a temporary file (until ZFS.sync(), which writes to a temp
> file and then moves it in place, but that's different from what happens
> with the useTempFile option enabled). Instead entries (and the backing
> zip file system) are kept in-memory.
> The cause of the issue here is instead that no deflation happens until
> sync(), even when writing to entries in-memory. Previously, the
> deflation happened eagerly, then the result of that was copied into
> the zip file during sync().
> I've written a proof-of-concept patch that restores the behavior of
> eagerly compressing entries when the method is METHOD_DEFLATED and the
> target is to store bytes in-memory (the default scenario):
> This restores performance of parallel zip to that of 11.0.2 for the
> default case. It still has a similar regression for the case where
> useTempFile is enabled, but that should be easily addressed if this
> looks like a way forward?
> (I've not yet created a bug as I got too caught up in trying to figure
> out what was going on here...)
> On 2019-04-16 09:29, Alan Bateman wrote:
>> On 15/04/2019 14:32, Lennart Börjeson wrote:
>>> Previously, the deflation was done when in the call to Files.copy,
>>> thus executed in parallel, and the final ZipFileSystem.close()
>>> didn't do anything much.
>> Can you submit a bug? When creating/updating a zip file with zipfs
>> then the closing the file system creates the zip file. Someone needs
>> to check but it may have been that the temporary files (on the file
>> system hosting the zip file) were deflated when writing (which is
>> surprising but may have been the case).
More information about the core-libs-dev