[OpenJDK 2D-Dev] sun.java2D.Pisces renderer Performance and Memory enhancements
bourges.laurent at gmail.com
Wed Apr 24 08:59:58 UTC 2013
First, here are both updated webrev and benchmark results:
- results: http://jmmc.fr/~bourgesl/share/java2d-pisces/patch_opt_night.log
- webrev: http://jmmc.fr/~bourgesl/share/java2d-pisces/webrev-2/
Note: the webrev is partially "cleaner" - work still in progress !
- optimized cleanup of alpha / edges arrays
- TileState HARD reference stored in SunGraphics2D to avoid repeated
ThreadLocal or ConcurrentQueue accesses
- TileState propagated in RenderingEngine to PiscesRenderingEngine:
warning: interface compatibility issues
- minor tuning.
Now the ArrayCache (IntArrayCache, Dirty... and FloatArrayCache) are
totally useless during MapBench tests as the RendererContext stores large
arrays (16K int or float arrays) + rowAARLE (2Mb).
However, I keep the array caching for very high workload ... to be
Comparison (open office format):
Patch2 vs ductus:
1 *102,11%* 2 *144,49%* 4 *263,13%*
In average, patch2 is equal or better than ductus: 44% for 2 threads and
2.6 times for 4 threads !
In the following table, you can see gain variations depending on the test
(work load): my patch performs better than ductus for complex test case
test threads Tavg Tmed *Med+Stddev* boulder_17 1 82,54% 77,68% *76,99%*
boulder_17 2 119,57% 120,24% *128,56%* boulder_17 4 149,95% 150,39% *
161,98%* shp_alllayers_47 1 107,26% 107,18% *107,02%* shp_alllayers_47 2
144,24% 144,18% *147,00%* shp_alllayers_47 4 288,05% 289,10% *286,04%*
Secondly, here are my comments:
2013/4/24 Jim Graham <james.graham at oracle.com>
> Originally the version that was used in embedded used RLE because it
> stored the results in the shape itself. On desktop I never found that to
> be a necessary optimization especially because it actually wastes memory
> for no gain during animations, but that was why they used RLE as a storage
> format. Would it speed up the code to use a different storage format?
Maybe it could be a very good idea: compressing alpha array to RLE and then
decompressing it to fill byte tile array seems a bad idea. However,
keeping RLE encoding may help having smaller arrays to store a complete
tile line as I want: width = 4096 (or more) x height = 32.
As memory is cheap nowadays, I could try having a large 1D array to store
alpha values for complete tile line: 512K only !
> Also, in the version we use in JavaFX we removed the tiling altogether and
> return one alpha array for the entire rasterization. We might consider
> doing that for this code as well if it allows us to get rid of Ductus - it
> was a Ductus design constraint that forced the tiling (it was again based
> on the expected size of the hardware AA engine)...
I think tiling is still interesting as such small arrays stay in the cpu
cache ! however, I could try tuning the tile width to be larger (256x32)
instead of (32x32 tiles) ...
Who could help me working on pisces ? Could we form a tiger team ?
or at least could denis and you have some time to help me ?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the 2d-dev