[Rev 03] RFR: 8088198: Exception thrown from snapshot if dimensions are larger than max texture size
github.com+7450507+fthevenet at openjdk.java.net
Fri Jan 24 17:16:33 UTC 2020
On Fri, 24 Jan 2020 16:55:47 GMT, Frederic Thevenet <github.com+7450507+fthevenet at openjdk.org> wrote:
>>> Here are the results when running JavaFX 14-ea+7.
>>> The columns of the table correspond the width of the target snapshot, while the rows correspond to the height and the content of the cells is the average time* spent (in ms) in `Node::snapshot`
>>> (*) each test is ran 10 times and the elapsed time is averaged after pruning outliers, using Grubbs' test.
>>> 1024 2048 3072 4096 5120 6144 7168 8192
>>> 1024 6.304272 10.303935 15.052336 35.929304 23.860095 28.828812 35.315288 27.867205
>>> 2048 11.544367 21.156326 28.368750 28.887164 47.134738 54.354708 55.480251 56.722649
>>> 3072 15.503187 30.215269 41.304645 39.789648 82.255091 82.576379 96.618722 106.586547
>>> 4096 20.928336 38.768648 64.255423 52.608217 101.797347 132.516816 158.525192 166.872889
>>> 5120 28.693431 67.275306 68.090280 76.208412 133.974510 157.120373 182.329784 210.069066
>>> 6144 29.972591 54.751002 88.171906 104.489291 147.788597 185.185643 213.562819 228.643761
>>> 7168 33.668398 63.088490 98.756212 130.502678 196.367121 225.166481 239.328794 260.162501
>>> 8192 40.961901 87.067460 128.230351 178.127225 198.479068 225.806211 266.170239 325.967840
>> Any idea why 4096x1024 and 1024x4096 are so different? Same for 8192x1024 and 1024x8192.
> I don't, to be honest.
> The results for some dimensions (not always the same) can vary pretty widely from one run to another, despite all my effort to repeat results and remove outliers.
> Out of curiosity, I also tried to eliminate the GC as possible culprit by running it with epsilon, but it seems to make a significant difference.
> I ran that test on a laptop with Integrated Intel graphics and no dedicated vram (Intel UHD Graphics 620), though, so this might be why.
> Maybe someone could try and run the bench on hardware with a discreet GPU?
With regard to why the tiling version is significantly slower, though, I do have a pretty good idea; as Kevin hinted, the pixel copy into a temporary buffer before copying into the final image is where most the extra time is spent.
The reason why is is some much slower is a little bit of a pity, though; profiling a run of the benchmark shows that a lot of time is spent into `IntTo4ByteSameConverter::doConvert` and the reason for this turns out that this is due to the fact that, under Windows and the D3D pipeline anyway, the `WriteableImage` used to collate the tiles and the tiles returned from the RTTexture have different pixel formats (IntARGB for the tile and byteBGRA for the `WriteableImage`).
So if we could use a `WriteableImage` with an IntARGB pixel format as the recipient for the snapshot (at least as long as no image was provided by the caller), I suspect that the copy would be much faster.
Unfortunately it seems the only way to choose the pixel format for a `WritableImage` is to initialize it with a `PixelBuffer`, but then one can no longer use a `PixelWriter` to update it and it desn't seems to me that there is a way to safely access the `PixelBuffer` from an image's reference alone.
I'm pretty new to this code base though (which is quite large; I haven't read it all quite yet... ;-), so hopefully there's a way to do that that has eluded me so far.
More information about the openjfx-dev