Blur effect on live scene?
james.graham at oracle.com
Thu Aug 13 17:06:35 UTC 2015
On 8/13/2015 1:29 AM, Matthias Hänel wrote:
>>>> I'd argue that we sort of do have something like that - it is the cache flag. If a Node is cached then we do have a copy of it in a texture and that can help make the Blur Effect work more efficiently, but there may be some additional copies between textures if everything isn't set up right. Still, that is an avenue for someone to check to see if there isn't a better way to achieve this effect in the short term...
>>> I am not pretty sure what cache does. Probably some hasmap that holds objects and they are not instantly destroyed in the graphics RAM?
>> It is not a hashmap.
>> It is a hint to save the rendering of that node in a buffer:
>> I don't like the way that this doc comment is worded as it implies that using it on a node that is blurred is unwise, but if the node is animated over even GPU acceleration of the rendering and blurring operations" are going to have some cost that it could save.
> That documentation says nodes are cached as Bitmaps. In GPU or in CPU space? It is not clear here. I suspect it is in CPU RAM space.
> That will just lead to more copy-tasks from CPU to GPU or not? If optimized there is no benefit at all as stated in the docu "note that on some platforms such as GPU accelerated platforms there is little benefit".
As I said, the doc comments there are confusing. "bitmap" here, despite
any capitalization, is a general term for "some type of pixel store".
On GPU accelerated platforms, this is a vram texture/FBO.
> Actually, I would expect that renered Nodes are textures in GPU-VRAM to get the most performance out of it.
Yes, nodes with the cache hint are rendered into VRAM textures and
reused for subsequent frames if the cache hint is set - subject to the
policies of the cacheHint property if any (mainly transform) properties
of the node are changed in the meantime.
>>> From my current point the major problem with JavaFX is still the same.
>>> 1. Has a good API
>>> 2. renders most of its stuff in software, hence does not run performant
>>> 3. Has good approaches, but the overall sight on the technology is broken somewhere.
>> I am not sure how you come to the conclusion that it renders most of its stuff in software. It renders quite a lot in hardware. Even the example here of using snapshot to optimize a blurred background - the rendering of the scene is done in hw. It is only copied to main memory because the API requires a persistent image. If you render that image to the screen it is copied back into a texture and reused from that texture unless we run low on vram. There is no rendering in software there, only use of a heap buffer for persistent storage...
> Your point is that it uses hardware to render and it just uses snapshot to satisfy the API?
What is "it"?
JavaFX doesn't "use" snapshot anywhere. It provides the snapshot API so
the developer can use snapshot where the developer wants to, but JavaFX
doesn't invoke the snapshot API on its own.
In particular, the node cache hint does not use the snapshot API under
> My current understaning is ... what happens in our blurred-effect case?
To be clear, you are describing what happens in a particular
implementation of the blurred-effect case. The implementation that uses
snapshot, which may be the popular technique in use at this time.
> 1. the application constructs a JavaFX node tree.
> 2. the node tree is rendered mostly in hardware (shader effects and so on on top) to the main framebuffer
> 3. Snapshot calls ReadPixels (or whatever it is called on the particular platform)
> 4. JavaFX encapsulates this "new" image with Object
Technically, snapshot does its own (hardware accelerated) rendering into
a separate GPU texture/FBO and does a read pixels on that texture. It
then encapsulates those pixels into an Image object.
This snapshot operation need only be done once if the underlying node
tree is static.
> 5. We draw the "new" image with effects to the OGL context with the same node-tree API as we did before on top of the first node-tree.
Correct. Note that when you render that Image object its pixels are
cached in vram and that vram copy is reused from frame to frame. So, if
you reuse the one snapshot then there was only one trip from vram to
memory and back to vram on the first frame you did this and then
everything should be done in vram for subsequent frames reusing the same
> That works, but there is too much CPU and memcpy involved for my believe. Furthermore ReadPixels takes forever in
> an OpenGL perspective.
If the underlying tree is static, though, that readpixels operation only
happens once at the start of the operation, but you are correct that it
would be much better if it didn't need to happen at all.
> Since this is just one very simple effect, it is actually not good to spend more than approx. 20% CPU (i7) load on it.
> I expect 0% (not noticable) for this blurry effect.
That would be ideal. We may need new API to get there, but there are
also options to consider that may get us there in the short term. In
particular, cached nodes - which are already present in the API - may
get us closer to that goal.
> The ideal implementation from my perspective would be:
> 1. the application constructs a JavaFX node tree.
> 2. the node tree is rendered mostly in hardware (shader effects and so on on top) to a virtual framebuffer in the GPU space
> 3. The virtual framebuffer is drawn by a simple drawVert-call for the background
> 4. The virtual framebuffer is drawn once again shaped and shaded (blurry filter) by another drawVert-call
If you set the cache hint to true on the underlying tree then this may
be approximately what happens in our current implementation. The part I
would still need to investigate would be how well the effects machinery
(called Decora) can reuse the cached version of the nodes. If it
doesn't attempt to reuse the cached version of the nodes then it might
end up re-rendering the tree.
> I have to correct my assumption that JavaFX renders most of its stuff in software a bit. The wording "rendering" was not correct.
> This feeling comes probaly from the massive use of Snapshot. I have not understood the entire core of Prism until now.
> In some implementation of snapshot it seems to do the rendering entirely in software. That might only the fallback, but the
> ReadPixels-Stuff is still valid.
To be clear, snapshot was mainly created for its namesake - to produce
static copies of the scene data to be saved as images for putting into
documentation. Alternate uses that were considered were for generating
thumbnails of scenes for an application that could open/close various
scene graph panes. One wouldn't use the Windows "print screen" API in a
performance intensive part of one's application either. It is great
that it can be used for this particular result, but it was not designed
to be performant in that respect. Note that the primary snapshot API is
the asynchronous version with the callback to deliver the data. The
non-asynchronous version is mostly just a helper around that, but it
stalls the rendering pipeline to complete its work. None of that was
designed for insertion into a running animation technique.
> Tiny-Offtopic: I already meantioned on this list that JFXPanel (Swing) is way below it's expectations. The main reason there is also the use
> of the snapshot function instead of letting render JavaFX in it's own heavy weight window. In our small test case a simple List
> displayed via JFXPanel was not even to render more than 1 fps and it slowed down the entire Swing-UI. Since I know jogle and it's canvas
> implementation, would it be good to have a similar heavy weight JFX canvas in Swing? This would let JFX use it's full hardware rendered power.
It doesn't use snapshot per-se, but it does do something similar.
Sharing contexts and GPU resources with AWT/Java2D is on a wish list,
but architecturally we aren't there. Kevin would know more about our
challenges on that front than me...
More information about the openjfx-dev