Krita performance.

Wed May 22 09:19:57 UTC 2013

On Tuesday 21 May 2013 May 21:55:25 Sven Langkamp wrote:

> Newer nvidia cards have up to 6 GB of graphics memory, so it should fit.

But only two or three layers... With images of a more ordinary size, it would be possible to keep the whole layer stack in gpu memory and compose everything there. It should actually be possible to do that for bigger images as well, since the gpu will do the mipmapping of the textures for us, and only keep the most relevant level in memory. Then on zooming out, it will swap in the visible areas of the next mipmaps, as far as I can tell.

> > I've been experimenting a bit, and an ordinary layerstack (a4, 300dpi)
> > with a dozen layers works fine on my intel gpu and a little bit of graphcis
> > memory, though.
> >
> > I want to experiment some more here -- but the trick is arriving at
> >
> > a) a good, extensible technology choice (glsl? cuda? opencl? whatever?)
> > b) a good, extensible design
> > c) something that krita can grow into, because we don't want to do full
> > rewrites (I hope...)
> 
> 
> a) I would go with OpenCL. CUDA is nvidia only and glsl is a bit ugly for
> this purpose.

But OpenCL is not really well supported everywhere, is it? But I haven't digged into it much yet. There is plenty of appropriately licensed code for doing pretty much all filters and composite ops Krita supports in glsl, but there might be even more appropriate code to nick for opencl.

> 
> b) and c) is very tricky. There is the most simple way were you write the
> content of the layer into a buffer and push that to the graphics card,
> process it and get it back. This can be done without many rewrites, but you
> get a memory transfer bottleneck. This would give us a speedup of very
> computationally intense calculation, but not much or even negative on
> memory bounds ones.

Yeah... I want to start with moving all pixel storage to the gpu, actually, and in a fixed format (rgba, with the channel depth closest to the actual image that the gpu supports, 8, 16, 16f or 32f).

> The other option is to have all the tiles on the GPU, but that sort of
> difficult to integrate with the current codebase.

I'm not so sure. We've got a nice and flexible design where we can actually rip out the compositor and replace it completely, and we could make our tiles a bit bigger and store them on the gpu. When I finally get the opengl canvas to work on windows, I really want to do some experiments there.

-- 
Boudewijn Rempt
http://www.valdyas.org, http://www.krita.org, http://www.boudewijnrempt.nl