<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <font size="-1">Hi, Andrew!<br>

      <br>

      Welcome back to Krita! :)</font>

    <blockquote type="cite"

cite="mid:CAL53kkKg3eVhWrsr4fDNejNY+Vew7YW+-7XOnzQej+_k=Mhzag@mail.gmail.com">

      <pre wrap="">I'm working on a prototype using OpenCL, and I was able to get a

*functionally* correct implementation of 'composite' algorithm

(KoCompositeOpAlphaBase.h), which is one of the performance bottlenecks

that I spotted. OpenCL version of 'composite' is currently sub-optimal,

but it passes a dozen of existing unit tests.</pre>

    </blockquote>

    Instead of trying to optimize the general-case

    KoCompositeOpAlphaBase

    composite op, you can try to optimize specific composite ops, like

    KoOptimizedCompositeOpOver{32,128} and

    KoOptimizedCompositeOpAlphaDarken{32,128}. They are used in 90% of

    the time<br>

    <br>

    <blockquote type="cite"

cite="mid:CAL53kkKg3eVhWrsr4fDNejNY+Vew7YW+-7XOnzQej+_k=Mhzag@mail.gmail.com">

      <pre wrap="">So far the only major issue was heavy use of C++ templates and type

traits in Krita core code. OpenCL mainly supports C language, so I had

to write a lot of C adapters and use a preprocessor meta-magic to lower

everything from C++ (host side) to C (device side).</pre>

    </blockquote>

    Do you have any decision on how you are going to optimize data

    transfers between CPU and GPU? Some proxy object for KisPainter and

    KisPaintDevice, so that the data will not leave GPU?

    <blockquote type="cite"

cite="mid:CAL53kkKg3eVhWrsr4fDNejNY+Vew7YW+-7XOnzQej+_k=Mhzag@mail.gmail.com">

      <pre wrap="">I'll focus on getting sane performance numbers out of this

implementation, and it may change the current design quite

significantly. Then the code will probably be good enough to be shared.</pre>

    </blockquote>

    Yeah, the numbers is a good thing :)<br>

    <br>

    <blockquote type="cite"

cite="mid:CAL53kkKg3eVhWrsr4fDNejNY+Vew7YW+-7XOnzQej+_k=Mhzag@mail.gmail.com">

      <pre wrap="">Andrew

On Thu, Apr 6, 2017 at 5:51 PM, Boudewijn Rempt <a class="moz-txt-link-rfc2396E" href="mailto:boud@valdyas.org"><boud@valdyas.org></a> wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">On Thu, 6 Apr 2017, Andrew Savonichev wrote:

</pre>

        <blockquote type="cite">

          <pre wrap="">Hi Boudewijn,

Okay, I understand.

I would take a couple of weeks for PoC work to have better understanding

of changes need to be made and issues we may face.

I will get back to you when I have more to discuss.

</pre>

        </blockquote>

        <pre wrap="">

Awesome! I'm looking forward to that!

</pre>

        <blockquote type="cite">

          <pre wrap="">

    - Andrew

On Thu, Apr 6, 2017 at 12:01 PM, Boudewijn Rempt <a class="moz-txt-link-rfc2396E" href="mailto:boud@valdyas.org"><boud@valdyas.org></a> wrote:

</pre>

          <blockquote type="cite">

            <pre wrap="">On Thu, 6 Apr 2017, Andrew Savonichev wrote:

</pre>

            <blockquote type="cite">

              <pre wrap="">Hello,

I'd like to know what is the status of GPU usage in Krita. I know

Krita can use OpenGL

for rendering, but I'm thinking about offloading image processing

algorithms to GPU.

>From my understanding, many algorithms in Krita are data parallel and

operate on entire image, what

makes them good candidates for offloading.

Is there any directions, discussions or maybe some existing work on

enabling GPU acceleration?

What do you think about adding this to Krita?

</pre>

            </blockquote>

            <pre wrap="">

Right now, only the canvas uses the gpu. We have had two attempts to

use the GPU for implementing filters, once using opencl, once with

glsl. That code is so old and was so unripe, it's probably not even

useful to look at.

Apart from filters, recomputing the layer stack on the gpu could be

worth-while, and writing brush engines that run on the gpu could be

worth-while. In both cases, the problem is getting the pixel data to

the gpu and back in the main memory.

There are two possible approaches: store everything in main memory and

copy to the gpu memory when needed, and then when done, back, but that

is slow. The other approach would be to keep the entire layer stack in

gpu memory, but that would limit the size of images -- and people do

work on images that take gigabytes of memory.

But, I am _very_ eager to see progress in this area and would

love to work with you to see this happen.

--

Boudewijn Rempt | <a class="moz-txt-link-freetext" href="http://www.krita.org">http://www.krita.org</a>, <a class="moz-txt-link-freetext" href="http://www.valdyas.org">http://www.valdyas.org</a>

</pre>

          </blockquote>

          <pre wrap="">

</pre>

        </blockquote>

        <pre wrap="">

--

Boudewijn Rempt | <a class="moz-txt-link-freetext" href="http://www.krita.org">http://www.krita.org</a>, <a class="moz-txt-link-freetext" href="http://www.valdyas.org">http://www.valdyas.org</a>

</pre>

      </blockquote>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Dmitry Kazakov</pre>

  </body>

</html>