Vc branch ready for testing

Mon Sep 10 14:35:55 UTC 2012

On Monday 10 September 2012 Sep, Sven Langkamp wrote:
> 
> Branch has been tested on half a dozen systems now. Results were from twice
> as fast to very slight improvement/no change noticeable. Not sure why there
> is such a difference between systems. Dual-core systems seem to have a
> bigger improvement. Might be that it was mask processing was already quite
> fast on quad-core cpus before.

Weirdly, though, I did see a big change on my desktop machine.

> Branch is almost feature complete, just some improvements for detecting
> cmake files needed. Also will need some ifdefs if vc should stay an
> optional dependency.

I think it should, at least until it gets more widespread and until I've fixed the Windows port :-)

> I did some further profiling with callgrind on some 1000px 0.04 spacing.
> Callgrind file can be found here: http://depot.tu-dortmund.de/get/ybukq
> 
> It shows that the composite op is now the most expensive operation in the
> KisStrokeBenchmark. Which is probably also the reason that we don't see
> bigger improvements from the mask processing. Pentalis wants to look at the
> composite ops and see what can be done there.

It should be a prime candidate for vectorization -- but it might mean a big operation since I'm beginning to suspect it'd mean taking the alpha-channel out of band.

> I'm considering to
> parallelize the fixedBlt with QtConcurrent like we already have for the
> brush mask.

That should work fine as well.

> 
> Beside that callgrind shows some other smaller bottlenecks. One is
> QVector::fill which is used by the initialize of the fixed paintdevice.
> Might be possible to save that by using uninitialized values and just
> resize.
> 
> Another smaller bottleneck appears to be the memcpy, we do to set the color
> of the dab. When we use a plain color that never changes, it might be
> possible to avoid that as we only change the alpha values.
> 
> Unfortunately the benchmarks don't show the other operation done while
> painting in Krita. So I can't say how much effect e.g. update of the
> projection/canvas has.

There's something to be gained here -- for both opengl and qpainter canvas, there's still a bottleneck where pixels go through the gui thread. For the opengl canvas, that's fixable, and I was hoping the kritasketch opengl canvas could become a proper replacement for both the opengl1 and qpainter canvas, but it's not there yet.

-- 
Boudewijn Rempt
http://www.valdyas.org, http://www.krita.org, http://www.boudewijnrempt.nl