Vc branch ready for testing
Boudewijn Rempt
boud at valdyas.org
Mon Sep 10 14:35:55 UTC 2012
On Monday 10 September 2012 Sep, Sven Langkamp wrote:
>
> Branch has been tested on half a dozen systems now. Results were from twice
> as fast to very slight improvement/no change noticeable. Not sure why there
> is such a difference between systems. Dual-core systems seem to have a
> bigger improvement. Might be that it was mask processing was already quite
> fast on quad-core cpus before.
Weirdly, though, I did see a big change on my desktop machine.
> Branch is almost feature complete, just some improvements for detecting
> cmake files needed. Also will need some ifdefs if vc should stay an
> optional dependency.
I think it should, at least until it gets more widespread and until I've fixed the Windows port :-)
> I did some further profiling with callgrind on some 1000px 0.04 spacing.
> Callgrind file can be found here: http://depot.tu-dortmund.de/get/ybukq
>
> It shows that the composite op is now the most expensive operation in the
> KisStrokeBenchmark. Which is probably also the reason that we don't see
> bigger improvements from the mask processing. Pentalis wants to look at the
> composite ops and see what can be done there.
It should be a prime candidate for vectorization -- but it might mean a big operation since I'm beginning to suspect it'd mean taking the alpha-channel out of band.
> I'm considering to
> parallelize the fixedBlt with QtConcurrent like we already have for the
> brush mask.
That should work fine as well.
>
> Beside that callgrind shows some other smaller bottlenecks. One is
> QVector::fill which is used by the initialize of the fixed paintdevice.
> Might be possible to save that by using uninitialized values and just
> resize.
>
> Another smaller bottleneck appears to be the memcpy, we do to set the color
> of the dab. When we use a plain color that never changes, it might be
> possible to avoid that as we only change the alpha values.
>
> Unfortunately the benchmarks don't show the other operation done while
> painting in Krita. So I can't say how much effect e.g. update of the
> projection/canvas has.
There's something to be gained here -- for both opengl and qpainter canvas, there's still a bottleneck where pixels go through the gui thread. For the opengl canvas, that's fixable, and I was hoping the kritasketch opengl canvas could become a proper replacement for both the opengl1 and qpainter canvas, but it's not there yet.
--
Boudewijn Rempt
http://www.valdyas.org, http://www.krita.org, http://www.boudewijnrempt.nl
More information about the kimageshop
mailing list