Vc branch ready for testing

Sven Langkamp sven.langkamp at gmail.com
Tue Sep 11 02:11:59 UTC 2012


On Mon, Sep 10, 2012 at 4:35 PM, Boudewijn Rempt <boud at valdyas.org> wrote:

> On Monday 10 September 2012 Sep, Sven Langkamp wrote:
> >
> > Branch has been tested on half a dozen systems now. Results were from
> twice
> > as fast to very slight improvement/no change noticeable. Not sure why
> there
> > is such a difference between systems. Dual-core systems seem to have a
> > bigger improvement. Might be that it was mask processing was already
> quite
> > fast on quad-core cpus before.
>
> Weirdly, though, I did see a big change on my desktop machine.
>
> > Branch is almost feature complete, just some improvements for detecting
> > cmake files needed. Also will need some ifdefs if vc should stay an
> > optional dependency.
>
> I think it should, at least until it gets more widespread and until I've
> fixed the Windows port :-)
>
> > I did some further profiling with callgrind on some 1000px 0.04 spacing.
> > Callgrind file can be found here: http://depot.tu-dortmund.de/get/ybukq
> >
> > It shows that the composite op is now the most expensive operation in the
> > KisStrokeBenchmark. Which is probably also the reason that we don't see
> > bigger improvements from the mask processing. Pentalis wants to look at
> the
> > composite ops and see what can be done there.
>
> It should be a prime candidate for vectorization -- but it might mean a
> big operation since I'm beginning to suspect it'd mean taking the
> alpha-channel out of band.
>
> > I'm considering to
> > parallelize the fixedBlt with QtConcurrent like we already have for the
> > brush mask.
>
> That should work fine as well.


I have done a quick experiment to test that in
branch krita-multithreadedfixedbitblt-langkamp. Stroke benchmark is
slightly faster and I measured that the time fixedBitBlt went down (haven't
done detailed testing, but is looks like a speedup of 1.6). I didn't notice
any improvements while painting though. Would be interesting to see if it
give bigger improvements on a quad-core (no extra libs required).

I'm more and more wondering where all the performance goes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kimageshop/attachments/20120911/b5238c8f/attachment.html>


More information about the kimageshop mailing list