Some notes about the directions of optimization of image merging in Krita
Dmitry Kazakov
dimula73 at gmail.com
Fri Apr 12 16:58:29 UTC 2013
Hi, all!
Recently I've been experimenting on optimization of Krita work with
huge multilayer images (including some more aggressive
multithreading), so I'd like to share some ideas I got about it.
Thre are three general approaches we can adopt here:
1) Multithreading at the level of the KisUpdateScheduler (we do
already have it). The general idea is that huge update regions are
split into smaller rectangles and each rect is merged separately in
its own thread.
2) Multithreading at the level of KisPainter. Each bitBlt (or
bitBltFixed) operation can split its work region on smaller rects and
process each rect in a separate thread. I guess, Sven did some
experiments on this topic some time ago, but I don't know the outcome
of it.
3) Avoid bitBlt of the empty tiles (the tiles filled with default (and
transparent) pixel).
Results:
1,2) [common things] When doing the measurements I found a very
interesting thing. It looks like the implementation of the QMutex and,
therefore, QReadWriteLock in Qt <= 4.7 is really flawed. The mutex there is
completely not scalable. It works *only* at the thread count <=
2(!). The raising of the number of threads higher than 2 gives
performance degradation to the level worse than the we get with a
single thread.
Qt >= 4.8 does not have this flaw. Its implementation scales with the
number of threads and the speed gets a bit higher.
As a result of it, I tried to implement my own read-write lock, which
would not rely on QMutex that much. The tests showed that my
implementation solved (a bit of) the problem in Qt 4.7, but the new
version of the mutexes in Qt 4.8 surpass it by about 15-20% [0]. So, I
think, I should drop that idea and just limit the number of used
threads when Krita is running on Qt <= 4.7.
1,2) [differences] In general, both types of the multithreading are
useful in Krita for different usecases. The threading at the level 1)
(scheduler) is useful when we have some filters or masks in the
stack. In this case the filters and filter masks' code will also use
the benefits of the threads. It can also give some benefit with usual
layers, but the update area should be quite huge (512+ px wide) so
that the scheduler could split it into smaller rectangles. That is
exactly the case when we do full refresh of the image (e.g. when
changing the visibility of the layer).
The threading at the level 2) (that is KisPainter) will not give any
benefit for the filters, but it can give some bonuses when painting
on usual layers, when the update rects are smaller than 512 pixels.
In my benchmarks I measured the speed of the full refresh of a huge
image (about 4000x6000 px) containing about 20 layers. It turned out
that the speed of the refresh is proportional (non-linearly) to the
total *sum* of the threads currently present, that is both ways of the
threading affect the speed, although the scheduler is a bit more
efficient in this [full refresh] testcase. And the good thing is that
there is not much overhead created: the speed of the refresh with 6+6
threads is only 5% slower than the refresh with 6 scheduler's threads
(of course this is applicable to Qt 4.8 only).
As a result, I think, adding the threads to the level of the
KisPainter is a good idea, because the scheduler can not cover all the
usecases.
But still, there is something to be desired in the threading code. The
usage of 4-8 threads makes speed boost of about 2 times (although the
portion of parallel code in this testcase is almost 100%) . I cannot
test what is happening, because I have Qt 4.8 only on a virtual
machine and I cannot run VTune there.
3) Ok, now about avoiding empty tiles. Everything is simple here. I
tested this approach and it gives about 2 times speed boost almost for
free! I just need to implement it in a proper way: expand the
interface of the data manager a bit and add some general iteration
classes to the KisPainter.
[0] - http://wstaw.org/m/2013/04/12/plasma-desktopSV2476.jpg
---
Dmitry Kazakov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kimageshop/attachments/20130412/b6e93cce/attachment.html>
More information about the kimageshop
mailing list