<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="http://git.reviewboard.kde.org/r/104052/">http://git.reviewboard.kde.org/r/104052/</a>
</td>
</tr>
</table>
<br />
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('http://git.reviewboard.kde.org/media/rb/images/review_request_box_top_bg.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for kdelibs, David Faure and Michael Pyne.</div>
<div>By Mark Gaiser.</div>
<p style="color: grey;"><i>Updated Feb. 25, 2012, 8:45 p.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Changes</h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Greatly changed the diff.
Added:
- gzip compression using KFilterDev
- prevent image insertion when the current image from the cache is exactly the same as the one we want to insert.
- heavily improved speed
(changed main description for this)</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description (updated)</h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I was running KWin through callgrind to see where possible bottlenecks are. I wasn't expecting much since it improved greatly during the 4.8 dev cycle, however one stood out. The saving of PNG images was taking about 1/5th of the time in KWin that i could see directly. That looked like something i might be able to optimize.
What this patch is doing is storing the actual image bits to prevent saving a PNG image to the mmapped cache. That was a hot code path in time (cycles), not even in calls. I've also reduced the amount of memory copies to a bare minimum by adding a rawFind function to KSharedDataCache which fills a QByteArray::fromRawData thus preventing a expensive memory copy. The rawFind is used for looking up an image and fetching it's data without copying it. That is done because QImage seems to make a copy itself internally. I don't have any performance measurements, however, prior to this patch my kwin test was using up ~5.000.000.000 cycles. After this patch it's using up 1.370.000.000. I don't have raw performance numbers to see if the cache itself is actually faster, it certainly has become a lot cheaper to use the cache. Logic wise i would say creating a QImage from the cached data should be way faster now since there is no step involved anymore in decoding the image. Storing is certainly an order of magnitude faster.
-- update --
After spending a lot more time trying to get compression in the mix "someone" came with the suggestion to use KFilterDev. Never heard of it, but i gave it a shot anyway. It turned out to be the bull-eye in this case. Data is now greatly compressed using bzip (any other compression available in KFilterDev makes it a lot slower) with no speed loss compared to my previous benchmark results, the opposite is true, speedups!
New benchmarks compared to the stock 4.8.0 KImageCache:
Read: ~8x faster (was 5x in my previous patch)
Write: ~5x faster (equally fast compared to stock, no speedup here in my previous patch)
There are still a few major speedups that can be taken here, but are a lot more complicated to do.
- use a faster compression method, that will probably greatly speedup insertion and retrieval! (i keep saying it: LZ4!)
- somehow store the bits separate so that only the bits can be fetched for insertion checking. I actually did do this already, but the added overhead of more QIODevice objects and more QBuffers is causing it to not be beneficial at all. It's even a little slower.
As for the issues marked below for race conditions. I need help in that area. I can't do what's suggested (i lack the knowledge for it).
Special thanks go to David Faure for helping me a great deal with this.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I've also written a bunch of test cases (greatly improved by David Faure) to see if i didn't break anything. According to the test (which is also comparing the actual image bits) it's all passing just fine.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> (updated)</h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>kdecore/util/kshareddatacache.h <span style="color: grey">(339cecc)</span></li>
<li>kdecore/util/kshareddatacache.cpp <span style="color: grey">(9fe3995)</span></li>
<li>kdeui/tests/CMakeLists.txt <span style="color: grey">(c8b8c85)</span></li>
<li>kdeui/tests/kimagecachetests.h <span style="color: grey">(PRE-CREATION)</span></li>
<li>kdeui/tests/kimagecachetests.cpp <span style="color: grey">(PRE-CREATION)</span></li>
<li>kdeui/util/kimagecache.cpp <span style="color: grey">(a5bbbe1)</span></li>
</ul>
<p><a href="http://git.reviewboard.kde.org/r/104052/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>