<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="http://git.reviewboard.kde.org/r/100730/">http://git.reviewboard.kde.org/r/100730/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On February 26th, 2011, 12:14 p.m., <b>David Nolden</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">The improvements you report from duchainify are far too huge, allociation has never played such a huge role during parsing. The problem is probably, that during the first (slow) run the duchain is built, then it is stored, and during the next run, it's only updated.
Please make sure that:
A) You call duchainify twice for each case, and use the second timing
B) You always use the "--force-update-recursive" parameter, which should eliminate the duchain caching issues
This issue also explains why you have observed the 50x difference in the number of calls to "new". From what I understand, you've changed the implementation of "new", so it shouldn't affect that number, or did I misunderstand that?</pre>
</blockquote>
<p>On March 4th, 2011, 4:58 p.m., <b>Floris Ruijter</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I reran the test with this script:
=====
#!/bin/bash
echo 'starting the test. first the old version.'
date
~/src/kdevplatform/build/util/duchainify/duchainify --force-update-recursive ~/src/kdevelop/languages/cpp/ > /dev/null
time ~/src/kdevplatform/build/util/duchainify/duchainify --force-update-recursive ~/src/kdevelop/languages/cpp/ > /dev/null
echo 'building the new version.'
cd ~/src/kdevelop/
git apply build/0001-rewrite-rxx_allocator-and-move-includes-from-the-hea.patch
cd build
make install > /dev/null
cd
echo 'new version:'
date
~/src/kdevplatform/build/util/duchainify/duchainify --force-update-recursive ~/src/kdevelop/languages/cpp/ > /dev/null
time ~/src/kdevplatform/build/util/duchainify/duchainify --force-update-recursive ~/src/kdevelop/languages/cpp/ > /dev/null
=====
the new version was about a minut faster(6:14 vs 5:18), i also used valgrind with the iostream including file again(this time running duchainify one time solo first) and here too i could see a very significant improvement in libkdev4cppparser.so .
the patch does NOT reimplement operator new. it just changes the way rxx_allocator works. rxx_allocator tries to get blocks from a static linked list first(which is tryLocked with a QMutex) if the chain is already locked or is empty it will instead create a new block and add that to it's own linked list of blocks. on destruction of a rxx_allocator, it will lock the static chain of unallocated blocks and add it's own chain to the chain of unallocated blocks.
</pre>
</blockquote>
</blockquote>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">i wrote the comment above a week ago i somehow just forgot to "publish changes".
if noone objects, i will push the changes that memorypool.h is only included in .cpps in a few days, but I'd like to hear if the changes to rxx_allocator are to be commited too. </pre>
<br />
<p>- Floris</p>
<br />
<p>On February 24th, 2011, 1:47 a.m., Floris Ruijter wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('http://git.reviewboard.kde.orgrb/images/review_request_box_top_bg.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for KDevelop.</div>
<div>By Floris Ruijter.</div>
<p style="color: grey;"><i>Updated Feb. 24, 2011, 1:47 a.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">rxx_allocator was according to my measurements done with kcachegrind, valgrind, duchainify and iostream. The allocator had three basic defects:
1) all allocated memory was deallocated whilst we need a lot of rxx_allocators (1 per file i presume?), so these blocks can be reused
2) it cleared the memory on a per block basis, but if not all of the block is used, then that is a waste of effort
3) it used realloc to manage the list of blocks, this isn't too bad but could cause a move of the list which is totaly unnecessary
i solved the problems mostly by making the blocks act as linked list nodes: a next pointer + a really long char array. deallocated blocks are kept in a static linked list, whilst actual rxx_allocators have their own(personal some would say)linked list of blocks. access to the deallocated blocks list is synchronized through a static QMutex.
the access could be threadsafe by using a thread local linked list of deallocated items too, but i don't think that'd be practical, the global static list is probably more effective (eventhough it requires locking) </pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">as mentioned i ran a file which only included iostream through duchainify which i callgrinded.
old: new:
pool::allocate ~450 000 000 ~7 000 000
all time spend in libkdev4cppparser:
~585 000 000 ~140 000 000
the pool::allocate numbers are both the 'inclusive' numbers
looking at the data for the amount of "operator new" calls I can see that the cost per call are pretty much the same but that the old implementation called it about 50x more.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>languages/cpp/codecompletion/item.cpp <span style="color: grey">(b25d1ae)</span></li>
<li>languages/cpp/cppparsejob.cpp <span style="color: grey">(f4819f2)</span></li>
<li>languages/cpp/parser/ast.h <span style="color: grey">(0281c6b)</span></li>
<li>languages/cpp/parser/control.h <span style="color: grey">(0b06248)</span></li>
<li>languages/cpp/parser/listnode.h <span style="color: grey">(d1eda36)</span></li>
<li>languages/cpp/parser/parser.cpp <span style="color: grey">(281ad8d)</span></li>
<li>languages/cpp/parser/rxx_allocator.h <span style="color: grey">(f0159e9)</span></li>
<li>languages/cpp/parser/tests/test_parser.cpp <span style="color: grey">(de5f804)</span></li>
</ul>
<p><a href="http://git.reviewboard.kde.org/r/100730/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>