Language Plugin Parsers / KDevelop-PG-Qt Performance

Sun Oct 25 08:58:28 UTC 2009

Hi Milian,

Milian Wolff wrote:
> Hey guys,
>
> after waiting quite a bit for my benchmark to finish in callgrind I just 
> aborted it and looked at the data it produced in the meantime. And boy was I 
> surprised!
>
> 75.67% of the time was spent in KDevPG::LocationTable::positionAt() !
>
> I think the problem is that my test file is very large, it has 79929 lines, and 
> the function above goes everytime from 0 to X where X is the line that has a 
> proper index...
>
> I have several ideas how to improve the situation:
>
> 1) store last looked up line + index and make the search relative to that:
> assuming that, during declaration building, the search is always incremental, 
> or at least we lookup stuff near to each other, this should greatly increase 
> performance.
>
> 2) dunno, if the following algorithm has a name (I bet it does), but it would 
> work like this:
>   
Bisection search.

> - look at index of line in the middle of the list
> - if index is good, finish
> - if index is too high, use lower half of remaining list, continue from 
> beginning
> - if index is too low, use upper half of remaining list, continue from 
> beginning
> This should at least reduce the number of iterations required. If I'm not 
> mistaken it would take log_2(total number of lines) to find the correct index, 
> in my case a maximum of 17. Of course the code would become much more 
> complicated, as we don't want to copy lists, but always operate on the base 
> list...
>
> I think I'll implement both tomorrow and see what I get. Both somehow interest 
> me :) Do you have any other ideas? I bet you "real" computer science guys have 
> learned such stuff in school and can solve it easily? Is there maybe something 
> already available in algorithm.h or the Qt equivalent?
>   
You do not even have to choose which algorithm to use at build time. 
When you perform a search for a particular index, store this. When you 
next come to do a search compare it to the first and see if they are 
correlated (within some specific distance of each other). If they are 
correlated then use this information by searching outwards from the 
previous result. If they are not correlated do a fresh bisection search 
that does not use the extra information available to us.

Attached is some sample code that does this. I use this as an abstract 
base class for various concrete interpolation sub-classes (linear, 
exponential, polynomial etc). The two algorithms you need for searching 
are in the locate() and hunt() functions respectively. The description 
of these algorithms came from Numerical Recipes in C++ but the class is 
mine.

It should be relatively simple for you to port these functions and the 
correlation stuff to your use case here I think.

HTH,

Sean

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: abstractinterp.h
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20091025/8e9aa5a3/attachment.h>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: abstractinterp.cpp
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20091025/8e9aa5a3/attachment.ksh>