Language Plugin Parsers / KDevelop-PG-Qt Performance

Sun Oct 25 20:47:39 UTC 2009

On Sunday 25 October 2009 09:58:28 Sean Harmer wrote:
> Hi Milian,
> 
> Milian Wolff wrote:
> > Hey guys,
> >
> > after waiting quite a bit for my benchmark to finish in callgrind I just
> > aborted it and looked at the data it produced in the meantime. And boy
> > was I surprised!
> >
> > 75.67% of the time was spent in KDevPG::LocationTable::positionAt() !
> >
> > I think the problem is that my test file is very large, it has 79929
> > lines, and the function above goes everytime from 0 to X where X is the
> > line that has a proper index...
> >
> > I have several ideas how to improve the situation:
> >
> > 1) store last looked up line + index and make the search relative to
> > that: assuming that, during declaration building, the search is always
> > incremental, or at least we lookup stuff near to each other, this should
> > greatly increase performance.
> >
> > 2) dunno, if the following algorithm has a name (I bet it does), but it
> > would work like this:
> 
> Bisection search.
> 
> > - look at index of line in the middle of the list
> > - if index is good, finish
> > - if index is too high, use lower half of remaining list, continue from
> > beginning
> > - if index is too low, use upper half of remaining list, continue from
> > beginning
> > This should at least reduce the number of iterations required. If I'm not
> > mistaken it would take log_2(total number of lines) to find the correct
> > index, in my case a maximum of 17. Of course the code would become much
> > more complicated, as we don't want to copy lists, but always operate on
> > the base list...
> >
> > I think I'll implement both tomorrow and see what I get. Both somehow
> > interest me :) Do you have any other ideas? I bet you "real" computer
> > science guys have learned such stuff in school and can solve it easily?
> > Is there maybe something already available in algorithm.h or the Qt
> > equivalent?
> 
> You do not even have to choose which algorithm to use at build time.
> When you perform a search for a particular index, store this. When you
> next come to do a search compare it to the first and see if they are
> correlated (within some specific distance of each other). If they are
> correlated then use this information by searching outwards from the
> previous result. If they are not correlated do a fresh bisection search
> that does not use the extra information available to us.
> 
> Attached is some sample code that does this. I use this as an abstract
> base class for various concrete interpolation sub-classes (linear,
> exponential, polynomial etc). The two algorithms you need for searching
> are in the locate() and hunt() functions respectively. The description
> of these algorithms came from Numerical Recipes in C++ but the class is
> mine.
> 
> It should be relatively simple for you to port these functions and the
> correlation stuff to your use case here I think.

Thanks very much, but I think either case is better than the current 
implementation, and enough so. Also my code seems to be a bit simpler than 
your case, i.e. the binary search is only like 10 lines long or something.

It performs quite well.
-- 
Milian Wolff
mail at milianw.de
http://milianw.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20091025/8d75bbc9/attachment.sig>