[Nepomuk] Review Request 109991: Regexp cache optimization in Nepomuk fileindexer.

Simeon Bird bladud at gmail.com
Mon Apr 29 00:00:58 UTC 2013



> On April 20, 2013, 4:14 a.m., Simeon Bird wrote:
> > common/regexpcache.cpp, line 108
> > <http://git.reviewboard.kde.org/r/109991/diff/2/?file=139540#file139540line108>
> >
> >     Do you really need to initialise this?
> 
> Lukasz Olender wrote:
>     I'm not sure - size_t is always initialized '0' by default?

I'm not sure, so just leave it alone.


On April 20, 2013, 4:14 a.m., Lukasz Olender wrote:
> > Could you also add to the commit message a comment on the sort of performance gains this patch produces? ie, is it O(10%), or an order of magnitude? Also, how much of the improvement is due to the combining of filters in createPattern?, and how much just to rolling the RegEx into one long (||||) one?
> 
> Lukasz Olender wrote:
>     I've added it. It's about 5 times faster using standard Nepomuk's filters on my machine. Actually, combining them against just joining with "|" makes all of the difference. My solution will be unfortunately about two times slower if I'll just join all those filters by "|" and let QRegExp do the job (it's easy to check by setting minOccur variable in createPattern method to value bigger than number of filters). I've added this information in comment to createPattern method (now it's named groupPatterns). Probably it will be slower if filters won't have any common patterns (for example if user will have a number of ignored files and no common filters at all). In such a situation, we can switch to old behavior, but it's quite hard to check when it's worth switching, so I'm leaving it as it is. I realize there might be a need to handle also that cases.

Ok, good! 


- Simeon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/109991/#review31314
-----------------------------------------------------------


On April 24, 2013, 6:45 a.m., Lukasz Olender wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/109991/
> -----------------------------------------------------------
> 
> (Updated April 24, 2013, 6:45 a.m.)
> 
> 
> Review request for Nepomuk and Vishesh Handa.
> 
> 
> Description
> -------
> 
> It's related with https://bugs.kde.org/show_bug.cgi?id=303654. 
> P.S. I accidentally deleted author's and license info in patch. Isolated performance tests are also uploaded to http://www.sendspace.com/file/mkihdp (previous link not always work). It's my first patch.
> 
> 
> This addresses bug 303654.
>     http://bugs.kde.org/show_bug.cgi?id=303654
> 
> 
> Diffs
> -----
> 
>   common/regexpcache.h d89f968 
>   common/regexpcache.cpp df45277 
> 
> Diff: http://git.reviewboard.kde.org/r/109991/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Lukasz Olender
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20130429/de5a5478/attachment.html>


More information about the Nepomuk mailing list