[Kde-pim] Review Request: Optimize Contact Search Job queries

Laurent Montel montel at kde.org
Wed Oct 31 17:16:54 GMT 2012



> On Oct. 28, 2012, 11:21 p.m., Laurent Montel wrote:
> > I tested it.
> > And it works.
> > I can't say if it's optimize search but it finds contact :)
> > So for me it works.
> > 
> > For me I can say "ship it" :)
> >

nobody wants to review?:)
Otherwise ship it;)


- Laurent


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/107065/#review21075
-----------------------------------------------------------


On Oct. 26, 2012, 8:32 p.m., Vishesh Handa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/107065/
> -----------------------------------------------------------
> 
> (Updated Oct. 26, 2012, 8:32 p.m.)
> 
> 
> Review request for KDEPIM-Libraries, Tobias Koenig, David Faure, and Sebastian Trueg.
> 
> 
> Description
> -------
> 
> Some of the contact search job queries are pathological, and would result in virtuoso consuming a lot of cpu for large amounts of time. I've optimized the queries by doing the following -
> 
> * Not adding a "?r a nco:Contact" term. This is unnecessary as it results in an extra property being matched. Considering that Nepomuk data almost always follows the ontologies (the exception is legacy data), just using properties which have a domain of nco:Contact should guarantee the correct results.
> 
> * Avoid unions - Virtuoso cannot optimize the unions that well. Instead we use a FILTER(?p in (..)) instead.
> 
> * Avoding regex based search - Regex based search will always be terribly slow. It literally applies the regular expression on each candidate in order to filter them out. It's a lot better to use the full text index. This is done using 'bif:contains'. We do loose a little bit of accuracy, and we cannot match word boundaries. But I think have a good user experience trumps a little bit of accuracy. ( If you agree - then I'll use bif:contains everywhere )
>  
> 
> 
> Diffs
> -----
> 
>   akonadi/contact/contactsearchjob.cpp 5df3bfd 
> 
> Diff: http://git.reviewboard.kde.org/r/107065/diff/
> 
> 
> Testing
> -------
> 
> Not tested at all. I've just run some of the queries in virtuoso, and looked at the corresponding SQL. I've used that as a basis of optimization. I would like someone to test this out. Preferably people will a large quantities of indexed data. 
> 
> 
> Thanks,
> 
> Vishesh Handa
> 
>

_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list