[Kde-pim] Review Request: Add indexing throttling and fixed endless indexing problems

Sebastian Trueg strueg at mandriva.com
Fri Mar 9 15:38:39 GMT 2012


On 03/09/2012 04:24 PM, Volker Krause wrote:
> On Thursday 08 March 2012 20:24:16 Sebastian Trueg wrote:
>> On 03/08/2012 05:33 PM, Volker Krause wrote:
>>> On Tuesday 06 March 2012 07:49:55 Sebastian Trueg wrote:
>>>> On 03/05/2012 04:49 PM, Volker Krause wrote:
>>>>> On Sunday 26 February 2012 12:33:49 Sebastian Trueg wrote:
>>>>>> Since I am unable to reproduce this slow query - it seems that I am
>>>>>> missing data which results in such a query - could you please:
>>>>>>
>>>>>> * apply the attached patch to kde-runtime/nepomuk/services/backupsync
>>>>>> (-p3)
>>>>>> * shutdown akonadi and nepomuk, remove the nepomuk db (or move it to a
>>>>>> backup location)
>>>>>> * start the storage service manually: "nepomukservicestub
>>>>>> nepomukstorage
>>>>>> 2> /tmp/mytmpfile"
>>>>>> * start akonadi, let it finish its indexing, even if Virtuoso goes wild
>>>>>> * finally get the hardest query via: "grep XXXXX /tmp/mytmpfile|sed
>>>>>> "s/.*XXXXX //g"|sort -n|uniq|tail -20".
>>>>>>
>>>>>> Thanks for the help.
>>>>> 20420633 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
>>>>> filter( ?p in (<ht
>>>>> 11627450 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
>>>>> filter( ?p in (<ht
>>>>> 2 status()
>>>>> 4850172 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
>>>>> filter( ?p in (<ht
>>>>> 5972876 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
>>>>> filter( ?p in (<ht
>>>>>
>>>>> doesn't look like this is working :-(
>>>>>
>>>>> I ended up with 4 threads with infinite queries now, but no output, as
>>>>> they
>>>>> never finish (I see the output for the fast queries, so it's working in
>>>>> general).
>>>> I see. If the problem persists even if the patch I gave you yesterday
>>>> evening, could you please put the debug statement before the query
>>>> execution instead?
>>>> If my patch works, could you please collect the output so I have some
>>>> statistics?
>>> looks very good :)
>>>
>>> For the last two days I did not end up with a hanging Virtuoso anymore,
>>> and it looks like the Akonadi indexing actually managed to finish this
>>> time. So, definitely something we should commit :)
>> great. thanks for testing.
>>
>>> I'll investigate the resulting data more closely during the weekend, but
>>> I've already spotted some things that still need fixing (but that's
>>> largely unrelated to the query performance problem, it'll likely require
>>> another full re-indexing though, which should finally confirm that this
>>> is fixed). Besides the problems with nco:Contact merging already
>>> described by Will, I noticed that some of those nco:Contacts also have a
>>> large amount of hasEmailAddress properties, so we seem to only merge the
>>> contact objects, but not the objects they refer to. My attempts on
>>> indexing attachment content (still commented out) also didn't generate
>>> the desired results yet (actually, no results at all, I can't find the
>>> content objects anywhere, only the meta-information we add in the
>>> feeder).
>> Is there a branch which I can have a look at?
> It's in master, agents/nepomukfeeder/plugins/nepomukmailfeeder.cpp:162 in 
> kdepim-runtime. The actual code of indexData() has been recovered from the 
> pre-DMS feeder, I'd assume something is wrong there (it uses the 
> nepomukindexer tool and feeds the attachment via stdin). When doing that 
> manually I fail to find the result in Nepomuk as well, while it works when 
> pointing it to a file. The attachment resource we create before indexing the 
> content (the not commented out part before line 162) works fine.
>
> regards,
> Volker
I think that is because nepomukindexer simply cannot handle stdin
properly anymore. Vishesh?
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list