[Kde-pim] Review Request: Add indexing throttling and fixed endless indexing problems

Sebastian TrĂ¼g sebastian at trueg.de
Mon Feb 20 16:34:27 GMT 2012


Please reproduce the high load, then start
isql localhost:1111 dba dba
and run "status();"
and tell me the running queries.

Thanks,
Sebastian

On 02/20/2012 11:44 AM, Christian Mollekopf wrote:
> On Friday 17 February 2012 19.53:42 Volker Krause wrote:
>> On Friday 17 February 2012 09:00:19 Sebastian Trueg wrote:
>>>> On Feb. 16, 2012, 7:45 p.m., Christian Mollekopf wrote:
>>>>> The HighPrio Queue shouldn't ever be throttled ideally, but in view of
>>>>> the current problems it's definitely a reasonable approach. I didn't
>>>>> give a close look yet, but you can ship it from my side. Thanks for
>>>>> the
>>>>> patch.
>>>
>>> Currently there is no way around throttling the high prio queue. As stated
>>> above (and as you confirmed in private email) adding a new email account
>>> will result in newItem events for all the emails. That in turn will put
>>> them into the high prio queue.
>>
>> I'm currently testing this, and it indeed seems to improve indexing
>> considerably. Without throttling in effect my system is now reliably
>> indexing hundreds of mails per minute, without getting stuck with Virtuoso
>> going crazy.
>>
> 
> It also goes for me a lot faster, it seems it's not the merging which was the 
> bottleneck after all. Not sure why it goes so much faster now though.
> 
>> I (locally) reduced the idle time limit a bit though, with the new two
>> minute setting it rarely switches to full speed here, maybe something we
>> still want to tweak.
>>
> 
> Feel free to do so once that is committed.
> 
>> I get a lot of these errors now though: "nepomukservicestub(24734)" Soprano:
>> "Invalid argument (1)":
>> "http://www.semanticdesktop.org/ontologies/2007/03/22/nmo#messageHeader has
>> a max cardinality of 1. Provided 2 values -
>> <nepomuk:/res/dfc71807-249b-47e4-91c1-90e3bd940f4d>, <nepomuk:/res/93266175-
>> f423-481b-a371-2b6ed28c5dbb>. Existing - "
>>
>> This seems to be caused by emails with more than one extra header we index
>> (such as List-Id), and thus triggers on basically everything in my
>> mailinglist folders. Affected emails are skipped and re-indexed at an agent
>> restart (which of course fails again). Is nmo:messageHeader the right
>> property for these headers, and if yes, why does it have cardinality one?
> 
> I'm running into those too, I'll have a look at it. This might be related to a 
> bug in the merger (will have to check again). I reduced the batch size to 1, 
> in order to minimize this problem. Once that is working we probably want to 
> increase the size again for performance reasons, which will of course affect 
> the system load again (nepomuk will i.e. get 100 items at a time instead of 
> just one.)
> 
> Otherwise I think we can commit that to master. I can backport it later when I 
> have my other fixes in.
> 
> Sebastian, can you commit this to master? Or do you want me to do it?
> 
> I think I still have to store the already indexed collections in the config, 
> because right now, if the initial indexing hasn't finished and the computer is 
> rebooted it will still start over, meaning it will query for the existence of 
> all items again. If that is fast enough that won't be a problem of course.
> 
> The removal of items seems to be broken right now too, that's not related to 
> the patch though.
> 
> Also I'm still not sure that this is the actual cause the virtuoso going crazy 
> bug, but in either case a very good improvement.
> 
> Cheers,
> Christian
> 
>>
>> regards,
>> Volker
> 
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list