[Kde-pim] Review Request: Add indexing throttling and fixed endless indexing problems

Christian Mollekopf chrigi_1 at fastmail.fm
Mon Feb 20 10:44:56 GMT 2012


On Friday 17 February 2012 19.53:42 Volker Krause wrote:
> On Friday 17 February 2012 09:00:19 Sebastian Trueg wrote:
> > > On Feb. 16, 2012, 7:45 p.m., Christian Mollekopf wrote:
> > > > The HighPrio Queue shouldn't ever be throttled ideally, but in view of
> > > > the current problems it's definitely a reasonable approach. I didn't
> > > > give a close look yet, but you can ship it from my side. Thanks for
> > > > the
> > > > patch.
> > 
> > Currently there is no way around throttling the high prio queue. As stated
> > above (and as you confirmed in private email) adding a new email account
> > will result in newItem events for all the emails. That in turn will put
> > them into the high prio queue.
> 
> I'm currently testing this, and it indeed seems to improve indexing
> considerably. Without throttling in effect my system is now reliably
> indexing hundreds of mails per minute, without getting stuck with Virtuoso
> going crazy.
> 

It also goes for me a lot faster, it seems it's not the merging which was the 
bottleneck after all. Not sure why it goes so much faster now though.

> I (locally) reduced the idle time limit a bit though, with the new two
> minute setting it rarely switches to full speed here, maybe something we
> still want to tweak.
> 

Feel free to do so once that is committed.

> I get a lot of these errors now though: "nepomukservicestub(24734)" Soprano:
> "Invalid argument (1)":
> "http://www.semanticdesktop.org/ontologies/2007/03/22/nmo#messageHeader has
> a max cardinality of 1. Provided 2 values -
> <nepomuk:/res/dfc71807-249b-47e4-91c1-90e3bd940f4d>, <nepomuk:/res/93266175-
> f423-481b-a371-2b6ed28c5dbb>. Existing - "
> 
> This seems to be caused by emails with more than one extra header we index
> (such as List-Id), and thus triggers on basically everything in my
> mailinglist folders. Affected emails are skipped and re-indexed at an agent
> restart (which of course fails again). Is nmo:messageHeader the right
> property for these headers, and if yes, why does it have cardinality one?

I'm running into those too, I'll have a look at it. This might be related to a 
bug in the merger (will have to check again). I reduced the batch size to 1, 
in order to minimize this problem. Once that is working we probably want to 
increase the size again for performance reasons, which will of course affect 
the system load again (nepomuk will i.e. get 100 items at a time instead of 
just one.)

Otherwise I think we can commit that to master. I can backport it later when I 
have my other fixes in.

Sebastian, can you commit this to master? Or do you want me to do it?

I think I still have to store the already indexed collections in the config, 
because right now, if the initial indexing hasn't finished and the computer is 
rebooted it will still start over, meaning it will query for the existence of 
all items again. If that is fast enough that won't be a problem of course.

The removal of items seems to be broken right now too, that's not related to 
the patch though.

Also I'm still not sure that this is the actual cause the virtuoso going crazy 
bug, but in either case a very good improvement.

Cheers,
Christian

> 
> regards,
> Volker
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list