[Kde-pim] More Nepomuk Feeder Improvements

Vishesh Handa me at vhanda.in
Wed May 29 13:55:16 BST 2013


Hey Christian


On Wed, May 29, 2013 at 4:45 PM, Christian Mollekopf
<chrigi_1 at fastmail.fm>wrote:

> On Wednesday 29 May 2013 13.14:04 Christian Mollekopf wrote:
> > On Wednesday 29 May 2013 14.47:07 Vishesh Handa wrote:
> > > Hey Christian
> >
> > Hey Vishesh,
> >
> > > I've made some more improvements to the nepomuk feeder. Most of it is
> > > simple stuff like scheduling the operations better, and reacting to
> config
> > > changes. Others are just simple cleanups.
> > >
> > > The biggest change is probably that I have remove the half-hour and
> hourly
> > > checks for emails.
> > > Also I've removed the whole concept of batch indexing.
> >
> > Not a good idea IMO
> >
> > That code has been there to detect mass insertions of new items.
> >
> > Let's say you get notifications for 300'000 items within a minute or so
> (I
> > just added my email account), and then get another 300'000 notifications
> > because I just removed my other account.
> >
> > That leads to:
> > * huge queues (shouldn't be a big deal, but maybe only store the item
> id's
> > instead of the full items for the ones to remove)
> > * all items end up in the high prio queue
> >
> > => The feeder is utterly useless until you restart because it will be
> busy
> > doing stuff that it should do in background processing (and that is not
> > really really relevant atm.)
> >
> > The code before prevented that by simply skipping the batch, so we don't
> > mistake mass changes for actually relevant stuff (Because I don't care
> if my
> > email account takes a day to index, but I want the note I just added to
> be
> > immediately available in search)
> >
> > The regular FindUnindexedJob, which was only executed if such a batch was
> > actually added and items were skipped, so not during normal operation,
> would
> > then retrieve the skipped items again, properly scheduling them as
> > background processing work.
> >
> > I'd rather not loose that.
> >
>
> But the scheduler is probably a better place to implement it...
>
>
Right. That's what I thought as well.

Plus, the FindUnindexedItems job is very expensive (kind of), so I would
like avoid calling it. Also, calling it after 30 minutes doesn't make much
sense, because the user looses context. If I add a new email account and
emails are being indexed, it makes sense to me, but half an hour later?
I'll just be wondering what is going on.

I can improve the scheduler to index more recent emails first, and give
other things such as notes, and contacts a higher precedence.


-- 
Vishesh Handa
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list