[Kde-pim] [Nepomuk] Problems with akonadi_nepomuk_feeder/virtuoso

Sun Nov 27 21:11:51 GMT 2011

On Saturday 26 November 2011 01.01:25 Christian Mollekopf wrote:
> Hey,
> 
> I'm currently trying to debug a problem in the akonadi_nepomuk_feeder -
> virtuoso stack. It's pretty difficult to debug however and I'm not sure
> what's going on yet.
> 
> What I can see is that, virtuoso and nepomuk_storage go berserk (fully using
> my cpu), and the feeder using lot's of ram suddenly. I think that
> nepomuk_storage uses the cpu while indexing is normal (It's doing all the
> heavy work), but the virtuoso process doesn't even recover after stopping
> the feeder.
> In my testing so far I ensured that I can make a full initial indexing
> without any of those effects (nepomuk uses of course cpu but that's normal,
> the feeder uses maybe ~10'000k ram depending on the loaded modules). I
> indexed this way ~100k mails without a problem.
> 
> I think the root of the problem is the virtuoso process going berserk, which
> results in the nepomuk_feeder in job timeouts when I start a new
> Nepomuk::storeResources jobs, presumably because nepomuk_storage cannot do
> it's job due to virtuoso being busy with itself.
> I'm not sure however how the feeder managed once to amount over 800'000k of
> ram (Possible scenarios are the Item queue getting huge, because a
> collection was fetched with loads of big items, including their payload, or
> the ChangeRecorder just recorded to many changes).
> 

By now I'm pretty sure that it is just the queue which is getting huge, I 
managed to produce a queue of ~40'000 items which resulted in the feeder using 
600 megs of ram. As a workaround I could limit the queue to a certain size, 
which would result in all lost items not being indexed (until they are changed 
again).
A proper fix would probably involve queuing the items with id only and then 
fetching a batch of only 100 items or so, but I suppose I'm a bit late for 
this (would need a bit of refactoring code wise)

Should I limit the queue to a certain size?

Any input appreciated.

> Job timeout manifests in this error:
> 
> akonadi_nepomuk_feeder(19233) ItemQueue::jobResult: "Did not receive a
> reply. Possible causes include: the remote application did not send a
> reply, the message bus security policy blocked the reply, the reply timeout
> expired, or the network connection was broken."
> 
> 
> What I would be interested in:
> 
> - Is it possible that the ChangeRecorder get's this big (Given that it is
> configured to fetch the full payload)?
> - How can I debug the virtuoso?
> 
> I don't have the isql-vt command (need to compile it myself), but
> nepomukserver just repeatedly prints what you can find in the attached
> nepomukoutput.txt if that gives a hint.
> 
> I have no idea where this query is coming from, nor what it is doing, but it
> seems this is what's keeping virtuoso busy (or not?).
> 
> Right now I throw new data at nepomuk as soon as the datastore job failed
> (because I just wait until the job has finished), which probably also
> doesn't help nepomuk to recover. So we might need a solution for jobs which
> just take longer than the dbus job timeout. I don't know however if this is
> a valid scenario that a store job can take longer than the dbus job
> timeout, or if this is just caused by virtuoso going berserk on some query.
> 
> Anyways, I just wanted to let you know what's going on, and if you know
> something or want to help any help/input is greatly appreciated.
> 
> Cheers,
> Christian
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/