[Kde-pim] Review Request: Add indexing throttling and fixed endless indexing problems

Vishesh Handa handa.vish at gmail.com
Fri Mar 9 18:38:36 GMT 2012


On Fri, Mar 9, 2012 at 9:08 PM, Sebastian Trueg <strueg at mandriva.com> wrote:

> On 03/09/2012 04:24 PM, Volker Krause wrote:
> > On Thursday 08 March 2012 20:24:16 Sebastian Trueg wrote:
> >> On 03/08/2012 05:33 PM, Volker Krause wrote:
> >>> On Tuesday 06 March 2012 07:49:55 Sebastian Trueg wrote:
> >>>> On 03/05/2012 04:49 PM, Volker Krause wrote:
> >>>>> On Sunday 26 February 2012 12:33:49 Sebastian Trueg wrote:
> >>>>>> Since I am unable to reproduce this slow query - it seems that I am
> >>>>>> missing data which results in such a query - could you please:
> >>>>>>
> >>>>>> * apply the attached patch to
> kde-runtime/nepomuk/services/backupsync
> >>>>>> (-p3)
> >>>>>> * shutdown akonadi and nepomuk, remove the nepomuk db (or move it
> to a
> >>>>>> backup location)
> >>>>>> * start the storage service manually: "nepomukservicestub
> >>>>>> nepomukstorage
> >>>>>> 2> /tmp/mytmpfile"
> >>>>>> * start akonadi, let it finish its indexing, even if Virtuoso goes
> wild
> >>>>>> * finally get the hardest query via: "grep XXXXX /tmp/mytmpfile|sed
> >>>>>> "s/.*XXXXX //g"|sort -n|uniq|tail -20".
> >>>>>>
> >>>>>> Thanks for the help.
> >>>>> 20420633 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p
> ?o.
> >>>>> filter( ?p in (<ht
> >>>>> 11627450 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p
> ?o.
> >>>>> filter( ?p in (<ht
> >>>>> 2 status()
> >>>>> 4850172 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
> >>>>> filter( ?p in (<ht
> >>>>> 5972876 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
> >>>>> filter( ?p in (<ht
> >>>>>
> >>>>> doesn't look like this is working :-(
> >>>>>
> >>>>> I ended up with 4 threads with infinite queries now, but no output,
> as
> >>>>> they
> >>>>> never finish (I see the output for the fast queries, so it's working
> in
> >>>>> general).
> >>>> I see. If the problem persists even if the patch I gave you yesterday
> >>>> evening, could you please put the debug statement before the query
> >>>> execution instead?
> >>>> If my patch works, could you please collect the output so I have some
> >>>> statistics?
> >>> looks very good :)
> >>>
> >>> For the last two days I did not end up with a hanging Virtuoso anymore,
> >>> and it looks like the Akonadi indexing actually managed to finish this
> >>> time. So, definitely something we should commit :)
> >> great. thanks for testing.
> >>
> >>> I'll investigate the resulting data more closely during the weekend,
> but
> >>> I've already spotted some things that still need fixing (but that's
> >>> largely unrelated to the query performance problem, it'll likely
> require
> >>> another full re-indexing though, which should finally confirm that this
> >>> is fixed). Besides the problems with nco:Contact merging already
> >>> described by Will, I noticed that some of those nco:Contacts also have
> a
> >>> large amount of hasEmailAddress properties, so we seem to only merge
> the
> >>> contact objects, but not the objects they refer to. My attempts on
> >>> indexing attachment content (still commented out) also didn't generate
> >>> the desired results yet (actually, no results at all, I can't find the
> >>> content objects anywhere, only the meta-information we add in the
> >>> feeder).
> >> Is there a branch which I can have a look at?
> > It's in master, agents/nepomukfeeder/plugins/nepomukmailfeeder.cpp:162 in
> > kdepim-runtime. The actual code of indexData() has been recovered from
> the
> > pre-DMS feeder, I'd assume something is wrong there (it uses the
> > nepomukindexer tool and feeds the attachment via stdin). When doing that
> > manually I fail to find the result in Nepomuk as well, while it works
> when
> > pointing it to a file. The attachment resource we create before indexing
> the
> > content (the not commented out part before line 162) works fine.
> >
> > regards,
> > Volker
> I think that is because nepomukindexer simply cannot handle stdin
> properly anymore. Vishesh?
>

eh? Works for me.

cat someFile | nepomukindexer --uri '/home/vishesh/someFile'

In the case of PIM, the uri will have to be the actual uri in the form of
nepomuk:/res/some-uuid.
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list