[Kde-pim] Review Request: Add indexing throttling and fixed endless indexing problems
Sebastian Trueg
strueg at mandriva.com
Fri Mar 9 18:51:05 GMT 2012
On 03/09/2012 07:38 PM, Vishesh Handa wrote:
>
>
> On Fri, Mar 9, 2012 at 9:08 PM, Sebastian Trueg <strueg at mandriva.com
> <mailto:strueg at mandriva.com>> wrote:
>
> On 03/09/2012 04:24 PM, Volker Krause wrote:
> > On Thursday 08 March 2012 20:24:16 Sebastian Trueg wrote:
> >> On 03/08/2012 05:33 PM, Volker Krause wrote:
> >>> On Tuesday 06 March 2012 07:49:55 Sebastian Trueg wrote:
> >>>> On 03/05/2012 04:49 PM, Volker Krause wrote:
> >>>>> On Sunday 26 February 2012 12:33:49 Sebastian Trueg wrote:
> >>>>>> Since I am unable to reproduce this slow query - it seems
> that I am
> >>>>>> missing data which results in such a query - could you please:
> >>>>>>
> >>>>>> * apply the attached patch to
> kde-runtime/nepomuk/services/backupsync
> >>>>>> (-p3)
> >>>>>> * shutdown akonadi and nepomuk, remove the nepomuk db (or
> move it to a
> >>>>>> backup location)
> >>>>>> * start the storage service manually: "nepomukservicestub
> >>>>>> nepomukstorage
> >>>>>> 2> /tmp/mytmpfile"
> >>>>>> * start akonadi, let it finish its indexing, even if
> Virtuoso goes wild
> >>>>>> * finally get the hardest query via: "grep XXXXX
> /tmp/mytmpfile|sed
> >>>>>> "s/.*XXXXX //g"|sort -n|uniq|tail -20".
> >>>>>>
> >>>>>> Thanks for the help.
> >>>>> 20420633 sparql select distinct ?r count(?p) as ?cnt where {
> ?r ?p ?o.
> >>>>> filter( ?p in (<ht
> >>>>> 11627450 sparql select distinct ?r count(?p) as ?cnt where {
> ?r ?p ?o.
> >>>>> filter( ?p in (<ht
> >>>>> 2 status()
> >>>>> 4850172 sparql select distinct ?r count(?p) as ?cnt where {
> ?r ?p ?o.
> >>>>> filter( ?p in (<ht
> >>>>> 5972876 sparql select distinct ?r count(?p) as ?cnt where {
> ?r ?p ?o.
> >>>>> filter( ?p in (<ht
> >>>>>
> >>>>> doesn't look like this is working :-(
> >>>>>
> >>>>> I ended up with 4 threads with infinite queries now, but no
> output, as
> >>>>> they
> >>>>> never finish (I see the output for the fast queries, so it's
> working in
> >>>>> general).
> >>>> I see. If the problem persists even if the patch I gave you
> yesterday
> >>>> evening, could you please put the debug statement before the
> query
> >>>> execution instead?
> >>>> If my patch works, could you please collect the output so I
> have some
> >>>> statistics?
> >>> looks very good :)
> >>>
> >>> For the last two days I did not end up with a hanging Virtuoso
> anymore,
> >>> and it looks like the Akonadi indexing actually managed to
> finish this
> >>> time. So, definitely something we should commit :)
> >> great. thanks for testing.
> >>
> >>> I'll investigate the resulting data more closely during the
> weekend, but
> >>> I've already spotted some things that still need fixing (but
> that's
> >>> largely unrelated to the query performance problem, it'll
> likely require
> >>> another full re-indexing though, which should finally confirm
> that this
> >>> is fixed). Besides the problems with nco:Contact merging already
> >>> described by Will, I noticed that some of those nco:Contacts
> also have a
> >>> large amount of hasEmailAddress properties, so we seem to only
> merge the
> >>> contact objects, but not the objects they refer to. My attempts on
> >>> indexing attachment content (still commented out) also didn't
> generate
> >>> the desired results yet (actually, no results at all, I can't
> find the
> >>> content objects anywhere, only the meta-information we add in the
> >>> feeder).
> >> Is there a branch which I can have a look at?
> > It's in master,
> agents/nepomukfeeder/plugins/nepomukmailfeeder.cpp:162 in
> > kdepim-runtime. The actual code of indexData() has been
> recovered from the
> > pre-DMS feeder, I'd assume something is wrong there (it uses the
> > nepomukindexer tool and feeds the attachment via stdin). When
> doing that
> > manually I fail to find the result in Nepomuk as well, while it
> works when
> > pointing it to a file. The attachment resource we create before
> indexing the
> > content (the not commented out part before line 162) works fine.
> >
> > regards,
> > Volker
> I think that is because nepomukindexer simply cannot handle stdin
> properly anymore. Vishesh?
>
>
> eh? Works for me.
>
> cat someFile | nepomukindexer --uri '/home/vishesh/someFile'
>
> In the case of PIM, the uri will have to be the actual uri in the form
> of nepomuk:/res/some-uuid.
>
There that is probably the problem. Volker, do you specify the actual
Nepomuk URI or do you provide the akonadi URL?
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/
More information about the kde-pim
mailing list