[Kde-pim] Review Request: Add indexing throttling and fixed endless indexing problems

Thu Mar 8 16:33:29 GMT 2012

On Tuesday 06 March 2012 07:49:55 Sebastian Trueg wrote:
> On 03/05/2012 04:49 PM, Volker Krause wrote:
> > On Sunday 26 February 2012 12:33:49 Sebastian Trueg wrote:
> >> Since I am unable to reproduce this slow query - it seems that I am
> >> missing data which results in such a query - could you please:
> >> 
> >> * apply the attached patch to kde-runtime/nepomuk/services/backupsync
> >> (-p3)
> >> * shutdown akonadi and nepomuk, remove the nepomuk db (or move it to a
> >> backup location)
> >> * start the storage service manually: "nepomukservicestub nepomukstorage
> >> 2> /tmp/mytmpfile"
> >> * start akonadi, let it finish its indexing, even if Virtuoso goes wild
> >> * finally get the hardest query via: "grep XXXXX /tmp/mytmpfile|sed
> >> "s/.*XXXXX //g"|sort -n|uniq|tail -20".
> >> 
> >> Thanks for the help.
> > 
> > 20420633 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
> > filter( ?p in (<ht
> > 11627450 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
> > filter( ?p in (<ht
> > 2 status()
> > 4850172 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
> > filter( ?p in (<ht
> > 5972876 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o.
> > filter( ?p in (<ht
> > 
> > doesn't look like this is working :-(
> > 
> > I ended up with 4 threads with infinite queries now, but no output, as
> > they
> > never finish (I see the output for the fast queries, so it's working in
> > general).
> 
> I see. If the problem persists even if the patch I gave you yesterday
> evening, could you please put the debug statement before the query
> execution instead?
> If my patch works, could you please collect the output so I have some
> statistics?

looks very good :)

For the last two days I did not end up with a hanging Virtuoso anymore, and it 
looks like the Akonadi indexing actually managed to finish this time. So, 
definitely something we should commit :)

I'll investigate the resulting data more closely during the weekend, but I've 
already spotted some things that still need fixing (but that's largely 
unrelated to the query performance problem, it'll likely require another full 
re-indexing though, which should finally confirm that this is fixed). Besides 
the problems with nco:Contact merging already described by Will, I noticed 
that some of those nco:Contacts also have a large amount of hasEmailAddress 
properties, so we seem to only merge the contact objects, but not the objects 
they refer to. My attempts on indexing attachment content (still commented 
out) also didn't generate the desired results yet (actually, no results at 
all, I can't find the content objects anywhere, only the meta-information we 
add in the feeder).

regards,
Volker

> > regards,
> > Volker
> > 
> >> On 02/22/2012 05:48 PM, Volker Krause wrote:
> >>> Here we go:
> >>> 
> >>> Connected to OpenLink Virtuoso
> >>> Driver: 06.01.3127 OpenLink Virtuoso ODBC Driver
> >>> OpenLink Interactive SQL (Virtuoso), version 0.9849b.
> >>> Type HELP; for help and EXIT; to exit.
> >>> SQL> status();
> >>> REPORT
> >>> VARCHAR
> >>> ________________________________________________________________________
> >>> __
> >>> _____
> >>> 
> >>> OpenLink Virtuoso  Server
> >>> Version 06.01.3127-pthreads for Linux as of ??? ?? ????
> >>> Started on: 2012/02/20 14:34 GMT+60
> >>> Lite Mode
> >>> 
> >>> Database Status:
> >>>   File size 1344274432, 164096 pages, 53194 free.
> >>>   400 buffers, 400 used, 1 dirty 1 wired down, repl age 125 0 w. io 1
> >>>   w/crsr.
> >>>   Disk Usage: 1114387496 reads avg 0 msec, 0% r 0% w last  0 s, 8976288
> >>> 
> >>> writes,
> >>> 
> >>>     537880 read ahead, batch = 6.  Autocompact 64288 in 40567 out, 36%
> >>>     saved.
> >>> 
> >>> Gate:  770368 2nd in reads, 0 gate write waits, 0 in while read 0 busy
> >>> scrap. Log =
> >>> /home/vkrause/.kde/share/apps/nepomuk/repository/main/data/virtuosobacke
> >>> nd
> >>> /soprano- virtuoso.trx, 87 bytes
> >>> 110879 pages have been changed since last backup (in checkpoint state)
> >>> Current backup timestamp: 0x0000-0x00-0x00
> >>> Last backup date: unknown
> >>> Clients: 610 connects, max 38 concurrent
> >>> RPC: 23159320 calls, -630 pending, 4 max until now, 0 queued, 2790 burst
> >>> reads (0%), 15 second brk=18624512
> >>> Checkpoint Remap 0 pages, 0 mapped back. 2383 s atomic time.
> >>> 
> >>>     DB master 164096 total 53194 free 0 remap 0 mapped back
> >>>    
> >>>    temp  256 total 249 free
> >>> 
> >>> Lock Status: 0 deadlocks of which 0 2r1w, 71 waits,
> >>> 
> >>>    Currently 2 threads running 0 threads waiting 0 threads in vdb.
> >>> 
> >>> Pending:
> >>> 
> >>> Client 1111:1:-48:  Account: dba, 415702328 bytes in, 240297225 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-43:  Account: dba, 296436892 bytes in, 173661376 bytes
> >>> out,
> >>> 1 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 1 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-46:  Account: dba, 292329482 bytes in, 179137713 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-17:  Account: dba, 335589080 bytes in, 192090303 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-240:  Account: dba, 10656 bytes in, 189180 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-41:  Account: dba, 1552 bytes in, 938 bytes out, 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-579:  Account: dba, 120819 bytes in, 273415 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-13:  Account: dba, 419207763 bytes in, 243086156 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-2:  Account: dba, 78053 bytes in, 3394883 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-69:  Account: dba, 2564 bytes in, 80545 bytes out, 0
> >>> stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:610:  Account: dba, 204 bytes in, 255 bytes out, 1 stmts.
> >>> PID: 14404, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 1 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-594:  Account: dba, 339712 bytes in, 584087 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-44:  Account: dba, 424311940 bytes in, 250861715 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-65:  Account: dba, 5613 bytes in, 687096 bytes out, 0
> >>> stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-29:  Account: dba, 7329 bytes in, 108643 bytes out, 0
> >>> stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-22:  Account: dba, 28081 bytes in, 9393 bytes out, 0
> >>> stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-470:  Account: dba, 3257767 bytes in, 6410521 bytes out,
> >>> 0
> >>> stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-12:  Account: dba, 323006521 bytes in, 181981840 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-81:  Account: dba, 30386 bytes in, 401979 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-47:  Account: dba, 398031874 bytes in, 233688412 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-45:  Account: dba, 254603133 bytes in, 144263381 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-23:  Account: dba, 372153873 bytes in, 224998663 bytes
> >>> out,
> >>> 0 stmts.
> >>> PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-68:  Account: dba, 10462 bytes in, 149481 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-253:  Account: dba, 14802 bytes in, 1564724 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Client 1111:1:-67:  Account: dba, 10881 bytes in, 169263 bytes out, 0
> >>> stmts. PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1
> >>> Transaction status: PENDING, 0 threads.
> >>> Locks:
> >>> 
> >>> Running Statements:
> >>>  Time (msec) Text
> >>>  
> >>>           96 status()
> >>>      
> >>>      2177514 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p
> >>>      ?o.
> >>> 
> >>> filter( ?p in (<ht
> >>> 
> >>> 
> >>> Hash indexes
> >>> 
> >>> 
> >>> 159 Rows. -- 97 msec.
> >>> 
> >>> On Monday 20 February 2012 17:34:27 Sebastian Trüg wrote:
> >>>> Please reproduce the high load, then start
> >>>> isql localhost:1111 dba dba
> >>>> and run "status();"
> >>>> and tell me the running queries.
> >>>> 
> >>>> Thanks,
> >>>> Sebastian
> >>>> 
> >>>> On 02/20/2012 11:44 AM, Christian Mollekopf wrote:
> >>>>> On Friday 17 February 2012 19.53:42 Volker Krause wrote:
> >>>>>> On Friday 17 February 2012 09:00:19 Sebastian Trueg wrote:
> >>>>>>>> On Feb. 16, 2012, 7:45 p.m., Christian Mollekopf wrote:
> >>>>>>>>> The HighPrio Queue shouldn't ever be throttled ideally, but in
> >>>>>>>>> view
> >>>>>>>>> of
> >>>>>>>>> the current problems it's definitely a reasonable approach. I
> >>>>>>>>> didn't
> >>>>>>>>> give a close look yet, but you can ship it from my side. Thanks
> >>>>>>>>> for
> >>>>>>>>> the
> >>>>>>>>> patch.
> >>>>>>> 
> >>>>>>> Currently there is no way around throttling the high prio queue. As
> >>>>>>> stated
> >>>>>>> above (and as you confirmed in private email) adding a new email
> >>>>>>> account
> >>>>>>> will result in newItem events for all the emails. That in turn will
> >>>>>>> put
> >>>>>>> them into the high prio queue.
> >>>>>> 
> >>>>>> I'm currently testing this, and it indeed seems to improve indexing
> >>>>>> considerably. Without throttling in effect my system is now reliably
> >>>>>> indexing hundreds of mails per minute, without getting stuck with
> >>>>>> Virtuoso
> >>>>>> going crazy.
> >>>>> 
> >>>>> It also goes for me a lot faster, it seems it's not the merging which
> >>>>> was
> >>>>> the bottleneck after all. Not sure why it goes so much faster now
> >>>>> though.>
> >>>>> 
> >>>>>> I (locally) reduced the idle time limit a bit though, with the new
> >>>>>> two
> >>>>>> minute setting it rarely switches to full speed here, maybe something
> >>>>>> we
> >>>>>> still want to tweak.
> >>>>> 
> >>>>> Feel free to do so once that is committed.
> >>>>> 
> >>>>>> I get a lot of these errors now though: "nepomukservicestub(24734)"
> >>>>>> Soprano: "Invalid argument (1)":
> >>>>>> "http://www.semanticdesktop.org/ontologies/2007/03/22/nmo#messageHead
> >>>>>> er
> >>>>>> has
> >>>>>> a max cardinality of 1. Provided 2 values -
> >>>>>> <nepomuk:/res/dfc71807-249b-47e4-91c1-90e3bd940f4d>,
> >>>>>> <nepomuk:/res/93266175- f423-481b-a371-2b6ed28c5dbb>. Existing - "
> >>>>>> 
> >>>>>> This seems to be caused by emails with more than one extra header we
> >>>>>> index
> >>>>>> (such as List-Id), and thus triggers on basically everything in my
> >>>>>> mailinglist folders. Affected emails are skipped and re-indexed at an
> >>>>>> agent
> >>>>>> restart (which of course fails again). Is nmo:messageHeader the right
> >>>>>> property for these headers, and if yes, why does it have cardinality
> >>>>>> one?
> >>>>> 
> >>>>> I'm running into those too, I'll have a look at it. This might be
> >>>>> related
> >>>>> to a bug in the merger (will have to check again). I reduced the batch
> >>>>> size to 1, in order to minimize this problem. Once that is working we
> >>>>> probably want to increase the size again for performance reasons,
> >>>>> which
> >>>>> will of course affect the system load again (nepomuk will i.e. get 100
> >>>>> items at a time instead of just one.)
> >>>>> 
> >>>>> Otherwise I think we can commit that to master. I can backport it
> >>>>> later
> >>>>> when I have my other fixes in.
> >>>>> 
> >>>>> Sebastian, can you commit this to master? Or do you want me to do it?
> >>>>> 
> >>>>> I think I still have to store the already indexed collections in the
> >>>>> config, because right now, if the initial indexing hasn't finished and
> >>>>> the computer is rebooted it will still start over, meaning it will
> >>>>> query
> >>>>> for the existence of all items again. If that is fast enough that
> >>>>> won't
> >>>>> be a problem of course.
> >>>>> 
> >>>>> The removal of items seems to be broken right now too, that's not
> >>>>> related
> >>>>> to the patch though.
> >>>>> 
> >>>>> Also I'm still not sure that this is the actual cause the virtuoso
> >>>>> going
> >>>>> crazy bug, but in either case a very good improvement.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20120308/62d6c215/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/