[Kde-pim] Nepomukfeeder Caching

Wed Dec 5 11:50:10 GMT 2012

Used the wrong reply to button...

----- Original message -----
From: Christian Mollekopf <chrigi_1 at fastmail.fm>
To: Milian Wolff <mail at milianw.de>
Subject: Re: [Kde-pim] Nepomukfeeder Caching
Date: Wed, 05 Dec 2012 12:48:48 +0100

On Tue, Dec 4, 2012, at 10:38 PM, Milian Wolff wrote:
> On Tuesday 04 December 2012 21:14:34 Christian Mollekopf wrote:
> > On Tue, Dec 4, 2012, at 08:15 PM, Milian Wolff wrote:

...

> > The unit-tests aren't really unittests as they work on the normal
> > akonadi/nepomuk databases and involve all parts of the system. I failed
> > to properly unit-test this code so far as it is pretty difficult to
> > decouple the functionality from akonadi and the alternative of running
> > separate akonadi/nepomuk instances isn't really attractive for all tests
> > either.
> 
> You most certainly don't want to work on your "normal" databases, see:
> 
> http://techbase.kde.org/Projects/PIM/Akonadi/Testing
> 

Yeah, I know, I was just too lazy ;-)
For most unit-test I don't want to use any database at all, but for that
I would need some unit-test facilities in the jobs, i.e. a mock session
delivering prepared items. For certain tests it makes sense to involve a
fully blown database (or two with nepomuk), but for most tests there
shouldn't be a need for that.
So I'd really prefer looking into that first before writing a lot of
tests for that code.

> Too bad there is nothing about this "Akonadi Benchmarker" in there -
> would be 
> potentially interesting. Maybe someone from the PIM team has more
> information 
> on this?
> 
> > For the benchmark we'd have similar problems, and when talking about the
> > cache only, it would just be comparing the caches performance compared
> > to no caching, as the cost of not having the cache ends up on the
> > nepomuk side.
> > 
> > A benchmark which just benchmarks the roundtrip-time of indexing a
> > couple of items repeatedly would be possible, I was just to lazy to
> > write it so far / didn't see the need for it yet. It would essentially
> > automate the manual steps above.
> > That wouldn't give us any information about how the load between nepomuk
> > and the feeder has changed though.
> 
> If the load is less, the roundtrip should be done in less time, no? 

Absolutely, but it doesn't tell us anything about where the cpu cycles
are used (nepomuk or the feeder),
and that the caching brings a pretty good performance improvement is out
of question for me.

> I personally think it would be a very good idea to have such a benchmark. 
> Especially considering this:
> 
> > I quickly ran the pimindexerutility through vtune with the following
> > result (I zoomed in on the indexing):
> > http://tinypic.com/r/o9pid4/6
> 
> You should *filter* in on the indexing, e.g. mark one/both "CPU Usage"
> peaks 
> and look at them separately. But I think these spikes are far too small
> to get 
> useful output from a sampling based profiler like VTune (the spikes look
> to be 
> ~200ms wide so that means ~20 samples).
> 

Ah, thanks, filtering indeed changes the results a bit.
But the spikes really have nothing to do with the hashing.

> Anyhow, that just shows that it's probably not worth spending time on 
> optimizing the hash function right now.
> 
> And as I said above, with a proper benchmark you could increase the size
> of 
> the problem until you get a reasonable timeframe which can be used to get 
> meaningful results from the sampling statistics.
> 

We can only increase the size of the problem by increasing the size of
the stored data, at which point nepomuk will start to go crazy, but I
get your point ;-)

> > As you can see, the akonadi framework is more of a performance problem
> > (due to the fetching of the items) than the hashing, so I think it's
> > safe to assume that the solution works good enough for the time being.
> > Note that also the spikes you see are akonadi related and have nothing
> > to do with the hashing.
> 
> Where do you see akonadi in the above? The 999ms spent in main are
> probably 
> the event loop (also compare to the CPU Usage below). I don't know what 
> exactly happens, but it could quite probably indicate that the time is
> just 
> spent on waiting for a result from nepomuk/whatever, and not really
> akonadi 
> (since there is no CPU activity).

I was referring to QBasicAtomicInt:ref usages, which are called by
assignEntityPrivate. I was just surprised that the cost of copying
akonadi entities shows up in the profile at all (I've seen it before in
some zanshin tests I did when populating the model).
To analyze this we would definitely need to scale up the problem though,
so if you're interested to look into this I can try to make some larger
scale benchmark for fetching items.

Note that QBasicAtomicInt:ref doesn't show up anymore after correctly
*filtering* in on the timeframe (instead of *zooming*), so I think this
comes from the model population of pimindexerutility.

Cheers,
Christian

> 
> > There are still many other performance improvements in other places to
> > do ;-)
> 
> I bet thats true :)
> 
> Cheers
> 
> -- 
> Milian Wolff
> mail at milianw.de
> http://milianw.de
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/