[RFC] [kservice] KPluginMetadata indexing

Mark Gaiser markg85 at gmail.com
Thu Nov 6 09:09:51 UTC 2014


On Thu, Nov 6, 2014 at 3:44 AM, Sebastian Kügler <sebas at kde.org> wrote:
> Hi all,  especially Alex and David,
>
> tl;dr:
> I've done a proof-of-concept implementation of a metadata index for
> KPluginTrader::query(), the main entry point when it comes to finding binary
> plugins. This index considerably speeds up all current use cases, but comes at
> the cost of having to maintain the index. Code is in
> kservice[sebas/kpluginindex], speeds up plugin quering a few times.
>
> The Slightly Longer Story...
>
> During Akademy's frameworks and plasma bofs, we talked about indexing plugins
> for faster lookups. One of the things we wanted to try in Plasma is to index
> packages, and thereby speeding up package metadata lookups and plugin queries.
>
> I have done a naive implementation of such an indexing mechanism, and have
> implemented this as a proof of concept in KService, specifically in
> KPluginTrader::query(). This is using Alex Richardson's recent work on
> KPluginMetadata, which I found very useful (
> https://git.reviewboard.kde.org/r/120198/ and
> https://git.reviewboard.kde.org/r/120199/ ). I've put these patches in my
> branch kservice[sebas/kpluginindex].
>
> Basic Mechanism
>
> - a small tool called kplugin-update-index collects the json metadata from the
> plugins, and puts the list of plugins in a given plugin directory into a
> QJsonArray, and dumps that in Qt's json binary format to disk
> - KPluginTrader::query checks if an index file exists in a given plugin
> directory
> -- if the index file exists, it reads it and creates a list of KPluginMetaData
> objects from it
> -- if the index file doesn't exist, it walks over each plugin to read its
> metadata, it basically falls back to the old code path
>
> Performance Measurement Method
>
> I've created a new autotest, kpluginmetadatatest, which runs two subsequent
> queries and measure the time it takes to return the results. I've instrumented
> the code in kplugintrader.cpp with QElapsedTimers. The autotest runs on an
> environment on rotation metal and ssd in separate test runs. Before cold cache
> tests, I've dropped page cache, dentries and inodes from memory using
> echo 3 > /proc/sys/vm/drop_caches
> Tests are running on Qt's 5.4 branch, they're fairly consistent with what I've
> seen on Qt 5.3.
>
> Performance Improvements
>
> Performance tests are promising:
> http://vizzzion.org/blog/wp-content/uploads/2014/11/performance-comparison-charts.png (note that the metal's left-most bar is truncated by /10 in the
> picture).
>
> In short, the indexed queries are roughly:
> * 60 times faster on a rotational medium with cold caches
> * 3 times faster on an SSD with cold caches
> * 7 times faster on  a rotational disk with warm caches
> * 5 times faster on a SSD with warm caches
>
> More Observations
> - on ssds, we save most of the time in directory traversal and (de)serializing
> the json metadata
> - the index lookups spends almost all of its time in disk reads, deserializing
> the binary metadata is almost free (Qt's json binary representation is really
> fast to read)
> - I haven't seen any tests in which the indexed queries have been slower.
>
> These results can be explained as follows:
> - the bottleneck is reading the files from disk
> - on rotational media, expectedly we get huge performance penalties for every
> seek we cause, the more files we read, the more desastrous lookups times get.
> - Expectedly, warm pagecaches help a lot in all cases
>
> Cost: Maintaining the Cache
>
> These speedups do come at a cost, of course, and that is the added complexity
> of maintaining the caches. The idea from the bof sessions had been to update
> the caches at install time, this is essentially what can be done with kplugin-
> update-index (it needs some added logic to give the index files sensible
> permissions when run as root). That means that packagers will have to run the
> index updater in their postinstall routine. Not doing this at all means slower
> queries (or rather, no speedier queries), worse is if they forget to update
> once in a while, in which case newly installed or removed plugins might be
> missing or dangling in the index files. This will need at least some packaging
> discipline.
>
> Index File Location
>
> The indexer creates the index files in the plugin directories itself, not in
> $CACHE or $TMP. This seems the most straight-forward way to do it, since if a
> plugin is installed into a specific directory, the "installer" will have write
> permission there to update the index as well. One might consider putting these
> index files in the cache directory, like ksycoca does, but in that case, we
> need to be smarter to actually update the index files correctly, since at that
> point, it depends on the environment of the user and the plugin paths (which
> means, it can't sensibly be done at install-time).
>
> KServiceTyperTrader Comparison
>
> First off, for the current situation, the comparison to KServiceTypeTrader is
> not of much use, since it's orthogonal to KPluginTrader.
> That aside, I've run the same queries through KServiceTypeTrader (with
> different results, of course, and just on an ssd).
> With cold caches KServiceTypeTrader is 40 times faster than unindexed queries
> (current status quo), and still  times faster with indices.
> Successive queries are about 100 times faster than indexed queries.
> KServiceTypeTrader is still a lot faster, supposedly since we're reading one
> larger file, instead of multiple ones. It may make sense to cache the index
> files read from disk, which should get us in the ballpark of
> KServiceTypeTrader again.
>
> Feedback, please!
>
> So, this code is in a bit of a draft stage, I'd very much welcome feedback
> about the approach, and of course the code itself. It can be found in
> kservice[sebas/kpluginindex]. the kpluginmetadata autotest gives a useful
> testing target. I didn't submit it to reviewboard yet, because I want to nail
> down the further direction, and provide a base to discuss on.
>
> Cheers,
> --
> sebas

Hi Sebas,

I'm curious about one thing. Have you done some profiling on the
current KPluginMetaData to see where the actual hot spot is?
In case you don't know how to do that, here are some tips:
1. Recompile Qt with debug symbols (not debug mode, just with the debug symbols)
2. Run a benchmark application via valgrind like so: valgrind
--tool=callgrind <your_benchmark_app>
3. Open the output file of the line above in KCacheGrind and hunt for
those pesky hot spots.

Perhaps there is nothing to optimize and then having an index (and the
cost of maintaining it) is worth it, but it would be best to first
determine if the current code path can be optimized.

Cheers,
Mark


More information about the Kde-frameworks-devel mailing list