[Nepomuk] [RFC] Avoid communicating through the Nepomuk Storage

Sun May 26 12:49:25 UTC 2013

On Sunday 26 May 2013 13:23:14 Christian Mollekopf wrote:
> On Sunday 26 May 2013 04.28:01 Vishesh Handa wrote:
> > Hey guys
> > 
> > I have made a very important discovery - The Storage service is a big
> > bottleneck!
> > 
> > Running a query such as - 'select * where { graph ?g { ?r ?p ?o. } } LIMIT
> > 50000' by directly connecting to virtuoso via ODBC takes about 2.65
> > seconds. In contrast running the same query by using the Nepomuk
> > ResourceManager's main model takes about 19.5 seconds.
> > 
> > Nepomuk internally uses the Soprano::LocalSocketClient to connect to the
> > storage service which runs a Soprano::LocalServer.
> > 
> > I've been trying to optimize this Soprano code for some time now and from
> > 4.9 we have a good 200% performance increase. But we can increase it a LOT
> > more by just directly communicating with virtuoso.
> > 
> > Pros -
> > * 6-8x performance upgrade
> > * The storage service isn't using such high cpu when reading
> > * Accurate reporting - Suppose app 'x' does a costly query which requires
> > a
> > large number of results, then 'x' will have high cpu consumption.
> > Currently
> > both NepomukStorage and 'x' have very high cpu consumption.
> > 
> > Cons -
> > * Less Control - By having all queries go through the Nepomuk Storage we
> > could theoretical build amazing tools to tell us which query is executing
> > and how long it is taking. However, no such tool has ever been written -
> > so
> > we won't be loosing anything.
> > 
> > Before 4.10 this could never have been done because we used to have a lot
> > of code in the storage service which handled removable media and other
> > devices. This code would often modify the sparql queries and modify the
> > results. With 4.10, I threw away all that code.
> > 
> > Comments?
> 
> Hey Vishesh,
> 
> Since akonadi has a similar design (database<->akonadi server<->akonadi
> session, where the akonadi session sits in the user process), and I've been
> pondering the same thing there as well...
> The server process through which all queries are going is a design decision
> I don't fully understand in either system.
> 
> It would seem to me much more efficient to move all work that has to be done
> to the user process, to avoid the extra communication between application
> and server process (which in akonadi is an IMAP like protocol, and in
> nepomuk even results in an extra serialization/deserialization). My moving
> all the required logic for the databinding etc. to a library, which then
> does it's work in the user process, each application could directly talk to
> the database, which I would expect to be always more efficient than the
> extra process, as databases typically handle concurrent access well (except
> sqllite AFAIK).
> 
> I therefore don't really see how the cons apply, or how this shouldn't have
> been possible before 4.10. All the necessary work can just as well be done
> in the process of each application (by using a library).
> 
> Even for write access any db has the required serialization mechanisms
> implemented for concurrent access. If you have a server process, the server
> just needs to implement that serialization again.
> 
> To me getting rid of those server processes seems to only have advantages:
> * Uses the strengths of databases in concurrency and it's ACID properties
> * Less context switches and other overhead
> * Simpler design
> * Less mutual interference between processes
> 
> If you have all the necessary abstraction layers, you can do this fully
> transparent to the user of the library. So I'd say go for it ;-)
> 
> CC'ing Volker, because he might know what he's been doing in akonadi ;-)

There are some general reasons for the extra server process:
- abstracting the database backend, hardly any of them are actually compatible 
on the SQL level, and if they are you still might want specific optimizations.
- generating change notifications (mainly relevant for write operations)
- allow backward compatibility when changing/optimizing the db structure

For Akonadi there's an additional reason:
- not everything you read is actually in the database, or even locally 
available yet.

Yes, there is an overhead for this design, but 6-8x is excessive. I can't give 
you a general number for Akonadi, since this has been rarely the bottleneck so 
far. IIRC it's <20% on FETCH, and mostly due to the use of a toolkit-neutral 
text-based protocol (avoiding string conversions with a binary Qt-specific 
protocol should help).

regards,
Volker
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20130526/ec23b4f4/attachment.sig>