[Nepomuk] [RFC] Avoid communicating through the Nepomuk Storage

Sun May 26 14:00:39 UTC 2013

On Sunday 26 May 2013 14.49:25 Volker Krause wrote:
> On Sunday 26 May 2013 13:23:14 Christian Mollekopf wrote:
> > On Sunday 26 May 2013 04.28:01 Vishesh Handa wrote:
> > > Hey guys
> > > 
> > > I have made a very important discovery - The Storage service is a big
> > > bottleneck!
> > > 
> > > Running a query such as - 'select * where { graph ?g { ?r ?p ?o. } }
> > > LIMIT
> > > 50000' by directly connecting to virtuoso via ODBC takes about 2.65
> > > seconds. In contrast running the same query by using the Nepomuk
> > > ResourceManager's main model takes about 19.5 seconds.
> > > 
> > > Nepomuk internally uses the Soprano::LocalSocketClient to connect to the
> > > storage service which runs a Soprano::LocalServer.
> > > 
> > > I've been trying to optimize this Soprano code for some time now and
> > > from
> > > 4.9 we have a good 200% performance increase. But we can increase it a
> > > LOT
> > > more by just directly communicating with virtuoso.
> > > 
> > > Pros -
> > > * 6-8x performance upgrade
> > > * The storage service isn't using such high cpu when reading
> > > * Accurate reporting - Suppose app 'x' does a costly query which
> > > requires
> > > a
> > > large number of results, then 'x' will have high cpu consumption.
> > > Currently
> > > both NepomukStorage and 'x' have very high cpu consumption.
> > > 
> > > Cons -
> > > * Less Control - By having all queries go through the Nepomuk Storage we
> > > could theoretical build amazing tools to tell us which query is
> > > executing
> > > and how long it is taking. However, no such tool has ever been written -
> > > so
> > > we won't be loosing anything.
> > > 
> > > Before 4.10 this could never have been done because we used to have a
> > > lot
> > > of code in the storage service which handled removable media and other
> > > devices. This code would often modify the sparql queries and modify the
> > > results. With 4.10, I threw away all that code.
> > > 
> > > Comments?
> > 
> > Hey Vishesh,
> > 
> > Since akonadi has a similar design (database<->akonadi server<->akonadi
> > session, where the akonadi session sits in the user process), and I've
> > been
> > pondering the same thing there as well...
> > The server process through which all queries are going is a design
> > decision
> > I don't fully understand in either system.
> > 
> > It would seem to me much more efficient to move all work that has to be
> > done to the user process, to avoid the extra communication between
> > application and server process (which in akonadi is an IMAP like
> > protocol, and in nepomuk even results in an extra
> > serialization/deserialization). My moving all the required logic for the
> > databinding etc. to a library, which then does it's work in the user
> > process, each application could directly talk to the database, which I
> > would expect to be always more efficient than the extra process, as
> > databases typically handle concurrent access well (except sqllite AFAIK).
> > 
> > I therefore don't really see how the cons apply, or how this shouldn't
> > have
> > been possible before 4.10. All the necessary work can just as well be done
> > in the process of each application (by using a library).
> > 
> > Even for write access any db has the required serialization mechanisms
> > implemented for concurrent access. If you have a server process, the
> > server
> > just needs to implement that serialization again.
> > 
> > To me getting rid of those server processes seems to only have advantages:
> > * Uses the strengths of databases in concurrency and it's ACID properties
> > * Less context switches and other overhead
> > * Simpler design
> > * Less mutual interference between processes
> > 
> > If you have all the necessary abstraction layers, you can do this fully
> > transparent to the user of the library. So I'd say go for it ;-)
> > 
> > CC'ing Volker, because he might know what he's been doing in akonadi ;-)
> 
> There are some general reasons for the extra server process:
> - abstracting the database backend, hardly any of them are actually
> compatible on the SQL level, and if they are you still might want specific
> optimizations. - generating change notifications (mainly relevant for write
> operations) - allow backward compatibility when changing/optimizing the db
> structure
> 
> For Akonadi there's an additional reason:
> - not everything you read is actually in the database, or even locally
> available yet.
> 
> Yes, there is an overhead for this design, but 6-8x is excessive. I can't
> give you a general number for Akonadi, since this has been rarely the
> bottleneck so far. IIRC it's <20% on FETCH, and mostly due to the use of a
> toolkit-neutral text-based protocol (avoiding string conversions with a
> binary Qt-specific protocol should help).
> 

Ok, sounds reasonable =). IMO all those problems could also be addressed 
otherwise otherwise, which doesn't mean it would necessarily be superior for 
akonadi, but for nepomuk I would see this as a viable alternative which would 
at least be worth investigating.

Cheers,
Christian