Nepomuk in 4.13 and beyond

Aaron J. Seigo aseigo at kde.org
Thu Dec 12 20:23:51 GMT 2013


On Thursday, December 12, 2013 20:10:27 Vishesh Handa wrote:
> On Thursday 12 Dec 2013 19:40:11 Ivan Čukić wrote:
> > > If we all decide to store stuff in sqlite, then it doesn't matter if
> > > they
> > > are separate database files or the same one.
> > 
> > I might be missing a few things here, but asking questions is the road to
> > enlightenment :)
> > 
> > - There is no way to query across different stores, which was the main
> > appeal of nepomuk? (I concluded this from the last mail)
> 
> There isn't one. Not right now. I'm open to ideas on how to do something
> like if it is required. I'm slightly skeptical if it actually is required.

for activities it’s pretty much a requirement: we have an activity and we want 
to know all resources (files, contacts, bookmarks, applications, windows ..) 
associated with it. so for activities we’ll either end up querying each store 
separately or Baloo will need to provide a way to query multiple stores.

for the Plasma Active shell as it currently is, single-store querying might be 
workable as we tend to keep most of the different resources separated in the UI 
(though that’s one thing i want to change in future releases, so you can group 
a set of bookmarks with a given file, e.g.)

it would be a big problem if the tags are per-store as well; we need cross-
store tags (though from glancing at the API tonight it looks like that is 
already there?)

this may be a question of API, of course. with different stores, collation will 
need to happen somewhere. should it happen on the client side or the server 
side is, i suppose, the big question.

i would suggest server side for a simple reason: if multiple stores all share 
the same physical storage system, it would be really nice to be able to 
optimize queries to hit that storage system as little as possible. example:

Stores: S0, S1, S2

S0 -> xapian
S1 -> xapian
S2 -> mysql

when fetching items from S0 and S1 that match tag T0, it would be very nice if 
the backends could cooperate to merge their queries into one so that one 
xapian query is done rather than 2 with post-query collation of the results.

for obvious reasons this can only be done in the server where the stores can 
cooperate.

a concrete use case:

S0 = files
S1 = bookmarks
S2 = applications

application = Plasma Active shell

if adding stores is easy enough, i expect we’ll end up with stores for things 
like geolocation, so this could balloon further.

> > - When querying, how do I get the properties of the results?
> 
> You don't. You just get the identifier and some text. You can do a
> subsequent fetch job to get additional data.

more roundtrips doesn’t sound great for performance. if a result set has a 
1000 returned items and you then want to get properties on them (e.g. for 
listing and sorting) then one needs to either send all 1000 UIDs back for 
further processing or in a worst case scenario 1000 individual requests.

this will be an issue for several things in Plasma Active, such as the file 
manager. unlike Dolphin which just shows metadata for a given file, the Active 
Files app relies on Nepomuk rather than the filesystem for these things and 
allows filtering by ratings, tags, etc.

> > - We talked about asynchronous querying. Is it going to happen?
> 
> There is a QueryRunnable class which can be used to run queries in another
> thread. Most backends, do not seem to allow asynchronous queries, so there
> wasn't a way to run queries asynchronously by default.

those backends could be run in a thread? iow, put the async/threading as a 
first class feature that the backends must implement. even if it means having a 
thread for execution in the background and queueing requests.

making every user handle the threading sounds like we’ll have lots of code 
that doesn’t ;)

> > From my POV, it would be much nicer if you forced a single db (as an
> > actual
> > store, not as a cache like nepomuk is for akonadi) on the people, with the
> > option to have a few things runtime defined. It would ease the development
> > and would allow more fun queries which would be optimized unlike the
> > manual
> > client-side joining of different query results.
> 
> But what if one doesn't use SQL for storing data? IMO Xapian is much better
> suited that sqlite's FTS support (or mysql).

hopefully there would be a query object and people would not be hand coding 
queries in strings that is passed to be parsed. which would make the “what is 
the query language” thing moot; the sparql queries in C++ is one thing i never 
really got comfortable with with nepomuk.

> When planning Baloo, I've mostly taken a look at PIM, Dolphin, KRunner (and
> Milou), PMC, and KPeople. Perhaps something was missed?

usage in activities and Plasma Active are key use cases from my POV.

-- 
Aaron J. Seigo




More information about the kde-core-devel mailing list