[Nepomuk] heavy duty query visualization

Fri Oct 14 20:13:21 UTC 2011

Hi all,

in Plasma Active one things like the activity screen that shows all the 
nepomuk resources, the ui to search and add a new one, the image viewer (that 
lists all objects of type image) use the plasma metadata dataengine.
this one is basically just a wrapper around Nepomuk QueryClient.
due to the architecture of plasma dataengines, this is not so much efficient, 
since a quite too much of data copy around is involved.

in the next releases we want to do more fancy stuff, like a central kick ass 
document browser that lets browse stuff by basically anything, type, date, 
rating,tags,activities, whatever and combining all of this.

what i'm trying to do now, is a QAbstractItemModel (working prototype in the 
plasma-mobile repo, mart/nepomukmodel branch) that is done expressively to 
show resources from a query, to be instantiated from QML, so some questions 
surface, related both to performance issues and nepomuk usage.

* lazy loading: if i do a query with really a lot of restults, would be 
possible to limit it like a relational database, to know how many results 
there are, asking for the first 10, then from 11 to 20 and so on? (if a query 
has something like 2000 results that would be hady;)

* sorting: is it possible to directly sort the results based on one of the 
properties? in this case i wouldn't have to load the whole model in memory 
from the beginning

* property watching: if i understood correctly to dynamically update property 
values i have to use Nepomuk::ResourceWatcher that stays 
nepomukdatamanagement, still semi private in kde-runtime, right?

* query construction: since this will have to be used from QML/JavaScript, 
unfortunately the c++ query api can't be used directly, and full javascript 
bindings to it doesn't sound like fun ;) as isn't brutally exposing sparql to 
qml. What i'm thinking to do here is:
 * pass queries by string, so with the more limited desktopquery language.
 * expose to js some functions to limit the results, like limitByType(), 
limitByActivity() limitByTag() etc. this will with the c++ api add the needed 
parameters to a query precedently created from a string

* what queries are possible? how? I think many of the queries we will need 
will be to build things tagcloud-ish, so besides a proper tagcloud that has 
been possible to do in nepomuk since a while, other needed things could be 
series of result like:
  * resource type, number of occurrencies
  * rating, number of occurrencies
  * date interval (like month) number
and so on. is this easily feasible? (this will probably go in the dataengine, 
not in the model)

* for the write part, like assigning a score, a tag, connect to activity: this 
will be done exactly like now with a plasma service, in this case created not 
by the dataengine but by the model

comments? ideas? opinions?

Cheers,
Marco Martin