[Nepomuk] Resource scoring and activity relation

Sebastian Trüg trueg at kde.org
Tue Mar 29 16:17:51 CEST 2011


Hi list,

for some time now the idea of scoring resources based on usage
statistics and other criteria is floating around.

1. RESOURCE SCORE

In the beginning the idea was simple: calculate one score for each
resource and remember it as nao:score. This score could have been
maintained by some scoring service and be updated whenever the user
accesses the resource or the resource changes. Ivan and myself came up
with a nice scoring function which allows to update the usage event part
of the score at query time and all seemed easy and nice.

But then activities and app separation came into the mix and the problem
got bigger. Now we want to be able to score by activity and application
also. This means we want to know which resources are of interest in this
specific activity and with this specific application.

Obviously using a single property like nao:score is not enough anymore.
The go-to approach was to create something like a ResourceScore type
which has relations to the resource being scored, the activity, the app,
and the score itself:

<RS> a nao:ResourceScore .
    nao:xxx <res> .
    nao:isRelated <A> .
    nao:isRelated <App> .
    nao:score "42"^^xsd:int ;

where <App> is an application resource (nao:Agent) and <A> is an
activity (kext:Activity).

This allows to remember scores relative to app and activity. (The
downside may be that we have lots or ResourceScore instances but I see
no way around that except a complicated system of named graphs where
there is one graph per app and activity. But that seems hacky to me) If
we want a general score we simply do a sum over all:

select sum(?score) where {
   ?rs a nao:ResourceScore .
   ?rs nao:xxx <res> .
   ?rs nao:score ?score .
}


2. ACTIVITIES

In Nepomuk we will soon start to always remember the application that
maintains certain information, ie. for each information stored in
Nepomuk you will know which application created it.

My initial idea was to do the same with activities: remember in which
activity a certain piece of information was created. IMHO this is enough
and provides all the information we need.


3. PUTTING IT ALL TOGETHER

So far so good. But now for the real issue:
How do we maintain the scoring and the activity relations?

I personally would love to see that all handled transparently within
Nepomuk. No client should ever even be allowed to set or change a score.
Instead they (the clients) should simply be able to query the scores.
There are two alternatives: we could provide API to update scores or we
could allow clients to handle resource scores like any other resource
and change them however they see fit.
The latter is certainly the simplest solution but I feel that scoring
should be separated from the rest of the data as it is essentially a
cache which can always be recreated from the other information in the
database. In addition: the more Nepomuk controls itself the less buggy
clients can break stuff.

Please let me know your opinions on the matter.

Cheers,
Sebastian


More information about the Nepomuk mailing list