[Nepomuk] Re: newbie questions about KDE Nepomuk

Tue Jan 4 10:03:23 CET 2011

Hi Darren,

On 01/03/2011 02:17 PM, Darren Cruse wrote:
> Thanks for both your replies - boiled my questions down to the following:
> 
> In my mind I was imaging Nepomuk/Strigi indexing and feeding RDF into
> Virtuoso using the Soprano C++ api on one side, and me hitting
> Virtuoso's SPARQL endpoint (or manipulating triples in Virtuoso using
> their java api) from the other side.

This is not the way it is intended. While you have access to the full
SPARQL API through C++ and DBus at the moment the idea is that Nepomuk
filters all data modifications in order to perform additional checks and
add additional data.
Currently most of this is in the C++ APIs while it should be in the service.

> I'm certainly not tied to this approach (I'm too green to be tied to
> anything at this point! :), but maybe I should clarify what led me to
> think that way:
> 
> a.  For this project, the metadata extracted from the html assets
> needs to be accessible remotely.  So using the books example, there
> needs to be a single repository of meta data for a library of books
> (Mandriva's RDF store), and multiple editorial people
> accessing/querying that metadata - most likely using a web interface
> since they run Windows not Linux.
> 
> (so here the motivation of using the SPARQL endpoint is the remote
> access to the metadata)

Well, this could be achieved by creating a simple bridge between the
Nepomuk API and a web service.

> b.  The initial project has built that web interface for
> browsing/searching the RDF (meta)data, currently using XQuery to
> transform RDF/XML into HTML.  To use Nepomuk I imagined changing that
> code to simply hit a SPARQL endpoint, which returns XML from what I
> understand, and just change the XQuery so it transforms the SPARQL
> response XML to HTML.  I'd googled and known that Virtuoso does have a
> SPARQL endpoint btw (wasn't sure whether Nepomuk exposed a SPARQL
> endpoint).

we do not expose it.

> (and here the motivation of using the SPARQL endpoint is reducing my work :)
> 
> c.  I guess the other reason I was thinking that way was to work
> around the question of C++.  Right now I'm a one person team for the
> project and I have done C++ in the past, but honestly it's been awhile
> for me.  And more importantly code I write will be maintained by
> people who are most likely java developers who don't know C++.
> 
> But regarding this last point, your comments made me think maybe if
> e.g. the "assets linked by an html file" requirement wound up as C++
> as part of what the Strigi indexing does, maybe you guys were saying
> you'd be interested in that being contributed as open source?  I'd
> need to read about Strigi to understand how it works - does it have
> some kind of plugin idea?   Or would incorporating this truly mean
> code I write winds up linked in with the indexer executable?

It would mean enhancing the html indexer plugin for strigi or writing an
additional plugin that can be used optionally in strigi to get
additional information about html files.

> Maybe I'm encouraged enough to at least investigate some more...  I
> did create a Mandriva Live CD to play with today.
> 
> Do you also have a link handy with instructions on getting/building
> the playground version of Mandriva? :)

Actually for what you are trying to do there is no need to do that at
this point.

Cheers,
Sebastian