XMP, Krita, KFileMetaInfo and Strigi

Sun Jun 24 17:47:20 BST 2007

On Saturday 23 June 2007, Evgeny Egorochkin wrote:
> On Saturday 23 June 2007 22:46:06 Richard Dale wrote:
> > On Saturday 23 June 2007, Evgeny Egorochkin wrote:
> > > On Saturday 23 June 2007 20:08:50 Cyrille Berger wrote:
> > > > > That is why I thought Nepomuk could not
> > > > > solve the problem right away. What would be of interest however,
> > > > > would be to also store the data in Nepomuk to make it searchable
> > > > > and linkable. maybe for this we need a bridging ontology that links
> > > > > XMP data with data specified in Nepomuk ontologies.
> > > >
> > > > Yes but that's really for the future :)
> > >
> > > Not so distant. Many of XMP features can be mapped to existing Nepomuk
> > > ontology, and I'll consider missing XMP features when coming up with
> > > suggestions for the next Nepomuk ontology draft.
> > >
> > > The problem is that you can't do 1:1 nepo-xmp mapping :(
> >
> > Does that mean that Nepomuk has it's own ontology, and we can't use
> > existing RDF ones?
>
> Reusing existing ontologies is not as straightforward as you might think.
> The devil is in the details.
>
> The problem is that some ontology might have a Name property for person.
> Other ontology has FirstName and LastName properties.
Yes, but that's why RDF has namespaces. You can always write a bridge between 
ontologies such that when, say a FOAF based data set for personal details is 
stored in the triple store you might infer VCARD (or NEPOMUK) equivalents and 
add them too.

> One ontology might express rating as an integer in 0-100 range, other one
> uses float in 0-1 range etc.
Yes, but you can map one onto the other if your program (the nepomuk 
extractor) knows about the definitions.

> When ontologies use essentially different structures/ideologies to
> represent data, it becomes a real headache.
No, I disagree - one of the main features of RDF is to allow ontologies to be 
combined. That might not always be easy. If I have a library classification 
system for religions and the Dewey decimal system, doesn't have many 
classification types for Islam as opposed to Christianity it doesn't mean 
that a clever person can't derive a mapping of Dewey religions onto an 
Islamic equivalent and the other way too.

> Often you have several ontologies trying to represent the same data( a good
> deal of media ontologies is an example of this).
>
> This is further complicated by outdated ontologies, ontologies with
> questionable practices(I had pointed out some issues with XMP which go
> against todays semantics approaches, but it was ok when the standard was
> drafted)
So only the Nepomuk  team can design ontologies correctly, and we should just 
discard XMP data and not add it to the triple store? This doesn't seem to 
ring true to me. If you want to design an extra ontology for KDE like SCOT 
does for tagging data that is fine (I assume you can tag XMP via SCOT), but I 
don't think you should just discard XMP data and not write it to the store. I 
really think it is important the a SPARQL endpoint on a KDE desktop should be 
exactly the same as a SPARQL endpoint on the web. XMP locally combined with 
XMP data queried from the internet is more useful than Nepomuk data which 
only works locally.

> Thus, making an ontology as generic as nepomuk is intended to be involves
> compromises.
>
> Also, if it becomes clear you can't be compatible in some aspect with all
> existing ontologies, it makes sense to adopt the most sensible approach
> according to todays knowledge, and not yesterdays assumptions or outright
> mistakes.
>
> That said, proper ontology design and a reasonable coding effort can map
> 90-95% of other ontologies features even in cases of complex and
> troublesome ontologies.
>
> Compare this to typical usage cases like IDv2.4. Hardly anyone uses even 10
> of tags provided by the standard, where in fact there are 10x more.
>
> The concern of 100% 1:1 mapping is for certain production apps that must
> have access to all obscure features of a particular standard.
>
> > Is it ok to read from the Redland triple store directly or should it
> > always be done via a Nepomuk service? I ask because Ruby has some nice
> > software for reading and writing to rdf stores (ActiveRDF), and I would
> > like to be able to use those apis for kde/ruby programming.
> > QtRuby/Korundum can use QtDBus too. Should C++ programs access Nepumuk
> > via Soprano (or ActiveRDF for ruby), and hence the triple store directly,
> > or only go via a dbus api?
>
> While at this moment it might not matter, eventually you may expect some
> performance tweaks and data stored quite differently than presented via
> KMetaData API. Also, I think in many cases native constructs of OO
> languages are much easier to use as compared to raw RDF.
I don't care much about performance tweaks - I was thinking more about whether 
there will be some sort of inference engine in Nepomuk that would only be 
available via the C++ api. And so if you queried the triple store in Ruby via 
ActiveRDF you would miss out on these assumptions derived from knowledge that 
the triple store is part of the KDE desktop.

-- Richard