[Nepomuk] Fwd: sequences

Bèrto ëd Sèra berto.d.sera at gmail.com
Thu Sep 23 16:09:20 CEST 2010


Hi!


> > In these terms you can think of an entire system of annotations as uuids
> > used to tag other uuids. The way you tag them (and what you use to tag
> > them) makes the most of the ontology, while actual content always steps
> > in at "profile" base level.
>
> Explain again please why you can't just use plain literal properties
> with different translations?
>

Because we don't always have them. A few use cases we need to cover:
1) two sources map the same profile: http://de.wiktionary.org/wiki/cat and
http://en.wiktionary.org/wiki/cat a user may have linked just one of them,
and it can be much more than two "regions", so we never know where we
"emerge" the profile from. These are NOT translations, as oftentimes none of
the sources checked the others for consistency. They are "alternate
versions", they are equivalent, since they are grouped by the same profile,
but never fully interchangeable;
2) some entries are not textual at all. I can have this:
http://commons.wikimedia.org/wiki/File:Mona_Lisa.jpg as an expression, and a
textual description for it, or maybe a video in sign language. Again, this
may vary very heavily, depending on what "regions" and languages a user
links.
3) some entries are too long, say we have a full text from here:
http://www.gutenberg.org/ebooks/2199 and break it in an ordered sequence.
For ancient texts we often have different versions, based on different
manuscripts. All of them are legal, none of them is fully interchangeable.

Since we have all of this polymorphism, we decided to implement
full polymorphism bottom up. If the properties names were to escape this
mechanism, they would need a separate translation process, and we are well
aware that the number of volunteers is extremely scarce. Especially since
the minimal requirement is a high degree of bilingualism. So if we base our
localisation on work that is ready avalilable, we manage to gain an
immediate usability.

We DO have translations, however, but "translation" is the result of
a traceable process. It's made in community regions, it has an authorship
attribution and a Quality Assessment by the admin of the region. These are
proper translations, as they are made by identifiable authors based on
identifiable sources. Everything else is simply the result of a bot
believing an internal wikilink between two editions (or similar) or an admin
saying "Hey, this two profiles are actually one!". But this is not a
translation, it's rather a semantic annotation.

There is a price for this, in the following terms:
1) We will have to adapt the existing nepomuk monitor in order to make it
able to "emerge" tags, otherwise debugging becomes pretty close to reading
machine code
2) For small languages there may NOT be a ready equivalent/translation for a
given property. This is solved by letting the users choose a fallback
strategy, and having a "panic" language (most probably English) to address
the case when even the fallback strategy fails.

Bèrto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.kde.org/pipermail/nepomuk/attachments/20100923/dc9aaeb3/attachment.htm 


More information about the Nepomuk mailing list