[Nepomuk] Fwd: sequences

Bèrto ëd Sèra berto.d.sera at gmail.com
Tue Sep 21 18:05:03 CEST 2010


>
> Hi Roman,
>
> > /* A "Region" has a textual  description, along with other minor
> properties,
> > so we want it to inherit all translation capabilities from Profile */
> > foo:Region rdf:type rdfs: Class.
> > foo:Region rdfs:subClassOf foo:Profile .
>
> > /* This is where actual content is */
> > foo:Content rdf:type rdfs:Class .
>
> > foo:means rdf:type rdf:Property .
> > foo:means rdfs:domain foo:Content .
> > foo:means rdfs:range foo:Profile .
> ...
> > foo:belongsTo rdf:type rdf:Property .
> > foo:belongsTo rdfs:domain foo:Content .
> > foo:belongsTo rdfs:range foo:Region .
>
> Since Region is subclasss of Profile, foo:means property may relate
> Content to Region.
>  foo:belongsTo seems to serve the same purpose here, or those 2
> properties should have different semantics?
>

It's two different things. Anything that has to appear to a user as
"language dependent content" (i.e., it's expressed in a language) is a
subclass of profile. So even a tag name is, as we want to tag something with
a Japanese tag サッカー and have it read by a Russian user as tag Футбол, by a
German as tag Fußball etc. Nevertheless, a profile is just a dictionary
entry, while its subclasses are specialized things that happen to be
translatable (as expressions) and explainable/translatable (by means of
their their definitions) as anything else. This is really just about being
able to manage a multilingual environment.

Since サッカー is already going to be included in our dictionary sources
(basically all words and most common expressions will be, at least for the
10 most common languages), when a user adds a free tag we look for it in the
profile list and have the tag "system name" generated as simply the profile
uuid, then we "emerge it as a linguistic dependent content" according to his
desktop settings. Once emerged, profile
1aa75482-4e61-11de-a0cb-e7dd9793c483, tagged as
1aa75482-4e67-11de-a0cb-e7dd9793c483 will appear as "dog" tagged as
"animal", or "cane" tagged as "animale", depending on the user's settings.

This allows an easy semantic data exchange among people who absolutely do
NOT understand each other. It also means that less resourced languages get
the full wealth of what major languages did as soon as they translate the
dictionary. It's also important for industrial applications. Think of
companies that work in different continents and use a semantic network to
exchange information about a project, applications are endless.

The "entries" as we said can be either expressions or definitions, and they
can either be textual or multimedia. There is no limit to the number
of synonyms and alternate or specialized expr/def. So I can have a
conference video and its audio in English as a expression, then translate
the audio in French. Since we are going to serve HTML5 multimedia, this
means we can sync the French audio or superimpose  a Sign Language
translation of it (which is technically a video translation of an audio
file) at rendering time. Nobody needs to download stuff they don't
understand, unless they specified they want it. Or I can include a literary
work from the Gutemberg Project in the expression (as a synonym of the
title) and translate it. This is why we need to break long text in shorter
ordered sequences, to simplify the translators' work.

Getting back to our "region" this is really going to be stuff like "
en.wiktionary.org", "de.wiktionary.org", "AGROVOC", "community X" etc. In
most cases the name is hardly going to be translated at all, as it's a sort
of trademark, but the definition needs to be, so that a Russian user who
knows nothing of, say, AGROVOC may read "AGROVOC - это многоязычный,
структурированный словарь, который охватывает терминологию всех отраслей
сельского хозяйства, лесного и рыбного хозяйства, пищевой и смежных областей
(например, окружающая среда).". So he can see it in his list of possible
"sources" and decide whether such source is of any interest for him.

"Profile" simply means "this thing can be translated and explained as in a
dictionary" and it's pretty much our base "object". What we derive from it
inherits all of its capabilities as a dictionary entry, while having a
specialized role (a license, for example, as we also store licensing data,
but at the same time care for having a licence text users can understand). I
know it's a bit complex, but it gets simpler if you make a basic division
between "language independent" and "language dependent" content.

In these terms you can think of an entire system of annotations as uuids
used to tag other uuids. The way you tag them (and what you use to tag them)
makes the most of the ontology, while actual content always steps in at
"profile" base level.

Bèrto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.kde.org/pipermail/nepomuk/attachments/20100921/a10265a1/attachment.htm 


More information about the Nepomuk mailing list