[Nepomuk] Fwd: sequences

Tue Sep 21 19:57:08 CEST 2010

2010/9/17 Bèrto ëd Sèra <berto.d.sera at gmail.com>:
> Sorry, I had mistakenly sent this email to Sebastian only...
>
> Hi!
>
> On 16 September 2010 19:46, Sebastian Trueg <trueg at kde.org> wrote:
>>
>> It is not recommended to use RDF containers. They cannot properly
>> queries via SPARQL, support is not guranteed, and their semantics are
>> very unclear anyway.
>> Thus, I follow popular opinion in the semantic world and recommend not
>> to use them.
>
>  Duly noted. We have an already high risk factor "as is", we won't be
> looking for additional trouble.
>>
>> If you need to store more information then you need to go the normal RDF
>> way: define the ontology constructs you need. We will be happy to help
>> you with that.
>
> Okay, I'll try to use the example I found on the ontologies, so you can
> correct my misunderstandings as I go on. I put in the end, as some
> explanations may be of help before that.
>>
>> Thus, please think twice before trying to store anything in the
>> graph/context metadata. In most cases a specific class or property might
>> make more sense.
>
>
> Again, whatever does the job better and with less trouble is welcome.
>>
>> Could you maybe elaborate a bit on your project?
>
> It's a simple idea: we collect external data sources, like wiktionary,
> AGROVOC, geographic names, public open source DBs etc and put them into a
> common format that allows a single interface for them all. In our DSL the
> prepared data from a single source is called a "region". We have bots doing
> the collect/update job for a region, and since data is semantically tagged,
> a user may say "I want everything about Tai-chi, vegan food and astronomy
> from the following regions in English and Japanese". His machine downloads
> the prepared normalized data from a list of network-sources to keep
> up-to-date. While these bot driven regions are read-only the user can also
> upload back stuff to the same network-sources using a "community region", in
> order to share it. There may be any number of communities, as we expect
> people to develop thematic communities or simply to dislike each other POVs,
> sooner or later, and communities are self managed. If I don't like a
> community I don't link their stuff, and that's it. Anyway, this is a further
> development, we will have just one community, to begin with.
is it right, that in the beginning a "community region" will just
contain some raw data, aggregated from read-only regions? And after
user's contributions to this community the data will essentially
"fork" from data, aggregated by bots?
>
>>
>> Could you also elaborate on the distributed rep, please. Be aware that
>> Nepomuk does not provide a distributed store and it is very unlikely
>> that it will in the near future - simply because creating a distributed
>> store is very very hard, a lot of work, and requires expertise that we
>> do not have...
>
>
> We have no idea about how to make a "real" distributed store, either. The
> doable thing I can think of is, as I said, a list of network-sources that
> can import and distribute a number of "regions" to end users. But I tend to
> think that since we have to upload stuff back to a "community region" two
> laptops could sync each other using any network connection in place. So, for
> example, if I live in an insulated village in the middle of nothing (a
> common situation in the third world) and I have but one box in a school,
> anyone coming to visit with a laptop can update me, provided that I told him
> what to download when he could get a normal connection and he has storage
> space enough for it. Or I could remain in the village and be sent a RAM key
> or a DVD with updates on it. This would already be a lot, for most "randomly
> connected" situations. Most of Africa and a lot of Asia has little choice
> other than this, and it's especially for them that accessing "thematic
> knowledge" locally is a high value.
> I suppose that since dbpedia has RDF exports there should be RDF imports,
> and we could just use this. Once export files are available, they could be
> broadcast using the p2p features (which I know nothing of, I just know they
> should be there, sooner or later). Since we do multimedia content, this is
> especially important to limit the amount of content one wants to store. To
> remain with our previous example, my subscription could be: "I want
> everything about Tai-chi, vegan food and astronomy from the following
> regions in English and Japanese, excluding video and audio files, pictures
> included". In any case I would get pointers to remote resorce uuids, telling
> me there's a video/audio file (and its tags), so that I can know it's there
> and I can order single downloads if I decide some particular material is of
> interest.
> Now let's get to the model. "Profile" is our DSL lingo for "meaning",
> see http://en.wikipedia.org/wiki/Cognitive_semantics#Langacker:_profile_and_base :
> @PREFIX foo: <http://foo.bar/types#>
> foo:Profile rdf:type rdfs:Class .
> /* A "Region" has a textual  description, along with other minor properties,
> so we want it to inherit all translation capabilities from Profile */
> foo:Region rdf:type rdfs: Class.
> foo:Region rdfs:subClassOf foo:Profile .
> /* This is where actual content is */
> foo:Content rdf:type rdfs:Class .
> foo:Text rdf:type rdfs:Class .
> foo:File rdf:type rdfs:Class .
> foo:means rdf:type rdf:Property .
> foo:means rdfs:domain foo:Content .
> foo:means rdfs:range foo:Profile .
> /* Here I'm in trouble, as I need what DBs call an ENUM(expression,
> definition) that defines the role a Content instance in a dictionary
> expr=def equation. You will excuse my "creative syntax", probably I should
> have created a "Role" class with two instances, right? */
> foo:hasRole rdf:type rdf:Property .
> foo:hasRole rdfs:domain foo:Content .
> foo:hasRole rdfs:range foo:(expression,definition) .
> /* How do we avoid infinite recursions here? */
> foo:isTranslationOf rdf:type rdf:Property .
> foo:isTranslationOf rdfs:domain foo:Content .
> foo:isTranslationOf rdfs:range foo:Content .
> /* Do we have Booleans? Anyway, if an instance of content gets modified, all
> of its translations are marked "fuzzy" by this flag */
> foo:isVerified rdf:type rdf:Property .
> foo:isVerified rdfs:domain foo:Content .
> foo:isVerified rdfs:range foo:Boolean .
> /* Now these two properties are on a mutex constraint, something is either a
> text of a file. Not sure whether this distinction is important for nepomuk,
> in PostgreSQL we use it to separate things we can set a full-text search on
> from things that must be searched otherwise. Content is also used as a
> meta-level, to send out minimal information about remote files that aren't
> actually present on the system */
> foo:hasText rdf:type rdf:Property .
> foo:hasText rdfs:domain foo:Content .
> foo:hasText rdfs:range foo:Text .
> foo:hasFile rdf:type rdf:Property .
> foo:hasFile rdfs:domain foo:Content .
> foo:hasFile rdfs:range foo:File .
> /* This assigns content to a Region */
> foo:belongsTo rdf:type rdf:Property .
> foo:belongsTo rdfs:domain foo:Content .
> foo:belongsTo rdfs:range foo:Region .
> Now, before I write too much garbage syntax, is this readable/usable? There
> is much more to come, although I expect changes to be needed, for the
> existing model to adapt to this new environment.
> Bèrto
>
>
> --
> ==============================
> Constitution du 24 juin 1793 - Article 35. - Quand le gouvernement viole les
> droits du peuple, l'insurrection est, pour le peuple et pour chaque portion
> du peuple, le plus sacré des droits et le plus indispensable des devoirs.
>
> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk
>
>