[Digikam-devel] Second try: classes and properties for describing excerpts (parts of text, parts of images)

Thu Sep 15 16:25:05 BST 2011

Hi list,

this is the second time I send this email. Previously it had the subject
"Bookmarking - rebooted" and I fear that it scared away some people that
should actually be interested in this topic. The thing is just that from
my point of view the concepts are the same. So here goes again:

Bookmarking is a rather simple concept. Typically used for URLs in web
browsers or file managers. In the Nepomuk ontologies (NFO) we have a
more or less direct mapping of the old bookmarking concept to classes:

nfo:BookmarkFolder contains several nfo:Bookmarks which nfo:bookmarks
some nie:DataObjects.

This is fine for the most basic kind of bookmarking: web urls and
files/folders. However, quickly the need for finer grained bookmarking
arose - a position in a text, a stream, and so on. Thus, properties like
nfo:pageNumber were created which give some information on the position
in the data object.

>From my point of view this is not a great solution. For starters I do
not even like the concept of bookmarks. For me a bookmark is nothing
more than a piece of information that has been marked as interesting.
And with semantic search and friends I see no need for the organization
into bookmark folders anymore anyway.

That aside I also think that we should not try to describe where in some
document our bookmark points to but we should rather properly define the
excerpt that we want to remember - a piece of text, part of an image,
and so on. Thus, we need to describe part of a nie:InformationElement.
Part of a nfo:PlainTextDocument for example is a piece of text which
starts at a certain character offset and has a certain length. To state
that a person is depicted in an image we should describe the part of the
image and then simply link that to the person.

To me all this seems to happen on the nie:InformationElement level
rather than on the nie:DataObject level. We are interested in the
information, not the container. Thus, such a part of the document would
be nie:isLogicalPartOf the main information element.

What I am not sure about, however, is whether part of, say, a
nfo:RasterImage is a nfo:RasterImage again or if we need a dedicated new
type or if we would double-type.

In any case I would like to kick off the discussion of this topic which
is important in many situations by simply throwing some draft at you.
Have a look, comment on it, tell me that it is utter bs or that you like
the approach. Let's discuss.

Cheers,
Sebastian

The draft:
==========================================

nie:Excerpt a rdfs:Class ;
  rdfs:subClassOf nie:InformationElement .

nie:containsExcerpt a rdf:Property ;
  rdfs:subPropertyOf nao:hasSubResource, nie:hasLogicalPart ;
  nrl:inverseProperty nie:isExcerptOf .

nie:IsExcerptOf a rdf:Property ;
  rdfs:subPropertyOf nao:hasSuperResource, nie:isLogicalPartOf ;
  nrl:inverseProperty nie:containsExcerpt .

nfo:TextExcerpt a rdfs:Class ;
  rdfs:subClassOf nie:Excerpt .

// can this be a nfo:Visual?
nfo:ImageRegion a rdfs:Class ;
  rdfs:subClassOf nie:Excerpt .

nfo:RectangularImageRegion a rdfs:Class ;
  rdfs:subClassOf nfo:ImageRegion .

nfo:offsetX a rdf:Property ;
  rdfs:domain nfo:RectangularImageRegion ;
  rdfs:range xsd:integer .

nfo:offsetY a rdf:Property ;
  rdfs:domain nfo:RectangularImageRegion ;
  rdfs:range xsd:integer .
==========================================