[Nepomuk] First time Hello / Conquirere research tool / BibTeX ontology

Sebastian Trüg trueg at kde.org
Sun Sep 18 07:31:49 UTC 2011


Hi Joerg,

welcome. Great to see that you are invested in Nepomuk.

On 09/17/2011 11:48 PM, Jörg Ehrichs wrote:
> First of all I like to say that Nepomuk runs great on my system.
> Furthermore from a developer point of view it is a pleasure to use it,
> most stuff works just as expected.
> The Semantic Desktop is an interresting concept and I really hope hope
> over time more and more applications adopt to it. Sadly there is not
> much you can do with as of today which might be one of the most reason
> the KDE users disable it. Sebastians save prototype looks great though
> and I hope he can finish it soon.

thanks. :)

> Like I said above I'm looking for a new project and thus I like to add
> something to Nepomuk.
> What I always missed in the last years was some program that lists a
> collection of all files connected to a specific research topic. In my
> case mostly research papers in pdf format, but also images, emails,
> mindmaps, other project files.
> Currently I manage everything in my own folder structure and try to keep
> track of the references via KBibtex, Mendeley, Zotero, Dolphin, Kontact.
> 
> This use case screams literally for a new semantic approach and this is
> what I try to achieve.

agreed.

> The program is called Conquirere and it can be found in a really really
> early version in playground/edu [1]
> Currently it offers a way to create a new project (and a new Nepomuk tag
> with it) and lists your "library" of tagged documents / mails / images
> and such in in one program.

I would suggest that you not create a tag but a pimo:Project instead.
Then you can relate things to it via nao:isRelated or even pimo:related.
nao:Tag is actually just a dumb keyword without any semantic
information. As such one should avoid it if possible. I feel more and
more that it was a bad idea to introduce it in the first place...

> It is also possible to see all system wide documents (capped at 500 per
> resource type currently).
> The tagging has to be done via Kontact/Dolphin/DigiKam or by copying the
> files to the project folder.
> 
> This alone is nothing special so far.
> 
> What I have added is a new ontology [2] that maps the BibTeX (and some
> more) information to Nepomuk, allow to change every field easily and
> also to export all project documents to a BibTeX .bib file.
> 
> I've seen there was a discussion about this ontology/KBibTeX integration
> a while ago. The last message said, the best way to talk about this
> topic is over some actual source code.
> So here we go :)

The ontology is a very good start already. I of coursde have a lot of
comments (I always do). And after having looked at it I feel it is best
to start with a general comment:
When designing an ontology you should not try to create a 1-to-1 mapping
but instead try to design the ontology the way things actually are. Wow,
there is a weird sentence nobody can understand. :P
What I mean is that instead of storing the journal as a string, you
store it as a resource which has a name and can even have an address and
so on.

Anyway, maybe it gets clearer with the detailed comments:

nbib:Website:
IMHO it would make sense to somehow map this to other websites stored in
the system - simply to be able to search and list them the same way.
Actually this is a really problematic topic which we have not solved
entirely thus far. We had nfo:Website and now have nfo:WebDataObject.
But the semantics are not entirely clear yet, at least not to me. Maybe
a first step would be to make it a sub-class to nfo:Website.

nbib:AccessDate:
this is a property. Thus, by convention it should start with a
lower-case letter. Also it is not a sub-property of plainTextContent.
Maybe using nuao:lastUsage would be sufficient here?

nbib:Abstract:
again lower case first letter. And I think plainTextContent again does
not match. plainTextContent is the content of the whole resource. As
this is only an abstract making it a sub-property of nao:description
would be better.

nbib:Address:
Why not make the publisher a full-blown nco:Contact and use
nco:publisher directly?

nbib:Annotate:
I do not understand this one? Can you give an example.

nbib:Author:
case again. And why not use nco:creator instead.

nbib:Booktitle:
Just use nie:title.

nbib:Chapter:
Here we run into a problem. Conceptually there should be the book which
has an author and a publisher and so on. And then there should be the
excerpt from the book which is used as the citation. Or a reference into
the book (but as you read my email regarding excerpts you know that I am
in favor of them). Thus, I would maybe model a chapter as a
nie:InformationElement which is nie:isLogicalPartOf the book. Then a
reference is to the chapter and not to the book. And in fact the
reference could just be a relation to the chapter.

nbib:Contents:
again lower case. And again plainTextContent does not match as a
super-prop since it is only part of the content. Just remove the super-prop.

nbib:Copyright:
IMHO not required. Use nie:copyright directly. Actually a sub-class or
sub-property is only required if it adds additional semantic meaning.

nbib:Crossref:
lower case again. How about using nao:prefLabel?

nbib:DOI:
lower case again. And does this relate to the book or the chapter or the
reference?

nbib:Edition:
lower case again. And I think this should be a property of the book and
not the reference - the way I mention in my comment to nbib:Chapter.

nbib:Editor:
lower case again. And IMHO this should be in nco instead seeing that
both nco:creator and nco:publisher exist.

nbib:Eprint:
lower case again. Also I so not understand this one. Is it actually a
property of the reference or of the book?

nbib:HowPublished:
lower case again. And seems like a property of the book.

nbib:Institution:
lower case again. This is a tricky one. IMHO this should be encoded in
the nco:publisher property. I suppose this means that an individual
person did the publishing but while working for or with the institution?
If that is the case we should maybe look into nco roles for this one.

nbib:ISBN:
lower case again. Also this should be a sub-property of nao:identifier.

nbib:ISSN:
same as above.

nbib:Journal:
lower case again. Also a journal should be encoded as a nco:Contact.

nbib:Language:
use nie:language directly instead.

nbib:LCCN:
lower case again. Unique identifier of the book? In that case a
sub-property of nao:identifier.

nbib:Month:
IMHO there should just be publicationDate with range xsd:dateTime. That
is the only way to include the publication date into date range searches.

nbib:MRNumber:
lower case again. I do not understand this one. So I cannot really comment.

nbib:Note:
lower case again. And there can only be one?

nbib:Number:
belongs to the journal/magazie instead of the reference I think. The
question is: would it make sense to model single journals as resources
or just the series with journal numbers?

nbib:Organization:
lower case again. which conference does this relate to? The one where
the paper was presented?

nbib:Pages:
lower case again. IMHO this should be modeled as an excerpt again.

nbib:Publisher:
use nie:publisher instead.

nbib:School:
lower case again. And IMHO the school shoudl be modeled as a resource,
maybe a nco:Contact is enough. If not make a sub-class to nco:Contact.

nbib:Series:
lower case again. Also the series should be a resource again. Basically
everything that is a group should be a fully qualified resource.

nbib:Title:
use nie:title instead.

nbib:Type:
lower case again. IMHO this should be expressed with sub-classes to
nbib:Techreport instead.

nbib:Url:
lower case again. See above the discussion of Website.

nbib:Volume:
lower case again. Not sure what this is. The comment is missing.

nbib:Year:
lower case again. I suppose this is the year of publication. In that
case I again think there should only be publicationDate with range
xsd:dateTime.

OK, as promised, a lot of comments. :)
Actually I think your ontology has the potential to become our way of
describing books and papers and journals in general. At least I think we
should make it that. Any additional things required for bibtech entries
could then be put into a separate ontology or into NFO.

Also how about creating a ticket at http:://oscaf.sf.net for the new
ontology. After all it makes perfect sense to be able to describe books
and stuff.

Cheers,
Sebastian


> In the future I like to extend this program to integrate automatic
> fetching of references (via libkbibtex io for example), add some more
> features like automatic recommendation of new documents/webpages based
> on tag clouds extracted from all project documents, add the possibility
> to list also sections of a document (fits nicely to Sebastians last
> mail) and maybe many more.
> 
> I'm not familiar how new ontologies are created normally, I hope this
> fulfills  most requirements for the usual ontologies.
> 
> I'm always glad to get some comments.
> 
> Kind Regards
> Joerg Ehrichs
> 
> [1] https://projects.kde.org/projects/playground/edu/conquirere
> [2] https://projects.kde.org/projects/playground/edu/conquirere/repository/revisions/master/changes/nbib/nbib.trig
> 
> 
> 
> 
> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk


More information about the Nepomuk mailing list