[Nepomuk] ontology related advice request

Evgeny Egorochkin phreedom.stdin at gmail.com
Wed Oct 21 16:59:56 CEST 2009


В сообщении от Среда 21 октября 2009 17:17:50 автор Laura Dragan написал:
> Evgeny Egorochkin wrote:
> > В сообщении от Среда 21 октября 2009 14:58:40 автор Laura Dragan написал:
> >> I'm looking at updating the Note class used in SemNotes and I could
> >> use a second opinion from somebody who knows the ontologies and API
> >> better.
> >>
> >> Currently the notes are of type pimo:Note. They have the following
> >> properties (not exclusively):
> >>
> >> - title -> dc:title, also sets the nao:prefLabel to the same value
> >> - creation time -> nao:created
> >> - last write -> nao:lastModified
> >> - tags -> nao:hasTag
> >> - referenced resources -> pimo:isRelated
> >> - content -> semnotes:htmlContent
> >>
> >> Implementation-wise, the Note class is a subclass of
> >> Nepomuk::Resource, but it will be changed to a subclass of
> >> Nepomuk::Thing after all this.
> >>
> >> There are 2 questions:
> >>
> >> 1. What is the best choice to replace the ugly semnotes:htmlContent
> >>  property?
> >>
> >> I would like to replace it with some existing property in an existing
> >> ontology. This would allow me to delete the ontology that comes with
> >> the application. I thought for a while that pimo:wikiText might do the
> >> job, but after some consideration I'm not so sure any more.
> >
> > Formatted text is a big pain. The closest thing we have in NIE is
> > nmo:htmlMessageContent.
> >
> > There's no generic property like this. For HTML we could add it though if
> > it's needed.
> 
> I used HTML links to store in the text the links to the referenced
> resources. I will move away from this and start using the annotation
> plugins to store the references separately from text. However, HTML is
> also useful for formatted text, which is what I actually need (like
> the nmo:htmlMessageContent you mentioned above). So maybe a new
> property that allows that would be good.
> 
> > As to wiki, the only standard thing in wiki formatting is the wiki word.
> > Everything else seems to be done in as different way as possible. So I
> > don't know what's the point of having any wiki properties at all.
> >
> > Much easier would be to translate between a subset of html and the wiki
> > syntax your app has to deal with. Easier != easy though :(
> 
> I wouldn't want to get into wiki syntax .. precisely because of the
> unlimited number of standards and subsets of standards. Although, some
> users who are familiar with it commented that it would be a nice
> addition to the tool.

You could instead provide a GUI. You know the already standard Ctrl+B to make 
text bold, Ctrl+I to make text itallic. It's just as easy as surrounding some 
text with asterisks.

The wiki syntax emerged precisely because there was no way to get equivalent 
functionality in a web page. It doesn't really make sense to deal with ugly 
markup, then press preview and see if it looks good and as you intended. 

Importing data from an already wikified document might nevertheless be useful.

> >> 2. Should I keep storing the notes in the RDF store or should I use
> >> files on disk?
> >>
> >> Currently the note and all its properties are stored in the
> >> repository, including the content. Initially I was expecting that
> >> notes would be small, therefore not really worth storing in
> >> independent files on disk. But after looking at the way that the few
> >> users I know (including myself) take notes, I found that some notes
> >> can be quite long and elaborated. That's why I'm now wondering if it
> >> wouldn't be better to just store them in files and let Strigi index
> >> the files. This way the indexing of note content is not lost.
> >>
> >> This question makes the first one redundant in a way, because if notes
> >> should not be stored in the repository, the semnotes:htmlContent would
> >> be anyway removed and the corresponding file would become
> >> pimo:groundingOccurence for the note instance.
> >
> > You can store notes in the RDF store as long as you do sopranocmd export
> > on a regular basis.
> 
> What do you mean by this? The tools provides a backup utility that
> saves the notes and all the information related to them as rdf and
> then there is the possibility of restoring notes from such a file.
> Also there is the possibility of exporting the notes as files, but
> even exported they are still stored in the repository..

sopranocmd export is a good commandline tool to backup all user-generated 
content. Not only notes, but also tags etc. It's better than backing up the 
RDF store file itself. 

It would be so much better if someone wrote a GUI frontend for this, because 
it's very useful for other data/apps as well.

Export to other formats such as html documents is an entirely different matter 
of course.

> > It doesn't matter if the notes are elaborate as long as you write them
> > yourself instead of using your note-taking app as a dumpster for random
> > content from the web. You are very unlikely to write more than several MB
> > of text during your lifetime ;)
> 
> Some might do note-taking by copy/pasting :) I don't do that, and my
> notes total almost 1M currently, with the first one dating from March.
> So your estimate is very accurate :)
> 
> For now I will keep storing the content in the rdf repo.
> 
> > A rule of thumb is: the fulltext index of a plaintext file is about the
> > same size as the file itself. So if you store the file in the RDF store,
> > you double its DB size.
> >
> > A fail-safe approach would be to introduce a generic htmlContent
> > property, create a nie:InformationElement to hold this property and make
> > it a groundingOccurence of your pimo:Note. This doesn't mean this
> > information element has to be a file on the disk. But it means it can
> > become a file on the disk if necessary without breaking anything.
> 
> So, if I understand correctly, this would be right:
> 
> [ - notes are of type pimo:Note and keep the title, creation time and
> last modif time and tag properties as they are ]
> - note has pimo:groundingOccurence a nie:InformationElement
> - the nie:InformationElement has a property htmlContent

Yes. But we should ask Leo first. He might really hate us for not using PIMO 
stuff for this :)

> New questions:
> 
> 1. Should I use nie:InformationElement directly, or would
> nfo:HtmlDocument make a better type, although there is no actual
> DataObject for it?

Using an InformationElement without a DataObject is perfectly fine. 
nfo:HtmlDocument is a good candidate.

> 2. Should I create a ticket for adding the nie:htmlContent property in
>  Trac? It would be super-property of nmo:htmlMessageContent.

Yes, this makes sense.

> >> Sorry for the lengthy email :)
> >> Thanks for reading,
> >
> > But no thanks for replying? :)
> 
> Now I thank you for replying and hope you will reply some more :)

:)

-- Evgeny


More information about the Nepomuk mailing list