[Nepomuk] ontology related advice request

Evgeny Egorochkin phreedom.stdin at gmail.com
Wed Oct 21 14:39:46 CEST 2009


В сообщении от Среда 21 октября 2009 14:58:40 автор Laura Dragan написал:
> I'm looking at updating the Note class used in SemNotes and I could
> use a second opinion from somebody who knows the ontologies and API
> better.
> 
> Currently the notes are of type pimo:Note. They have the following
> properties (not exclusively):
> 
> - title -> dc:title, also sets the nao:prefLabel to the same value
> - creation time -> nao:created
> - last write -> nao:lastModified
> - tags -> nao:hasTag
> - referenced resources -> pimo:isRelated
> - content -> semnotes:htmlContent
> 
> Implementation-wise, the Note class is a subclass of
> Nepomuk::Resource, but it will be changed to a subclass of
> Nepomuk::Thing after all this.
> 
> There are 2 questions:
> 
> 1. What is the best choice to replace the ugly semnotes:htmlContent
>  property?
> 
> I would like to replace it with some existing property in an existing
> ontology. This would allow me to delete the ontology that comes with
> the application. I thought for a while that pimo:wikiText might do the
> job, but after some consideration I'm not so sure any more.

Formatted text is a big pain. The closest thing we have in NIE is 
nmo:htmlMessageContent.

There's no generic property like this. For HTML we could add it though if it's 
needed.

As to wiki, the only standard thing in wiki formatting is the wiki word. 
Everything else seems to be done in as different way as possible. So I don't 
know what's the point of having any wiki properties at all.

Much easier would be to translate between a subset of html and the wiki syntax 
your app has to deal with. Easier != easy though :(

> 2. Should I keep storing the notes in the RDF store or should I use
> files on disk?
> 
> Currently the note and all its properties are stored in the
> repository, including the content. Initially I was expecting that
> notes would be small, therefore not really worth storing in
> independent files on disk. But after looking at the way that the few
> users I know (including myself) take notes, I found that some notes
> can be quite long and elaborated. That's why I'm now wondering if it
> wouldn't be better to just store them in files and let Strigi index
> the files. This way the indexing of note content is not lost.
> 
> This question makes the first one redundant in a way, because if notes
> should not be stored in the repository, the semnotes:htmlContent would
> be anyway removed and the corresponding file would become
> pimo:groundingOccurence for the note instance.

You can store notes in the RDF store as long as you do sopranocmd export on a 
regular basis.

It doesn't matter if the notes are elaborate as long as you write them 
yourself instead of using your note-taking app as a dumpster for random 
content from the web. You are very unlikely to write more than several MB of 
text during your lifetime ;)

A rule of thumb is: the fulltext index of a plaintext file is about the same 
size as the file itself. So if you store the file in the RDF store, you double 
its DB size.

A fail-safe approach would be to introduce a generic htmlContent property, 
create a nie:InformationElement to hold this property and make it a 
groundingOccurence of your pimo:Note. This doesn't mean this information 
element has to be a file on the disk. But it means it can become a file on the 
disk if necessary without breaking anything.

> Sorry for the lengthy email :)
> Thanks for reading,

But no thanks for replying? :)

-- Evgeny


More information about the Nepomuk mailing list