[Nepomuk] Re: Handling multiple sources of metadata
Bruce Adams
tortoise_74 at yahoo.co.uk
Wed May 4 16:52:20 CEST 2011
----- Original Message ----
> From: Sebastian Trüg <trueg at kde.org>
> To: Bruce Adams <tortoise_74 at yahoo.co.uk>
> Cc: Nepomuk at kde.org
> Sent: Wed, May 4, 2011 9:53:05 AM
> Subject: Re: [Nepomuk] Re: Handling multiple sources of metadata
>
> Hi Bruce,
>
> On 05/03/2011 07:24 PM, Bruce Adams wrote:
> > That roughly accords with my originally intentions anyway.
> > I was thinking in terms of a standalone tool, library & api
> > for managing simple meta data (just tags)
>
> IMHO it does not make sense to start with tags alone. I think it would
> be much simpler to only start with literal properties, i.e. those for
> which there is no need to store additional resources.
> Then the next step would be to also store the additional resources which
> gets much more complicated as it also involves garbage collection when
> the user removes a property.
>
I am coming from a non-nepomuk background with different but overlapping goals.
I agree about starting with literal properties.
I think of tags as being the simplest property possible. You are implying tags
are not this simple.
I think in nepomuk they are associated with an ontology that knows the set of
all possible tags.
This is a more feature rich representation but also a more complex one.
I guess we'll see how the code grows.
The 15 basic elements of the dublin core would be another good set to start
with.
> > and later growing this to support integration with nepomuk.
> > and incorporate other kinds of metadata.
> >
> > I'm happy to hear suggestions.
> >
> > There are two main design choices to consider.
> > 1. the location of the metadata
> > one per file
> > one metadata area per directory
> > one per filesystem
>
> IMHO there should be one file per file system. The reason is simple:
> that way we only need to store additional resources like the previously
> mentioned project once. If we had one file per dir or file than we would
> have to store (and later merge) these additional resources over and over.
>
Merging is an inevitable requirement when you copy data around.
You can't necessarily have one file per file system if that file system is
multi-user.
It may be sufficient for a USB flashdrive but not when you have say
/home/user1
/home/user2
It brings in all the complexity of multi-users and security.
So I think the best thing is to be flexible.
So if you have an acessible metadata directory at the root of the filesystem
you need no other but if not try the next level down
(or rather start at the file and work up the tree until you find one).
> > on balance I believe per directory makes most sense.
> > Though it is not that much extra complication to say a metadata area is not
> > required for a sub-directory of a directory which already has one and this
>would
> > keep the meta-data layout simple.
> >
> > 2. the format of the metadata
> > binary or text
> > if text, trig, turle , trix or something else.
>
> As I already mentioned I would prefer trig since that allows us to store
> graph metadata which contains information like "when was the data
> created" and "who created the data".
>
> One could compress this file. Sadly there is no pseudo-standard for RDF
> storage yet as there is for SQL (sqlite) so using redland seems weird to me.
>
I'll look into it
> > there is an advantage to the simplicity of <key>=<value> for just tags
> > but it will not scale well to complex meta data.
> > for binary I would imagine a standard database such as sqlite.
> > The advantage there is compactness.
> >
> > There is nothing to stop either of these being configurable but it is
>sensible
>
> > to
> > start as you mean to go in.
> >
> > I think metadata should live in a .metadata directory except that .metadata
>is
>
> > used by eclipse.
> > This is something that should be adoptable as part of the linux filesystem
> > hierarchy.
> > I don't think it should be .nepomuk as that might alienate gnomes.
> > If all metadata is rdf .rdf might be a good choice.
>
> I would personally go for .nepomuk for now since there will be no
> collaboration with Gnome anyway (well, at least I do not believe in it
> after trying for several years. But maybe you would have more luck ;)
>
> Cheers,
> Sebastian
>
That is one reason why I would rather keep things simple and stand-alone.
Rather than starting from the nepomuk or xesam ontologies and trying to force
one on the other.
I would rather stay out of that fight for now and offer an additional means of
interchanging data. :)
Regards,
Bruce.
More information about the Nepomuk
mailing list