[Nepomuk] Why store file urls?

Fri Nov 23 10:17:28 UTC 2012

On Fri, Nov 23, 2012 at 3:30 PM, Jörg Ehrichs <Joerg.Ehrichs at gmx.de> wrote:

> 2012/11/23 Marco Martin <notmart at gmail.com>:
> > On Friday 23 November 2012, Vishesh Handa wrote:
> >
> >> <nepomuk:/res/23161f9c-8839-4de3-bba0-affdd6d654ef>
> >>         rdf:type
> >> nmm:MusicPiece
> >>         rdf:type
> >> nfo:FileDataObject
> >>         rdf:type
> >> nfo:Audio
> >>         rdf:type
> >> nie:InformationElement
> >>         nie:url
> >> file:///home/vishesh/Music/where_does_the_good_go.mp3
> >>
> >> Storing this URL makes accessing file resources quite convenient. But I
> >> fear it has been a terrible design decision. By storing the url we face
> the
> >> following problems -
> >
> > uhm, probably is right, keeping the full file url consistent is a mess,
> > however...
> >
> > a very common use case is in the c++ code, doing Nepomuk2::Resource(file
> path)
> >
> > needing a fast way to obtain the resource associated to a particular file
> > (like in https://bugs.kde.org/show_bug.cgi?id=310525)
> >
> > otherwise how could be done quickly to have the metadata of a file given
> we
> > have the file, and the other way around?
>

It would be slightly more expensive, but not too hard. One would have to
retrieve the resource for each file resource till the root element. So if
you give me something like this Resource("/home/vishesh/kde/src/file.cpp")

I'll have to do either multiple queries -

select ?r where { ?r nfo:filename "home" ; nie:isPartOf <rootElement> . }
-> homeRes
select ?r where { ?r nfo:filename "vishesh" ; nie:isPartOf <homeRes> . } ->
visheshRes
..
..
or maybe it can be done in one query?

You get the gist. These all could be cached in memory so it shouldn't be a
big problem. This is actually quite analogous to what the kernel does in
the file system later, except that it matches inodes to their filename. We
will be matching resource uris.

I'd say retrieving metadata from a file is a "one-time" job of the
> file-indexer.
> Afterwards, we should rely on the data inside Nepomuk and only get
> more once this fails.
>
> In addition, the nepomuk-core part could offer a convenient method to
> create the file url for the end-user and also cache this information
> for a while to speed up the query. I assume its faster to check
> QFile::exists() than creating the url with every query again.
>

Of course. This all should be transparently handled in the resource class.

> Other than that, I like the idea. It seems there are several problems
> with remove able media, which doesn't seem to get solved with the
> current way.
>

Yeah. I think so as well.

But it's a BIG change. All the previous data will first need to be ported.

> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk
>

-- 
Vishesh Handa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20121123/e8054c8f/attachment.html>