[Nepomuk] Why store file urls?

Mon Nov 26 20:23:32 UTC 2012

On 11/23/2012 11:17 AM, Vishesh Handa wrote:
>
>
>
> On Fri, Nov 23, 2012 at 3:30 PM, Jörg Ehrichs <Joerg.Ehrichs at gmx.de
> <mailto:Joerg.Ehrichs at gmx.de>> wrote:
>
>     2012/11/23 Marco Martin <notmart at gmail.com <mailto:notmart at gmail.com>>:
>      > On Friday 23 November 2012, Vishesh Handa wrote:
>      >
>      >> <nepomuk:/res/23161f9c-8839-4de3-bba0-affdd6d654ef>
>      >>         rdf:type
>      >> nmm:MusicPiece
>      >>         rdf:type
>      >> nfo:FileDataObject
>      >>         rdf:type
>      >> nfo:Audio
>      >>         rdf:type
>      >> nie:InformationElement
>      >>         nie:url
>      >> file:///home/vishesh/Music/where_does_the_good_go.mp3
>      >>
>      >> Storing this URL makes accessing file resources quite
>     convenient. But I
>      >> fear it has been a terrible design decision. By storing the url
>     we face the
>      >> following problems -
>      >
>      > uhm, probably is right, keeping the full file url consistent is a
>     mess,
>      > however...
>      >
>      > a very common use case is in the c++ code, doing
>     Nepomuk2::Resource(file path)
>      >
>      > needing a fast way to obtain the resource associated to a
>     particular file
>      > (like in https://bugs.kde.org/show_bug.cgi?id=310525)
>      >
>      > otherwise how could be done quickly to have the metadata of a
>     file given we
>      > have the file, and the other way around?
>
>
> It would be slightly more expensive, but not too hard. One would have to
> retrieve the resource for each file resource till the root element. So
> if you give me something like this
> Resource("/home/vishesh/kde/src/file.cpp")
>
> I'll have to do either multiple queries -
>
> select ?r where { ?r nfo:filename "home" ; nie:isPartOf <rootElement> .
> } -> homeRes
> select ?r where { ?r nfo:filename "vishesh" ; nie:isPartOf <homeRes> . }
> -> visheshRes
> ..
> ..
> or maybe it can be done in one query?

I think so:

select ?r where { ?r nfo:filename "file.cpp" ; nie:isPartOf [ 
nfo:filename "src" ; nie:isPartOf [ nfo:filename "kde" ... ] ] }

I am, however, not sure which is faster.

In general I like the idea to get rid of file URL, a lot actually. This 
could even mean that you get rid of nie:url alltogether. In the end 
there is really no need to use nie:url for http or any other remote 
resource...

As for your (3): that should actually be fairly simple. I wrote the 
code, which feels very hacky (not the code itself, but the need for its 
existance) and it could easily be adapted to only update nfo:filename 
and nie:isPartOf. Much simpler in the end.

All in all: +10 from me if you can get the direct file resource access fast.

Cheers,
Sebastian

>
> You get the gist. These all could be cached in memory so it shouldn't be
> a big problem. This is actually quite analogous to what the kernel does
> in the file system later, except that it matches inodes to their
> filename. We will be matching resource uris.
>
>     I'd say retrieving metadata from a file is a "one-time" job of the
>     file-indexer.
>     Afterwards, we should rely on the data inside Nepomuk and only get
>     more once this fails.
>
>     In addition, the nepomuk-core part could offer a convenient method to
>     create the file url for the end-user and also cache this information
>     for a while to speed up the query. I assume its faster to check
>     QFile::exists() than creating the url with every query again.
>
>
> Of course. This all should be transparently handled in the resource class.
>
>     Other than that, I like the idea. It seems there are several problems
>     with remove able media, which doesn't seem to get solved with the
>     current way.
>
>
> Yeah. I think so as well.
>
> But it's a BIG change. All the previous data will first need to be ported.
>
>     _______________________________________________
>     Nepomuk mailing list
>     Nepomuk at kde.org <mailto:Nepomuk at kde.org>
>     https://mail.kde.org/mailman/listinfo/nepomuk
>
>
>
>
> --
> Vishesh Handa
>
>
>
> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk
>