[Nepomuk] Why store file urls?

Vishesh Handa me at vhanda.in
Mon Dec 10 13:18:05 UTC 2012


Quick update -

Right now the plan is to implement this for 4.11.


On Tue, Nov 27, 2012 at 1:53 AM, Sebastian Trüg <trueg at kde.org> wrote:

> On 11/23/2012 11:17 AM, Vishesh Handa wrote:
>
>>
>>
>>
>> On Fri, Nov 23, 2012 at 3:30 PM, Jörg Ehrichs <Joerg.Ehrichs at gmx.de
>> <mailto:Joerg.Ehrichs at gmx.de>> wrote:
>>
>>     2012/11/23 Marco Martin <notmart at gmail.com <mailto:notmart at gmail.com
>> >>:
>>
>>      > On Friday 23 November 2012, Vishesh Handa wrote:
>>      >
>>      >> <nepomuk:/res/23161f9c-8839-**4de3-bba0-affdd6d654ef>
>>      >>         rdf:type
>>      >> nmm:MusicPiece
>>      >>         rdf:type
>>      >> nfo:FileDataObject
>>      >>         rdf:type
>>      >> nfo:Audio
>>      >>         rdf:type
>>      >> nie:InformationElement
>>      >>         nie:url
>>      >> file:///home/vishesh/Music/**where_does_the_good_go.mp3
>>      >>
>>      >> Storing this URL makes accessing file resources quite
>>     convenient. But I
>>      >> fear it has been a terrible design decision. By storing the url
>>     we face the
>>      >> following problems -
>>      >
>>      > uhm, probably is right, keeping the full file url consistent is a
>>     mess,
>>      > however...
>>      >
>>      > a very common use case is in the c++ code, doing
>>     Nepomuk2::Resource(file path)
>>      >
>>      > needing a fast way to obtain the resource associated to a
>>     particular file
>>      > (like in https://bugs.kde.org/show_bug.**cgi?id=310525<https://bugs.kde.org/show_bug.cgi?id=310525>
>> )
>>      >
>>      > otherwise how could be done quickly to have the metadata of a
>>     file given we
>>      > have the file, and the other way around?
>>
>>
>> It would be slightly more expensive, but not too hard. One would have to
>> retrieve the resource for each file resource till the root element. So
>> if you give me something like this
>> Resource("/home/vishesh/kde/**src/file.cpp")
>>
>> I'll have to do either multiple queries -
>>
>> select ?r where { ?r nfo:filename "home" ; nie:isPartOf <rootElement> .
>> } -> homeRes
>> select ?r where { ?r nfo:filename "vishesh" ; nie:isPartOf <homeRes> . }
>> -> visheshRes
>> ..
>> ..
>> or maybe it can be done in one query?
>>
>
> I think so:
>
> select ?r where { ?r nfo:filename "file.cpp" ; nie:isPartOf [ nfo:filename
> "src" ; nie:isPartOf [ nfo:filename "kde" ... ] ] }
>
> I am, however, not sure which is faster.
>
> In general I like the idea to get rid of file URL, a lot actually. This
> could even mean that you get rid of nie:url alltogether. In the end there
> is really no need to use nie:url for http or any other remote resource...
>
> As for your (3): that should actually be fairly simple. I wrote the code,
> which feels very hacky (not the code itself, but the need for its
> existance) and it could easily be adapted to only update nfo:filename and
> nie:isPartOf. Much simpler in the end.
>
> All in all: +10 from me if you can get the direct file resource access
> fast.
>
> Cheers,
> Sebastian
>
>
>> You get the gist. These all could be cached in memory so it shouldn't be
>> a big problem. This is actually quite analogous to what the kernel does
>> in the file system later, except that it matches inodes to their
>> filename. We will be matching resource uris.
>>
>>     I'd say retrieving metadata from a file is a "one-time" job of the
>>     file-indexer.
>>     Afterwards, we should rely on the data inside Nepomuk and only get
>>     more once this fails.
>>
>>     In addition, the nepomuk-core part could offer a convenient method to
>>     create the file url for the end-user and also cache this information
>>     for a while to speed up the query. I assume its faster to check
>>     QFile::exists() than creating the url with every query again.
>>
>>
>> Of course. This all should be transparently handled in the resource class.
>>
>>     Other than that, I like the idea. It seems there are several problems
>>     with remove able media, which doesn't seem to get solved with the
>>     current way.
>>
>>
>> Yeah. I think so as well.
>>
>> But it's a BIG change. All the previous data will first need to be ported.
>>
>>     ______________________________**_________________
>>     Nepomuk mailing list
>>     Nepomuk at kde.org <mailto:Nepomuk at kde.org>
>>
>>     https://mail.kde.org/mailman/**listinfo/nepomuk<https://mail.kde.org/mailman/listinfo/nepomuk>
>>
>>
>>
>>
>> --
>> Vishesh Handa
>>
>>
>>
>> ______________________________**_________________
>> Nepomuk mailing list
>> Nepomuk at kde.org
>> https://mail.kde.org/mailman/**listinfo/nepomuk<https://mail.kde.org/mailman/listinfo/nepomuk>
>>
>>  ______________________________**_________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/**listinfo/nepomuk<https://mail.kde.org/mailman/listinfo/nepomuk>
>



-- 
Vishesh Handa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20121210/c7c10594/attachment.html>


More information about the Nepomuk mailing list