[Nepomuk] [PATCH] Metadata mover

Vishesh Handa handa.vish at gmail.com
Tue May 18 12:09:53 CEST 2010


Hey Sebastian.

On Tue, May 18, 2010 at 3:19 PM, Sebastian Trüg <trueg at kde.org> wrote:

> Hi Vishesh,
>
> the minor patch is of course no problem.
>
> So I understand correctly: the issue is that the same url is returned
> twice, once for the correct resource and once for another one.
> And you are sure this does not happen if you cache the results?
>
>  It works fine if I cache the results. I checked!

Also be aware that the number of results can be very high. Thus, caching
> them all at once might not be a good idea. We would have to do it in
> batches.
>
>  How do I cache some of them? Should I perform the query with a limit
multiple times?

However, normally Virtuoso should not no problem with performing changes
> while iterating results.
>

I know, but this has been happening to me for some time now. I actually run
the same query twice, once just normally printing the results, and once
running it through the updateMetadata loop. The results aren't the same.
This is probably because of the modifications being made.

A simple way to check if this is happening on your system is to perform a
sparql query "select ?url where { ?r nie:url ?url. FILTER(regex(str(?url),
'foo') . }" and then rename the folder "foo" and perform the query again to
see if some of files haven't been transfered. Do it on a folder which has at
least 20+ files. On my system it's usually the 19th or 20th file that is
skipped.

- Vishesh Handa


> Cheers,
> Sebastian
>
> On 05/14/2010 03:12 PM, Vishesh Handa wrote:
> > If you rename a folder with loads of data (it should have metadata), the
> > MetadataMover::updateMetadata function would sometimes skip a resource
> > or two. This doesn't always happen. Just occasionally. Look at the output
> -
> >
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: r -> * QUrl(
> > "nepomuk:/res/0887efd7-b6ab-4acf-8f0a-74ed7106c6ed" )*
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: url ->
> > *KUrl("file:///home/vishesh/Index_LOO/Phonemusic/19.mp3")*
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata:
> > KUrl("file:///home/vishesh/Index_LOO/Phonemusic/19.mp3") ->
> > KUrl("file:///home/vishesh/Index_L/Phonemusic/19.mp3")
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: Old Resource Exists
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: r ->  *QUrl(
> > "nepomuk:/res/b9f11a5b-cbe3-4797-9fe0-0f293a215a9e" )*
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: url ->
> > *KUrl("file:///home/vishesh/Index_LOO/Phonemusic/19.mp3")*
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata:
> > KUrl("file:///home/vishesh/Index_LOO/Phonemusic/19.mp3") ->
> > KUrl("file:///home/vishesh/Index_L/Phonemusic/19.mp3")
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: Old Resource Exists
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: r ->  QUrl(
> > "nepomuk:/res/38616b7c-9540-46de-aa5c-e12971cac64d" )
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: url ->
> > KUrl("file:///home/vishesh/Index_LOO/Phonemusic/22.mp3")
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata:
> > KUrl("file:///home/vishesh/Index_LOO/Phonemusic/22.mp3") ->
> > KUrl("file:///home/vishesh/Index_L/Phonemusic/22.mp3")
> > nepomukfilewatch(10425)/nepomuk (filewatch service)
> > Nepomuk::MetadataMover::updateMetadata: Old Resource Exists
> >
> > In the second case. The resource
> > *nepomuk:/res/b9f11a5b-cbe3-4797-9fe0-0f293a215a9e *actaully has a
> > nie:url of *file:///home/vishesh/Index_L/Phonemusic/20.mp3*. This was
> > kinda difficult to track down, but the solution is fairly simple, and
> > was mentioned in the Soprano::QueryResultIterator documentation.
> >
> > Many backends do lock the underlying Model during iteration. Thus,
> > it is always a good idea to cache the results if they are to be used
> > to modify the model to prevent a deadlock:
> >
> > I really should read the documentation more thoroughly.
> > *
> > Other things :*
> >
> > 1. The kinotify currently tracks hidden files as well. Why is that?
> > Tracking hidden files means you track all kind of temporary files and
> > are alerted when they are altered. It makes debugging a million times
> > harder. It's just a matter of changing
> > *KInotify::Private::watchHiddenFolders* to false. If you don't agree
> > with me, then can we please make it configurable?
> >
> > 2. Minor optimization on *MetadataMover::updateMetadata* -> Patch!
> >
> > - Vishesh Handa
> >
> >
> >
> > _______________________________________________
> > Nepomuk mailing list
> > Nepomuk at kde.org
> > https://mail.kde.org/mailman/listinfo/nepomuk
> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.kde.org/pipermail/nepomuk/attachments/20100518/b49be644/attachment.htm 


More information about the Nepomuk mailing list