Nepomuk Metadata Extractor moved to KDE Review

Jörg Ehrichs Joerg.Ehrichs at
Wed Oct 31 22:46:18 GMT 2012


2012/10/31 Sebastian Kügler <sebas at>:
>> While most parts can be safely installed by any user, the
>> Nepomuk2:Service should only be installed/activated, if the user is
>> aware of its function.
> How are you going to make sure the user is aware of its function?

Currently only in the KCM configuration dialog. This lets the user
enable/disable it.
But it is enabled by default.

> If it doesn't behave as the user would expect it to (i.e. first, not render
> the system unusable for a long time, query third party sourcees), it needs
> fixing. The key is not that it does something, the key is that it should
> provide value to the user, not break his or her system, or endanger privacy.
> Can't you come up with a gentle solution? It should:
> - not bog down the system
> - make sure the user understands which online resources are queried
> Those might need some UI work, but I think it could turn the metadataextractor
> into something rather useful *and* user-friendly.

I'm not really sure what the user would expect when he installs the program.
This program is delivered under the premission to find metadata on the
internet automatically for all files.

So basically the user is aware what this means. For the Dolphin
integration this means the user has to activate it for each
on his own, so this should be fine. I'm just not sure if the automatic
background service is what a user really expects.

There is currently a KCM installed along with it, that lets the user
decide which web-resources will be queried.
Also the KCM allows to enable/disable the background service that will
automatically fetch additional information for
each music/video/document once the Nepomuk fileindexer is aware of it.

The current implementation allows to set the amount of processes that
are used to fetch this data at the same time.
The default is 1 process, but even ~6 are no problem and do not stress
the system to much. I have tested is with my hd full of tvshows
and even thoug hit takes a long time, the fetching progress wasn't
noticeable while using the computer normally.

So unlike Nepomuks old file-/mailindexer that could stress the system,
this is rather harmless.
I couldn't spot any differences apart from the fact that there was a
constant low network traffic in the background.

What might be the problem is that the user might not be aware that
this will query some webresources in the background by default,
which like i mentioned is not really a good idea for traffic limited
connections (not that a lot of traffic is produces, but sometimes each
kb counts)

So the UI part is available.
I guess the best solution would be to disable the background service
by default and let the user enable it if he really wants to use it.

This could either be done be by the packagers, as during installation
the nepomukserverrc file needs to be changed.
Or I'll do a first run check in the service part and disable it, again
the first time the service is started.

>> But I assume this is something the packagers have to split.
> That would be almost making sure that it will hit users. In general, never
> assume something's going to be fixed by packagers, they, too, need something
> that can just be slapped into a package, and which should behave well by
> default. :/

I have sadly no idea how to make it easier for packagers, in the end
having separate packages that allows to deinstall the service
if it is really not needed is the very best solution.

In general even if everything is in one package, the user won't see
any performance loss.

What on the other end might be a problem is the privacy issue i
haven't thought about yet.
As I do ask (microsoft bing) for any found document, if there is more
information available, this could be problematic.

I will change the service so so the lookup for a specific resource
type (namely documents) can be switched of separately.
and beside the first run disabling of the serivce also disable the
document search as initial settings.

Its better to let the user enable it on its own decision that passing
all information right away to some Internet search engine.


More information about the kde-core-devel mailing list