[Nepomuk] Nepomuk-WebMiner integration into nepomuk-core fileindxer

Jörg Ehrichs Joerg.Ehrichs at gmx.de
Tue Jan 8 15:29:03 UTC 2013


2013/1/8 Vishesh Handa <me at vhanda.in>:
> Hey Jörg
>
> Sorry about the delay.

No problem :)

>
> Would it really be that much effort to copy the indexing queue in the
> webminer service? Additionally the fileindexer could even emit signals when
> its queue finishes one file so you can pick it up. If required.

Not really to much effort. I just like to avoid copying stuff around.
But I fully understand all the issues against the nepomuk-core
inclusion.
How about I stay with the Nepomuk2::Service part (I use at the moment)
but instead of the ResourceWatcher I stick with the kext:indexingLevel
solution
that the file indexer uses.

So the service runs with a copy of the eventmonitor for proper
suspend/resume on idle/diskspace/battery (and Network) limitations
and checks for the next 10 resources with mimeytpe pdf/video/audio
that have kext:indexingLevel == 2.
After the webminer was executed, no matter if successful or not the
resources gets kext:indexingLevel 3.

Does this sound like a better solution?

>>
>> The current indexing can than be controlled via the overall indexing
>> status and shown in the nepomuk-controller that sits in the
>> systemtray.
>
>
> It would complicate the code, but we could make it control multiple sources
> - file indexer + webminer. Actually, I think that is a good idea specially
> cause it should also be controlling the akonadi indexing.

I believe the controller needs to be changed anyway (no matter what
solution I would select, the way it is now it doesn't show much
information in the one string)
First it might be nice if the suspend/resume button throws a signal
via dbus (in case it doesn't do this already).
Adding proper output to show is a nice task for later.

>
> I'm hesitant of moving the webminer into nepomuk-core cause of the following
> reasons -
> [...]
>
> * Splitting of the webminer - I like the code being in one place. You'd have
> to split the miner into the ui parts / core parts.
>
> Overall, these reasons aren't that serious, but I still feel quite uneasy.
> That's the main reason why I took so much time to reply. Maybe someone else
> could tell their opinion?

I'm fine with your reasons, especially as I don't like to split
core/ui parts here.
Annoying enough that it took me a while to find where the nepomuk kcm
was hidden.

All I need is a "proper" way to integrate the service into the system properly.
I don't like opening the webminer ui every time I have a new file,
when the same thing can be done automatically.

So if the service way mentioned above together with a copied part of
the indexingqueue and the kext:indexinglevel is
a good solution i'll implement this and we can move the complete
webminer repo (after polishing) somewhere into SC
wherever it might fit.

>> The biggest problem might be the generation of the SimpleResource
>> classes, which takes a very long time currently. Hopefully this can be
>> fixed too, as this problem should be solved by any program that will
>> use them in the future anyway.
>
>
> I don't think everyone will agree with me, but I think the code should be
> pre-generated and saved. The main reason for this is that we have had
> problems when SDO has been changed in a way that the nepomuk-rcgen produces
> different code. This happened when we had set the max cardinality of a lot
> of properties. It resulted in the method signature changing. Eg -
> setFullNames( QList<> .. ) to setFullName( QString ). We eventually had to
> patch up rcgen to generate both the functions.
>
> Also, since we do not guarantee that the generated files will maintain
> source compatibility, it seems better if one could just re-generate them
> before each release, and fix the problems as they arise.
>

Well we had the discussion in a lengthy way in kde review.
All in all it would be nice to generate the files on the fly, but
packagers really hate it.

I hope we can add the generated files back into the repo later.

Cheers,
Jörg


More information about the Nepomuk mailing list