Nepomuk Metadata Extractor moved to KDE Review

Wed Nov 7 15:57:04 GMT 2012

Hi Vishesh,
thanks for the input.

>
> 1. You have multiple extractors - One for resources which extract
> information from the file, and some web-extractors. Considering that Nepomuk
> now allows easy Qt based extractor plugins, how about we move your code over
> there? Your poppler based code would be quite useful. Same goes with the
> ODF.
>

Yes, when I've started the project the new file extractors wasn't available.
I do intent to switch to the new fileindexer way combined some the
current filename analyzing
(unless this can go into the fileindexer too)
About moving the pdf/odf analyzer over. I thought you already copied
the important parts to the new indexer?
If they are still missing, I'll add them later on.

>
> 2. Project Name - If one moves the extractors away, the only part that is
> left is the web-extractors. Why not rename the project to
> Nepomuk-WebExtractor or something similar? I know a project by that name
> already exists, but that can be removed. It's a dead project.
>

If we can ignore the naming issue with the "old" project, I really
like to rename it to Nepomuk-WebExtractor as this
fits the purpose of the system the most.

>
> 3. I would eventually like this to be a part of the KDE SC release. Web
> Extractors are something that I have wanted for a very very long time. I'm
> not sure if we can get this into 4.10, but I'd definitely like it to be a
> part of 4.11.
>

I would like to be part of KDE SC, i assume its way to late for 4.10
due to feature freeze but 4.11 sounds nice too.
I could go extragear for now and move it back to kde-runtime when
master is unfrozen again?

> As to where it should be placed. I agree with Sebastian Kugler, kdelibs is
> not the place. We had initially planned on splitting kde-runtime/nepomuk
> into multiple repositories, but we're now waiting for KF5. Do you think this
> could go under kde-runtime (not in that repo)
>

Just wonder if runtime is the really best place.
Beside the fact that its a standalone program, it also can serve as a
library which can be used by others.
While this is mostly a "like to have case" the additional searching
capabilities could be nice in
Bangarang, Amarok, Okular and other programs working with media files.
Would such a library component
still fit into runtime? Or should is just ignore this fact for now, as
it is unlikely that this will be integrated
into other programs is the near future.

> 4. ResourceWatcher - [...]
>
> This way, you would avoid using the ResourceWatcher, and everything would be
> better integrated. But I'm not sure how we would go about this, so lets
> stick with the current architecture for now.
>

This sounds like a nice idea. We can figure something out I do have a
few ideas in this area, but not really
the time to work on such large changes at the moment.

>
> 5. Auto generated SimpleResource Headers - You've included them in your
> repo. That was what we originally wanted. We didn't want to repeat the mess
> that happened with breaking kdepim cause of ontology changes.
>

Got to know this was the intended way to go.

>
> Does anyone have a problem with having generated headers in the code? One
> could generate them on the fly, but that would be slow (Jorg says around 10
> minutes?) and if something is changed in the ontologies, the classes would
> change drastically thereby affecting the code.
>

I could push my latest changes which allows to easily use a cmake
switch to generate updated ontology classes.
This would combine both solutions, as long as it is fine to add such
generated classes into the repo.

Cheers,
Joerg