Nepomuk Metadata Extractor moved to KDE Review

Wed Nov 7 14:38:15 GMT 2012

Hey Jorg

A couple of things

1. You have multiple extractors - One for resources which extract
information from the file, and some web-extractors. Considering that
Nepomuk now allows easy Qt based extractor plugins, how about we move your
code over there? Your poppler based code would be quite useful. Same goes
with the ODF.

2. Project Name - If one moves the extractors away, the only part that is
left is the web-extractors. Why not rename the project to
Nepomuk-WebExtractor or something similar? I know a project by that name
already exists, but that can be removed. It's a dead project.

3. I would eventually like this to be a part of the KDE SC release. Web
Extractors are something that I have wanted for a very very long time. I'm
not sure if we can get this into 4.10, but I'd definitely like it to be a
part of 4.11.

As to where it should be placed. I agree with Sebastian Kugler, kdelibs is
not the place. We had initially planned on splitting kde-runtime/nepomuk
into multiple repositories, but we're now waiting for KF5. Do you think
this could go under kde-runtime (not in that repo)

4. ResourceWatcher - This is something that I would like done in the
future. Not right now. We don't need to be perfectionists.

I would ideally like this to be part 3 of the file indexing system we have.
Currently part 1 pushes the stat + rdf:type + mimetype. Part 2 indexes the
contents of the file ( Your resource extractors go here ), and part 3 could
extract information from the web.

This way, you would avoid using the ResourceWatcher, and everything would
be better integrated. But I'm not sure how we would go about this, so lets
stick with the current architecture for now.

5. Auto generated SimpleResource Headers - You've included them in your
repo. That was what we originally wanted. We didn't want to repeat the mess
that happened with breaking kdepim cause of ontology changes.

Does anyone have a problem with having generated headers in the code? One
could generate them on the fly, but that would be slow (Jorg says around 10
minutes?) and if something is changed in the ontologies, the classes would
change drastically thereby affecting the code.

On Wed, Oct 31, 2012 at 3:41 AM, Jörg Ehrichs <Joerg.Ehrichs at gmx.de> wrote:

> Hi all,
>
> today I've moved my metadata extractor into KDE Review [1].
> As kde-libs is frozen till kf5 I like to get this into extragear/base
> (unless anyone has a better idea where to put this).
>
> For those who are unaware what this little program does:
>
> This programs is an extension to Nepomuk and is able to find
> additional metadata for videos/music and documents on the Internet.
> Based on filename / previous metadata extraction / mimetype one of the
> existing python plugin based (thanks to KROSS) fetcher are called,
> to get more information for a file.
>
> This can be, title, season, episode, writer, author, cast, cited
> references and so on.
> All this data is saved into Nepomuk and can be used with Dolphin /
> Bangarang to get more information from your files.
>
> The program is integrated into the dolpin service menu, can be called
> as command-line program, runs as a Nepomuk2::Service in the background
> (can be switched off)
> and has also adapters to be able to integrate into Konqueror and Chromium.
>
> More information on it can be found on my blog [2].
> Some more technical description is available via doxygen.
>
> Please review the current codebase to help this getting as stable as
> possible.
>
> Thanks in advance,
> Joerg
>
> [1] https://projects.kde.org/projects/kdereview/nepomuk-metadata-extractor
> [2] http://joerg-weblog.blogspot.de/search/label/Metadata%20Extractor
>

-- 
Vishesh Handa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20121107/c32eaf25/attachment.htm>