[Nepomuk] [RFC] File Indexers

Tue Mar 19 18:05:42 UTC 2013

Hey everyone

As your guys might remember, we moved away from Strigi for the 4.10
release. Our solution however, still does not support any document formats
apart from PDF. We need to change that and support other formats. There are
2 possible ways to go about this -

1. We use Okular which supports a number of popular formats
2. We write our own indexers by using the relevant library.

(1) would lead to yet another abstraction, and someone will need to split
the Okular code base. Currently the core parts are quite entangled with the
ui parts. I've looked over their code and it doesn't seem like a small task.

(2) would require us to write a large number of extractors. I was thinking
of organizing some kind of online sprint in order to do this. Writing an
extractor is very simple, and I have been getting a large number of
requests of people interested in Nepomuk. This might be a nice way to get
them started. (Though we do have tasks + juniors as well). Maybe I could
organize something over irc where we make everyone compile nepomuk (People
seem to be struggling with this), and write a simple extractor plugin.

The advantage of (2) is that it might require a lot of hand holding, but it
would be better in the long run, specially since we might get some
contributors out of it. (1) would require someone experienced to work with
the Okular team.

Any comments? Opinions?

-- 
Vishesh Handa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20130319/d57e1545/attachment.html>