[Nepomuk] How to use Nepomuk Strigi DB

Sebastian Trüg trueg at kde.org
Mon Sep 20 11:48:31 CEST 2010


The simplest way to read the information is to use Nepomuk::Resource
like so:

KUrl myFileUrl = ...;
Nepomuk::Resource res( myFileUrl );

Nepomuk::Variant v = res.property( xyz );
...

But before implementing this from scratch you should really look into
the work done by Artem in his Summer of Code project. He has created a
framework that does what you describe in a generic pluggable manner. It
is called webextractor and lives in the KDE playground[1].
The idea is that you write a plugin that fetches information from
somewhere and then creates suggestions for annotations like tags. If the
suggestions are good enough they will be applied automatically,
otherwise the user has to approve them.
Also I started to write a plugin for that framework which uses the
Scribo library[2] to do exactly what you propose: it analyzes the text
(again using the scribo plugin system which soon will use "real" NLP to
determine keywords and entities) and creates suggestions. That plugin is
as good as done since it has simply to be converted from one framework
to the other (my old annotation lib to the new webextraction service).
So please do not rewrite that. Rather help us making the work that has
already be done stable enough for a release.

Cheers,
Sebastian

[1] http://websvn.kde.org/trunk/playground/base/nepomuk-kde/webextractor/
[2] http://websvn.kde.org/trunk/playground/base/nepomuk-kde/scribo/

On 09/18/2010 12:43 PM, Vyacheslav wrote:
> Hi. After searching some time at api.kde.org && this forum i have't find 
> information about using strigi DB. As a part of this project: 
> http://forum.kde.org/brainstorm.php#idea90070_page1 we need indexing 
> information about files. As i see, strigi store the information about all text 
> documents (txt, rtf, odt, etc...). 
> At this time our program read all text files manual and this is not right. Can 
> you give the link to api or example were we can find how use strirg DB and read 
> information in all txt files, without direct file riding. Thanks


More information about the Nepomuk mailing list