[Nepomuk] word lists - strigi? nepomuk?

Ignacio Serantes kde at aynoa.net
Wed Jul 18 13:00:26 UTC 2012


Hi,

Actually this is the information stored in Nepomuk using text files indexer:

   - nao:characterCount
   - nie:contentSize
   - nfo:lineCount
   - nie:plainTextContent
   - nfo:wordCount

to add more ontologies text analyzer must be modified, this indexer is
available in libstreamanalyzers git repository.

Ontologies information it's available here: http://oscaf.sourceforge.net/

There is an example about fetching information for tv series available
called nepomuktvnamer and could be useful to you. You could locate the git
repository, and other repositories related to Nepomuk, here:
http://quickgit.kde.org/


On Wed, Jul 18, 2012 at 2:42 PM, Dean Perry <happy.heyoka at gmail.com> wrote:

> **
>
> Hi,
>
>
>
> I originally posted this here : <
> http://forum.kde.org/viewtopic.php?f=43&t=106919>
>
>
>
> but the forum admin said I should try you directly... if you feel like
> answering, post to the forum or mail me and I'll copy it there; I can't be
> the only one who has wondered about this:
>
>
>
> I have an idea for an application to automatically categorise and tag
> documents based on their contents.
>
>
> To do this I need a frequency distribution of the words in the document.
>
> I have played around with the nepomuk examples and have a few clues about
> the tagging and rdf storage.
>
> I can't find much info on a per-document word list though - nepsak,
> nepoogle don't appear to show it, so maybe it's not stored in virtuoso?
>
> Is there a word list stored (eg: inverted vector index)? How does the full
> text search in Dolphin do its thing?
>
> Do I need to produce this list myself using libstreamanalyzer? I'd prefer
> not to do a second indexing pass.
>
>
>
> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk
>
>


-- 
Best wishes,
Ignacio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20120718/70e452e0/attachment-0001.html>


More information about the Nepomuk mailing list