[Nepomuk] [RFC] Better Full text search
phreedom at yandex.ru
phreedom at yandex.ru
Sat May 4 14:09:59 UTC 2013
On Суббота 04 мая 2013 18:49:05 Vishesh Handa wrote:
> Even when you're doing a simple search for one word
> it is still something like this -
>
> select distinct ?r where {
> { ?r ?p ?o .
> bif:contains(?o, "word") .
> }
> UNION {
> ?r ?p ?o1
> ?o1 ?p2 ?o .
> bif:contains(?o, "word") .
> }
> }
>
> which is again kinda slow cause we aren't using any of the indexes of the
> statements.
>
> I was thinking of moving all the plain text related to a file into the
> nie:plainTextContent of the resource. So in the case of music we would have
> -
>
> <res> nie:plainTextContent "title artist album whatevereElse" .
>
> for the case of files, we would append the file name, and any other plain
> text that we want searched just in the nie:plainTextConent. So a search for
> any combination of text will just have to search through the plain text
> content.
>
> Opinions?
>
> We can easily do this for the 4.11 release cause we already need everyone
> to re-index everything cause of the migration.
Have you asked Virtuoso devs?
Unless someone tried using plainTextContent as primary storage, it shouldn't
be a problem. But if structured data goes out of the window, we could as well
use clucene to get good performance :(
More information about the Nepomuk
mailing list