[Nepomuk] [RFC] Better Full text search

Vishesh Handa me at vhanda.in
Sat May 4 17:09:50 UTC 2013


On Sat, May 4, 2013 at 9:18 PM, <phreedom at yandex.ru> wrote:

> On Суббота 04 мая 2013 20:14:37 Vishesh Handa wrote:
> > On Sat, May 4, 2013 at 7:47 PM, Ivan Čukić <ivan.cukic at kde.org> wrote:
> > > > <res> nie:plainTextContent "title artist album whatevereElse" .
> > >
> > > For me, the plainTextContent of a song would be the lyrics. This seems
> > > like a
> > > misuse of the property. With a very good reason behind it, but still a
> > > misuse.
> > >
> > > I remember when I wanted to keep all activities in one string property
> as
> > > a \n
> > > terminated list to make it speedy :D
> > >
> > > I'd say go for it, but only as a last resort.
> >
> > I would not like Nepomuk to be a data store. It's not the place to store
> > your lyrics to fetch them later, same for emails and files. It is a place
> > to store structured data.
> >
> > In the case of lyrics, the main reason we are storing them is to be able
> to
> > be search through them, not to display them to the user. So we can
> > potentially append other data.
>
> Yes and no.  Until discardable graphs were introduced, there was even no
> distinction between primary storage and cached stuff. The real life is even
> more complicated, you can have local data indexed, you can have  remote
> data
> indexed(and it would be very very nice to have it cached) and for some tuff
> nepomuk is used as the primary storage.
>
> The reason people are trying to stuff nepomuk with their blobs is very
> simple:
> there's a very real demand for this functionality and nepomuk ontologies
> as-is
> already allow you to store your whole filesystem, including all byte
> streams/file contents, so it looks like a very reasonable approach,
> especially
> since nobody actually offers an alternative. Ok akonadi is the only
> exception
> which provides caching of remote data but it's domain-specific.
>
> Imagine a user finding a music video by its lyrics, opening the video only
> to
> discover that (s)he can't see any lyrics, because nepomuk got its lyrics
> from
> some web extractor. Thus the motivation to use nepomuk at least as a cache
> of
> data, not only for search purposes.
>

You do have a point. In this case they should be able to access the lyrics.

>
> There's no primary storage for user-generated rdf at all, so the data is
> stored in nepomuk and users are disappointed when something breaks or
> disappears.
>

If we treat Nepomuk as a data store, then you have to deal with keeping the
store up to date. Specifically in the case of Akonadi - what are
applications supposed to use? Nepomuk or Akonadi? And then we also need a 2
way sync to keep both the databases up to date.

So I prefer treating Nepomuk as a cache just for searching, but I get that
it isn't in the case of tags, and ratings, and other specific rdf. So it's
weird.


> I'm currently experimenting with solutions to some of these issues, but I
> can't do it fast due to time constraints. I don't expect anything worth
> going
> public with in the next couple of months at least and that's if I'm lucky
> :(
>

Could you elaborate?


> _______________________________________________
> Nepomuk mailing list
> Nepomuk at kde.org
> https://mail.kde.org/mailman/listinfo/nepomuk
>



-- 
Vishesh Handa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/nepomuk/attachments/20130504/e0684219/attachment-0001.html>


More information about the Nepomuk mailing list