[Kde-pim] [Discussion] Search in Akonadi

Thu Aug 20 13:22:27 BST 2009

On Thursday 20 August 2009 12:46:05 Tobias Koenig wrote:
> On Thu, Aug 20, 2009 at 11:59:45AM +0200, Volker Krause wrote:
> > >   1) Loading everything into a model, iterate over the model, filter
> > > out everything you don't need -> performance and memory problems
> >
> > sure, but not worse than before. Which means it could be an acceptable
> > intermediate solution until the search problem has been solved for real.
>
> Right, but that could be a real performance problem, as Akonadi is supposed
> to handle more data than the previous libs.

Sure, that's why I said "intermediate solution", we of course don't want this 
to be used in the long run.

> > >   2) Having the search implemented as separated engine, which returns
> > > only the Akonadi UIDs of the items that matches -> sounds perfect
> >
> > Yep, although I don't really like the UID-list interface. As a developer
> > you don't need UIDs, you need Items. So, I'd rather suggest an interface
> > similar to ItemFetchJob which can be configured using ItemFetchScope and
> > returns as much payload data as you need for your current task. This also
> > saves you some additional roundtrips to the Akonadi server.
>
> Right, that would be an additional extension.
>
> > >   1) If users don't have a working Nepomuk installation with Sesame2
> > > backend, they will have either a damn slow working search (if Redland
> > > backend is used) or a not working search at all -> half of Akonadi
> > > doesn't work
> >
> > I agree that the setup problems have to be taken serious, we still have
> > some for the database part. OTOH nothing will change there if we don't
> > push it. Sure, that's painful for everyone involved but it will
> > eventually get us there. At least this time we are not alone with this
> > problem, as you could see in a recent k-c-d thread plasma is considering
> > making Nepomuk mandatory as well.
>
> The question is when...? As far as I understood is Sesame2 the only usfull
> backend currently. The current state of virtuoso doesn't encourage me to
> belief that they will help us much to get things done. So we basically get
> stuck with Sesame2 or invest time in improving the redland backend...

From my understanding the "only" problem with Sesame2 is deployment, feature- 
and performance-wise is can does everything we need, right?

> > >   2) Some users refuse to install Java (needed by Sesame2) because of
> > > economical reasons (disc space on embedded devices)
> >
> > if your embedded device has a problem with Java, you'd likely not want
> > Akonadi/MySQL either.
>
> MySQL/PostgreSQL/SQlite... some kind of database that understands SQL at
> least ;)

Just look at the amount of database specific code and fixes we needed for the 
rather simple stuff we currently do with SQL. IMHO it's not an option to have 
anything that goes more or less directly into a SQL query in the public API.

> > > or political ones (Why should I install Java on a C++ Desktop?)
> >
> > Same as with MySQL, the answer is very simple: It's the best/only
> > available option currently that actually does work.
>
> Yepp, but Nepomuk doesn't work right for this special usecase...

What do you mean? Is there a conceptual problem in Nepomuk for this usecase or 
is this just about (somehow solvable) technical issues?

> > >   3) The data stored by the Nepomuk engine can easily get out-of-sync
> > > with the Akonadi data since they are holded in two places (Akonadi's
> > > MySQL database and Nepomuk repository). That happens quite easily if
> > > you start a KDEPIM application outside a KDE Desktop an therefor the
> > > nepomukservices are not running -> search engine data can't be updated
> > > via DBus
> >
> > That's something that is fixable I think and has to be fixed anyway.
>
> Ok
>
> > > So to let Akonadi work rock stable we don't need a semantic search but
> > > _fast_ and _reliable_ internal search!
> >
> > Well, if we could get a fast and reliable Nepomuk version in the near
> > future, that looks preferable to me than doing our own stuff there. If we
> > cannot count on this however it probably is the only option.
>
> Yes, the 'in the near future' makes me a headache...
> Sebastian, can you give some insight here? Do you have any plans to improve
> the installation/configuration of soprano with Sesame2 or choosing another
> backend?
>
> > >   1) We do not want to depend on external search engines to keep
> > > dependencies small and ensure a working system without additional
> > > configuration from the user => let's use the available search engine:
> > > MySQL
> >
> > Keep in mind that nowaday Akonadi works with PostgreSQL as well and
> > Sqlite support is underway. So, relying on specific DB features can be
> > problematic.
>
> I just wanted to say 'Some SQL database', does not have to be MySQL...
>
> > > The feeder agents would now feed the data that should be able to looked
> > > up into this table. But be aware, only the basic data should be feed in
> > > here, no full text index etc... only the basic stuff!!!
> >
> > How does the feeder agent access this table? It is a separate process and
> > therefore has no direct access to the database.
>
> Could be done via DBus or ASAP, a thread inside the server would fill the
> table with the passed data then.
>
> > > A further advantage, we can fine-tune the indexes of that table to make
> > > the most common searches as fast as possible.
> >
> > Using SQL and exposing various internal implementation details that way
> > sounds like a really bad idea to me.
>
> Right, the user shouldn't have to write SQL there, but we could provide an
> abstract representation:
>
> struct SearchQuery {
>   QString typeIdentifier,
>   QString fieldIdentifier,
>   QString value,
>   ComparisonOperator op
> };
>
> So we can map it to whatever we want on client side or server side.
>
> > During the last Akonadi meeting we thought about using a XESAM subset,
> > which is XML and therefore (hopefully) much easier to automatically
> > translate into other query languages (IMAP, LDAP, SQL, ...).
>
> Hmm, we should provide an object representation for it though, don't want
> people to write XML documents just to search a contact by name ;)

That's independent of the actual query language, we always need an API for it. 
One advantage of XML might be that query translations could be done with XSL 
though.

regards
Volker
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20090820/d1897bfd/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/