[Kde-pim] [Discussion] Search in Akonadi

Thu Aug 20 11:46:05 BST 2009

On Thu, Aug 20, 2009 at 11:59:45AM +0200, Volker Krause wrote:
> Hi,
Hej,

> >   1) Loading everything into a model, iterate over the model, filter out
> > everything you don't need -> performance and memory problems
> 
> sure, but not worse than before. Which means it could be an acceptable 
> intermediate solution until the search problem has been solved for real.
Right, but that could be a real performance problem, as Akonadi is supposed
to handle more data than the previous libs.

> >   2) Having the search implemented as separated engine, which returns only
> > the Akonadi UIDs of the items that matches -> sounds perfect
> 
> Yep, although I don't really like the UID-list interface. As a developer you 
> don't need UIDs, you need Items. So, I'd rather suggest an interface similar 
> to ItemFetchJob which can be configured using ItemFetchScope and returns as 
> much payload data as you need for your current task. This also saves you some 
> additional roundtrips to the Akonadi server.
Right, that would be an additional extension.

> >   1) If users don't have a working Nepomuk installation with Sesame2
> > backend, they will have either a damn slow working search (if Redland
> > backend is used) or a not working search at all -> half of Akonadi doesn't
> > work
> 
> I agree that the setup problems have to be taken serious, we still have some 
> for the database part. OTOH nothing will change there if we don't push it. 
> Sure, that's painful for everyone involved but it will eventually get us 
> there. At least this time we are not alone with this problem, as you could 
> see in a recent k-c-d thread plasma is considering making Nepomuk mandatory 
> as well.
The question is when...? As far as I understood is Sesame2 the only usfull backend
currently. The current state of virtuoso doesn't encourage me to belief that they
will help us much to get things done. So we basically get stuck with Sesame2 or invest
time in improving the redland backend...

> >   2) Some users refuse to install Java (needed by Sesame2) because of
> > economical reasons (disc space on embedded devices)
> 
> if your embedded device has a problem with Java, you'd likely not want 
> Akonadi/MySQL either.
MySQL/PostgreSQL/SQlite... some kind of database that understands SQL at least ;)

> > or political ones (Why should I install Java on a C++ Desktop?)
> 
> Same as with MySQL, the answer is very simple: It's the best/only available 
> option currently that actually does work.
Yepp, but Nepomuk doesn't work right for this special usecase...

> >   3) The data stored by the Nepomuk engine can easily get out-of-sync with
> > the Akonadi data since they are holded in two places (Akonadi's MySQL
> > database and Nepomuk repository). That happens quite easily if you start a
> > KDEPIM application outside a KDE Desktop an therefor the nepomukservices
> > are not running -> search engine data can't be updated via DBus
> 
> That's something that is fixable I think and has to be fixed anyway.
Ok

> > So to let Akonadi work rock stable we don't need a semantic search but
> > _fast_ and _reliable_ internal search!
> 
> Well, if we could get a fast and reliable Nepomuk version in the near future, 
> that looks preferable to me than doing our own stuff there. If we cannot 
> count on this however it probably is the only option.
Yes, the 'in the near future' makes me a headache...
Sebastian, can you give some insight here? Do you have any plans to improve the
installation/configuration of soprano with Sesame2 or choosing another backend?

> >   1) We do not want to depend on external search engines to keep
> > dependencies small and ensure a working system without additional
> > configuration from the user => let's use the available search engine: MySQL
> 
> Keep in mind that nowaday Akonadi works with PostgreSQL as well and Sqlite 
> support is underway. So, relying on specific DB features can be problematic.
I just wanted to say 'Some SQL database', does not have to be MySQL...

> > The feeder agents would now feed the data that should be able to looked up
> > into this table. But be aware, only the basic data should be feed in here,
> > no full text index etc... only the basic stuff!!!
> 
> How does the feeder agent access this table? It is a separate process and 
> therefore has no direct access to the database.
Could be done via DBus or ASAP, a thread inside the server would fill the table
with the passed data then.

> > A further advantage, we can fine-tune the indexes of that table to make the
> > most common searches as fast as possible.
> 
> Using SQL and exposing various internal implementation details that way sounds 
> like a really bad idea to me.
Right, the user shouldn't have to write SQL there, but we could provide an abstract
representation:

struct SearchQuery {
  QString typeIdentifier,
  QString fieldIdentifier,
  QString value,
  ComparisonOperator op
};

So we can map it to whatever we want on client side or server side.

> During the last Akonadi meeting we thought about using a XESAM subset, which 
> is XML and therefore (hopefully) much easier to automatically translate into 
> other query languages (IMAP, LDAP, SQL, ...).
Hmm, we should provide an object representation for it though, don't want people
to write XML documents just to search a contact by name ;)

Thanks for your comments and input!

Ciao,
Tobias
-- 
Separate politics from religion and economy!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20090820/4b2229eb/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/