[Kde-pim] Akonadi: single database design mistake?

Martin Steigerwald Martin at lichtvoll.de
Fri Dec 2 08:40:29 GMT 2011


Am Donnerstag, 1. Dezember 2011 schrieb Dmitry Torokhov:
> Hi Martin,

Hi Dmitry,

> On Thu, Dec 01, 2011 at 06:50:23PM +0100, Martin Steigerwald wrote:
> > Hi Dmitry!
> > 
> > I rarely post here as a user of KDEPIM, but this I like to comment:
> > 
> > Am Dienstag, 29. November 2011 schrieb Dmitry Torokhov:
> > > > Another thing: If we ever come to the point where MySQL is the
> > > > bottleneck, the  current architecture should make it rather
> > > > simple to come up with an alternative, optimized architecture.
> > > > Personally, I just doubt that we are able to design a relational
> > > > database from scratch that will outperform MySQL so easily...
> > > 
> > > Thet is the main question: do we really want a monolithic database
> > > here.
> > 
> > At work we use a Zimbra Collaboration Suite server as our groupware
> > solution. It uses MySQL to store mail metadata, Lucene to provide a
> > search index and files to store the actual mail. It is serving about
> > 30 users all day via Zimbra webclient, outlook and various IMAP
> > clients - I partly use KMail with it.
> > 
> > Now consider this:
> > 
> > - about 100 folders including subfolders
> > - hundred of thousands of mails
> > - several gigabytes of mail easily (do not see it in the webclient
> > and do not want to open up a VPN to look in the administrative
> > interface for the size right now)
> > - folders with tens of thousands of mails
> 
> Here at work we like to eat our dogfood and so we also converted almost
> entirely to Zimbra. So it is several thousand employees with so many
> mails, server farm running Zimbra that is partitioned properly, etc,
> etc. And it all scaled and partitioned nicely and if our infrastructure
> guys see that the load on one of the nodes gets too big they can bring
> in another node and so forth. So yes, MySQL apparently can handle that.
> I never said that MySQL is not suitable for large amounts of data.

You work at VMware?

Well you said:

> Thet is the main question: do we really want a monolithic database
> here.

And in your initial post:

> And this brings me to the question: is stuffing everything into a single
> database such a good idea?

And also hinted at the slow speed of KMail during import.
 
> But the point I was trying to make is that I do not want to replicate
> that setup on my tiny laptop. There is a reason I have IMAP - I
> _offload_ tasks from laptop to other boxes, such as receiving and
> sorting mail, anti-spam and anti-virus checks, etc, etc, so that the
> laptop only does fraction of work required. I do not want to fine-tune
> MySQL on laptop to make sure indices fit into memory, that the log size
> is appropriate, and so forth. And my question was - given that there
> normally a single user (as in person) working with a single folder at a
> given time, would not it be more effective to restrict the size of the
> data we are working with to that single folder instead of trying to
> handle the data as whole.

But then this is a completely different argument IMHO.

Except for can it work fast when we stuff everything into a database this 
is the argument: I have that intelligence on the server, why replicate it 
on the client?

Well now I use KMail with 5 POP accounts - some freemail and my main POP 
account. Intelligence on the server is restricted to dovecot, Postfix and 
policyd-weight capabilities. And I do not plan to use Zimbra for my 
private mail. I have been thinking to convert to IMAP, but then: What for? 
My current setup does what I need. Except for fast fulltext search and 
having all the stuff in the background and there I have high hopes for 
Akonadi.

KMail is not only for corporate users, but also for personal users. And 
while most of them might use IMAP already - although I read from quite 
some POP3 based setups on kdepim-users - not everyone has that amount of 
intelligence on the server that Zimbra features.

But even then I see benefits for Akonadi such as better disconnected IMAP 
with fast fulltext search. Basically the same was the Zimbra Desktop is 
doing as well. The Zimbra Desktop client is basically a complete Zimbra 
Server with web gui - but on the desktop that synchronizes all mail on the 
server to the desktop. Quite handy to access mail when you are offline, I 
would think.

> To be fair, after the pain of initial import and after running for a
> couple days, the system has settled down and is now usable.

And thats important. From what I read actually I believe that there is 
*much* room for optimization. But I do not see why a database Akonadi 
shouldn´t be fast and lean.

Everybody and his dog is using a database these days: Digikam (by default 
SQLite3), Amarok (by default MySQL embedded), Firefox for bookmarks 
(SQLite3). And these are just three examples on the client side. And as 
long as you do not put Firefox onto a BTRFS - where a co-worker indeed had 
performance issues, maybe due to problems with a workload with many 
fsync(), as far as I recall there have been optimizations regarding that 
in BTRFS recently - these three applications perform well.

Also Nepomuk from KDE 4.7.2 seems to behave much saner than before. It 
crawled even this new ThinkPad T520 almost down to a halt at times - I am 
exaggerating this a bit I admit -, but now is barely noticable. For that 
it takes more time to index my home folder - than it likely would have 
taken, didn´t it crash on some file before -, but searching for stuff thats 
already indexed is really fast. And its a 956 MB big Virtuoso index 
already.

So instead whether to use a database or not - Akonadi could always be 
changed to use a flat file backend and even now stores mails in files - the 
more important questions in your case are:

1) Why have KMail/Akonadi been so slow during import and even days after 
it?

2) Why is Akonadi stuffing more into the database (about 700 MB) than the 
amount of local mail (283 MB - figures taken from your initial post)?

Cause I agree that the database should go larger than the mails - at least 
in the IMAP or local mail case. And even with DIMAP large mails should not 
sit in the database IMHO. And for what I read KDEPIM developers would 
actually agree with that. So thats where to start looking IMHO.

And if you want a flat file backend for Akonadi - go write one. I even heard 
that there has been some activity regarding this in the past, but I do not 
know the details.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list