[Kde-pim] Kmail2: Akonadi memory requirements

Martin Steigerwald Martin at lichtvoll.de
Fri Sep 10 11:07:53 BST 2010


Am Freitag 10 September 2010 schrieb Volker Krause:
> On Thursday 09 September 2010 09:56:09 Stephan Mueller wrote:
> > Storage space of akonadi
> > ========================
> > 
> > I probably have 3000+ emails in my spool where some of them are
> > sizeable 10+ MB. I was a bit irritated when I looked at the size of
> > the mysql database files: more than 800 MB!
> > 
> > I started to dig into the database by using the akonadiconsole tool.
> > I saw that the ENTIRE mail bodies are stored there! I am really
> > unsure why there is a need to really duplicate the email storage:
> > once in the maildir folders and once in mysql.
> 
> It's not supposed to store content bigger than 4k in the database by
> default, what version of the Akonadi server are you using? What
> settings do you have in ~/.config/akonadi/akonadiserverrc, section
> [%General], keys SizeThreshold and ExternalPayload?
> 
> You should have Akonadi server >= 1.4.0 with SizeThreshold=4096 and
> ExternalPayload=true. This will not fix existing "damage" though, it
> will only affect new or modified content.
> 
> With the settings fixed here, many of the following issues should
> become far less problematic. After all the same setup works fine on an
> N900 phone, not perfectly smooth yet, but still far better than in
> your horror story ;)

I am having linux-kernel mailing list and lots of other mailing lists with 
about 250000 mails in total for the active set. Many of them should be 
smaller than 4 KiB and thus would land in the MySQL database. And there 
are another just under 600000 mails archived as mbox folders. And this 
scales with KMail 1 as long as I take care not to have more than about 
40000 mails in one folder - except for archive folders where I do not care 
that much cause I access them rarely. With the new folder view and thread 
grouping even that already feels sluggish, cause thread groups are rebuilt 
everytime I visit a folder, but once they are built, it works okay. This 
has been faster with old folderview. But I like the new one quite a lot 
and hope its performance will be improved. I have the open that this 
thread grouping will be done by Akonadi, thus KMail merely has to update 
the GUI as grouping continues.

But why is content stored in the MySQL database at all?  Doesn't KMail 2 
use Nepomuk for search? Shouldn't mail content be indexed by Nepomuk only? 
At work we use Zimbra which also uses a MySQL database for metadata. But 
to my knowledge it doesn't store the mail contents in it, mails are still 
stored as files. For searching it used a Lucene index. All of this is 
blazingly fast and even scales with my current linux kernel mailinglist 
folder with about 90000 mails in it (at work I didn't archive for a longer 
time).

Is it possible to set "SizeThreshold" to zero and what consequences would 
this have?

I did not yet try KMail 2 myself yet, but I know for me it would have to 
deal with the same amount of mail that KMail 1 is able to. For my usage 
patterns I need a massively, if not insanely scalable MUA regarding the 
total amount of mails, the amount of mails per folder and the amount of 
folders.

And what about:

> Another issue is that it seems this database only grows and never 
> shrinks. I.e. when I delete an email (i.e. remove it from trash or 
> delete it without moving it to trash), the database does not get 
> smaller.

I was concerned about this with Nepomuk already. I excluded some scary 
directories like my kernel collection for my ThinkPad T42 which had 
several kernel source trees and my complete ~/Mail, as I believed, Nepomuk 
will receive mails for indexing via Akonadi in the future - and since I 
wanted to see Nepomuk be able to finish a complete scan of my home 
directory at least *once*. Nepomuk indeed reduced the number of files in 
the index, but Virtuoso didn't shrink the database. With further indexing 
it even grew bigger. It had 2.2 GiB as I removed the kernel tree directory 
and now it as 2,8 GiB. It didn't seem as that Virtuose reused old entries.

Even with todays harddisk sizes I think saving storage requirements is 
important. Especially as SSD's aren't that big yet. With the 160 GB 
harddisk I had before I bought the largest 2,5 inch PATA disk from Western 
Digital with 320 GiB I didn't activate desktop search due storage 
contraints.

So will there be an easy way to optimize databases and clean out stale 
entries?

I think this all should be thought out before releasing as stable.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20100910/9619a0b4/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/


More information about the kde-pim mailing list