[Kde-pim] Akonadi architecture: a serious performance bottleneck

Andras Mantia amantia at kde.org
Wed Nov 30 20:55:40 GMT 2011


Hi,

 (writing this mail for the second time as KNode has no crash recovery...)

While trying to look at several scenarios that cause the maildir resource to 
use a lot of CPU and IO, I found something that is a serious bottleneck and 
IMO can't be solved without new API. 
The issues might be known for some of you, so it is not big news, still my 
tests confirms that it is a big problem and that unfortunately I can't see 
other solution, but extending the API.

The problem is with mass operations, when multiple items are changed or 
removed (I didn't test copy/move/fetch, but they might have similar 
problems).
 I found that marking 18000 mails in maildir as read/unread is slow, there 
is heavy disk activity and the CPU is also used a lot. First I looked at 
maildir, how can I optimize. For each flag change, maildir does two stats 
(QFile::exists) to find the file, one stat to check the existence of a new 
name and one QFile::rename (maildir encodes the flags in the filename). From 
this I could optimize out the two stats, but I saw that this doesn't really 
help. The reason it doesn't help is that it is not maildir itself that is 
busy, but the Akonadi libraries, namely the change recorder.
 What happens when a lot of files are marked with a flags is that the 
Akonadi server sends a change notification for each item. This goes to the 
change recorder and that delivers the changes one by one to the resource.
There are two problems with this:
- the delivering is very slow and the changerecorder is quite busy while 
delivering. It also saves the changes to the disk into a file (.dat file) 
and cleans that file as the changes are processed, resulting in heavy disk 
activity. Remember, we talk about 18000 changes.
- the resource gets the items one by one and cannot do any optimization on 
the resource size. Imagine that the resource talks with a network server 
that support delete per item or delete for a bunch of items. With the 
current Akonadi architecture the resource would be forced to use the delete 
per items method, even if there is a more optimal way to perform the 
deletion. Same for maildir: it has to find the maildir folder every time 
(even if all the messages are in the same folder), it has to disable file 
watching in the folder while the mail is deleted and re-enable then again. 
If it would get the items in one go (e.g grouped by collection), then it 
could look for the folder only once and would need to disable/enable the 
file system watchers only once, not for every mail.

Now, how serious it is? Here are my (not that scientific) tests. Again, 
maildir, 18000 mails in one folder, regular hard disk, Core2Quad 2.8Ghz, 8GB 
RAM. Nepomuk is disabled.

Test 1: maildir as we have now in KDE 4.8, mark all the messages with a 
flag.
- time until KMail starts to update the GUI (unread count or important marks 
appear in the message list): 40-50 sec. During this time both Akonadiserver 
and mysqld is busy. The time is probably spent in SQL queries.
- time until the changes are propagated to the file system (files are 
renamed: 47 minutes (!). During this 47 minutes akonadi_maildir_resource is 
using as much CPU as it can, mysqld is also busy. Hard disk is continously 
used, the maildir resource is on the top with iotop.

Test 2: maildir with some optimization: caches the folder for items (both 
the maildir folder and if they are in "new" or "cur", so it avoids some 
recursion and some file system stats)
- time until KMail updates the gui: 40-50 sec
- time until the changes are propagated to disk: 46 minutes. Almost nothing 
was improved, the difference is greatly between the meassurement errors (as 
I used the computer meantime, I don't have free hours just to watch maildir 
working ;) )

Test 3: I simply disabled the code in maildir that makes modifications on 
the file system, instead I cancel the modification task in the resource. 
This means the mails are marked in the akonadi server, but the file names 
are untouched (well, the files are untouched).
- time until KMail updates the gui: 42 sec (so similiar).
- time until akonadi_maildir_resource stops using the CPU and the disk: 45 
minutes! Still. So it means that with a resource code doing almost nothing 
the system is busy 45 minutes.
The only interesting thing here (that I don't quite understand) is that with 
disabled file name changes mysqld was NOT busy after the first minute.

Anyway, this clearly indicates that we need signals for mass processing. 
On the client side we already have Akonadi::ItemModifyJob and 
Akonadi::ItemDeleteJob taking a list of items.
The suggestions are:
- make the server group the items received per collection
- issue SQL commands in a grouped way (IIRC Volker told me something about 
this being already done)
- issue change notification is a groupped way, so if there were 10 items 
changed in collection with id 1, and 20 in collection with id 2, only two 
signals go out, like "itemsChanged(Collection, Item::List)".
- add an Akonadi::AgentBase::ObserverV3 and that has methods itemsChanged 
and itemsRemoved with Collection, Item::List pair arguments. Same for other 
actions if makes sense.

Now if you tell me this is not a common operation, I give you two cases when  
you run into it:
- emptying the trash
- marking as read all mails e.g in a mailing list folder that you don't want 
to read (got filled up while you were on vacation)

Comments are welcome. Code is welcome even more. :)

Andras
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/



More information about the kde-pim mailing list