[Kde-pim] Akonadi architecture: a serious performance bottleneck
Andras Mantia
amantia at kde.org
Wed Nov 30 20:55:40 GMT 2011
Hi,
(writing this mail for the second time as KNode has no crash recovery...)
While trying to look at several scenarios that cause the maildir resource to
use a lot of CPU and IO, I found something that is a serious bottleneck and
IMO can't be solved without new API.
The issues might be known for some of you, so it is not big news, still my
tests confirms that it is a big problem and that unfortunately I can't see
other solution, but extending the API.
The problem is with mass operations, when multiple items are changed or
removed (I didn't test copy/move/fetch, but they might have similar
problems).
I found that marking 18000 mails in maildir as read/unread is slow, there
is heavy disk activity and the CPU is also used a lot. First I looked at
maildir, how can I optimize. For each flag change, maildir does two stats
(QFile::exists) to find the file, one stat to check the existence of a new
name and one QFile::rename (maildir encodes the flags in the filename). From
this I could optimize out the two stats, but I saw that this doesn't really
help. The reason it doesn't help is that it is not maildir itself that is
busy, but the Akonadi libraries, namely the change recorder.
What happens when a lot of files are marked with a flags is that the
Akonadi server sends a change notification for each item. This goes to the
change recorder and that delivers the changes one by one to the resource.
There are two problems with this:
- the delivering is very slow and the changerecorder is quite busy while
delivering. It also saves the changes to the disk into a file (.dat file)
and cleans that file as the changes are processed, resulting in heavy disk
activity. Remember, we talk about 18000 changes.
- the resource gets the items one by one and cannot do any optimization on
the resource size. Imagine that the resource talks with a network server
that support delete per item or delete for a bunch of items. With the
current Akonadi architecture the resource would be forced to use the delete
per items method, even if there is a more optimal way to perform the
deletion. Same for maildir: it has to find the maildir folder every time
(even if all the messages are in the same folder), it has to disable file
watching in the folder while the mail is deleted and re-enable then again.
If it would get the items in one go (e.g grouped by collection), then it
could look for the folder only once and would need to disable/enable the
file system watchers only once, not for every mail.
Now, how serious it is? Here are my (not that scientific) tests. Again,
maildir, 18000 mails in one folder, regular hard disk, Core2Quad 2.8Ghz, 8GB
RAM. Nepomuk is disabled.
Test 1: maildir as we have now in KDE 4.8, mark all the messages with a
flag.
- time until KMail starts to update the GUI (unread count or important marks
appear in the message list): 40-50 sec. During this time both Akonadiserver
and mysqld is busy. The time is probably spent in SQL queries.
- time until the changes are propagated to the file system (files are
renamed: 47 minutes (!). During this 47 minutes akonadi_maildir_resource is
using as much CPU as it can, mysqld is also busy. Hard disk is continously
used, the maildir resource is on the top with iotop.
Test 2: maildir with some optimization: caches the folder for items (both
the maildir folder and if they are in "new" or "cur", so it avoids some
recursion and some file system stats)
- time until KMail updates the gui: 40-50 sec
- time until the changes are propagated to disk: 46 minutes. Almost nothing
was improved, the difference is greatly between the meassurement errors (as
I used the computer meantime, I don't have free hours just to watch maildir
working ;) )
Test 3: I simply disabled the code in maildir that makes modifications on
the file system, instead I cancel the modification task in the resource.
This means the mails are marked in the akonadi server, but the file names
are untouched (well, the files are untouched).
- time until KMail updates the gui: 42 sec (so similiar).
- time until akonadi_maildir_resource stops using the CPU and the disk: 45
minutes! Still. So it means that with a resource code doing almost nothing
the system is busy 45 minutes.
The only interesting thing here (that I don't quite understand) is that with
disabled file name changes mysqld was NOT busy after the first minute.
Anyway, this clearly indicates that we need signals for mass processing.
On the client side we already have Akonadi::ItemModifyJob and
Akonadi::ItemDeleteJob taking a list of items.
The suggestions are:
- make the server group the items received per collection
- issue SQL commands in a grouped way (IIRC Volker told me something about
this being already done)
- issue change notification is a groupped way, so if there were 10 items
changed in collection with id 1, and 20 in collection with id 2, only two
signals go out, like "itemsChanged(Collection, Item::List)".
- add an Akonadi::AgentBase::ObserverV3 and that has methods itemsChanged
and itemsRemoved with Collection, Item::List pair arguments. Same for other
actions if makes sense.
Now if you tell me this is not a common operation, I give you two cases when
you run into it:
- emptying the trash
- marking as read all mails e.g in a mailing list folder that you don't want
to read (got filled up while you were on vacation)
Comments are welcome. Code is welcome even more. :)
Andras
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/
More information about the kde-pim
mailing list