[Kde-pim] Akonadi architecture: a serious performance bottleneck

Volker Krause vkrause at kde.org
Fri Dec 16 09:49:18 GMT 2011


Hi,

thanks for looking into this, one of the many things on my looong kdepim todo 
list :)

On Wednesday 30 November 2011 22:55:40 Andras Mantia wrote:
> While trying to look at several scenarios that cause the maildir resource to
> use a lot of CPU and IO, I found something that is a serious bottleneck and
> IMO can't be solved without new API.
> The issues might be known for some of you, so it is not big news, still my
> tests confirms that it is a big problem and that unfortunately I can't see
> other solution, but extending the API.
> 
> The problem is with mass operations, when multiple items are changed or
> removed (I didn't test copy/move/fetch, but they might have similar
> problems).
>  I found that marking 18000 mails in maildir as read/unread is slow, there
> is heavy disk activity and the CPU is also used a lot. First I looked at
> maildir, how can I optimize. For each flag change, maildir does two stats
> (QFile::exists) to find the file, one stat to check the existence of a new
> name and one QFile::rename (maildir encodes the flags in the filename). From
> this I could optimize out the two stats, but I saw that this doesn't really
> help. The reason it doesn't help is that it is not maildir itself that is
> busy, but the Akonadi libraries, namely the change recorder.
>  What happens when a lot of files are marked with a flags is that the
> Akonadi server sends a change notification for each item. This goes to the
> change recorder and that delivers the changes one by one to the resource.
> There are two problems with this:
> - the delivering is very slow and the changerecorder is quite busy while
> delivering. It also saves the changes to the disk into a file (.dat file)
> and cleans that file as the changes are processed, resulting in heavy disk
> activity. Remember, we talk about 18000 changes.

>From previous measurements (and optimizations) by Tobias, David and myself I 
somehow suspect that this is the real problem (at least in the maildir case 
you describe, as the backend (the file system) doesn't have mass-modify 
commands).

This code is optimized for data consistency rather than performance and writes 
the replay queue after each modification back to disk. This would allow some 
obvious optimizations for common use cases, such as only writing the index to 
the next entry instead of a complete new file (which can get quite large), and 
only rewrite it once in a while. Pretty much the same consistency level, but 
much better i/o scaling.

Optimizing this would not require API changes and could even be done in a 4.8 
patch level release. Also, this problem would still be around with mass-modify 
change notifications, so we have to do it anyway I guess. Thus this looks like 
the best place to start.

> - the resource gets the items one by one and cannot do any optimization on
> the resource size. Imagine that the resource talks with a network server
> that support delete per item or delete for a bunch of items. With the
> current Akonadi architecture the resource would be forced to use the delete
> per items method, even if there is a more optimal way to perform the
> deletion. Same for maildir: it has to find the maildir folder every time
> (even if all the messages are in the same folder), it has to disable file
> watching in the folder while the mail is deleted and re-enable then again.
> If it would get the items in one go (e.g grouped by collection), then it
> could look for the folder only once and would need to disable/enable the
> file system watchers only once, not for every mail.
> 
> Anyway, this clearly indicates that we need signals for mass processing.
> On the client side we already have Akonadi::ItemModifyJob and
> Akonadi::ItemDeleteJob taking a list of items.
> The suggestions are:
> - make the server group the items received per collection
> - issue SQL commands in a grouped way (IIRC Volker told me something about
> this being already done)

right, we have bulk versions for most commands by now.

> - issue change notification is a groupped way, so if there were 10 items
> changed in collection with id 1, and 20 in collection with id 2, only two
> signals go out, like "itemsChanged(Collection, Item::List)".

For deletion it's straightforward, but changing items is slightly more 
complicated though. For efficiently using mass-modify commands on the backend, 
you need to know what exactly has changed, and you need items provided grouped 
by that change. IMAP for example can do mass-modification of flag changes 
("mark all as read") if it's exactly the same change for all items. Right now 
we already provide information on what property changed (eg. "flags"), but not 
how that changed ("+ \Seen").

We also need safe fall-backs of course, to the single item way, but that 
shouldn't be too hard, by decomposing a mass-change notification into single 
ones (and thus keeping the same consistency guarantees on the replay queue).

> - add an Akonadi::AgentBase::ObserverV3 and that has methods itemsChanged
> and itemsRemoved with Collection, Item::List pair arguments. Same for other
> actions if makes sense.

btw, with the Qt5/KF5 progress we will also get a BIC opportunity in the not 
so distant future to clean up the ObserverVx hacks.

> Now if you tell me this is not a common operation, I give you two cases when
> you run into it:
> - emptying the trash
> - marking as read all mails e.g in a mailing list folder that you don't want
> to read (got filled up while you were on vacation)
> 
> Comments are welcome. Code is welcome even more. :)

IIRC Stephen looked into this already for model optimizations, which is easier 
as it doesn't require information about what actually changed.

regards,
Volker
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kde-pim/attachments/20111216/3b42abea/attachment.sig>
-------------- next part --------------
_______________________________________________
KDE PIM mailing list kde-pim at kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/


More information about the kde-pim mailing list