[kmail2] [Bug 301470] Threading very slow
Bernd Paysan
bernd.paysan at gmx.de
Sat Sep 12 00:19:41 BST 2015
https://bugs.kde.org/show_bug.cgi?id=301470
--- Comment #4 from Bernd Paysan <bernd.paysan at gmx.de> ---
Am Freitag, 11. September 2015, 21:36:26 schrieb Martin Steigerwald:
> https://bugs.kde.org/show_bug.cgi?id=301470
>
> --- Comment #3 from Martin Steigerwald <Martin at Lichtvoll.de> ---
> Note that it will take till December for 15.12 to be released. Current git
> master, buildable with kdesrc-build has quite some improvements already. But
> of course you can also wait for 15.12 to appear in your distro.
Due to the slow speed, I've split up my folders; the largest one has 3500
deeply threaded messages in it; at the moment (with a 3GHz Core i7, Kmail
4.14.9), it takes 6 seconds to completely thread in interactive mode, and
4.something in batch mode - you can expect that a slow machine with an Atom or
so will take a minute or more for that task, which is certainly not tolerable.
That's the only folder I've left which is somewhat slow; it used to be worse
than that. If you want to get some better feedback, I suggest you add a stop
timer while threading and display the final timing after "Ready", bottom left.
I've some mailing lists which accumulate deeply threaded messages relatively
quickly and will let them grow until I get 15.12. If you want to get a folder
full with lots of deeply threaded messages, just subscribe to the Linux kernel
mailing list ;-).
I suspect the current threading code in 4.14.9 is at least O(n²) or worse,
because all folders somewhat below 2000 messages are threaded fast enough. If
you need help with an O(n) algorithm for threading messages, I propose the
following - requires only two linear walk through the list of messages:
First walk: Build a hash message id->message, so you can access each message
by its id. That walk you need to do only once when loading the messages from
the backend.
Second walk: Go through the list of referenced message ids, and if you find a
message, append it in the "nested messages" array/vector of that message
object. If there's no reference found, append it to the top-level
array/vector. Choose a data structure where append is an O(1) function.
Next: Sort the arrays from bottom (1st level messages) with the selected sort
order, and walk the threading tree downwards to sort the sub-lists. This is
O(n log(n)). To make it easier for displaying, in the return path of the tree
walk, calculate the number of messages in each sub-thread, and annotate them
with their absolute position in in the message list and their nesting depth
(that information is easily available in a tree walk).
The last walk is then ready to display.
IMHO it should at most take some hundred microseconds to thread a few thousand
messages that way, where the headers are all in memory, and selection of the
right data structures.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the Kdepim-bugs
mailing list