Ideas for improving sorting. Goal: under 1 second for 200.000 files.
Christoph Feck
christoph at maxiom.de
Wed Oct 24 00:13:40 BST 2012
Hi,
you wrote:
> Copy KStringHandler::naturalCompare into dolphin for further in
> depth optimizations
A good read about locale-aware sorting is this link from glibc doc:
http://www.gnu.org/software/libc/manual/html_node/Collation-
Functions.html
Basically, it says to get an effective locale-aware sorting, you need
to cache a sort key ("collating sequence") for each string, then use
that for the sort algorithm. That sort key could additionally encode
the digit sequences in "magnitude - digits" order, so that "a12" comes
after "a2", and do case mapping for case insensitive compares.
If you sort N = 200.000 entries, you only have to create 200.000
collating sequences, instead of 2*N*log(N) = 7.000.000. Speed of
comparing the sequences is that of a memcpy() if done properly. See
Thiago's posts about QString performance optimizations.
So what you have to work on before doing anything else, is to ask
Frank, if he is okey with adding a sort key (QByteArray basically) for
each file item. Your code would have to make sure it is always updated
on file renames, locale changes, sort mode changes (natural vs.
simple) etc.
For more information about collating, I already gave you the link for
the official Unicode collating code. You would need to improve the
algorithm to include the "magnitude - digits" coding.
Christoph Feck (kdepepo)
KDE Quality Team
More information about the kfm-devel
mailing list