Ideas for improving sorting. Goal: under 1 second for 200.000 files.

Frank Reininghaus frank78ac at googlemail.com
Wed Oct 24 08:57:00 BST 2012


Hi Christoph,

thanks for sharing your ideas!

2012/10/24 Christoph Feck:
> On Wednesday 24 October 2012 01:13:40 Christoph Feck wrote:
>> Speed of comparing the sequences is that of a memcpy()
>
> Of course I meant memcmp().
>
>> So what you have to work on before doing anything else, is to ask
>> Frank, if he is okey with adding a sort key (QByteArray basically)
>> for each file item. Your code would have to make sure it is always
>> updated on file renames, locale changes, sort mode changes
>> (natural vs. simple) etc.
>
> Another idea is to not store this side information "permanently", but
> only create it temporarily when sorting. So you effectively only sort
> a list with side information.
>
>         struct SortEntry
>         {
>                 QByteArray collatingKey;
>                 KFileItem *fileItem;
>         };

That approach looks like it might be worth considering.

I think that the function that generates the collatingKey should be in
the same place as the naturalCompare() function, i.e., in
KStringHandler in kdelibs. Ideally, both functions would also share
some code - if a bug is found in the comparison algorithm, we don't
want to fix it twice ;-) One approach would be to make
KStringHandler::naturalCompare() a thin wrapper around the
collatingKey generator and a straightforward comparison. However, this
might slow down other users of KStringHandler::naturalCompare() in
some situations.

Best regards,
Frank




More information about the kfm-devel mailing list