Ideas for improving sorting. Goal: under 1 second for 200.000 files.

Mark markg85 at gmail.com
Fri Oct 26 20:48:10 BST 2012


On Fri, Oct 26, 2012 at 9:40 PM, Bernd Buschinski
<b.buschinski at googlemail.com> wrote:
> On Friday 26 October 2012 21:21:39 Mark wrote:
>>
>> Oke, i got "something" working now but i kinda miss some more
>> knowledge in the sorting area.
>>
>> First the current implementation. I went for a structure like this:
>> struct KFileItem
>> {
>>     QString collatingSequence;
>>     QString filename;
>>     ....
>> };
>>
>> Note: this is just in a clean new (one file) project so don't bother the
>> names.
>>
>> filename is obviously the filename. collatingSequence is a copy of
>> filename. In that collatingSequence i'm tweaking the unicode values
>> and lower them for numbers. So the intention is to have numbers shown
>> before characters.
>>
>> So:
>> 000.txt (in unicode: 48, 48, 48, 46, 116, 120, 116)
>> a.txt (in unicode: 97, 46, 116, 120, 116)
>> ...
>>
>> And i tweaked that to look like this: (1 + num)
>> 000.txt (in unicode: 1, 1, 1, 46, 116, 120, 116)
>> a.txt (in unicode: 97, 46, 116, 120, 116)
>> ...
>>
>>
>> That part "seems" to be oke in theory but i'm getting stuck when
>> sorting. For this "proof of concept" i simply use qSort like so:
>> QList<KFileItem> fileList;
>>
>> ... add a bunch of items in fileList ...
>>
>> qSort(fileList.begin(), fileList.end(), lessThan);
>>
>> lessThan is where i'm stuck. I simply don't understand yet how
>> lessThan should work. What conditions i should do and when i should
>> return false or true.. My current version looks like this:
>> bool intSort(const KFileItem& s1, const KFileItem& s2)
>> {
>>     const QChar* sOne = s1.collatingSequence.unicode();
>>     const QChar* sTwo = s2.collatingSequence.unicode();
>>
>>     while(!sOne->isNull() && !sTwo->isNull())
>>     {
>>         if(sTwo->unicode() < sOne->unicode())
>>         {
>>             return false;
>>         }
>>
>>         ++sOne;
>>         ++sTwo;
>>     }
>>
>>     return true;
>> }
>>
>> Which compiles and runs, but does a very strange sorting :p
>> If someone can explain what the intended implementation is for a lessThan..?
>>
>> Kind regards,
>> Mark
>
> isn't the if wrong? shouldn't this be "if a is less than b"?
>
> so
>         if(sTwo->unicode() < sOne->unicode())
>         {
>             return false;
>         }
>
> should be
>
>         if(sOne->unicode() < sTwo->unicode())
>         {
>             return false;
>         }
>
> or? (untested)
>
>
>
> all:
>
> bool intSort(const KFileItem& s1, const KFileItem& s2)
> {
>     const QChar* sOne = s1.collatingSequence.unicode();
>     const QChar* sTwo = s2.collatingSequence.unicode();
>
>     while(!sOne->isNull() && !sTwo->isNull())
>     {
>         if(sOne->unicode() < sTwo->unicode())
>         {
>             return false;
>         }
>
>         ++sOne;
>         ++sTwo;
>     }
>
>     return true;
> }
>
>
> Best regards
> Bernd
>

Nope, i tried that but it doesn't improve sorting.
Here is the VERY HACKY single file project.
http://paste.kde.org/582536/

Make an empty Qt console project and paste this as the main.cpp and it
should just run. There is quite a bit of debug info in there and quite
a lot more commented out.




More information about the kfm-devel mailing list