Ideas for improving sorting. Goal: under 1 second for 200.000 files.

Bernd Buschinski b.buschinski at googlemail.com
Fri Oct 26 20:40:36 BST 2012


On Friday 26 October 2012 21:21:39 Mark wrote:
> 
> Oke, i got "something" working now but i kinda miss some more
> knowledge in the sorting area.
> 
> First the current implementation. I went for a structure like this:
> struct KFileItem
> {
>     QString collatingSequence;
>     QString filename;
>     ....
> };
> 
> Note: this is just in a clean new (one file) project so don't bother the
> names.
> 
> filename is obviously the filename. collatingSequence is a copy of
> filename. In that collatingSequence i'm tweaking the unicode values
> and lower them for numbers. So the intention is to have numbers shown
> before characters.
> 
> So:
> 000.txt (in unicode: 48, 48, 48, 46, 116, 120, 116)
> a.txt (in unicode: 97, 46, 116, 120, 116)
> ...
> 
> And i tweaked that to look like this: (1 + num)
> 000.txt (in unicode: 1, 1, 1, 46, 116, 120, 116)
> a.txt (in unicode: 97, 46, 116, 120, 116)
> ...
> 
> 
> That part "seems" to be oke in theory but i'm getting stuck when
> sorting. For this "proof of concept" i simply use qSort like so:
> QList<KFileItem> fileList;
> 
> ... add a bunch of items in fileList ...
> 
> qSort(fileList.begin(), fileList.end(), lessThan);
> 
> lessThan is where i'm stuck. I simply don't understand yet how
> lessThan should work. What conditions i should do and when i should
> return false or true.. My current version looks like this:
> bool intSort(const KFileItem& s1, const KFileItem& s2)
> {
>     const QChar* sOne = s1.collatingSequence.unicode();
>     const QChar* sTwo = s2.collatingSequence.unicode();
> 
>     while(!sOne->isNull() && !sTwo->isNull())
>     {
>         if(sTwo->unicode() < sOne->unicode())
>         {
>             return false;
>         }
> 
>         ++sOne;
>         ++sTwo;
>     }
> 
>     return true;
> }
> 
> Which compiles and runs, but does a very strange sorting :p
> If someone can explain what the intended implementation is for a lessThan..?
> 
> Kind regards,
> Mark

isn't the if wrong? shouldn't this be "if a is less than b"?

so
        if(sTwo->unicode() < sOne->unicode())
        {
            return false;
        }

should be

        if(sOne->unicode() < sTwo->unicode())
        {
            return false;
        }

or? (untested)



all:

bool intSort(const KFileItem& s1, const KFileItem& s2)
{
    const QChar* sOne = s1.collatingSequence.unicode();
    const QChar* sTwo = s2.collatingSequence.unicode();

    while(!sOne->isNull() && !sTwo->isNull())
    {
        if(sOne->unicode() < sTwo->unicode())
        {
            return false;
        }

        ++sOne;
        ++sTwo;
    }

    return true;
}


Best regards
Bernd





More information about the kfm-devel mailing list