Ideas for improving sorting. Goal: under 1 second for 200.000 files.

Tue Oct 23 20:01:41 BST 2012

Hi Mark,

Thanks for trying to make sorting faster :-)

Am 23.10.2012 14:39 schrieb "Mark":
>
> Hi,
>
> Right now we have 2 bug reports for sorting and suggestions/patches on
> how to improve them. [1, 2]
>
> I tried a lot already to improve sorting and made it twice as fast in
> my local experiments [2]. The patch for that is certainly unacceptable
> and very very hacky! Yesterday i just sat down (had to since i was
> having a bus ride anyway :p) and just think about the actual issue and
> how it might be resolved.
>
> Doing that i came to a whole different conclusion then before. It's
> only theoretical at the moment, but i think this might actually work
> and greatly speed things up.
>
> 1. Splitting out sorting for folders and the rest
> -------------------
> This one seemingly simple makes a noticeable dent in the time it takes
> to sort. The lessThan function [3] (line 1320) is called for the
> sorting algorithm in O(n log n). Optimizing anything in here will
> result in a faster sort. File and Folder sorting is split out in
> "specialized" lessThan functions (lessThanFolder and lessThanFiles)
> then you can already prevent the "if (m_sortDirsFirst || m_sortRole ==
> SizeRole) {" construction. I've tried this and it speeds up the
> sortingquite a bit. It still remains slow though.

How much is "quite a bit"? I very much doubt that avoiding the
if-check that you quoted makes a big difference. On the other hand,
such a change would require more code paths and more complexity in
general.

> 2. Copy KStringHandler::naturalCompare into dolphin for further in
> depth optimizations

If there are ways to improve that function, why not do it in kdelibs?
I think that increasing the maintenance burden by having two different
implementations in different places is not a good idea.

> 2.1 Unicode cache for isNull, isPunct, isSpace and is isDigit.
> -------------------
> KStrinHandler::naturalCompare is - by far - the biggest resource user.
> It needs improvements and thus far i simply didn't know how to do
> that. Yesterday i figured out a bunch of optimizations for this. What
> i;d do is make a structure (class):
> class CharUnicodeCache
> {
> public:
>     bool isSpace;
>     bool isPunct;
>     bool isNull;
>     bool isDigit;
> }
>
> Putting that in a list: QList<CharUnicodeCache> m_chatUnicodeCache;
> Then in the naturalCompare it needs to fetch the cached values from
> the list rather then using the function calls. I'm not quite sure
> about this optimization since it's quite tricky and "might" not
> improve performance that much. Note: the key is the numeric
> representation of the hex value of the unicode character thus the
> lookup is indeed in constant time and should be very rapid.
>
> 2.2 Get rid of localeAwareCompare - kind of
> -------------------
> localeAwareCompare is - by far - the biggest resource user in
> naturalCompare so one way of making it a lot faster is by getting rid
> of it all together :)  Sadly that's not entirely possible, but we can
> use the stuff that Qt uses to make it faster and prevent string
> allocations.
> To follow this, here is the stack when localeAwareCompare is called:
> - localeAwareCompare
> - - localeAwareCompare_helper
> - - - ... system dependent stuff. if nothing is found: ucstrcmp Lets
> assume we go through ucstrcmp [4]
> - - - - ucstrncmp [5]
>
> The deepest functions for localeAwareCompare are ucstrcmp and
> ucstrncmp. We can simply copy those 2 functions and make "our own"
> localeAwareCompare that just passes around QChar pointers without
> allocating anything. Doing this is likely to greatly speedup the new
> "localeAwareCompare".

Before you even consider to fork localeAwareCompare() and try to
improve it, you should check what it's like in Qt 5. It's not unlikely
that there are some improvements in there, and you should avoid
reinventing the wheel :-)

> Note: the naturalCompare has a comment that complains about
> insensitive comparison and having to do toLower to get around that. If
> we copy the Qt functions we might as well copy the case insensitive
> version [6] and [7] which gets rig of the toLower which is also quite
> high on the function call cost list.
>
> 2.3 Pass around QChar* and get rid of QString
> -------------------
> Right now naturalCompare takes two QString arguments while it
> internally grabs a QChar* from the string. I'm not sure here, but i
> think you save some time if you prevent QString passing around and
> simply pass around QChar* then i imagine that you have a lot less
> unicode since the string arrays are already in unicode. That might be
> another (small?) speed improvement i guess.
>
> 3. Multithreaded sorting
> -------------------
> I don't know if the current sorting algorithm is fit for
> multithreading usage, but the issue we have in sorting is CPU bound.
> If we add in threading it should speed up significantly.

The sort algorithm that we use (Mergesort) can easily be parallelized.
Emmanuel is working on it.

> Right now i'm implementing 2.1 in a local branch and 2.2 is next on my
> list. I expect 2.2 to give the biggest speedboost.
> I will report my findings in this thread once i have something.
>
> It would be best if this sorting is greatly improved and added to
> Dolphin's codebase. The above suggestions do add quite a bit of code
> to dolphin, but it's all fairly self contained. I hope it's also low
> maintenance, but that remains to be seen :)

I have yet to see any non-trivial addition of code that is low maintenance ;-)

Moreover, please consider that much of the code that you are trying to
work on is currently not low-maintenace, but no-maintenance from a
Dolphin point of view. Forking stuff into Dolphin and adding much
complexity is something that I'm not really willing to accept.

Best regards
Frank