Review Request 108824: New sorting algorithm for Dolphin

Emmanuel Pescosta emmanuelpescosta099 at gmail.com
Fri Feb 8 10:37:23 GMT 2013



> On Feb. 8, 2013, 9:55 a.m., Frank Reininghaus wrote:
> > Thanks Emmanuel! This looks like it's been a massive amount of work (BTW, it might be good to discuss such ideas on the mailing list before starting to code). I've also played around with Timsort some time ago, but only with a quick hack of mine that just made use of already sorted runs and then with https://github.com/gfx/cpp-TimSort.
> > 
> > I don't have the time now to go through everything, but here are two things that come to my mind right now:
> > 
> > 1. Were you aware of https://github.com/gfx/cpp-TimSort? If yes, what is different about your implementation? The one on Github looks like it has a permissive license and could in principle be used, and it's not always a good idea to 'reinvent the wheel'. However, it looks like it's just a Github repository and has not seen any 'official' release yet, so we would have to just copy the header to our repository, which is not quite an elegant approach.
> > 
> > 2. I think that such a thing should in principle not be implemented on the application level. Having easy access to a sort function that minimizes the number of comparisons and works well for data with some structure in it might also be useful in other contexts where comparisons are expensive. It would probably not be accepted into kdelibs right now due to the feature freeze, so adding it to Dolphin itself might be the way to go in the short term, but I think that it should be put into some kind of library in the long term. Maybe a new 'framework' that bundles a couple of sorting-related things like this and the parallel sorting implementation which are not available out-of-the box on every system which has a C++ compiler+Qt?
> > 
> > 3. I agree that benchmarks and, more importantly, unit tests should definitely be added.

> and it's not always a good idea to 'reinvent the wheel', BTW, it might be good to discuss such ideas on the mailing list before starting to code
Yes I know. But in this case I tried to learn how the Tim Sort algorithm works ;) (only possible when you implement/debug it by yourself with the help of http://svn.python.org/projects/python/trunk/Objects/listsort.txt)

> 1. Were you aware of https://github.com/gfx/cpp-TimSort? If yes, what is different about your implementation?
Yes. My final implementation is a mix between TimSort.java/cpp-TimSort and my original implementation (This implementation was by far not as fast/beautiful as cpp-TimSort)

I have also replaced the normal insertion sort (Used by my first implementation) with binary insertion sort (used in TimSort.java)

My final implementation is a little bit faster than cpp-TimSort and a little bit easier to understand I think (+ it uses Qt containers/algorithms)

2. Yep. A sorting algorithm framework would be great :)


- Emmanuel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/108824/#review26941
-----------------------------------------------------------


On Feb. 6, 2013, 11:53 p.m., Emmanuel Pescosta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/108824/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2013, 11:53 p.m.)
> 
> 
> Review request for Dolphin and Frank Reininghaus.
> 
> 
> Description
> -------
> 
> Implemented a new sorting algorithm called Tim Sort which is a modified Merge Sort algorithm (http://svn.python.org/projects/python/trunk/Objects/listsort.txt) - Default sorting algorithm in Python and Java
> 
> Tim Sort is really fast for already sorted lists (Great for inserting new items, sort role changes - see benchmark 2, ...)
> 
> Some benchmarks (200k folders and files):
> # Sort Type: Name => TimSort: 1755 ms - Parallel MergeSort: 1547 ms
> # Sort Type: Name (Reorder descending) => TimSort: 163 ms - Parallel MergeSort: 434 ms
> # Sort Type: Size => TimSort: 259 ms - Parallel MergeSort: 516 ms
> # Sort Type: Type => TimSort: 269 ms - Parallel MergeSort: 594 ms
> 
> Btw.: Tim Sort uses only 1 core!
> 
> ToDo: Write a benchmark for sorting algorithms
> 
> 
> Diffs
> -----
> 
>   dolphin/src/kitemviews/private/kfileitemmodelsortalgorithm.h 1d56894 
> 
> Diff: http://git.reviewboard.kde.org/r/108824/diff/
> 
> 
> Testing
> -------
> 
> Sorting works fine for me
> 
> 
> Thanks,
> 
> Emmanuel Pescosta
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.kde.org/mailman/private/kfm-devel/attachments/20130208/0f146a76/attachment.htm>


More information about the kfm-devel mailing list