<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2653.12">
<TITLE>RE: Sorting speed improvement</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=2>> Some points:</FONT>
<BR><FONT SIZE=2>> </FONT>
<BR><FONT SIZE=2>> 1) Postman's sort is radix sort with "string" keys, therefore it has </FONT>
<BR><FONT SIZE=2>> complexity O(n*k) where k is the max. length of the string. </FONT>
<BR><FONT SIZE=2>> Claiming O(n) </FONT>
<BR><FONT SIZE=2>> would not be accurate IMO. O(n) would be true if and only if </FONT>
<BR><FONT SIZE=2>> the number of </FONT>
<BR><FONT SIZE=2>> digits in the key, here corresponding to the length of </FONT>
<BR><FONT SIZE=2>> string, were constant </FONT>
<BR><FONT SIZE=2>> which is not the case here. Note that the real complexity analysis is </FONT>
<BR><FONT SIZE=2>> slightly more complex than that (I will do it if you like but it's </FONT>
<BR><FONT SIZE=2>> unnecessary, same as radix sort. See MIT's "Introduction to </FONT>
<BR><FONT SIZE=2>> Algorithms" </FONT>
<BR><FONT SIZE=2>> textbook if you are curious. I have it on my desk now :))</FONT>
</P>
<P><FONT SIZE=2>First. I know that the algorithm is a radix sort. When I was playing with the Spectrum,</FONT>
<BR><FONT SIZE=2>18 years ago, I implemented this algorithm the first time in basic.</FONT>
</P>
<P><FONT SIZE=2>Second. In the University, I've learned two complexity notations:</FONT>
<BR><FONT SIZE=2>O(....) o in uppercase, that talks about the higher exponent in the complexity</FONT>
<BR><FONT SIZE=2>and</FONT>
<BR><FONT SIZE=2>o(.....) o in lowercase, that talks about all the factors in the complexity.</FONT>
</P>
<P><FONT SIZE=2>In that way, an algorith with o(n^2*k) or o(3*n^2+50) is O(n^2) if k is constant.</FONT>
<BR><FONT SIZE=2>In this notation, you are talking about an o(n*k) complexity.</FONT>
<BR><FONT SIZE=2> </FONT>
<BR><FONT SIZE=2>> 2) The above point is reflected in your code. There are three </FONT>
<BR><FONT SIZE=2>> loops. Counts </FONT>
<BR><FONT SIZE=2>> are max_len, 256, length(B[k]) which is O( max_len * </FONT>
<BR><FONT SIZE=2>> length(A) ), where B is </FONT>
<BR><FONT SIZE=2>> the array of intermediate lists and A is the input list . You </FONT>
<BR><FONT SIZE=2>> should also be </FONT>
<BR><FONT SIZE=2>> overly skeptical when somebody dubs "his algorithm" "WORLD'S </FONT>
<BR><FONT SIZE=2>> FASTEST SORTING </FONT>
<BR><FONT SIZE=2>> PROGRAM" which is in fact just an instance of a well known </FONT>
<BR><FONT SIZE=2>> sorting technique. </FONT>
<BR><FONT SIZE=2>> I would be much more modest with the naming!!!!! [His </FONT>
<BR><FONT SIZE=2>> "whitepaper" is faaar </FONT>
<BR><FONT SIZE=2>> from modest while his technique is almost trivial, </FONT>
<BR><FONT SIZE=2>> interesting attitude]</FONT>
</P>
<P><FONT SIZE=2>Does anyone knows a sorting algorithm faster than a radix sort ?</FONT>
</P>
<P><FONT SIZE=2>I do not know where I said that this was "my" "program", but I will review it </FONT>
<BR><FONT SIZE=2>and fix it. And I will include links to the other pages that talk about</FONT>
<BR><FONT SIZE=2>this algorithm.</FONT>
</P>
<P><FONT SIZE=2>> 3) Your implementation seems to be OK but you have to be </FONT>
<BR><FONT SIZE=2>> skeptical when you </FONT>
<BR><FONT SIZE=2>> are implementing a sorting algorithm! Did you actually test it?</FONT>
</P>
<P><FONT SIZE=2>Not the implementation I made yesterday night. Well, I'm using the generated</FONT>
<BR><FONT SIZE=2>QT to run KDE now, but this is not a good test.</FONT>
<BR><FONT SIZE=2> </FONT>
<BR><FONT SIZE=2>> 4) Your implementation might have some hidden costs. Are you </FONT>
<BR><FONT SIZE=2>> sure this line </FONT>
<BR><FONT SIZE=2>> will result in constant time?:</FONT>
<BR><FONT SIZE=2>> lst[(*it).at(i)].append(*it); </FONT>
<BR><FONT SIZE=2>> Qt does have "value sharing" but I would double-check such </FONT>
<BR><FONT SIZE=2>> things. It looks </FONT>
<BR><FONT SIZE=2>> like you are assuming you are just moving around pointers, </FONT>
<BR><FONT SIZE=2>> right? If these </FONT>
<BR><FONT SIZE=2>> calls induce memory calls they will be overly expensive.</FONT>
</P>
<P><FONT SIZE=2>I do not know enougth of QT (still) to move the pointers, I just used the</FONT>
<BR><FONT SIZE=2>provided methods (QT methods are easy to learn).</FONT>
</P>
<P><FONT SIZE=2>This implementation, therefore, is quite expensive in CPU and memory,</FONT>
<BR><FONT SIZE=2>and, of course, is not optimal at all. It was intended as a hint.</FONT>
</P>
<P><FONT SIZE=2>> Also, w.r.t. point 1, quicksort with ordinary string </FONT>
<BR><FONT SIZE=2>> comparison would really </FONT>
<BR><FONT SIZE=2>> have O(n*logn*k) worst case running time complexity... Is </FONT>
<BR><FONT SIZE=2>> that the case </FONT>
<BR><FONT SIZE=2>> currently in QStringList? If so, I strongly suggest that this </FONT>
<BR><FONT SIZE=2>> algorithm be </FONT>
<BR><FONT SIZE=2>> made available as an optional code in Qt, maybe something like </FONT>
<BR><FONT SIZE=2>> QStringList::fastSortASCII(). Please license it under BSD </FONT>
<BR><FONT SIZE=2>> after you test it! </FONT>
<BR><FONT SIZE=2>> It would be nice if you could write a test suite code with </FONT>
<BR><FONT SIZE=2>> assert()'s spewed </FONT>
<BR><FONT SIZE=2>> out all over the place. You should already have such a thing, no?</FONT>
</P>
<P><FONT SIZE=2>First. In the above notation, the Quicksort algorith is</FONT>
<BR><FONT SIZE=2>o(n*log2(n)*k) or O(n*log2(n)).</FONT>
</P>
<P><FONT SIZE=2>Second. QT is using a heapSort, that is O(n*log(n)).</FONT>
</P>
<P><FONT SIZE=2>Third. In the Quicksort, there are String comparations, that are more</FONT>
<BR><FONT SIZE=2>expensive than accesing an array.</FONT>
</P>
<P><FONT SIZE=2>Fourth. No, I do not have such an implementation, but I will do it (now that</FONT>
<BR><FONT SIZE=2>I have again some free time) and release it under an Open Source License.</FONT>
</P>
<P><FONT SIZE=2>> Cheers,</FONT>
</P>
<P><FONT SIZE=2>Regards.</FONT>
<BR><FONT SIZE=2> </FONT>
<BR><FONT SIZE=2>> -- </FONT>
<BR><FONT SIZE=2>> Eray Ozkural (exa) <erayo@cs.bilkent.edu.tr></FONT>
<BR><FONT SIZE=2>> Comp. Sci. Dept., Bilkent University, Ankara KDE Project: </FONT>
<BR><FONT SIZE=2><A HREF="http://www.kde.org" TARGET="_blank">http://www.kde.org</A></FONT>
<BR><FONT SIZE=2>www: <A HREF="http://www.cs.bilkent.edu.tr/~erayo" TARGET="_blank">http://www.cs.bilkent.edu.tr/~erayo</A> Malfunction: <A HREF="http://mp3.com/ariza" TARGET="_blank">http://mp3.com/ariza</A></FONT>
<BR><FONT SIZE=2>GPG public key fingerprint: 360C 852F 88B0 A745 F31B EA0F 7C07 AE16 874D 539C</FONT>
</P>
</BODY>
</HTML>