Review Request 110262: KRandomSequence::randomize: use the Fisher-Yates Algorithm to achieve O(N) complexity

Thu May 2 13:52:43 BST 2013

> On May 1, 2013, 9:03 p.m., Parker Coates wrote:
> > Since KRandomSequence is a class for generating a predictable random sequence, is it not possible that this change would break applications that relied on the code below producing a consistent result between kdelibs releases?
> > 
> > QList list = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
> > KRandomSequence pseudoRand( 123545 );
> > pseudoRand.randomize( list );
> > return list;
> > 
> > 
> >
> 
> Frank Reininghaus wrote:
>     I had thought about that as well, this is the reason why I propose to push that to master only, and not to KDE/4.10. Even though I think that a design which relies on the order of the items in a randomized sequence might be questionable in an application, it might make sense in unit tests. If the patch is pushed to master only, it would give people some time to react if anything breaks. At first sight, I did not notice anything in http://lxr.kde.org/ident?i=randomize that looks dangerous though.
>     
>     On the other hand, if people are worried that this change might break things, then we could just deprecate the existing method and implement Fisher-Yates in a new one (maybe KRandomSequence::shuffle ?). The disadvantage would be that every application that uses the old method needs to be ported to the new one in order to stop wasting CPU cycles and prevent freezes (like when you set up a Plasma picture frame slideshow with many pictures).
>
> 
> Parker Coates wrote:
>     The only reason I mentioned this is that I considered using KRandomSequence::randomize() for shuffling in card games. There I needed to take an integer deal number and a sorted list of cards to produce a predictably randomised deck. I wouldn't have been all that impressed if my deal numbers suddenly changed meaning due to a kdelibs update. (In the end, I didn't end up using KRandomSequence::randomize(), so this is a non issue for me personally.)
>     
>     On the other hand, if the change is planned for a new major library version, and your survey of current uses didn't turn up any that seem to rely on the predictability of the result, I see no reason not to make the change. I just wanted to make sure that you'd fully considered the implications. Maybe a note in the API docs mentioning the change in behaviour would be a good idea, though?
>     
>     (Oh and sorry about the triple comment. ReviewBoard's feedback can be... interesting at times. I'll see if I can manage to send this one only once.)

Thanks for the quick reply. The deal number issue is a valid concern, of course, but none of the places I found using lxr seem to use something like that. In fact, it looks like basically everyone is just using the default seed value. But I'll wait for futher feedback from other people and see if anyone whom I missed relies on a reproducible shuffling.

BTW, your shuffling algorithm in kpat.git/libkcardgame/shuffle.h looks equivalent to what I'm proposing here :-)

- Frank

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/110262/#review31855
-----------------------------------------------------------

On May 1, 2013, 8:34 p.m., Frank Reininghaus wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/110262/
> -----------------------------------------------------------
> 
> (Updated May 1, 2013, 8:34 p.m.)
> 
> 
> Review request for kdelibs.
> 
> 
> Description
> -------
> 
> The current algorithm that is used to shuffle lists is rather inefficient. It works by removing the first item of the list repeatedly and inserting it at a random position in a new list, which is finally used to replace the original list. Unfortunately, this results in O(N^2) run time complexity because inserting into a list, which is done N itmes, is O(N).
> 
> I propose to replace this algorithm by the Fisher-Yates algorithm, which works by swapping items N - 1 times. One could modify the entire thing further, like providing randomization also for other containers and not only QList, but that would probably be frameworks material. 
> 
> 
> Diffs
> -----
> 
>   kdecore/util/krandomsequence.h 46949b4 
> 
> Diff: http://git.reviewboard.kde.org/r/110262/diff/
> 
> 
> Testing
> -------
> 
> I wrote a small benchmark: http://paste.kde.org/735914/
> 
> I got the following results with the existing algorithm:
> 
> RESULT : Benchmark::randomSequenceBenchmark():"n=0":
>      0.000015 msecs per iteration (total: 66, iterations: 4194304)
> RESULT : Benchmark::randomSequenceBenchmark():"n=1":
>      0.000192 msecs per iteration (total: 101, iterations: 524288)
> RESULT : Benchmark::randomSequenceBenchmark():"n=3":
>      0.00070 msecs per iteration (total: 93, iterations: 131072)
> RESULT : Benchmark::randomSequenceBenchmark():"n=10":
>      0.0025 msecs per iteration (total: 83, iterations: 32768)
> RESULT : Benchmark::randomSequenceBenchmark():"n=30":
>      0.0070 msecs per iteration (total: 58, iterations: 8192)
> RESULT : Benchmark::randomSequenceBenchmark():"n=100":
>      0.023 msecs per iteration (total: 97, iterations: 4096)
> RESULT : Benchmark::randomSequenceBenchmark():"n=300":
>      0.077 msecs per iteration (total: 79, iterations: 1024)
> RESULT : Benchmark::randomSequenceBenchmark():"n=1000":
>      0.35 msecs per iteration (total: 90, iterations: 256)
> RESULT : Benchmark::randomSequenceBenchmark():"n=3000":
>      1.8 msecs per iteration (total: 58, iterations: 32)
> RESULT : Benchmark::randomSequenceBenchmark():"n=10000":
>      18 msecs per iteration (total: 72, iterations: 4)
> RESULT : Benchmark::randomSequenceBenchmark():"n=30000":
>      283 msecs per iteration (total: 283, iterations: 1)
> RESULT : Benchmark::randomSequenceBenchmark():"n=100000":
>      3,823 msecs per iteration (total: 3,823, iterations: 1)
> 
> Here are the numbers for the proposed new algorithm:
> 
> RESULT : Benchmark::randomSequenceBenchmark():"n=0":
>      0.000015 msecs per iteration (total: 65, iterations: 4194304)
> RESULT : Benchmark::randomSequenceBenchmark():"n=1":
>      0.000015 msecs per iteration (total: 65, iterations: 4194304)
> RESULT : Benchmark::randomSequenceBenchmark():"n=3":
>      0.00018 msecs per iteration (total: 98, iterations: 524288)
> RESULT : Benchmark::randomSequenceBenchmark():"n=10":
>      0.00079 msecs per iteration (total: 52, iterations: 65536)
> RESULT : Benchmark::randomSequenceBenchmark():"n=30":
>      0.0025 msecs per iteration (total: 83, iterations: 32768)
> RESULT : Benchmark::randomSequenceBenchmark():"n=100":
>      0.0084 msecs per iteration (total: 69, iterations: 8192)
> RESULT : Benchmark::randomSequenceBenchmark():"n=300":
>      0.025 msecs per iteration (total: 52, iterations: 2048)
> RESULT : Benchmark::randomSequenceBenchmark():"n=1000":
>      0.085 msecs per iteration (total: 88, iterations: 1024)
> RESULT : Benchmark::randomSequenceBenchmark():"n=3000":
>      0.25 msecs per iteration (total: 66, iterations: 256)
> RESULT : Benchmark::randomSequenceBenchmark():"n=10000":
>      0.85 msecs per iteration (total: 55, iterations: 64)
> RESULT : Benchmark::randomSequenceBenchmark():"n=30000":
>      2.6 msecs per iteration (total: 86, iterations: 32)
> RESULT : Benchmark::randomSequenceBenchmark():"n=100000":
>      10 msecs per iteration (total: 81, iterations: 8)
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-core-devel/attachments/20130502/2bd4cc63/attachment.htm>