Reply
 
Thread Tools Display Modes
  #1  
Old 11-06-2019, 02:45 AM
moes lotion is offline
Guest
 
Join Date: Nov 2002
Location: North of a Great Lake
Posts: 232

Random Shuffle - expected distributions


If an ordered array is randomly shuffled, my possibly faulty understanding is that the elements should have roughly equal probabilities of ending up anywhere in the shuffled array. For simplicity, say I define the initial ordered array A with 100 elements such that ai = i for i = 1 to 100, randomly shuffle it and then note that the element i (which was initially in position i) is now in position j. If I then define a 100 x 100 element distribution array D with all elements initially set to 0, and each time I randomly shuffle my initially ordered array A I increment the elements dij where the initial element i in A ends up in position j, should I expect that after some reasonably large number of random shuffles the elements of D should all be approximately equal? My initial guess is that any element i in A is less likely to remain in position i after a random shuffle so I would expect that the values along the diagonal of D should be lower than the rest, but on the other hand if the shuffle is truly random, why wouldn't an element occasionally remain in place?

Note that I'm not asking for how to perform a random shuffle, just trying to wrap my head around what a random shuffle actually means.
  #2  
Old 11-06-2019, 03:42 AM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,967
What it means... you can visualize what happens to a deck of cards (could be a small one, like with 4, if you want to enumerate all the distinct possibilities).

By symmetry, any card has an equal chance of ending up in any position. If you count where it ends up each time, the counts will not be exactly the same, but after a short while you should see them form a uniform distribution with few significant deviations from the expected number.
  #3  
Old 11-06-2019, 05:44 AM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
It's not completely clear how OP defines random shuffle. A single riffle shuffle of a brand-new deck of cards will leave much order intact. Riffle the deck a few dozen times, using multiple riffle methods, and after all that you may approximate a random uniform shuffle, where each of the 52! possible orderings is expected to occur with 1/52! probability.

When shuffling the deck (#1, #2, ..., #52) via computer there is a standard algorithm which almost every programmer knows:
Swap #1 with #j (j uniform random on {1, 2, 3, ..., 52})
Swap #2 with #j (j uniform random on {2, 3, 4, ..., 52})
Swap #3 with #j (j uniform random on {3, 4, 5, ..., 52})
...
Swap #51 with #j (j uniform random on {51, 52})
Each of the 52! orderings will occur with 1/52! probability if the "uniform randoms" are reliable. Note that 1/52 of the time you swap #1 with itself (i.e. do nothing) and so on.

"Almost every programmer" may have been an over-statement. Many get it wrong, including one on-line casino sorry, I'm not going to try Google which used a trivial bad shuffle and got exploited for it!
  #4  
Old 11-06-2019, 06:53 AM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
As an aside, when you shuffle a list, on average, you'll expect exactly one element to end up in its original spot. And this is true regardless of the size of the list.
  #5  
Old 11-06-2019, 08:36 AM
Dead Cat is offline
I was curious...
 
Join Date: Feb 2005
Location: UK
Posts: 4,305
I must be interpreting this incorrectly. Looking at small sets, for the set with 1 element, exactly 1 element ends up in its original spot 100% of the time. For 2 elements, exactly 1 element ends up in its original spot 0% of the time (because either both elements will be in the same spot, or neither will). For 3 elements, half the possible orderings after a shuffle result in exactly one element in its original spot. For 4 elements, I make it 8 out of 24 orderings that have exactly one element in its original spot. What am I missing here? Does it only work for sets with an odd number of elements? That's in line with what I would expect intuitively. But intuition is often extremely misleading when it comes to probability and statistics.

Last edited by Dead Cat; 11-06-2019 at 08:36 AM.
  #6  
Old 11-06-2019, 09:19 AM
Hari Seldon is offline
Member
 
Join Date: Mar 2002
Location: Trantor
Posts: 13,159
It has been shown that 7 random riffle shuffles of a 52 card deck will leave it within 1% of a perfectly random distribution. What is a random riffle shuffle? Divide the deck in two at a random place then randomly choose between the two packs and take the top card from the chosen pack. If you google "Trailing the riffle shuffle to its lair" you will get a pdf of the paper by D. Bayer and P. Diaconis that establishes this result. Warning: heavy going, including a detour into the cohomology of commutative algebras in which an animal called the shuffle idempotent is described.
  #7  
Old 11-06-2019, 09:28 AM
Quercus is offline
Guest
 
Join Date: Dec 2000
Location: temperate forest
Posts: 7,196
Quote:
Originally Posted by Dead Cat View Post
I must be interpreting this incorrectly. Looking at small sets, for the set with 1 element, exactly 1 element ends up in its original spot 100% of the time. For 2 elements, exactly 1 element ends up in its original spot 0% of the time (because either both elements will be in the same spot, or neither will). For 3 elements, half the possible orderings after a shuffle result in exactly one element in its original spot. For 4 elements, I make it 8 out of 24 orderings that have exactly one element in its original spot. What am I missing here? Does it only work for sets with an odd number of elements? That's in line with what I would expect intuitively. But intuition is often extremely misleading when it comes to probability and statistics.
Chronos meant that, averaging over a bunch of shuffles, you'd expect an average of one element in its original spot per shuffle.

So, for two elements, half the time they'll move (zero elements in the original spot), and half the time they won't (two elements in the original spot). So the average number of elements in the same spot (over lots of shuffles) is ( 0.5 x 0 + 0.5 x 2=) one.

Likewise for any other list with more than one item: sometimes you'll get no elements in the original spot, sometimes you'll get more than one, but over lots of shuffles the average should be close to one element in its original spot.
  #8  
Old 11-06-2019, 09:32 AM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Chronos: "you'll expect exactly one element to end up in its original spot."

Chronos' usage of "expect" may be somewhat technical. With two elements, you'll get two stay-at-homes half the time, and zero the other half the time. You expect that exactly one element in its original spot never happens! But on average there will be one stay-at-home. (50%2 + 50%0 = 1.)

Quote:
Originally Posted by Dead Cat View Post
... For 4 elements, I make it 8 out of 24 orderings that have exactly one element in its original spot. What am I missing here? ...
With four elements you'll get four stay-at-homes 1 time in 24, two 6 times in 24, one 8 times in 24, and zero 9 times in 24. Again the average is one.


ETA: Ninja'ed by Quercus.

Last edited by septimus; 11-06-2019 at 09:35 AM.
  #9  
Old 11-06-2019, 09:48 AM
Dead Cat is offline
I was curious...
 
Join Date: Feb 2005
Location: UK
Posts: 4,305
Thanks. It seems the conjunction of "expect" and "exactly one" in the original statement is what threw me.
  #10  
Old 11-06-2019, 10:53 AM
ftg's Avatar
ftg is offline
Member
 
Join Date: Feb 2001
Location: Not the PNW :-(
Posts: 20,506
Quote:
Originally Posted by septimus View Post
When shuffling the deck (#1, #2, ..., #52) via computer there is a standard algorithm which almost every programmer knows:
Swap #1 with #j (j uniform random on {1, 2, 3, ..., 52})
Swap #2 with #j (j uniform random on {2, 3, 4, ..., 52})
Swap #3 with #j (j uniform random on {3, 4, 5, ..., 52})
...
Swap #51 with #j (j uniform random on {51, 52})
Each of the 52! orderings will occur with 1/52! probability if the "uniform randoms" are reliable. Note that 1/52 of the time you swap #1 with itself (i.e. do nothing) and so on.

"Almost every programmer" may have been an over-statement. Many get it wrong, including one on-line casino sorry, I'm not going to try Google which used a trivial bad shuffle and got exploited for it!
You greatly overestimate the skills of typical programmers. Most given this task would do like you did but select from 1..52 each time. That means that there are 5252 outcomes which is not a multiple of 52!.

When I was CS prof I saw this all over the place. Not just with students but sample code posted on the Net. A USENET posting for a permutation generator was almost certainly to be bad. But at least there'd soon be a follow up pointing out the error. (Which wasn't alway believed. "Hey, it looks random." Sigh.)
  #11  
Old 11-06-2019, 11:32 AM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Yes; my "'Almost every programmer' may have been an over-statement" was itself a tongue-in-cheek understatement.

Think about the fact I mentioned: An on-line casino, with huge incentive to get this correct, managed to come up with the same wrong code as those CS drop-outs. The mind boggles. (And there are other simple facts about working with random numbers that are often overlooked.)


ETA: Referring to them as "CS drop-outs" "was another misstatement on my part. Some are doubtless earning high salaries as software managers at Microsoft.

Last edited by septimus; 11-06-2019 at 11:35 AM.
  #12  
Old 11-06-2019, 02:07 PM
manson1972's Avatar
manson1972 is online now
Member
 
Join Date: Jan 2004
Posts: 12,288
Quote:
Originally Posted by ftg View Post
You greatly overestimate the skills of typical programmers. Most given this task would do like you did but select from 1..52 each time. That means that there are 5252 outcomes which is not a multiple of 52!.
What's wrong with doing it this way?
  #13  
Old 11-06-2019, 02:12 PM
Buck Godot's Avatar
Buck Godot is offline
Guest
 
Join Date: Mar 2010
Location: MD outside DC
Posts: 6,062
Quote:
Originally Posted by moes lotion View Post
If an ordered array is randomly shuffled, my possibly faulty understanding is that the elements should have roughly equal probabilities of ending up anywhere in the shuffled array. For simplicity, say I define the initial ordered array A with 100 elements such that ai = i for i = 1 to 100, randomly shuffle it and then note that the element i (which was initially in position i) is now in position j. If I then define a 100 x 100 element distribution array D with all elements initially set to 0, and each time I randomly shuffle my initially ordered array A I increment the elements dij where the initial element i in A ends up in position j, should I expect that after some reasonably large number of random shuffles the elements of D should all be approximately equal? My initial guess is that any element i in A is less likely to remain in position i after a random shuffle so I would expect that the values along the diagonal of D should be lower than the rest, but on the other hand if the shuffle is truly random, why wouldn't an element occasionally remain in place?

Note that I'm not asking for how to perform a random shuffle, just trying to wrap my head around what a random shuffle actually means.
If you have N elements in your array, on each shuffle, dij will be 1 with probability 1/N and zero otherwise. This is true regardless of whether i and j are equal, there will be some correlations, involved since if dij =1 then dik is necessarily 0 if j is not equal to k. But each individual count will work according to that probability.

If you repeat the experiment R times, and add up the dij's then each sum will follow a binomial distribution. On average each one will have a value equal to about R/N, plus or minus sqrt(R*(N-1))/N.
  #14  
Old 11-06-2019, 03:47 PM
zimaane is offline
Guest
 
Join Date: Jan 2003
Location: washington, dc
Posts: 1,027
Quote:
Originally Posted by septimus View Post
It's not completely clear how OP defines random shuffle. A single riffle shuffle of a brand-new deck of cards will leave much order intact. Riffle the deck a few dozen times, using multiple riffle methods, and after all that you may approximate a random uniform shuffle, where each of the 52! possible orderings is expected to occur with 1/52! probability.

When shuffling the deck (#1, #2, ..., #52) via computer there is a standard algorithm which almost every programmer knows:
Swap #1 with #j (j uniform random on {1, 2, 3, ..., 52})
Swap #2 with #j (j uniform random on {2, 3, 4, ..., 52})
Swap #3 with #j (j uniform random on {3, 4, 5, ..., 52})
...
Swap #51 with #j (j uniform random on {51, 52})
Each of the 52! orderings will occur with 1/52! probability if the "uniform randoms" are reliable. Note that 1/52 of the time you swap #1 with itself (i.e. do nothing) and so on.


"Almost every programmer" may have been an over-statement. Many get it wrong, including one on-line casino — sorry, I'm not going to try Google — which used a trivial bad shuffle and got exploited for it!

How many times do you have to repeat this procedure to get a random ordering?

Last edited by zimaane; 11-06-2019 at 03:50 PM.
  #15  
Old 11-06-2019, 04:01 PM
ftg's Avatar
ftg is offline
Member
 
Join Date: Feb 2001
Location: Not the PNW :-(
Posts: 20,506
Quote:
Originally Posted by zimaane View Post
How many times do you have to repeat this procedure to get a random ordering?
Once.

Quote:
Originally Posted by manson1972 View Post
What's wrong with doing it this way?
Um, it's not a uniform distribution. Some permutations occur more than others. It is highly unlikely that this non-uniform distribution is more desirable than the uniform one and it is trivial to modify the code to get a uniform one.

Example: with 3 elements you get will generate one of 33=27 outcomes, and there are 3!=6 permutations. 27/6 is 4.4. So some permutations occur 4 times in the outcomes and some occur 5 times. If you do it right all appear the same number of times. In this case once.
  #16  
Old 11-06-2019, 04:07 PM
manson1972's Avatar
manson1972 is online now
Member
 
Join Date: Jan 2004
Posts: 12,288
Quote:
Originally Posted by ftg View Post
Example: with 3 elements you get will generate one of 33=27 outcomes, and there are 3!=6 permutations. 27/6 is 4.4. So some permutations occur 4 times in the outcomes and some occur 5 times. If you do it right all appear the same number of times. In this case once.
Sorry, I don't understand the difference between "outcomes" and "permutations" in this context.
  #17  
Old 11-06-2019, 06:56 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,967
Quote:
Originally Posted by DPRK View Post
If you count where it ends up each time, the counts will not be exactly the same, but after a short while you should see them form a uniform distribution with few significant deviations from the expected number.
Not a "uniform" distribution, of course; let's call it a bell-shaped binomial distribution. Sorry for the typo. What is uniform is the probability of ending up in position i (it is always 1/100 in your case )
  #18  
Old 11-06-2019, 07:36 PM
gazpacho is offline
Guest
 
Join Date: Oct 1999
Posts: 5,709
Quote:
Originally Posted by Hari Seldon View Post
It has been shown that 7 random riffle shuffles of a 52 card deck will leave it within 1% of a perfectly random distribution. What is a random riffle shuffle? Divide the deck in two at a random place then randomly choose between the two packs and take the top card from the chosen pack. If you google "Trailing the riffle shuffle to its lair" you will get a pdf of the paper by D. Bayer and P. Diaconis that establishes this result. Warning: heavy going, including a detour into the cohomology of commutative algebras in which an animal called the shuffle idempotent is described.
There is a good youtube video with P. Diaconis discussing this paper. Numberphile has 6 or so videos with Diaconis discussing coin flips dice and things of that nature.
https://www.youtube.com/watch?v=AxJubaijQbI
  #19  
Old 11-06-2019, 08:02 PM
moes lotion is offline
Guest
 
Join Date: Nov 2002
Location: North of a Great Lake
Posts: 232
Thanks to everyone who has replied so far, especially septimus and Chronos. I may not have been clear in my description of the distribution array, it holds a cumulative count of the outcomes of each shuffle, in reply to Buck Godot.

The reason I posted this was that I needed to code up a random shuffle, found the Fisher-Yates algorithm and Durstenfeld's implementation thanks to wikipedia and coded it up. I did exactly as I described in my original post as a brute-force check that I was getting the distribution that I thought I should but the cumulative values in the top left corner of the distribution (for a start to end shuffle) were all about half of the expected values, and when I switched to an end to start shuffle, the bottom right corner values likewise were about half of the expected values.

As it turns out, I had made a glaringly obvious (in hindsight) error in my code, instead of selecting the swap index j in the range of i <= j <= N I was restricting j to be greater than (and not equal) to i.
  #20  
Old 11-06-2019, 08:24 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Yes, to clarify, I was using "expected value" in the technical sense of the mean, even though (as often happens with means) sometimes the "expected" value is itself impossible. Just like the average human has approximately one testicle and one ovary.

And "exactly 1" was a statement about that mathematically-defined expected value. 1 isn't just a good approximation for the expected value, or valid for sufficiently large N, or the like: For any deck size, large or small, it's always exactly 1.
  #21  
Old 11-06-2019, 10:02 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Quote:
Originally Posted by ftg View Post
Um, it's not a uniform distribution. Some permutations occur more than others....

Example: with 3 elements you get will generate one of 33=27 outcomes, and there are 3!=6 permutations. 27/6 is 4.5. [corrected typo -- sgs7] So some permutations occur 4 times in the outcomes and some occur 5 times. If you do it right all appear the same number of times. In this case once.
It gets even worse for larger decksizes, e.g. 6. Just now, hoping my coding skill isn't too rusty, I did something like
Code:
ln -s /dev/tty badsort.c
cc -o badsort badsort.c
#include        <stdio.h>
#include        <stdlib.h>

#define NUMEL   6

int main(int argc, char **argv)
{
        int     a[NUMEL], b[NUMEL], u, t, k;

        for (a[0] = 0; a[0] < NUMEL; a[0]++)
        for (a[1] = 0; a[1] < NUMEL; a[1]++)
        for (a[2] = 0; a[2] < NUMEL; a[2]++)
        for (a[3] = 0; a[3] < NUMEL; a[3]++)
        for (a[4] = 0; a[4] < NUMEL; a[4]++)
        for (a[5] = 0; a[5] < NUMEL; a[5]++) {
                for (k = 0; k < NUMEL; k++)
                        b[k] = k;
                for (k = 0; k < NUMEL; k++) {
                        t = a[k];
                        u = b[t];
                        b[t] = b[k];
                        b[k] = u;
                }
                for (k = 0; k < NUMEL; k++)
                        printf("%c", b[k] + 'A');
                printf("\n");
        }
}
^D
badsort | sort | uniq -c | sort -nr | more
and discovered that BCAEFD occurred a whopping 159 times, while FABCDE occurred just 32 times, barely a fifth as often! Each should have occurred 6^6/6! = 64.8 times.
  #22  
Old 11-06-2019, 10:36 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Which in turn means that, once you see a few cards, you can make a much better guess than you should be able to about which cards will be coming next. Which breaks pretty much any card-based casino game.
  #23  
Old 11-07-2019, 01:15 AM
md2000 is offline
Guest
 
Join Date: Feb 2009
Posts: 15,157
Quote:
Originally Posted by manson1972 View Post
What's wrong with doing it this way?
If you do 52 swaps where the first swap element is random (not sequential) there must be a bigger chance that a number of slots did not get selected to be swapped from. Of course they may be on the receiving end of a swap but intuitively (which is a bad thing to say in probability) the odds of a slot not being swapped from or to is higher. Whereas doing each slot sequentially, the only way a number ends up in its original slot is if it is swapped out and then back again (or swapped through three or more slots)

OTOH, there's no reason to stop at 52 if the slot you swap from is selected at random. Do 1000 swaps of random slots and it's highly unlikely There's a slot that does not get swapped from or to.
  #24  
Old 11-07-2019, 08:37 AM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Yeah, repeat it a thousand times, and even a not-very-good shuffling procedure will probably be acceptable. But random numbers are expensive.
  #25  
Old 11-07-2019, 04:14 PM
ftg's Avatar
ftg is offline
Member
 
Join Date: Feb 2001
Location: Not the PNW :-(
Posts: 20,506
Quote:
Originally Posted by manson1972 View Post
Sorry, I don't understand the difference between "outcomes" and "permutations" in this context.
There are 27 different ways the array can be shuffled. A permutation is ... a permutation.

Doing it the right way and there are 6 different ways the array can be shuffled.

6 is a multiple of 6. 27 isn't. Some outcomes are going to occur more often than others. That's not likely to be desired.
  #26  
Old 11-07-2019, 04:35 PM
manson1972's Avatar
manson1972 is online now
Member
 
Join Date: Jan 2004
Posts: 12,288
Quote:
Originally Posted by md2000 View Post
Whereas doing each slot sequentially, the only way a number ends up in its original slot is if it is swapped out and then back again (or swapped through three or more slots)
With the code from above:

Quote:
Originally Posted by septimus View Post
Swap #1 with #j (j uniform random on {1, 2, 3, ..., 52})
What if j is 1 for card number 1? Doesn't that mean the number stays in its original slot?

Quote:
OTOH, there's no reason to stop at 52 if the slot you swap from is selected at random. Do 1000 swaps of random slots and it's highly unlikely There's a slot that does not get swapped from or to.
Actually, thinking about this more, this came up when my son had a blackjack game he had to program for school, and I was just showing him a quick way to shuffle the deck - first card random, second card random, swap them, do that 1000 times and it's good enough for your program. He then proceeded to ask me "What if each number is the same, isn't it swapping the card with itself?" at which point I grounded him for talking back
  #27  
Old 11-07-2019, 04:46 PM
zimaane is offline
Guest
 
Join Date: Jan 2003
Location: washington, dc
Posts: 1,027
Quote:
What if j is 1 for card number 1? Doesn't that mean the number stays in its original slot?
Yes. If you want a random reordering of the deck, you have to have the possibility that the top card stays on top.
  #28  
Old 11-07-2019, 05:26 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,313
Quote:
Originally Posted by md2000 View Post
OTOH, there's no reason to stop at 52 if the slot you swap from is selected at random. Do 1000 swaps of random slots and it's highly unlikely There's a slot that does not get swapped from or to.
I have a personal anecdote that is exactly on point. Big ass technical post follows...

In a former job one of the things my company did was verify the math models of casino games. This included analysis of PRNGs in general and PRNG applications that shuffled decks of cards. Part of this was done by generating millions of shuffles and analyzing those with a battery of tests that determined if the data set was statistically distinguishable from perfectly uniformly random.

A different mathematician was evaluating a blackjack game that used the following algorithm for shuffling blackjack shoes that contained 1, 2, 6, and 8 decks:

1. Start with an array [0, 1, .... , 50, 51] * (number of decks).
2. Randomly choose two distinct indices of this array and swap the elements in those spots. (The underlying PRNG used for this was the Mersenne Twister for those interested.)
3. Do step 2 a total of 1000 * (number of decks) times.

This was not a perfectly uniformly random shuffle for a couple reasons, but we could certify for use in the field if it was statistically indistinguishable from an unbiased shuffle.

So this algorithm was passing most of the tests. As an aside, one of the tests was essentially the test that was proposed in the OP. Look at what shows up at every position and see if that matches the expected distribution (uniform in this case).

It was, however, super failing one of the tests. This test was called the longest cycle test. This looked at each shuffle as a permutation in disjoint cycle notation (Seen here), and compared the distribution of the length of the longest cycle of each to the expected distribution of an unbiased shuffle (very very not uniform, fyi).

The manner of the failure was pretty bizarre, and the reason I was called in to the project. Only the single deck implementation was failing and it was failing because it never produced a shuffle with a longest cycle of length 52. In the grand scheme of things, these are relatively rare, but we had a huge data set and were expecting thousands and were instead getting exactly zero.

What was the problem? Permutations come in even and odd varieties. There's a couple different logically equivalent to define this, but I'll go with every there is a canonical matrix group representation of every perumutation group. Half the matrices in this group will have a determinant of 1, we'll call these even, and the other half will have a determinant of -1, which we will call odd (provided we a permuting more than one item). The reason I'm defining things this way is that the determinant of the product of matrices is the product of the determinants of the matrices.

One more fact that we need is that the determinant of a transposition (swapping two elements) is always -1.

So the algorithm always produced a shuffle equivalent to the product of 1,000 transpositions so it had a determinant of (-1)^1000 = 1. While every cycle of length 52 can be written as the product of 51 transpositions so it has a determinant of (-1)^51 = -1. So while the algorithm was pretty good in most ways, it was incapable of producing a specific type of permutation and we lucked out in that a subset of that set was a kind of permutation that we were looking for.

One extra bit of weirdness was that the multi-deck versions of algorithm weren't failing because sometimes the algorithm we swap a card for a copy of itself which was equivalent to doing nothing. Thus the multi-deck version could produce both even and odd permutations and did so with equal frequency. I'm leaving out some technical details of decomposing multi-deck shuffles into disjoint cycles, but that's the gist of it.
  #29  
Old 11-07-2019, 05:32 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,313
Epilogue: Our recommended fix? Just use a Fisher Yates shuffle like you should have in the first place.
  #30  
Old 11-07-2019, 05:48 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,313
Quote:
Originally Posted by manson1972 View Post
He then proceeded to ask me "What if each number is the same, isn't it swapping the card with itself?" at which point I grounded him for talking back
As my anecdote shows, you definitely want this to be the case. At least once in a while.

But also your described algorithm is almost 40 times the work of a Fisher Yates shuffle and the results aren't as good. Although this difference is in all likelihood undetectable.
  #31  
Old 11-07-2019, 06:09 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
On the one hand, at some point programmer time is more valuable than computer time, so you want something that can be programmed quickly.

On the other hand, what takes least programmer time of all is to just copy-and-paste someone else's implementation of a standard algorithm.

And to sum up the bug in Lance Turbo's example, it only gave half of all possible shuffles. Which means that if you dealt out 50 of the 52 cards, you'd be able to say, with 100% certainty, what order the last two were in. OK, games rarely go all the way through a deck, but that's still something you want to avoid.
  #32  
Old 11-07-2019, 06:57 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,313
Quote:
Originally Posted by Chronos View Post
On the one hand, at some point programmer time is more valuable than computer time, so you want something that can be programmed quickly.

On the other hand, what takes least programmer time of all is to just copy-and-paste someone else's implementation of a standard algorithm.
One crazy thing is that almost every roll your own shuffle algo that I've seen is much more complicated than a Fisher Yates shuffle and more than half the time fatally flawed.

Here's a Python implementation...

Code:
from random import randrange

def FYShuffle(items):
    i = len(items)
    while i > 1:
        i = i - 1
        j = randrange(i)  # 0 <= j <= i-1
        items[j], items[i] = items[i], items[j]
It's also already coded up for you in the random library.
  #33  
Old 11-07-2019, 08:01 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Quote:
...and more than half the time fatally flawed.
And that's the great thing with using a standard algorithm: You don't have to waste any valuable programmer time determining whether it works.
  #34  
Old 11-07-2019, 08:09 PM
manson1972's Avatar
manson1972 is online now
Member
 
Join Date: Jan 2004
Posts: 12,288
Quote:
Originally Posted by Lance Turbo View Post
As my anecdote shows, you definitely want this to be the case. At least once in a while.

But also your described algorithm is almost 40 times the work of a Fisher Yates shuffle and the results aren't as good. Although this difference is in all likelihood undetectable.
I understand that it would happen due to the nature of randomness. I was just happy that he saw it. Plus, I didn't really ground him, you know?

Also, that depends on your definition of "work". Took me about 2 minutes to explain it and code it up. Much longer than searching for "Fisher Yates shuffle" and then trying to explain that.

Programming is based on requirements, this one being "fast to code up and easily explainable"
  #35  
Old 11-07-2019, 08:41 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Quote:
Originally Posted by Chronos View Post
... But random numbers are expensive.
Most programmers don't seem overly concerned about the cost. 32 random bits are usually spent just to pick a uniform variate on 1 <= x <= 52; instead less than 6 bits are needed (on average) if you do it properly.

Quote:
Originally Posted by Lance Turbo View Post
https://docs.python.org/3/library/random.html"]random library
How many random bits do they spend to shuffle a deck? I spend about 7.05*32 bits, in close agreement with log2(52!). I think my code is correct, but will PM it to skeptics.

Quote:
Originally Posted by manson1972 View Post
Much longer than searching for "Fisher Yates shuffle" and then trying to explain that.
Fisher-Yates is trivial if you think methodically. You want one of 52 cards to appear in the #1 slot with equal likelihood; the 1st line of my pseudocode above does exactly that. And so on.

Last edited by septimus; 11-07-2019 at 08:43 PM.
  #36  
Old 11-07-2019, 08:58 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,313
Quote:
Originally Posted by septimus View Post
How many random bits do they spend to shuffle a deck? I spend about 7.05*32 bits, in close agreement with log2(52!). I think my code is correct, but will PM it to skeptics.
It's a straightforward implementation of Fisher Yates which is logically equivalent to what you posted.

Here's the source...

Code:
def shuffle(self, x, random=None, int=int):
    """x, random=random.random -> shuffle list x in place; return None.
    
    Optional arg random is a 0-argument function returning a random
    float in [0.0, 1.0); by default, the standard random.random.
    """
    
    if random is None:
        random = self.random
    for i in reversed(xrange(1, len(x))):
        # pick an element in x[:i+1] with which to exchange x[i]
        j = int(random() * (i+1))
        x[i], x[j] = x[j], x[i]
  #37  
Old 11-07-2019, 09:03 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Quote:
Originally Posted by Lance Turbo View Post
... Optional arg random is a 0-argument function returning a random
float in [0.0, 1.0); by default, the standard random.random.
So does it grab 51 random floats when sorting a 52-deck? In that case it is MUCH more spendthrift than my code which, as I said, uses only 7.05 32-bit random words on average.
  #38  
Old 11-07-2019, 09:28 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,313
It's equivalent to the algorithm you outlined in post #3.

Last edited by Lance Turbo; 11-07-2019 at 09:28 PM.
  #39  
Old 11-07-2019, 09:31 PM
manson1972's Avatar
manson1972 is online now
Member
 
Join Date: Jan 2004
Posts: 12,288
Quote:
Originally Posted by septimus View Post
Fisher-Yates is trivial if you think methodically. You want one of 52 cards to appear in the #1 slot with equal likelihood; the 1st line of my pseudocode above does exactly that. And so on.
True. But that only helps if you already know Fisher-Yates. Which I didn't. I DID know swap two random cards. And I did make sure to tell him it wasn't the best way to do it.

But now I do know Fisher-Yates, so there's that. Thanks!
  #40  
Old 11-07-2019, 09:41 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Quote:
Originally Posted by Lance Turbo View Post
It's equivalent to the algorithm you outlined in post #3.
I'd better stop after this post since you seem to be missing the complete context of my question. We're not speaking about the validity of the random shuffle. We're asking
How many random bits are consumed?

A naive implementation of shuffle will consume 51 random words, however the library provides a word, but ' random on {1,2,3,4,5,6,7,8} ' only needs to consume three random bits.

And nevermind that (random_word % 49) is very very slightly biased against large residues. Similarly (random_float * 49) must also be biased unless the exact number of possible random_floats is a multiple of 49.

Last edited by septimus; 11-07-2019 at 09:42 PM.
  #41  
Old 11-07-2019, 10:46 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,313
I get what you're saying now. One could put some effort into more efficiently using random bits. I've never really thought about the problem or had a problem where random number generation was a bottleneck.

If anyone is interested in this problem it looks like the paper Fast Random Integer Generation in an Interval by Daniel Lemire might be a good place to start. The blog post Efficiently Generating a Number in a Range by M.E. O'Neill found Lemire's method to be the fastest.
  #42  
Old 11-07-2019, 11:27 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Quote:
Originally Posted by Lance Turbo View Post
... I've never ... had a problem where random number generation was a bottleneck.
AFAIK, DSM-VI doesn't even have a code for my pathology.

... And based on the date of your cite, I'm happy to conclude that Lemire's method is based on my own publications on another message board!
  #43  
Old 11-08-2019, 06:32 AM
Dead Cat is offline
I was curious...
 
Join Date: Feb 2005
Location: UK
Posts: 4,305
Quote:
Originally Posted by Chronos View Post
And to sum up the bug in Lance Turbo's example, it only gave half of all possible shuffles. Which means that if you dealt out 50 of the 52 cards, you'd be able to say, with 100% certainty, what order the last two were in. OK, games rarely go all the way through a deck, but that's still something you want to avoid.
Thanks for this; I was going to ask what the practical implications might be, and whether this would even be detectable by a human. I somehow doubt it.

I also don't think a player would be in a position to take advantage often enough even if they knew about the flaw, unless the program also allowed nearly all the cards to be used before a shuffle.
  #44  
Old 11-08-2019, 07:02 AM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Keeping track of the parity of a permutation isn't something that would be obvious to most people, but it is something that a human could do without too much difficulty if they learned how (about the same level of difficulty as counting cards).
  #45  
Old 11-08-2019, 07:20 AM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Oh, and what if you take a Quicksort, but replace all of the comparisons with random selections? That should use the right number of bits-- Is it a valid shuffle?

(I think this is equivalent to asking whether Quicksort is optimally efficient)

Come to think of it, it can't be perfect, because Quicksort has a bounded worst-case performance, but a random selection between any number of alternatives that's not a power of 2 has an unbounded worst case. But is it good enough?
  #46  
Old 11-08-2019, 07:41 AM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963
Quote:
Originally Posted by Chronos View Post
Oh, and what if you take a Quicksort, but replace all of the comparisons with random selections? That should use the right number of bits-- Is it a valid shuffle?
Any oblivious procedure using 50-50 coin tosses will have a number of equally likely outcomes equal to a power of two (i.e. some 2n). This is (conceptually) true even if some sequences allow premature completion: omitting the final k tosses is the same as treating 2k cases identically.

And, as ftg implied, such a set of size 2n cannot possibly provide a perfectly perfect shuffle! Because (52!) will not divide 2n evenly no matter how large n is.


In practice, computer PRNG's are (almost?) always a sequence of 50-50 bits. So, a perfect perfect shuffle is, strictly speaking, impossible to guarantee in finite time! In practice this is not a problem but — given that simple sentences like "How many random bits do they spend to shuffle a deck? I spend about 7.05*32 bits, in close agreement with log2(52!)" turned out to be hard-to-understand — I'll just let you take this on faith.

Last edited by septimus; 11-08-2019 at 07:42 AM.
  #47  
Old 11-08-2019, 03:42 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Yes, and that's basically what I realized in my third paragraph. Quicksort has a maximum number of comparisons it will ever need to make, but a perfect shuffle algorithm does not have a maximum number of random bits that's guaranteed to be enough, ergo, randomized Quicksort cannot be a perfect shuffle algorithm.
  #48  
Old 11-08-2019, 03:55 PM
ftg's Avatar
ftg is offline
Member
 
Join Date: Feb 2001
Location: Not the PNW :-(
Posts: 20,506
Quicksort is Ω(n log n). The algorithm given by septimus back in the 3rd post is linear time. Basically as fast as one can get. It's quick and dirty code without the dirt. Anything else is pretty much going to be slower and take longer to code.

The only real reason to discuss other methods concerns what methods use fewer random bits or are better if you don't completely trust your RNG.

I just cannot fathom that there are companies out there who pay people whose brains come up with idiocies like "Let's do a 1000 pair-wise random swaps." It's not just flawed, it's a waste of computer and people time. (Remember: the maintenance cost of code is far greater than the original development cost. Doing stupid stuff makes the maintenance cost a lot greater. And in this particular case ridiculously worse.)
  #49  
Old 11-08-2019, 04:24 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
septimus's algorithm there is also n log n in number of bits required. Remember, choosing a random number from 1 to n requires more bits (on average) for a larger n. And the bound that a sorting algorithm must be at least n log n is based on the same thing we're talking about here, the number of bits needed to completely represent an n-element permutation.
  #50  
Old 11-08-2019, 06:47 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,963

Stirling's Approximation


Quote:
Originally Posted by Chronos View Post
septimus's algorithm there is also n log n in number of bits required.
More precisely, and quite close for largish n,
(n+.5) * log2(n) + (.919 - n)*1.442695 bits of random entropy
are consumed for an entropy-minimal n-shuffle, where 1.442695 is approx log2(e).

Last edited by septimus; 11-08-2019 at 06:49 PM.
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 10:25 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@straightdope.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Copyright 2019 STM Reader, LLC.

 
Copyright © 2017