Reply
 
Thread Tools Display Modes
  #51  
Old 11-08-2019, 10:44 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,967
Quote:
Originally Posted by Chronos View Post
Yeah, repeat it a thousand times, and even a not-very-good shuffling procedure will probably be acceptable. But random numbers are expensive.
If I am remembering correctly posts from people in the industry, slot machines and digital casinos have to pass all kinds of inspection and certification, but having a source of truly random numbers is not a requirement.

Q: is it a requirement to be able to generate every possible permutation of even a single 52-card deck?

Last edited by DPRK; 11-08-2019 at 10:48 PM.
  #52  
Old 11-08-2019, 11:31 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,315
Quote:
Originally Posted by DPRK View Post
If I am remembering correctly posts from people in the industry, slot machines and digital casinos have to pass all kinds of inspection and certification, but having a source of truly random numbers is not a requirement.

Q: is it a requirement to be able to generate every possible permutation of even a single 52-card deck?
I worked for a company that was a trusted third party between casino game manufacturers and gambling regulators. We were an international company that did work in pretty much every gambling jurisdiction in the world. There is no jurisdiction that I know of where a pseudo-random number generator would not suffice.

I did twice have to test hardware RNGs. One was essentially a clever application of a Geiger counter. That thing was slow as fuck and actually failed on first pass. It didn't fail because the source of randomness though. They were introducing detectable bias due to improper scaling.

The other was a pretty cool device that relied on the behavior of photons as governed by quantum mechanics. That thing was pretty sweet.

It is generally a requirement in every jurisdiction that every advertised outcome be possible. For card games there was often additional language that games needed to be faithfully simulating what would be expected if the game had been played with actual cards. We interpreted this to mean that every shuffle must be possible, but we never actually checked that because it would be computationally impossible. We did however fail games when we could prove that particular shuffles could not be produced as I alluded to in my earlier anecdote.
  #53  
Old 11-09-2019, 08:15 AM
ftg's Avatar
ftg is offline
Member
 
Join Date: Feb 2001
Location: Not the PNW :-(
Posts: 20,506
Quote:
Originally Posted by Chronos View Post
septimus's algorithm there is also n log n in number of bits required. Remember, choosing a random number from 1 to n requires more bits (on average) for a larger n. And the bound that a sorting algorithm must be at least n log n is based on the same thing we're talking about here, the number of bits needed to completely represent an n-element permutation.
You are confusing two different aspects of Complexity Theory. Quicksort analysis counts number of major operations. (Comparisons, swapping things, etc.)

If you count the number of bit operations then just a simple loop that counts from 1 to n takes O(n log n) time. We do not do this in Algorithm Analysis as long as the size of words are small. And log n is definitely considered small. After all, a 64-bit word size on a computer means you can do additions and such in constant time on numbers up to a bit over 4 billion. Go to double word size and a counter run from 0 to 2^128 would take an almost unimaginable amount of time.

If the algorithm actually requires long integers or some such, then counting bits matters. (E.g., prime factoring.)

There's a lot going on in Algorithm Analysis to keep things in the proper perspective that a non-expert wouldn't understand. But let me assure you that your claim about needing n log n random bits immediately requires n log n time is not at all how anyone knowledgeable in the field would see things. Again, it is no different than claiming that a simple for loop with no body takes n log n time.
  #54  
Old 11-09-2019, 12:04 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Yes, Quicksort needs n log(n) in all operations, but that's because the other operations are proportional to the comparisons. Any sorting routine must use at least n log(n) comparisons, because in order to sort a list, you must (directly or indirectly) determine what order it was in originally, which means you must distinguish one of the n! permutations, which means you need n log(n) bits of information about the original arrangement.

And yes, it's true that comparison isn't actually a constant-time operation, because it in principle takes longer for larger numbers. But for any practical number size, it's still far quicker than generating a good random bit. A random bit takes far longer than anything that would ordinarily be considered a "fundamental operation", and so it's absurd to measure the complexity of a shuffle as a count of anything other than random bits.
  #55  
Old 11-09-2019, 02:29 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,315
Some Python 3 code...

Code:
from random import shuffle
import datetime

def shuffle_time_trial(n, trials):
    my_list = list(range(n))
    start = datetime.datetime.now()
    for t in range(trials):
        shuffle(my_list)    
    print(n, trials, datetime.datetime.now() - start)
    
trials = 100000
for n in range(100, 600, 100):
    shuffle_time_trial(n, trials)
It outputs...

Code:
100 100000 0:00:07.136390
200 100000 0:00:14.232014
300 100000 0:00:21.538563
400 100000 0:00:29.174170
500 100000 0:00:36.412494
This is not surprising since each shuffle calls the RNG n - 1 times.

Each call to the RNG produces 64 random bits. septimus pointed out that this is far more random bits than is absolutely necessary and suggests that he has a method that, on average, needs just a touch over log2 n! bits which is a lot less than 64 * (n-1) for all reasonable values of n (note: it does eventually surpass the linear method at values well above the number of things we might want to shuffle, it starts catching up at 2^64).
  #56  
Old 11-09-2019, 04:36 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
There is no linear method. You're assuming that a routine that gives n random bits is always the same thing, no matter the value of n.
  #57  
Old 11-09-2019, 08:03 PM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,315
The code I posted generates 64 * (n - 1) random bits. That is indeed linear with respect to n.
  #58  
Old 11-09-2019, 11:49 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,964
Miscellaneous remarks:

Complexity quotes can get weird when you try to be very precise. I recall a computational geometry paper (addressing a problem like taking an awkwardly shaped piano through a series of doorways) where inside the O(), in addition to the usual n and log n factors, was A-1(n) the inverse Ackermann function! (Not the ridiculously rapid-growing Ackermann. The inverse Ackermann which plods along far more slowly than any imaginable tortoise!)

Even very high quality PRNGs are very fast. Hardware RNGs may be much more expensive. (How do the applications which think they need "true" randoms work anyway? I'd guess that, if speed is an issue, they use the hardware randoms as seeds for a software PRNG.)

I'm not sure what Lance's demo is intended to prove. Of course a procedure that requires O(n log n) time can be presented as O(k n) if we choose some k > log max_n and experiment only with n < max_n!

OTOH, I'm happy to admit that my own random library, which minimizes random-bit consumption, was written just for fun. Fun is good!

BTW, another common use for randomness is ' { return 1 with probability p, otherwise 0 } ' Most codes will consume 32 (or 64) bits for this, although only 1 bit is needed when p is exactly 0.5. If we're thrifty with our random bits we can consume less than 1 bit when p ≠ 0.5 ! For example, suppose p = 0.25 exactly or p = 0.75. There's a trivial way to 'toss such a biased coin' while consuming exactly 2 random bits; in fact barely 0.811 bits are needed. Coding for this is, again, unnecessary, but Fun!

Finally, looking over my own random library just now, I see a puzzle: Pick a random point uniformly distributed on an n-dimensional ball (or an n-dimensional sphere take your pick). Assume n may be large. No fair Googling? Please start a new MPSIMS thread to present any solution.
  #59  
Old 11-09-2019, 11:57 PM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,964
Quote:
Originally Posted by septimus View Post
If we're thrifty with our random bits we can consume less than 1 bit when p ≠ 0.5 !
This should be
If we're thrifty with our random bits we can consume less than 1 bit (on average) whenever p 0.5 !
Damn that 5-second Edit window!
  #60  
Old 11-10-2019, 12:48 AM
Lance Turbo is offline
Guest
 
Join Date: Aug 1999
Location: Asheville, NC
Posts: 4,315
Quote:
Originally Posted by septimus View Post
I'm not sure what Lance's demo is intended to prove. Of course a procedure that requires O(n log n) time can be presented as O(k n) if we choose some k > log max_n and experiment only with n < max_n!
That is exactly what it is intended to show.
  #61  
Old 11-10-2019, 08:40 AM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
OK, I can see how a 25% chance needs less than 2 bits on average, because half the time, you can stop after the first bit, and not need to call the second one at all. But I can't see how you could do it in less than 1 bit on average, because that means that you'd sometimes be using zero bits, and how do you decide if you're in the zero-bit case without using any randomness?

Unless you're referring to a large number of such "biased coin flips" in series, and we're using extra entropy left over from one of the previous trials, I suppose.
  #62  
Old 11-10-2019, 09:22 AM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,967
Quote:
Originally Posted by Chronos View Post
OK, I can see how a 25% chance needs less than 2 bits on average, because half the time, you can stop after the first bit, and not need to call the second one at all. But I can't see how you could do it in less than 1 bit on average, because that means that you'd sometimes be using zero bits, and how do you decide if you're in the zero-bit case without using any randomness?

Unless you're referring to a large number of such "biased coin flips" in series, and we're using extra entropy left over from one of the previous trials, I suppose.
If your question is, what if "H" has probability 25% and "T" has probability 75%, and you want to represent a sequence of coin flips using a string of bits, then you can do it using a simple arithmetic encoding.
  #63  
Old 11-10-2019, 11:00 AM
septimus's Avatar
septimus is offline
Guest
 
Join Date: Dec 2009
Location: The Land of Smiles
Posts: 19,964
Quote:
Originally Posted by DPRK View Post
If your question is, what if "H" has probability 25% and "T" has probability 75%, and you want to represent a sequence of coin flips using a string of bits, then you can do it using a simple arithmetic encoding.
+++

The optimal such code will use 0.811278124459132864 bits on average, via a formula attributed to Claude Shannon, or
-.25*l(.25) - .75*l(.75) nats
Arithmetic coding converts biased flips to bits ("50-50 bits"). We need the inverse problem: converting bits to biased flips. As I mentioned above ("in fact barely 0.811 bits are needed") The library function for that is essentially optimal.
  #64  
Old 11-10-2019, 11:22 AM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,967
Call it arithmetic decoding, then.
  #65  
Old 11-10-2019, 06:01 PM
ftg's Avatar
ftg is offline
Member
 
Join Date: Feb 2001
Location: Not the PNW :-(
Posts: 20,506
Quote:
Originally Posted by Chronos View Post
Yes, Quicksort needs n log(n) in all operations, but that's because the other operations are proportional to the comparisons. Any sorting routine must use at least n log(n) comparisons, because in order to sort a list, you must (directly or indirectly) determine what order it was in originally, which means you must distinguish one of the n! permutations, which means you need n log(n) bits of information about the original arrangement.

And yes, it's true that comparison isn't actually a constant-time operation, because it in principle takes longer for larger numbers. But for any practical number size, it's still far quicker than generating a good random bit. A random bit takes far longer than anything that would ordinarily be considered a "fundamental operation", and so it's absurd to measure the complexity of a shuffle as a count of anything other than random bits.
Sorting by comparison requires Ω(n log n) time. Not all sorting methods use comparisons. Radix sort is a famous example. It's touted as "linear time" however that ignores the length of the keys. If you process m bit keys in blocks of size b then it takes O(n m/b) time. If m/b is really small then it's linear. (One good use of radix sort is to sort n things whose keys are between 0 and n. So key size is log n and that'll fit into a 64 bit word, two in really bad cases.) But if the key size is hundreds of bytes then things start looking very non-linear.

Now, about comparisons being non-constant. If you do something like Mergesort and maintaining how "far in" a comparison goes before hitting a mismatch on two consecutive keys in a sublist you can save a lot of comparison time by using that measure to resume comparisons when merging lists. The total comparison time falls quite dramatically.

But the nature of the data Rules All. I would point out to my students that while the name field in the class roster is quite long, there would typically be only a handful of names that agree on two letters and rarely any that agree on 3 letters. So comparisons are fast here. OTOH, a ton of stuff like part numbers tend to have similar prefixes and mostly differ in the last symbols. So radix sort starts off really doing a lot of useful work. But then it starts wasting time, too. So a mixed approach is best.

There are a ton of tricks one can use to tweak sorting that the common undergrad Algorithms class doesn't cover.

I.e., there is no one "best" sorting method*. One has to be very careful in extrapolating sorting algorithms of one type to another domain.

And you are unreasonably hung up on certain bit operations but not others. Either you count bits on everything (and get absurd results) or you are practical and consider log n and such size word operations that are built in on computers to be constant time (and get sane results). Look at WELL and similar RNGs. A constant number of single word operations to generate the next word.

* That goes for "Quicksort". It's okay on some things. Nothing special on others. YMMV applies.
  #66  
Old 11-10-2019, 07:12 PM
Chronos's Avatar
Chronos is offline
Charter Member
Moderator
 
Join Date: Jan 2000
Location: The Land of Cleves
Posts: 85,446
Doesn't radix sort depend on an assumed distribution of the objects to be sorted? Like, if you were sorting names, but by some fluke you had 50 names starting with A and only ten for all other letters combined (and didn't know that from the outset), then radix sort would perform poorly.
  #67  
Old 11-11-2019, 10:20 AM
ftg's Avatar
ftg is offline
Member
 
Join Date: Feb 2001
Location: Not the PNW :-(
Posts: 20,506
Quote:
Originally Posted by Chronos View Post
Doesn't radix sort depend on an assumed distribution of the objects to be sorted? Like, if you were sorting names, but by some fluke you had 50 names starting with A and only ten for all other letters combined (and didn't know that from the outset), then radix sort would perform poorly.
A vanilla Radix sort doesn't care about the distribution. The longer the keys the worse it goes. There are a ton of adaptations to speed things up in certain cases. Buckets and all that. And then the distribution starts to matter.
  #68  
Old Yesterday, 01:04 PM
Jim Peebles is offline
Guest
 
Join Date: Jul 2017
Posts: 446
Check out the research of Persi Diaconis on card shuffling:
https://en.m.wikipedia.org/wiki/Persi_Diaconis
(I haven't read this thread, and have no interest in doing so, but I searched for his name, and was surprised it hasn't come up.)
  #69  
Old Yesterday, 01:13 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,967
Quote:
Originally Posted by Jim Peebles View Post
Check out the research of Persi Diaconis on card shuffling:
https://en.m.wikipedia.org/wiki/Persi_Diaconis
(I haven't read this thread, and have no interest in doing so, but I searched for his name, and was surprised it hasn't come up.)
That paper has been mentioned at least 2 or 3 times in this thread already
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 03:14 AM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@straightdope.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Copyright 2019 STM Reader, LLC.

 
Copyright © 2017