What's the difference between the Monte Carlo method and polling?

billfish678 · February 1, 2016, 5:55pm

Is a random number generator and a computer program that is used to calculate the number of heads or tails on aveage for a perfect coin a Monte Carlo simulation?

Pasta · February 1, 2016, 6:45pm

Your statement is ever-so-slightly underspecified. By “number of heads on average” I’ll assume you mean something like “number of heads on average for a sequence of N throws.” In this case you can flip the fair coin via a random number generator N times, count the number of heads, and repeat this for a large number of trials. The average head count over this ensemble of trial is your estimate.

This is about as classic a Monte Carlo simulation as you can get, so yes. Obviously the answer to this case can be trivially obtained analytically, but you wouldn’t need to add much to the task to make a Monte Carlo simulation the natural approach to solve it (or at least to check an analytic calculation). Example: How many heads on average do you get in a sequence of throws of a fair coin where you stop throwing after seeing 1, 2, and 3 heads in a row, in that order, with only tails separating the subsequences?

billfish678 · February 1, 2016, 6:50pm

But it is classic to the point of absurdity. And I don’t think that is the only point but I am having trouble putting the ideas into words at the moment.

It’s kinda like Chronos’s point but at the other end of the spectrum.

Pasta · February 1, 2016, 6:59pm

Indeed, which is why I gave a less absurd example in my reply. The absurd version is perfectly reasonable in a classroom setting where, say, code structure is being explained and the student is expected to abstract away certain pieces of the problem. But, yeah, no one would try to do that for the purpose of actually learning the answer.

I’m not sure which pieces of the thread you are digging into. What do you think of my chess example?

billfish678 · February 1, 2016, 7:30pm

Yes, no doubt the absurd version has its uses in the classroom. I guess part of my point was you were using what you know to calculate what you know, which is silly. And again, this is in response to Chronos point. That using random number generators to gather data makes the process a Monte Carlo process, which IMO is at another silly corner of this discussion space. Technically speaking ALL data gathering is random at some level.

I’m not sure I like the chess example to be honest (sorry). But then again I haven’t thought it through much and for some reason today I’m better at asking questions than giving explanations. Maybe my inner Buddha Feymann is coming out?

For some reason this question seems vaguely related. Does an integral you don’t have an exact solution for have an exact value?

Pasta · February 1, 2016, 8:09pm

I guess I meant whether you thought that that was a Monte Carlo calculation or not. Here the state space is small (only 64 sites to probe), but there are many real-world examples where the state space is combinatorically enormous, yet asking a question about any single randomly drawn state is easy. That is, being able to ask something about a random state (or site on the chessboard) is only part of the problem. You still have to sample the states, and in many problems that is hard to do exhaustively even if there is a finite number of them available and even if asking a question about each is easy.

The distinction I think many are leaning on is whether the information gathered is data or “just math”. What if the chessboard were instead a 15-dimensional version played by aliens, with 8[sup]15[/sup] possible hyper-squares to probe? Each board site still has a yes/no answer about whether a pawn is there, but we know so little about this 15-D game that the MC procedure I outlined earlier feels a lot like data gathering rather than a calculation. But the only real distinction is that I increased the number of board sites to a level that you couldn’t practically enumerate with the tools or time available.

Some classes of data gathering are, but certainly not all. If the data I want to gather is “How many people attended the Super Bowl in person last year?” there needn’t be anything random about that.

A bit underspecified again, I think. Integral(x dx; x from 0 to 1.23456) has an exact value. However, I don’t have an exact solution for it because I haven’t attempted to get one. But that’s a question of practicality, which might not be what you mean by “don’t have an exact solution for”. Do you mean “no one knows how to calculate an exact solution (yet)?”

billfish678 · February 1, 2016, 8:14pm

Something pooped (heh) into my head just now.

There is a difference between a Monte Carlo measurement and Monte Carlo calculation (and a Monte Carlo simulation for that matter).

A Monte Carlo measurement is what Chronos is talking about. But again IMO technically ALL data collection is random at some level. And even if you disagree with that, it is pretty much understood that you have to be careful when collecting data to avoid bias and blah blah blah so some random aspect MAY be designed into the process. This is such a given that something like polling will not mention Monte Carlo.

A Monte Carlo calculation uses randomness to calculate something that has an exact value in theory but the math is too complex or there is no explicit solution so Monte Carlo becomes an option. These WILL typically will refer to Monte Carlo to differentiate them from the OTHER methods of doing the same.

A Monte Carlo simulation will often combine the two.

PS. Typing while you were typing and have to run for a while…

billfish678 · February 1, 2016, 8:18pm

No. I mean an exact value period (not sure my math vocab is up to snuff or that my point is well thought out for that matter).

Gotta run.

lazybratsche · February 1, 2016, 8:57pm

Except in the real world there is no easily-enumerated space from which to sample for a political poll. There is no single database with a unique record and reliable contact information for “All Americans”. After that it becomes even more of an art than a science because people don’t have to answer or may not give an accurate answer. There’s also no way to affordably contact a sample of people that doesn’t introduce biases. This is all to say that the best-attempt-at-random-sampling that pollsters do is such a trivial part of their job that using the modifier “Monte Carlo” doesn’t aid communication in any way.

Otherwise I mostly agree with you, because the boundary between sampling physical objects or their digital representations can be fuzzy. If you really wanted to, all the Monte Carlo methods I’m acquainted with for DNA and protein sequence analysis could be done by physically pulling balls from an urn or looking at bands on an old fashioned sequencing gel. But if it’s OK with you I’m going to use all the computational resources I have available so I can finish my PhD before the death of the solar system…

watchwolf49 · February 1, 2016, 11:04pm

This might be a hijack, and if so I apologize. In Monte Carlo, we can very accurately calculate the chances of hitting on black 29, 1 in 37, without doing any sampling at all. Now if we run 37,000,000 trials, then our results will be very very close to 1,000,000 wins, confirming our calculated method and/or alerting us that the wheel is unbalanced.

Chronos · February 2, 2016, 2:19am

OK, how about this? Is the Monte Carlo method for numerical integration a Monte Carlo method?

Measure_for_Measure · February 3, 2016, 2:26am

Yes.

Blink. I guess they probably do. :o I can fall back to common usage criteria: polling isn’t typically referred to as a Monte Carlo exercise. Probably because Monte Carlo methods typically involve more than simply pulling a number out of an urn: there are computational algorithms that are coded as well that use these random numbers as inputs. That is true for both simulation and numerical integration.