Getting someone to write a program for me?

brossa · March 24, 2009, 8:56pm

In order to serve that most noble of causes, namely winning an Internet bet, I need to find a way to solve a particular problem by either using an existing program or having someone write some code for me. The first approach is preferable, since I have no idea as to how to do the second.

Here is the problem: I want to create a database of all possible means of n cards drawn without replacement from a hat of m cards numbered x1,x2,x3,x4…etc. So drawing 4 cards from a hat of 200 gives 200x199x198x197 = 1,552,438,800 possible combinations, each with its own mean. I wish then to compare this database to a given number and determine exactly how many of the means are greater or less than that given number.

Can anyone point me in the right direction - such as a way to do this with Excel or something, or how to find someone to write me a program? I have a Statistics101 program with which I can check p means of n cards against a given number, but the p means are drawn randomly rather than systematically. For large n and m, I’d expect to have to let the thing run on a laptop for a looong time, but that’s not a dealbreaker.

iamthewalrus_3 · March 24, 2009, 10:24pm

It sounds like the problem isn’t that hard of one to solve in any reasonable programming language.

So, you’ve got a set of m cards with some numbers on them x[sub]i[/sub], and you want to consider all sets of n of them. By “mean”, do you mean arithmetic mean? as in: the average?

So, then you’ve got a whole bunch of averages, and you want to know how many of them are larger than some number p?

Do you need to know exactly? If not, then the easiest solution is to just do some random samples and see what fraction are larger than p. You can make this as accurate as you want by increasing the size of the test run.

Chronos · March 24, 2009, 11:01pm

In principle, this isn’t a hard problem, but the devil is in the details. If you literally want to do all 1.5 billion cases (as opposed to the Monte Carlo sampling (:3= suggests), you’re going to have to use a ton of memory, and most programming environments will make you jump through a bunch of hoops to use that much memory.

If there’s only one given number that you want to compare to (or a small number of them), then you could get away with not actually storing the database, but just looking at one draw at a time, and keeping running totals.

runcible_spoon · March 24, 2009, 11:41pm

Well, skipping over the limits that **Chronos **mentions, which are going to be an issue no matter how you do it if you really want to use a 200-card deck, you could go through Excel to Visual Basic for Applications. I started poking around with it, and it shouldn’t be too complicated. If no one has a better option, I’ll finish it tomorrow.

If you’re curious about the process, I started out with a macro to create a database for small sets of cards (like, 3 cards from a 6-card deck) and once I’m sure that’s working properly, I’ll probably drop the write portion, and just start counting how many of the means are sufficiently large. Of course, you could also probably just write an algebraic function to determine the answer, but I’m unfamiliar enough with statistics that I like being able to check my answers (at least, the early ones).

zut · March 25, 2009, 12:11am

Do you have to actually run all cases, or can you just calculate results? In your example, the averages are discrete values from 2.5 (10/4) up to 198.5 (794/4). There’s only one combination that results in 2.5 (1 + 2 + 3 + 4)/4, one that results in 2.75, two that result in 3, three that result in 3.25, etc. Seems to me there should be some standard permutation/combination calucualtion you could use to step through all possible discrete values.

brossa · March 25, 2009, 12:38am

Thanks very much for the quick responses.

Yes, I want to exhaustively pick n cards and calculate the arithmetic mean of their values. I’ll need to be able to specify what the card values are, since they won’t necessarily follow a nice pattern: 0, 1, 3, 3.45, 6.7, 9, 57, etcetera. I will be comparing the means to only one or two ‘experimental’ numbers. So far I can’t see a reason to actually keep track of the individual means, as long as an exact running tally is kept of whether each mean is greater than the ‘experimental’ number, and it can be shown that every possible combination of n cards has been tested.

(:3= and Chronos have struck right to the heart of the matter: the issue is one of comparing Monte Carlo approximations (for a variable number of samplings) to an exact answer. I’ve already got a way to run Monte Carlo but to settle the issue I need to show what the brute-force result is. For the purposes of this bet, just running Monte Carlo with a number of samplings that is, say, 100 times the number of possible combinations of n cards from m is insufficient.

Sunspace · March 25, 2009, 12:54am

If you have money, I know a guy who wants programming jobs…

RaftPeople · March 25, 2009, 2:43am

I have a working program (kind of a fun little challenge). How many numbers (cards) do you have, if not a ridiculous amount, just post them with commas separating them, I’ll copy into the program and run it for you.

Peter_Morris · March 25, 2009, 3:06am

Not quite that bad, actually, because the order of the cards doesn’t matter. I make it 64.6 million.

brossa · March 25, 2009, 4:37am

I don’t know that it matters, but I forgot to add that some of the cards are repeated multiple times. It’s late here and I’m tired and can’t figure out how the repeated cards affect Peter Morris’s point about ordering.

At any rate, here’s the first set - 72 cards in total - six copies each of the following: 0, 10, 20, 25, 27.5, 29, 30, 30.5, 31, 31.5, 32, and 32.5. Draw 4 cards without replacement and determine the arithmetic mean of their values. Compare with the ‘experimental mean’ of 31.75. What percentage of means are equal to or greater than 31.75?

If that appears doable, the next set gets quite a bit harder: 340 cards made up of 12 groups. Ten copies of 0, then 30 copies each of the following: 10, 20, 25, 27.5, 29, 30, 30.5, 31, 31.5, 32, and 32.5. Draw five without replacement. Compare the means to the ‘experimental mean’ of 32.1. What percentage of means are equal to or greater than the experimental mean?

RaftPeople · March 25, 2009, 6:27am

brossa:

I don’t know that it matters, but I forgot to add that some of the cards are repeated multiple times. It’s late here and I’m tired and can’t figure out how the repeated cards affect Peter Morris’s point about ordering.

At any rate, here’s the first set - 72 cards in total - six copies each of the following: 0, 10, 20, 25, 27.5, 29, 30, 30.5, 31, 31.5, 32, and 32.5. Draw 4 cards without replacement and determine the arithmetic mean of their values. Compare with the ‘experimental mean’ of 31.75. What percentage of means are equal to or greater than 31.75?

If that appears doable, the next set gets quite a bit harder: 340 cards made up of 12 groups. Ten copies of 0, then 30 copies each of the following: 10, 20, 25, 27.5, 29, 30, 30.5, 31, 31.5, 32, and 32.5. Draw five without replacement. Compare the means to the ‘experimental mean’ of 32.1. What percentage of means are equal to or greater than the experimental mean?

When I said working program, I may have wanted to qualify that as “it compiled and the combinations looked accurate when tested with 6 cards 3 drawn”. I believe my program is creating the set of combinations not permutations, hopefully that is what you want.

For the first set, this is what I get:
Total combinations=1028790
Total with mean (as in the average of the 4 cards drawn) > =31.75=8706

RaftPeople · March 25, 2009, 7:06am

I’m running the big one (340 with 5 cards drawn). It’s going through about 1 billion combinations per minute, this is probably going to take a while.

It’s at 9 billion and not a single hit yet (my algorithm starts with the first 5 cards selected, then the rightmost card shifts up one position, etc. so it starts with the lowest numbers in your list)

brossa · March 25, 2009, 11:58am

Thanks very much for doing this. The first one (8706/1028790 = 0.008462) matches the Monte Carlo numbers that I was getting (0.00846). I’m not gonna sweat significant digits on this one. Looking great so far!

When you say ‘total combinations’ I take it that that means total number of unique means, so for example all the various ways of drawing four zeros get counted once? That’s fine as long as we can establish somehow what the exact relationship is between unique means and total permutations - the formula for that escapes me at the moment.

RaftPeople · March 25, 2009, 2:55pm

It finished after I went to bed, here are results (it took 36 minutes):

Total combinations: 36,760,655,568
Mean>=32.1: 22,536,306

RaftPeople · March 25, 2009, 3:10pm

This is how my algorithm works (example with 6 cards, 3 drawn at a time):



Card Value               0     0     10    10    20    20
Card Index               0     1     2     3     4     5
Cards Drawn 1st          X     X     X
Cards Drawn 2nd          X     X           X
Cards Drawn 3rd          X     X                 X
Cards Drawn 4th          X     X                       X
Cards Drawn 5th          X           X     X
Cards Drawn 6th          X           X           X
Cards Drawn 7th          X           X                 X
Cards Drawn 8th          X                 X     X
Cards Drawn 9th          X                 X           X
Cards Drawn 10th         X                       X     X
Cards Drawn 11th               X     X     X
Cards Drawn 12th               X     X           X
etc.

It’s not perrmutations because the order of drawing the cards doesn’t matter, it’s just unique combinations of the cards available.

But it’s not unique means because 2 different combinations could have the same mean. For example, in the set you gave me with 10 zero cards, the first 2 sets of 5 cards drawn (1,2,3,4,5 and 1,2,3,4,6) will both have a mean of 0, but they are unique combinations of the cards so they both need to be counted (unless you were looking for something else).

brossa · March 25, 2009, 6:07pm

Thirty-six billion thanks for running that for me, RaftPeople! Man, the SDMB is an awesome place. Now I can go off and school Somebody Who Is Wrong on the Internet.

RaftPeople · March 25, 2009, 7:11pm

Did the percent work out ok on that one?

Also, you have me curious, what was the nature of the debate?

RaftPeople · March 25, 2009, 7:18pm

Before you do that, we should check our work. I’m checking to see if number of combinations is same as if using the formula to calc number of combo’s.

RaftPeople · March 25, 2009, 7:25pm

Ok, it looks correct.

Topic		Replies	Views
Stats Question Factual Questions	9	749	March 1, 2003
Statistics help for an old codger Factual Questions	6	859	December 15, 2008
Math problem for a dope, Part Deux... Factual Questions	18	1909	September 13, 2016
Several probabilities questions Factual Questions	7	1814	November 27, 2012
Probabilities involving multiple groups Factual Questions	8	784	September 19, 2006

Getting someone to write a program for me?

Related topics