I'm thinking of a number. How many question do you need to find it?

A question in MPSIMS got me thinking about this.

Suppose I’m thinking of a number from 1 to 100, inclusive.
I’ll answer as many “Yes-No” questions as you need to find the answer. What is the best question asking strategy?

Yeah, I know the answer, too… Ask if the number is greater than 50. If it is, ask if the number is greater than 75, and so on. In essence, using the middle point of all the possible numbers as the number you ask the question about.

Is this the best strategy, however? Intuition tells me “yes”, but intuition also tells me not to pick “1-2-3-4-5-6” when I fill out my lottery slips.

Is there any other strategy that is equally as good?

Firstly I tie you to a rack and heat up the pliers. Then I ask “Are you going to tell me the f***ing number?”

If you answer “no” there will be a brief interlude involving much screaming.

I bet I could get the answer within two questions. :wink:

First, I’d want to know if it was odd or even. Then a multiple of 3. Then over or under 50.

O.K. I am completely inept when it comes to math. Is there any distinct advantage to asking about odd or even numbers before asking whether the number is greater than or less than fifty? Personally, I can’t see one, but that means nothing. If there is ne difference, however, I think asking whether the number is greater or less than fifty ought to come first. No answer to the odd or even question could answer the question by itself. There is a one percent chance that the “fifty” question could produce a final answer.

What if you ask if it’s higher or lower than 33?

I don’t think there’s any way you can improve on this.

Think of it this way. You’re basically breaking up the 100 numbers into two piles–you’ll know it’s in pile one if the person answers “yes”, pile two if the answer is “no”. What if the “yes” pile has only one number in it and the answer is “no”? That means you’ve now narrowed it down to 99 possible numbers–not really much of a help. So the strategy should be to divide the “yes” and “no” piles evenly–picking the midpoint and asking “higher or lower” is one way of doing this (asking odd/even would also be a good first question–anything to guarantee you’ll cut the pool of possibilities in half).

How many guesses will it take this way? Basically, take the logarithm (base two) of 100 (or however large the pool of numbers is) and round up. So it should take at most seven guesses if the number is from 1 through 100.

If the most information that can be gained in one answer is binary (i.e. yes or no), then the initial algorithm is as good as any. There are other possibilities, but that’s the simplist, and as I say nothing is more effective. The maximum number of questions needed is the total possibilities log 2.

I hadn’t thought of that, although technically asking “is it greater than or less than 50?” would elicit the answer “yes” under the terms of the OP. There really is no difference in terms of the number of possibilities eliminated. Using 33 might eliminate 67 of the possibilities, but it is more likely to eliminate 33 of them. Uisng 50 always eliminates 50% of the possible numbers. The two multiple questions, taken together, eliminate up to 84 possibilities.

The method you guys are working on with the halving of the numbers is known as a “binary search.” It is the optimal method for searching with the lowest number of steps. There’s a formula I used to know to determine the minimum number of steps but I’ve forgotten. You can find a detailed examination of the binary search algorithm in Donald Knuth’s “The Art of Computer Programming, vol 3, Sorting and Searching.”

saoirse wrote:

Only 1/3rd of the time. The other 2/3rds of the time, they only eliminate 67 possibilites. Averaged out over multiple guessing games, that’s about 73 possibilities gone after the first two questions.

Go for the extreme case: ask 1) is the number even or odd, and 2) is the number greater than 2? 2% of the time, you’ll have the number after these two questions (if answer 2 is ‘no’). The other 98% of the time, you’ll still have 49 possible numbers left. It works out to, on average, just 52.02 numbers eliminated after these two questions.

On the other hand, a straight binary search guarantees that after the first two questions, 75 possibilities will be eliminated, every time.

Of course, all of what’s been said here so far ignores an implication in the OP: that picking winning lottery numbers is somehow similar to a binary search. Lotteries don’t give you feedback (“the winning number is higher than the one you’ve picked”) until it’s too late. The optimal strategy for the lottery is not to play. :slight_smile:

Picking 1-2-3-4-5-6 as your lottery numbers may actually be a good strategy.

All combinations of numbers are equally likely (unlikely) to come up in a lottery drawing. 1-2-3-4-5-6 is as likely as 5-8-11-13-19-30 or any other combination of six numbers (or however many numbers you pick in your version of the lottery.

The reason picking 1 through 6 may be a better strategy is that fewer people pick this combination as it is considered more unlikely. That way if/when you win there is a bigger chance that you are alone in having 6 correct numbers and do not have to share your millions with other winners.

A good book on this subject is The winner’s curse : paradoxes and anomalies of economic life by Richard H. Thaler.

According to Thaler certain numbers are picked more often than others in Lotteries and by picking unpopular numbers you can increase the size of your win (but not your chance of actually winning). According to Thaler people often use birthdays of friends and family in picking their numbers which means that numbers below 31 will be picked more often than numbers above 31. So picking high numbers may be a better strategy.

But as DaveW pointed out the best strategy of all may be not playing at all. Atleast if you are going by expected return.

/Andreas

Yeah, it does seem somewhat natural to think that, but I’ve heard several times that in the VA state lottery, for example, many people play 1-2-3-4-5-6 every drawing; you would never win much playing those numbers. It’s much better to play numbers with no obvious pattern in them, to help insure you don’t have to split (assuming you win at all, of course).

ya know, i think that’s the best answer (considering the original question) so far. :smiley:

dives for cover

It seems to me you could do slightly better than a binary search. My understanding is that a binary search over 100 numbers is the same “power” as one over 128 numbers. Still have up to 7 guesses either way, right?

How about question 1,
is the number less than or equal to 64?

If the answer is yes, you have 64 answers to search, and your binary search has the same power as if there were 60. If the answer is no, you’re down to only 36 instead of 50.

No wait, 36 is still more than 32 which seems to be the threshold you would need.

But there’s something here… anyone else want to chip in on this idea?

muttrox, I think a scheme like yours is better, at least in reducing the average number of questions required:

Q1: Is the number greater than 64? If not, perform a standard binary search, which will take exactly 6 more questions, a total of 7 questions.

Otherwise,Q2: Is it greater than 80? If not, 4 more questions required, a total of 6 Qs.

Otherwise, Q3: Is it greater than 88? If not, 3 more Qs reqired, total 6.

Otherwise, Q4: Is it greater than 92? If not, 2 more Qs required, total 6.

Otherwise, 3 more questions are required, for a total of 7 questions.

So this scheme would take 6 questions to identify a number between 64 and 92, otherwise it would take 7 questions, which is certainly better than the binary search, which would almost always take 7 questions.

Oops. On second thoughts, it would often take only 6 questions.

Thanks Usram, that’s exactly the kind of scheme I was thinking about, I just couldn’t articulate it.

muttrox wrote:

Well, see above. If the first question you ask is “is the number greater than 64?” then there’s a 64% chance that you’ve eliminated 36 possibilities, and a 36% chance you’ve gotten rid of 64 numbers. Over many guessing games, this first question eliminates, on average, 46.08 numbers. Asking about 50, though, eliminates 50, every time.

Thinking about this more, if the answer to the 64 question is no, you’re left with 64 possibilities, which will require a further 6 questions, or 7 total, which is the most you’ll need for the standard search, anyway. No difference. And, because 36 is more than 32, you’ll need either 5 or 6 more questions if the answer to the first question is ‘yes’, so no difference there, either (since in the usual search, you’ll sometimes get it in 6 questions instead of 7 - those nasty fractions).

So, it seems to me there’s no real difference at all, except in how many possible numbers you want to eliminate early on. But using a first number of more than 64 (or less than 36) will definitely mess things up by adding unneeded questions.

Now, here’s something that’s a little more interesting: the above is written with the assumption that if the number to guess is, indeed, 50, asking the question “is the number greater than 50?” will result in a ‘no’ answer, and the search will continue.

But if we “change the rules” so that a third answer is possible (“the number is 50”), which ends the search, we throws things off. See, in a “pure” search, you always need either 6 or 7 questions to get the answer. If this twist on the rules is tossed in, you can get the right answer in just one question. If not, and the answer is ‘no’, then there are just 49 possibilties left, because you know it’s less than 51, but it’s also not 50, leaving just 1 through 49.

The binary searches we computer people do all the time actually do this, instead of the “pure” search, in the interest of speed. If you’re going to go through the trouble of getting the data and comparing it to the “current guess,” well, dangit, if it’s equal you can stop right away.

Usram wrote:

Nope. Same number exactly. Go ahead and work out the binary tree for guessing a number between 1 and 25. 7 of these numbers will take just 4 questions, the rest 5. Multiply this result by four (the first two questions in the full 1-100 game divide things up into 4 blocks of 25 numbers), and you find that 28 numbers get guessed in 6 questions, the rest in 7, which is just what you wrote down (given that you’ve got a typo in there - the number 64 needs 7 questions, it’s 65 through 92 which only need 6).

Hey, here’s something nifty:

Using a “pure” binary search of n possible numbers, the minimum number of questions (call it Q[sub]min[/sub]) is the integer part (no decimals) of log[sub]2[/sub]n. The maximum, Q[sub]max[/sub] is Q[sub]min[/sub]+1 (except when n is a power of 2, in which case don’t add 1). This isn’t the nifty part yet, as this has already been pointed out by Cabbage and Bill H..

The nifty part is this: take the fractional part of log[sub]2[/sub]n and subtract it from 1 - call this value x, for the heck of it. Calculate 2[sup]x[/sup], and take the fractional part of the result. This appears to be the percentage of numbers for which Q[sub]min[/sub] will be the number of questions needed.

This works for 100 (28%, see my previous post), 3 (33%), 5 (60%), 6 (33%), 7 (14%), 9 (77%) as well as for any n which is a power of 2 (well, they’re special cases, since Q[sub]min[/sub]=Q[sub]max[/sub], .000 must be taken as 100%). I’m quickly running up against the limit of my ability to create binary trees in my head, and after having drawn two of them today already, I’ll leave 10 and up for braver souls than I (gotta check the math answer against real life, after all).

Or is this already in some book somewhere?