Okay, I have a pretty solid math background but it’s been a long time (I’ve spent the last three years learning useful stuff like the parol evidence rule and the rule against perpetuities). But I have a general interest in stuff like game theory, so here goes my simple question.
Given a basic payoff matrix, as follows:
P Q
R S
with player A deciding which row and B deciding which column to select, what’s the formula for an optimal strategy?
If A chooses row 1 x% of the time, and B chooses column 1 y% of the time, then A’s expected payoff is
P(a)= xyP+x*(1-y)*Q + (1-x)yR + (1-x) * (1-y)*S
Since this is a zero sum game, B’s is simply (-P(a)), and of course both players wish to maximize their profit/ minimize their loss given perfect play from the other.
So the basic question is how do you calculate this? Is there a simple formula, or do you have to do some sort of recursive iteration?
If memory serves, to find the optimal mixed strategy for A you would try find a solution to the equation xP+(1-x)R=xQ+(1-x)S. If a solution for x exists and if that solution is between 0 and 1, then A should choose the first row with probability x and the second with probability 1-x. If no solution exists or if the solution is outside the range from 0 to 1 then that means one strategy “dominates” the other, i.e. A should either pick the first row all the time or the second all the time.
If this is correct (my memory could be off), then the corresponding optimal mixed strategy for B is given by the solution to yP+(1-y)Q=yR+(1-y)S (again, if there exists a solution between 0 and 1).
So for example if the payoff matrix is
3 1
0 2
Then A’s best strategy is to pick the first row 1/2 of the time and the second 1/2 of the time, while B’s best strategy is to pick the first column 1/4 of the time and the second column 3/4 of the time. Then the expected payoff for A is 3/2.
I probably should point out that by “optimal”, I mean that this should be an optimal minimax strategy; i.e. a strategy producing the best possible worst-case scenario. The idea is that any other mixed strategy by A fares worse against some strategy by B, and vice versa.