A Condition for Optimality

Java Applet Interactive red and black game

Java Applet Simulation of the red and black experiment

Recall that the stopping rule for the game of red and black is to continue playing until the gambler is ruined or his fortune reaches the target fortune a. Thus, the gambler's strategy is to decide how much to bet on each trial before he must stop. Suppose that we have a class of strategies that correspond to certain valid fortunes and bets:

A: set of fortunes, B_x: set of valid bets for x in A.

For example, sometimes (as with timid play) we might want to restrict the fortunes to integers from 0 to a; other times (as with bold play) we might want to use the interval [0, 1] as the fortune space. As for the bets, we will always assume that the gambler cannot bet what he does not have and does not bet more than he needs in order to reach the target. Thus, we at least have the conditions

x in A, y in B_x implies 0 y min{x, a - x}

Moreover, we always restrict our strategies to those for which the stopping time is N finite.

A strategy with win probability function V is optimal if for any other strategy with win probability function U we have

U(x) V(x) for x in A.

$Mathematical Exercise$ 1. Show if there exists an optimal strategy, then the win probability function is unique.

However, there may not exist an optimal strategy or there may be several optimal strategies. Moreover, the optimality question depends on the value of the trial win probability p, in addition to the structure of fortunes and bets.

Suppose now that S is a strategy with win probability function V. Our goal in this section is to show that if

pV(x + y) + qV(x - y) V(x) for x in A, y in B_x.

Then S is optimal.

$Mathematical Exercise$ 2. Consider the following strategy: if the initial fortune is x in A, we pick y in B_x and then bet y on the first trial; thereafter we follow strategy S. Condition on the outcome of the first trial to show that the win probability function for this new strategy is

U(x) = pV(x + y) + qV(x - y)

Thus, the theorem we are trying to prove in this section can be restated as follows: If S is optimal with respect to the class of strategies in Exercise 2, then S is optimal over all strategies.

Suppose now that (1) holds. Let T be an arbitrary strategy with win probability function U. The random variable

V(X_n)

can be interpreted as the probability of winning if the gambler's strategy is replaced by strategy S after time n.

$Mathematical Exercise$ 3. Condition on the outcome of trial n to show that

E[V(X_n) | X₀ = x] = E[pV(X_{n
- 1} + Y_n) + qV(X_{n
- 1} - Y_n) | X₀ = x]

$Mathematical Exercise$ 4. Use the results of Exercises 3 and the optimality condition (1) to show that for n = 1, 2, ...

E[V(X_n) | X₀ = x] E[V(X_{n
- 1}) | X₀ = x]

$Mathematical Exercise$ 5. Use the result of Exercise 5 to show that

E[V(X_n) | X₀ = x] V(x) for n = 1, 2, ...

$Mathematical Exercise$ 6. Take the limit as n increases in Exercise 6 to show that

E[V(X_N) | X₀ = x] V(x) where N is the stopping time for strategy T.

$Mathematical Exercise$ 7. Show that

E[V(X_N) | X₀ = x] = U(x)

Finally from Exercises 6 and 7 we have

U(x) V(x) for x in A.

and we have shown that strategy S is optimal.

A Condition for Optimality

Red and Black