The Simple Birthday Problem |
Simulation of the simple birthday experiment
Consider the birthday experiment of distributing k balls into the n cells. We will record the outcome of the experiment as the random vector
X = (X1, X2, ..., Xk)
where Xi is the number of the cell containing ball i.
Thus, the sample space is
S = {1, 2, ..., n}k.
Our basic modeling assumption is that the balls are independently and randomly placed in the cells. Mathematically this means that the basic outcome variables
X1, X2, ..., Xk
are independent and each is uniformly distributed on {1, 2, ..., n}.
1. Show that the modeling assumption is equivalent to X being uniformly distributed on S.
The simple birthday problem is to compute the probability of the event
Bn,k: some cell contains 2 or more balls.
If we choose k people at random and note their birthdays (thus n = 365), then the birthday problem is to compute the probability that at least two people have the same birthday (this special case is the origin of the name).
2. Show that in terms of the basic outcome variables
Bn,k = {Xi = Xj for at least one distint pair of indices i, j}
The solution of the birthday problem is an easy exercise in combinatorial probability.
3. Use the multiplication rule of combinatorics and Exercise 1 to show that
Hint: The complementary event occurs if and only if the outcome vector X forms a permutation of size k from {1, 2, ..., n}
The fact that the probability is 1 for k > n is sometimes referred to as the pigeonhole principle: if more than n pigeons are placed into n holes then at least one hole has 2 or more pigeons.
4. Let n = 365 (the true birthday problem). Show that the probability that some cell contains 2 or more balls is
In the simulation of the birthday experiment, you can vary n and k with scroll bars. When you run the experiment, the basic varibles Xi and the cell counts of the occupied cells are shown on each update. The indicator variable for the birthday event is recorded on each update and the density function is shown in the distribution graph and table.
5. In the birthday experiment, set n = 365. For k = 20, 30, 40, 50, and 60, run the experiment 1000 times each and compute the relative frequency of the event that some cell contains 2 or more balls. Compare the relative frequencies with the probabilities computed in Exercise 4.
In spite of its easy solution, the simple birthday problem is famous because, numerically, the probabilities can be a bit surprising. In the setting of Exercise 4, note that with a just 60 people, the event is almost certain! Mathematically, the rapid increase in the birthday probability in Exercise 3, as k increases, is due to the fact that the denominator grows much faster than the numerator.
6. In the birthday experiment, set n = 365. Vary k and note graphically how the probability changes.
7. Repeat Exercise 6 with n = 100.
8. Repeat Exercise 6 with n = 200.
9. Let bn,k denote the probability of the complementary event, that no cell contains two or more balls. Prove the following recursion relation in two ways: first starting with the result in Exercise 3, and then by using a conditional probability argument.
10. Let n = 52 (corresponding to birth weeks). Use Exercise 9 to find the smallest value of k such that the probability that some cell contains 2 or more balls at least 1 / 2.
11. Run the birthday experiment 1000 times with n = 52 and the value of k that you found in Exercise 10. Compare the relative frequency of the event that some cell contains 2 or more balls with the probability that you found in Exercise 10.
Ball and Cell Experiments |