The Maximum Ball Number |
Simulation of the ball and urn experiment
In the ball and urn experiment, let V be the random variable that gives the largest number of a ball in the sample:
V = max{X1, X2, ..., Xn}
where Xi is the number of the i'th ball selected.
Suppose first that the sampling is with replacement.
1. Use the fact that the sample ball numbers are independent, and each is uniformly distributed on {1, 2, ..., N} to show that
2. Use the result of Exercise 1 to show that, the density function of V is given by
3. In the urn experiment, select sampling with replacement and random variable V. Vary the parameters and note the shape of the graph of the density function. For N = 50 and n = 10, run the experiment with an update frequency of 100 and watch the apparent convergence of the relative frequency function to the density function.
4. For fixed n and k, show that
Interpret the result.
5. For fixed N, show that
Interpret the result.
Suppose now that the sampling is without replacement.
6. Use a combinatorial argument to show that
Hint: Consider the outcome of the experiment as an (unordered) combination of size n chosen from the population of size N and recall that these combinations are equally likely.
7. In the ball and urn experiment, select sampling without replacement and random variable V. Vary the parameters and note the shape of the graph of the density function. With N = 50 and n = 10, run the experiment with an update frequency of 100 and watch the apparent convergence of the relative frequency function to the density function.
8. Show that the density function of V in Exercise 6 establishes the following binomial coefficient identity:
9. If the sampling is without replacement, show that the expected value of V is
E(V) = n(N + 1) / (n + 1)
10. Use the result of Exercise 9 to show that
[(n + 1)V / n] - 1
is an unbiased estimator of N.
The estimator in Exercise 10 was used during World War II to estimate the number of German tanks N that had been produced. German tanks had serial numbers and the captured German tanks formed the sample data.
11. In the ball and urn experiment, select sampling without replacement and set N = 100, R = 30, and n = 15. Run the experiment 50 times, updating after each run. On each run, compute the estimate of N based on Y and the estimate of N based on V. For each estimator, compute the square root of the average of the squares of the errors over the 50 runs. Based on these empirical error estimates, which estimator of N works better?
12. Since the estimator of N based on V is unbiased, its variance is a measure of the quality of the estimator. Show that
var{[(n + 1)V / n] - 1} = (N + 1)(N - n) / [n(n + 2)]
The Ball and Urn Experiment |