Estimation of N with R Known

Home

Java Applet Simulation of the ball and urn experiment


The Estimation Problem

In the ball and urn experiment, suppose that the total number of red balls R is known but that the total number of balls N is unknown. We wish to estimate N after sampling n balls and observing Y, the number of red balls in the sample. Recall that if the sampling is without replacement, Y has the hypergeometric distribution with parameters N, R, and n, while if the sampling is with replacement, Y has the binomial distribution with parameters n and p = R / N.

As a more realistic example of this type of problem, suppose that we have a lake containing N fish where N is unknown. We capture R of the fish, tag them, and return them to the lake. Next we capture n of the fish and observe Y, the number of tagged fish in the sample. We wish to estimate N from this data. In this context, the estimation problem is sometimes called the capture-recapture problem. Naturally, sampling without replacement is usually used.

Mathematical Exercise 1. Do you think that the main assumption of the ball and urn experiment, namely equally likely samples, would be satisfied for a real capture-recapture problem? Explain.

Once again, we can derive a simple estimate of N by hoping that the sample proportion of red balls is close the population proportion of red balls. That is,

Y / n ~ R / N so N ~ nR / Y (if Y > 0).

Thus, our estimator of N is n R / Y if Y > 0 and is undefined if Y = 0.

Simulation Exercise 2. In the urn experiment, select sampling without replacement and set N = 100, R = 30, and n = 20. Run the experiment 50 times, updating after each run. On each run, compute n R / Y, the estimate of N. Now compute the square root of the average of the squares of the errors over the 50 runs.

Simulation Exercise 3. Repeat Exercise 2, sampling with replacement. Compare the results.

Properties

Mathematical Exercise 4. Suppose that the sampling is with replacement. Show that if k > 0, then n R / k maximizes P(Y = k) as a function of N for fixed R and n. This means that n R / Y is the maximum likelihood estimator of N.


The Ball and Urn Experiment

PreviousNext