Variance

Home

Definition

Recall that the expected value or mean of a random variable X gives the center of the distribution of X. The variance of X is a measure of the spread of the distribution about the mean and is defined by

var(X) = E{[X - E(X)]2}

Mathematical Exercise 1. Suppose that X is a discrete random variable taking values in a subset S of R, with density function f. Use the change of variables theorem to show that

Mathematical Exercise 2. Suppose that X is a continuous. random variable taking values in a subset S of R with density function f. Use the change of variables theorem to show that

The standard deviation of X is the square root of the variance:

It also measures dispersion about the mean but is in the same units as the variable X.

Properties

The following exercises give some basic properties of variance, which in turn rely on basic properties of expected value:

Mathematical Exercise 3. Show that var(X) = E(X2) - [E(X)]2.

Mathematical Exercise 4. Show that var(X) 0

Mathematical Exercise 5. Show that var(X) = 0 if and only if P(X = c) = 1 for some constant c.

Mathematical Exercise 6. Show that if a and b are constants then var(aX + b) = a2var(X)

Examples

Mathematical Exercise 7. Suppose that I is an indicator variable with

P(I = 1) = p, P(I = 0) = 1 - p

  1. Show that var(I) = p(1 - p).
  2. Sketch the graph of var(I) as a function of p.
  3. Find the value of p that maximizes var(I).

Mathematical Exercise 8. Suppose that X is uniformly distributed on {1, 2, ..., n}. Show that

var(X) = (n2 - 1) / 12

Mathematical Exercise 9. Suppose that X is uniformly distributed on the interval (a, b) where a < b. Show that

var(X) = (b - a)2 / 12.

Note in particular that the variance depends only on the length of the interval, which is intuitively reasonable.

Mathematical Exercise 10. Suppose that X has the power distribution with parameter a > 1, which has probability density function

f(x) = (a - 1)x-a for x > 1

Show that if a > 3,

Mathematical Exercise 11. Suppose that X is a real-valued random variable. Define

Show that Z has mean 0 and variance 1.

The random variable Z in Exercise 11 is sometimes called the standard score associated with X. Since X and its mean and standard deviation all have the same units, the standard score Z is dimensionless. It measures the directed distance from X to its mean in terms of standard deviations.

Mathematical Exercise 12. Marilyn Vos Savant has an IQ of 228. Assuming that the distribution of IQ scores has mean 100 and standard deviation 15, find Marilyn's standard score.

Mean Square Error

Suppose that we want to approximate a random variable X with a single real number t, and we measure the quality of the approximation by the mean square error

MSE(t) = E[(X - t)2]

(recall that this is the second moment of X about t).

Mathematical Exercise 13. Show that

MSE(t) = E(X2) - 2t E(X) + t2.

Mathematical Exercise 14. Show that MSE(t) is minimized when t = E(X) and that the minimum value is var(X).

The root mean square error is the square root of the mean square error:

RMSE(t) = [MSE(t)]1/2.

Mathematical Exercise 15. Show that RMSE(t) is minimized when t = E(X) and that the minimum value is sd(X).

For more on this topic read the section on mean square error for frequency distributions. The section on mean absolute error for frequency distributions gives some insight on why mean square error is the best choice for measuring the error.

Chebyshev's Inequality

Mathematical Exercise 16. Use Markov's inequality to prove Chebyshev's inequality: for t > 0,

Mathematical Exercise 17. Establish the following equivalent version of Chebyshev's inequality: for k > 0,

Mathematical Exercise 18. Suppose that X is uniformly distributed on the interval (0, 6). Compute the true value and the Chebyshev bound for the probability that X is at least 2 standard deviations away from the mean.

Mathematical Exercise 19. Suppose that X has the power distribution with parameter a > 3:

f(x) = (a - 1)x-a for x > 1

Compute the true value and the Chebyshev bound for the probability that X is at least 3 standard deviations away from the mean.

The variance of a sum of random variables is best understood in terms of a related concept known as covariance.


Expected Value

PreviousNext