Variance |
![]() |
Recall that the expected value or mean of a random variable X gives the center of the distribution of X. The variance of X is a measure of the spread of the distribution about the mean and is defined by
var(X) = E{[X - E(X)]2}
1. Suppose that X is a
discrete random variable taking values in a subset S of R,
with density function f. Use the change of variables
theorem to show that
2. Suppose that X is a
continuous. random variable taking values in a subset S
of R with density function f. Use the change of
variables theorem to show that
The standard deviation of X is the square root of the variance:
It also measures dispersion about the mean but is in the same units as the variable X.
The following exercises give some basic properties of variance, which in turn rely on basic properties of expected value:
3. Show that var(X) = E(X2)
- [E(X)]2.
4. Show that var(X)
0
5. Show that var(X) = 0 if
and only if P(X = c) = 1 for some
constant c.
6. Show that if a and b
are constants then var(aX + b) = a2var(X)
7. Suppose that I is an
indicator variable with
P(I = 1) = p, P(I = 0) = 1 - p
8. Suppose that X is uniformly
distributed on {1, 2, ..., n}. Show that
var(X) = (n2 - 1) / 12
9. Suppose that X is uniformly
distributed on the interval (a, b) where a < b.
Show that
var(X) = (b - a)2 / 12.
Note in particular that the variance depends only on the length of the interval, which is intuitively reasonable.
10. Suppose that X has the
power distribution with parameter a > 1, which has
probability density function
f(x) = (a - 1)x-a for x > 1
Show that if a > 3,
11. Suppose that X is a
real-valued random variable. Define
Show that Z has mean 0 and variance 1.
The random variable Z in Exercise 11 is sometimes called the standard score associated with X. Since X and its mean and standard deviation all have the same units, the standard score Z is dimensionless. It measures the directed distance from X to its mean in terms of standard deviations.
12. Marilyn Vos Savant
has an IQ of 228. Assuming that the distribution of IQ scores has
mean 100 and standard deviation 15, find Marilyn's standard
score.
Suppose that we want to approximate a random variable X with a single real number t, and we measure the quality of the approximation by the mean square error
MSE(t) = E[(X - t)2]
(recall that this is the second moment of X about t).
13. Show that
MSE(t) = E(X2) - 2t E(X) + t2.
14. Show that MSE(t)
is minimized when t = E(X) and that
the minimum value is var(X).
The root mean square error is the square root of the mean square error:
RMSE(t) = [MSE(t)]1/2.
15. Show that RMSE(t) is
minimized when t = E(X) and that the
minimum value is sd(X).
For more on this topic read the section on mean square error for frequency distributions. The section on mean absolute error for frequency distributions gives some insight on why mean square error is the best choice for measuring the error.
16. Use Markov's inequality to prove Chebyshev's
inequality: for t > 0,
17. Establish the following
equivalent version of Chebyshev's inequality: for k >
0,
18. Suppose that X is
uniformly distributed on the interval (0, 6). Compute the true
value and the Chebyshev bound for the probability that X is at
least 2 standard deviations away from the mean.
19. Suppose that X has the
power distribution with parameter a > 3:
f(x) = (a - 1)x-a for x > 1
Compute the true value and the Chebyshev bound for the probability that X is at least 3 standard deviations away from the mean.
The variance of a sum of random variables is best understood in terms of a related concept known as covariance.
Expected Value |
![]() ![]() |