|
We want to estimate the variance of the difference in the means.
We already know how to estimate the variance of the height of men , and
the corresponding quantity for women , it's does in eqn 2.9.
Now call the ith data point for the men and for the women
call it . Call the numbers and , respectively, because
they might be different. So the variance of the means is
 |
(2.10) |
Now we've got to do the same kind of manipulation that we've been doing before
except now it's more involved. If you assume that all the true variances are equal
(not the estimates and , then you get
.
But now we need to figure out how to estimate this. We know that the means
could be different, and the test we're devising is suppose to decide
if they are or not. So we're going to come up with an estimate for this variance
with the assumption of equal variances and possibly unequal means.
When all the smoke clears you get that the unbiased estimate for this is
 |
(2.11) |
But you can get the gist of what's going on
as follows. We've seen that variances of independent data add. So the
in the right hand side in the above equation, we can just add the variances
of two terms separately. When you average, you know that the variances of all
the terms like are all the same by assumption. That
basically gives you the right hand side, but it's not totally right because
this is a biased estimate. To make it unbiased, you got to put in that
in the denominator.
I'm not trying to give a detailed derivation at this stage, but it is worth
understanding how the equation behaves and where is comes from intuitively.
Without the
, this is just an estimate of the pooled
variance of the height. Those factors like cancel out with the variance
estimate in eqn 2.9. So this is just like the total variance
assuming all the data is from the same population.
We learned in eqn 1.57 that if we have
independent data points, the variance of the mean is just times
the variance of the data. Since we're interested in the deviation
of the {difference of the means}, these involve both the men and women.
So we just add those two mean-variances together. This gives the factor
. If either or is small, this makes our
estimate for this difference rather shaky, as to be expected, so you get
a big variance.
Josh Deutsch
2009-03-05
| |
|