Linear Regression
Say in the example of above in section 1.8, you
want to actually predict the height of a Swedish man, knowing his
weight. The simplest way to do this is to try to find a linear
relationship between his weight , and height . We write
this in the standard form . The slope is and the intercept with
the y-axis is .
The red line is a best fit to the data. There are many ways
of doing this, but the most common is called "linear regression"
You'll never get a perfect fit, they're be errors for each data
point that are the difference between the true value you
measured and the line . So the error for the
ith point is
.
You try to find the values of and that best fit
the data by minimizing a measure of the error. This is
normally done by taking the measure of the total error
to be the sum of the squares of the individual errors (technically
these individual errors are actually called
residuals):
 |
(1.64) |
You minimize this with respect to a and b and get
 |
(1.65) |
You get by averaging :
so
 |
(1.66) |
Here's
a nice applet demonstrating these concepts.
Josh Deutsch
2009-03-05
|