Add a line to the plot that “fits the data well”. Don’t do any calculations, just add the line.
Now estimate the residuals for each point relative to your line.
Now square each residual and add the squared residuals together. This sum is sometimes written SSR for sum of squared residuals or SSE for sum of squared eerrors (even though error is a bit of a misnomer).
You may find it handy to use the following table to organize your work.
x | observed y | predicted y | residual | residual squared |
---|---|---|---|---|
0 | -2 | |||
1 | 1 | |||
3 | 0 | |||
4 | 3 | |||
5 | 2 |
Estimate the slope and intercept of your line.
Recall slope = \(\frac{\mathrm{rise}}{\mathrm{run}}\). You can comput this by choosing two points on your line and figuring out how much the y-values change (rise) and how much the x-values change (run).
The intercept is where your line hits the y-axis.
Check to see which person in your group has the smallest value for SSE. Double check their work.
# Note: This SSE function is not available for general use
SSE(slope = 1, intercept = -1.4)
## x y observed predicted residual
## 1 0 -2 -2 -1.4 -0.6
## 2 1 1 1 -0.4 1.4
## 3 3 0 0 1.6 -1.6
## 4 4 3 3 2.6 0.4
## 5 5 2 2 3.6 -1.6
SSE(slope = -1, intercept = 5.0)
## x y observed predicted residual
## 1 0 -2 -2 5 -7
## 2 1 1 1 4 -3
## 3 3 0 0 2 -2
## 4 4 3 3 1 2
## 5 5 2 2 0 2
Best in the sense of giving the smallest SSE value.
lm( y ~ x, data = someData)
##
## Call:
## lm(formula = y ~ x, data = someData)
##
## Coefficients:
## (Intercept) x
## -1.1047 0.7326
SSE(slope = 0.7326, intercept = -1.1047)
## x y observed predicted residual
## 1 0 -2 -2 -1.1047 -0.8953
## 2 1 1 1 -0.3721 1.3721
## 3 3 0 0 1.0931 -1.0931
## 4 4 3 3 1.8257 1.1743
## 5 5 2 2 2.5583 -0.5583