Chapter 4: Analysing the Data |

## Least-squares regression lineRegression generates what is called the "least-squares" regression line. The regression line takes the form: = a + b*X, where a and b are both constants, (pronounced y-hat) is the predicted value of Y and X is a specific value of the independent variable. Such a formula could be used to generate values of for a given value of X. For example, suppose a = 10 and b = 7. If X is 10, then the formula produces a predicted value for Y of 45 (from 10 + 5*7). It turns out that with any two variables X and Y, there is one equation that produces the "best fit" linking X to Y. In other words, there exists one formula that will produce the best, or most accurate predictions for Y given X. Any other equation would not fit as well and would predict Y with more error. That equation is called the But how do we measure best? The criterion is called the least squares criterion and it looks like this: You can imagine a formula that produces predictions for Y from each value of X in the data. Those predictions will usually differ from the actual value of Y that is being predicted (unless the Y values lie exactly on a straight line). If you square the difference and add up these squared differences across all the predictions, you get a number called the It is possible to derive by hand computation the values of a and b that minimise SS
## RegressionMore detailed comments on Output 4.4
So the least squares regression formula is: predicted creativity score = 5.231 + 0.654*reasoning test score Someone who scored 10 on the logical reasoning test would be predicted to score 5.231+0.654*10 or about 11.77 on the creativity test. This is the best fitting equation because it minimises the sum of the squared differences between the predicted values and the actual values. That value, called SS The variability in creativity that can be predicted by knowing logical reasoning can be found by Notice the value for "Beta" under "Standardized Coefficients" is the same as the correlation between the two variables. The slope of the regression line (symbolised by "B) can be computed from
Figure 4.10 The regression line "fitted to" the scatterplot of values shown in Figure 4.9. This regression equation (or line of best fit) is depicted graphically in Figure 4.10. The value of b in the regression equation (also called the |

© Copyright 2000 University of New England, Armidale, NSW, 2351. All rights reserved Maintained by Dr Ian Price |