Linear Regression with SPSS
- What is Linear Regression?
In statistics, Linear Regression is an approach to modelling relationships among variables where scores on one variable are predicted from the scores on a second variable. Whereas the variable predicted is called criterion variable (and referred as X), that on which prediction is based is named predictor variable (and referred as Y). In case there is only one predictor variable, the prediction method is called simple regression.
Download the dataset from here and start your version of SPSS (In my case, I am using SPSS 16). The command to run the linear regression method can be found under Analyze -> Regression -> Linear, as shown below:
The Linear Dialog Menu will appear by clicking on the label Linear, as described above:
Let assume that we want to find out whether there is a correlation between these two variables: “I live a fast paced life” ( criterion variable) and “I rely on a calendar” (predictor variable). In a nutshell, we want to predict value of the “I live a fast paced life” variable given the value of the “I rely on a calendar” variable. If more than one variable is included in the Indipendent box, Multiple Regression will be performed.
By clicking on on the Statistics button, we are now presented with the Statistics Dialog box displaying a number of Regression coefficients. Select the box next to Descriptives:
Click on Continue and the SPSS Output Viewer will generate the following reports:
The Descriptive Statistics part returns the mean, standard deviation, and observation count (N) for dependent and independent variables. The Correlation part, instead, shows the correlation coefficients. The first row (Pearson Correlation) provides the correlations between the independent and dependent variables, whereas the second row returns the significance of the correlation coefficients. The last row gives the number of observations for each of the variables, and the number of observations including values for all the independent and dependent variables.
The Variables Entered/Removed area simply shows what are the variables taken into account (dependent and independent) , whereas the Model Summary is used when performing multiple regression (which will be treated in another tutorial/experiment). Capital R is the multiple correlation coefficient that tells us how strongly the multiple independent variables are related to the dependent variable.
ANOVA (Analysis of variance) provides information about levels of variability within the regression equal. In a nutshell, it denotes whether the regression equation explains a statistically significant portion of the variability in the dependent variable from variability in the independent variables.
The Coefficients provides the information we need in order to write the regression equation, which is determined by:
Predicted variable (dependent variable) = slope * independent variable + intercept
The slope describes the direction and steepness of the regression line. For example, a slope equal to 0 is a horizontal line, whereas slope = 1 is a diagonal line from the lower left to the upper right. The intercept is is the expected mean value of Y when the independent variable is 0.
The predicted variable is the dependent variable . For example, it is “I live a fast paced life”. Instead, the independent variable was “I rely on a calendar” and the intercept is found at the intersection of the line labeled (Constant) and the column labeled B. In this example, the intercept is 2.440. The regression equation will then look like this:
Predicted value of “I live a fast paced life” = -.002 X value of “I rely on a calendar” + 2.440
This means that if a person has a “I rely on a calendar” score of 2, we would estimate that their “I live a fast paced life” score would be .002 X 2 + 2.440 = 2.444. Thus, we would predict that a person who “I rely on a calendar” would probably disagree with the statement “I live a slow-paced life”
Next tutorial will show how to calculate Multiple Regression with SPSS.