Linear Regression with SPSS

This tutorial will teach you how to perform Linear Regression using SPSS 16.0. The Dataset and steps are taken from here. First, we will start off with the definition of Linear Regression.

  • What is Linear Regression?

In statistics, Linear Regression is an approach to modelling relationships among variables where scores on one variable are predicted from the scores on a second variable. Whereas the variable predicted is called criterion variable (and referred as X), that on which prediction is based is named  predictor variable (and referred as Y).  In case there is only one predictor variable,  the prediction method is called simple regression.

———————————————————————————————————————————————————-

Download the dataset from here and start your version of SPSS (In my case, I am using SPSS 16). The command to run the linear regression method can be found under Analyze -> Regression -> Linear, as shown below:

1diagr

The Linear Dialog Menu will appear by clicking on the label Linear, as described above:

screenshot-2-linear-regression

Let assume that we want to find out whether there is a correlation between these two variables: “I live a fast paced life” ( criterion variable) and “I rely on a calendar” (predictor variable). In a nutshell, we want to predict value of the “I live a fast paced life” variable given the value of the  “I rely on a calendar” variable.  If more than one variable is included in the Indipendent box, Multiple Regression will be performed.

By clicking on on the Statistics button, we are now presented with the  Statistics Dialog box displaying a number of Regression coefficients. Select  the box next to Descriptives:

descriptive-logo

Click on Continue and the SPSS Output Viewer will generate the following reports:

descriptive-values

The Descriptive Statistics part returns the mean, standard deviation, and observation count (N) for dependent and independent variables.  The Correlation part, instead, shows the correlation coefficients. The first row (Pearson Correlation) provides the correlations between the independent and dependent variables, whereas the second row returns the significance of the correlation coefficients. The last row gives the number of observations for each of the variables, and the number of observations including values for all the independent and dependent variables.

stats3

The Variables Entered/Removed area simply shows what are the variables taken into account (dependent and independent) , whereas the Model Summary is used when performing multiple regression (which will be treated in another tutorial/experiment). Capital R is the multiple correlation coefficient that tells us how strongly the multiple independent variables are related to the dependent variable.

stats4

stats5

ANOVA (Analysis of variance) provides information about levels of variability within the regression equal. In a nutshell, it denotes whether the regression equation explains a statistically significant portion of the variability in the dependent variable from variability in the independent variables.

coeff

The Coefficients provides the information we need in order to write the regression equation, which is determined by:

Predicted variable (dependent variable) = slope * independent variable + intercept

The slope describes the direction and steepness of the regression line.  For example, a slope equal to 0 is a horizontal line, whereas slope = 1 is a diagonal line from the lower left to the upper right. The intercept is  is the expected mean value of Y when the independent variable is 0.

The predicted variable is the dependent variable . For example,  it is “I live a fast paced life”.  Instead, the independent variable was “I rely on a calendar” and the intercept is found at the intersection of the line labeled (Constant) and the column labeled B. In this example, the intercept is 2.440.  The regression equation will then look like this:

Predicted value of “I live a fast paced life” = -.002 X value of “I rely on a calendar”  + 2.440

This means that if a person has a “I rely on a calendar” score of 2, we would estimate that their “I live a fast paced life” score would be .002 X 2 +  2.440 = 2.444. Thus, we would predict that a person who  “I rely on a calendar” would probably disagree with the statement “I live a slow-paced life”

Next tutorial will show how to calculate Multiple Regression with SPSS.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: