Name: ______________________________________________________
You are working on an alternative energy source and biomass is a key component. You want to predict above-ground biomass for this region, and you believe that biomass is related to substrate (subsoil) variables of salinity, water acidity, potassium, sodium, and zinc. Your crew collects information on biomass and these five variables for 45 plots.
1) Before you create this regression model, you must examine the relationships between each of the five predictor variables and biomass (the response variable). Create five scatterplots using biomass as the response variable (y) and each of the predictor variables (x). Compute the linear correlation coefficient for each pair. Describe the relationships.
GRAPH>Scatterplot>Simple>OK. The response variable (y-variable) is Bio and the five predictor variables are the x-variables. Look at the scatterplots and describe each relationship below. Next compute the correlation coefficient for each pair and write the r-value below. STAT>Basic Statistics>Correlation. You can easily do all correlations at once by creating a correlation matrix. Put all predictor variables in the Variables box together.
Correlation (r) Description
Bio v. sal ______________________________________________________
Bio v.pH ______________________________________________________
Bio v. K _______________________________________________________
Bio v. Na ______________________________________________________
Bio v. Zn ______________________________________________________
Circle the above pair that has the strongest linear relationship.
2) You are now going to create four regression models using the predictor variables. You will compare the adjusted R2, regression standard error, p-values for each coefficient, and the residuals for each model. Using this information, you will select the best model and state your reasons for this choice.
Begin with the full model using all five predictor variables. STAT>Regression>General Regression. Put Bio in the Response box and all five predictor variables in the Model box (see image). Click Results and make sure that the Regression equation, coefficient table, Display confidence intervals, Summary of Model, and Analysis of Variance Table are checked (see image). Click OK. Click Graphs and make sure that under Residual Plots that Individual plots and Residual versus Fits are selected (see image). Click OK.
MODEL 1
Write the regression model _______________________________________________
Write the adj. R2 ________________________________________________________
Write the regression standard error _________________________________________
Examine the residual plot. Are there any problems? ____________________________
Write the variables which are NOT significant ________________________________
MODEL 2
Now remove the LEAST significant variable (highest p-value) and repeat the steps using only the remaining variables.
Write the regression model _______________________________________________
Write the adj. R2 ________________________________________________________
Write the regression standard error _________________________________________
Examine the residual plot. Are there any problems? ____________________________
Write the variables which are NOT significant ________________________________
MODEL 3
Now remove the LEAST significant variable (highest p-value) and repeat the steps using only the remaining variables.
Write the regression model _______________________________________________
Write the adj. R2 ________________________________________________________
Write the regression standard error _________________________________________
Examine the residual plot. Are there any problems? ____________________________
Write the variables which are NOT significant ________________________________
MODEL 4
Now remove the LEAST significant variable (highest p-value) and repeat the steps using only the remaining variables.
Write the regression model _______________________________________________
Write the adj. R2 ________________________________________________________
Write the regression standard error _________________________________________
Examine the residual plot. Are there any problems? ____________________________
Write the variables which are NOT significant ________________________________
3) Select the best model and state your reasons for selecting this model.
Candela Citations
- Natural Resources Biometrics. Authored by: Diane Kiernan. Located at: https://textbooks.opensuny.org/natural-resources-biometrics/. Project: Open SUNY Textbooks. License: CC BY-NC-SA: Attribution-NonCommercial-ShareAlike