The Coefficient of Determination

Learning Outcomes

Interpret the coefficient of determination in context

The variable [latex]\mathbf{r^2}[/latex] is called the coefficient of determination and is the square of the correlation coefficient, but is usually stated as a percent, rather than in decimal form. It has an interpretation in the context of the data:

r², when expressed as a percent, represents the percent of variation in the dependent (predicted) variable y that can be explained by variation in the independent (explanatory) variable x using the regression (best-fit) line.
1 – r², when expressed as a percentage, represents the percent of variation in y that is NOT explained by variation in x using the regression line. This can be seen as the scattering of the observed data points about the regression line.

Consider the example (example 2 aka the Third Exam vs Final Exam Example) introduced in the previous section:

The line of best fit is [latex]\displaystyle\hat{{y}}=-{173.51}+{4.83}{x}[/latex]
The correlation coefficient is r = 0.6631
The coefficient of determination is r² = 0.66312 = 0.4397
Interpretation of r² in the context of this example:
Approximately 44% of the variation (0.4397 is approximately 0.44) in the final-exam grades can be explained by the variation in the grades on the third exam using the best-fit regression line.
Therefore, approximately 56% of the variation (1 – 0.44 = 0.56) in the final exam grades can NOT be explained by the variation in the grades on the third exam using the best-fit regression line. (This is seen as the scattering of the points about the line).

Module 12: Linear Regression and Correlation

The Coefficient of Determination

Learning Outcomes

The Coefficient of Determination

Candela Citations