The Coefficient of Determination

Learning Outcomes

  • Interpret the coefficient of determination in context

The Coefficient of Determination

The variable r2 is called the coefficient of determination and is the square of the correlation coefficient, but is usually stated as a percent, rather than in decimal form. It has an interpretation in the context of the data:

  • r2, when expressed as a percent, represents the percent of variation in the dependent (predicted) variable y that can be explained by variation in the independent (explanatory) variable x using the regression (best-fit) line.
  • 1 – r2, when expressed as a percentage, represents the percent of variation in y that is NOT explained by variation in x using the regression line. This can be seen as the scattering of the observed data points about the regression line.

Consider the example (example 2 aka the Third Exam vs Final Exam Example) introduced in the previous section:

  • The line of best fit is y^=173.51+4.83x
  • The correlation coefficient is r = 0.6631
  • The coefficient of determination is r2 = 0.66312 = 0.4397
  • Interpretation of r2 in the context of this example:
  • Approximately 44% of the variation (0.4397 is approximately 0.44) in the final-exam grades can be explained by the variation in the grades on the third exam using the best-fit regression line.
  • Therefore, approximately 56% of the variation (1 – 0.44 = 0.56) in the final exam grades can NOT be explained by the variation in the grades on the third exam using the best-fit regression line. (This is seen as the scattering of the points about the line).