Summary: Testing the Significance of the Correlation Coefficient

Key Concepts

  • The null hypothesis is the population correlation coefficient is not significantly different from zero. This means there is not a significant linear relationship (correlation) between x and y. The line should not be used for making predictions.
  • The alternate hypothesis is the population correlation coefficient is significantly different from zero. This means there is a significant linear relationship (correlation) between x and y in the population. The line should be used for making predictions.
  • The p-value is calculated using n2 degrees of freedom and the test statistic is t=rn21r2.
  • A critical value approach is an alternative method to doing a test of significance for a correlation coefficient.
    • There are assumptions that need to be verified before doing the test of significance for a correlation coefficient. They are as follows:
      • The underlying relationship is a linear relationship.
      • The y values for any particular x value are normally distributed about the line.
      • The standard deviations of the population y values about the line are equal for each value of x. There is no pattern in a plot of the residuals.
      • The data are produced from a well-designed, random sample or randomized experiment.

Glossary

Coefficient of Correlation: a measure developed by Karl Pearson (early 1900s) that gives the strength of association between the independent variable and the dependent variable; the formula is

r=n(xy)(x)(y)[nx2(x)2][ny2(y)2]

where n is the number of data points. The coefficient cannot be more than 1 or less than –1. The closer the coefficient is to ±1, the stronger the evidence of a significant linear relationship between x and y.