Putting It Together: The Chi-Square Distribution

Let’s Summarize

  • The following concepts apply to all of the chi-square hypothesis tests in this module:
    • The chi-square distribution is a distribution that is skewed to the right.
    • The variability (or spread) of the chi-square distribution depends on the degrees of freedom of the distribution.
    • The test statistic for a chi-square distribution is always greater than or equal to zero.
  • The following concepts apply for a chi-square goodness-of-fit test:
    • The null hypothesis is that the distribution fits the hypothesized proportions. The alternative hypothesis is that the distribution does not fit the hypothesized proportions.
    • Expected counts are found by taking the total count and multiplying by each of the hypothesized proportions.
    • The expected counts need to be 5 or more to conduct a chi-square test and are NOT rounded to the nearest whole number.
    • The degrees of freedom is [latex]k – 1[/latex], where [latex]k[/latex] is the number of categories.
    • The chi-square test statistic is the sum of [latex]\frac{(\mathrm{Observed} \ – \ \mathrm{Expected})^2}{\mathrm{Expected}}[/latex] for each category.
  • The following concepts apply for a chi-square test of independence:
    • The null hypothesis is that there is no association between the two categorical variables. The alternative hypothesis is that there is an association between the two categorical variables.
    • The degrees of freedom is [latex](r – 1)(c – 1)[/latex], where r is the number of rows in the contingency table and [latex]c[/latex] is the number of columns in the contingency table.
    • The expected count for each cell is found by taking the row total times the column total and dividing it by the grand total.
    • The expected counts need to be 5 or more to conduct a chi-square test and are NOT rounded to the nearest whole number.
    • The chi-square test statistic is the sum of [latex]\frac{(\mathrm{Observed} \ – \ \mathrm{Expected})^2}{\mathrm{Expected}}[/latex] for each cell in the contingency table.
  • The following concepts apply for a chi-square test of homogeneity:
    • The null hypothesis is that the distribution of the two populations is the same. The alternative hypothesis is that the distribution of the two populations is not the same.
    • The expected count for each cell is found by taking the row total times the column total and dividing it by the grand total.
    • The expected counts need to be 5 or more to conduct a chi-square test and are NOT rounded to the nearest whole number.
    • The degrees of freedom for a chi-square test of homogeneity for two populations is [latex]k – 1[/latex], where [latex]k[/latex] is the number of response values.
    • The chi-square test statistic is the sum of [latex]\frac{(\mathrm{Observed} \ – \ \mathrm{Expected})^2}{\mathrm{Expected}}[/latex] for each category.

To determine which type of chi-square test is being done, consider the number of samples and the general research question that is being answered.

Type of Chi-Square Test Number of Samples Question
Goodness-of-Fit One Sample Does the population fit the given distribution?
Test of Independence One Sample Is there an association between the two categorical variables?
Test of Homogeneity Two Independent Samples Do the two populations follow the same distribution?