Test of Independence (3 of 3)

Learning outcomes

  • Conduct a chi-square test of independence. Interpret the conclusion in context.

On this page, we practice the chi-square test for independence in its entirety and learn how to use statistical software to conduct this test. We also investigate the effect of sample size on the chi-square test statistic.

Try It

A Real Court Case

In the early 1970s, a young man challenged an Oklahoma state law that prohibited the sale of 3.2% beer to males under age 21 but allowed its sale to females in the same age group. The case (Craig v. Boren, 429 U.S. 190, 1976) was ultimately heard by the U.S. Supreme Court. The state of Oklahoma argued that the law improved traffic safety. One of the three main pieces of data presented to the court was the result of a “random roadside survey.” This survey gathered information on gender and whether or not the driver had been drinking alcohol in the previous 2 hours. A total of 619 drivers under 21 years of age were included in the survey.

Use this simulation to answer the questions below.

Here are the data presented as evidence in court.

Driver Gender Alcohol in Last Two Hours?
Driver 1 M Yes
Driver 2 F No
Driver 3 F Yes
* * *
* * *
* * *
Driver 619 M No
Drank Alcohol in Last Two Hours?
Yes No Totals
Male 77 404 481
Female 16 122 138
Totals 93 526 619

In this table the expected counts are in parentheses next to the observed counts.

Drank Alcohol in Last Two Hours?
Yes No Totals
Male 77 (72.25) 404 (408.75) 481
Female 16 (20.73) 122 (117.27) 138
Totals 93 526 619

Comment: The Effect of Sample Size on Chi-Square

With other hypothesis tests, we have seen that sample size can affect the P-value and our conclusion. This is also true for chi-square. To illustrate this idea, we multiplied all of the counts in the Oklahoma data by 3.

Original Data Data x 3
Drank Alcohol in Last Two Hours? Drank Alcohol in Last Two Hours?
Yes No Totals Yes No Totals
Male 77 404 481 Male 231 1212 1443
Female 16 122 138 Female 48 366 414
Totals 93 526 619 Totals 279 1578 1857
Original Data (Conditional Percents) Data x 3 (Conditional Percents)
Drank Alcohol in Last Two Hours? Drank Alcohol in Last Two Hours?
Yes No Totals Yes No Totals
Male 77/481 (16.0%) 404/481 (84.0%) 481 Male 231/1443 (16.0%) 1212/1443 (84.0%) 1443
Female 16/138 (11.6%) 122/138 (88.4%) 138 Female 48/414 (11.6%) 366/414 (88.4%) 414
Totals 93 526 619 Totals 279 1578 1857

 

Notice that the conditional percentages do not change, so the new “data” shows the same relationship between gender and drinking before driving. The probability that a driver under the age of 21 drinks alcohol before driving is still about 15.0% (279/1857). Males are still more likely to consume alcohol before driving (231/1443 = 16.0%) than are females (48/414 = 11.6%), with the same difference of 4.4% that we saw in the original data.

We used technology to find expected counts and the chi-square test statistic.

Original Data (Expected Counts in Parentheses) Data x 3 (Expected Counts in Parentheses)
Drank Alcohol Yes Drank Alcohol No Total Drank Alcohol Yes Drank Alcohol No Total
Male 77 (72.27) 404 (408.7) 481 Male 231 (216.8) 1212 (1226) 1443
Female 16 (20.73) 122 (117.3) 138 Female 48 (62.2) 366 (351.8) 414
Totals 93 526 619 Totals 279 1578 1857

 
Chi-Square test:

Statistic DF Value P-value Statistic DF Value P-value
Chi-square 1 1.6365641 0.2008 Chi-square 1 4.9096923 0.0267

Notice that multiplying the observed counts by 3 also triples the expected counts and the chi-square value. This increase in the chi-square value gives a statistically significant P-value of 0.0267, which changes our conclusion. With this larger sample, the evidence is strong enough to reject the null hypothesis. We conclude that gender is associated with drinking alcohol before driving. The variables are dependent for drivers under the age of 21 in Oklahoma. With this sample size, the data provides evidence in support of the Oklahoma law that forbids sale of 3.2% beer to males and permits it to females with the goal of improving traffic safety.

What’s the point? We see once again that sample size affects the P-value in a hypothesis test. This means that a small sample may not detect a relationship that exists between two categorical variables in a population. Conversely, a large sample may indicate that a relationship is statistically significant on the basis of differences in observed and expected counts that are not important in a practical sense.

Contribute!

Did you have an idea for improving this content? We’d love your input.

Improve this pageLearn More