15D InClass

Recall from the preview assignment that we are analyzing data from a representative sample of American adults polled by the Pew Research Center,[1] and we are considering the question, “Are the variables Income level and Education level independent?”
Count Income level
< $30,000 $30,000-$74,999 $75,000 and up Total
Education level Post-Grad Degree 2 8 46 56
College Degree 39 113 202 354
Some College 131 138 120 389
HS Grad 175 129 65 369
No HS Degree 78 32 8 118
Total 425 420 441 1,286

Question 1

1) Make a conjecture. Do you think that income level and education level are independent? Explain.

Question 2

2) We will conduct a chi-square test of independence for the variables Income leveland Education level. What are the null and alternative hypotheses?

Question 3

3) Check whether or not the conditions are met to perform a chi-square test of independence.
Part A: This is not an official condition, but to usethis type of test, it is important that the data represent the counts for two categorical variables measured for individuals in one sample from one population. Is thiscondition met? Explain.
Part B: Independence/Randomness Condition-Is the sample an independent random sample, or is it an independent sample that can be considered representative of the population? (Notice that this condition is very similar to the one for the chi-square test of homogeneity, but in this case, we are only considering one sample from one populationand not multiple samples from respective populations.) Is this conditionmet? Explain.
Part C: Large Sample Size Condition-The sample size must be large enough so that the expected count in each cell is at least five. Is thiscondition met? Explain.

Question 4

4) Go to the DCMP Chi-Square Test tool at https://dcmathpathways.shinyapps.io/ChiSquaredTest/andinput the data in the contingency table. Notice that since we are checking whether there is evidence that these two variables are not independent, we can use either variable for the rows and the other for the columns. The order of the two variables does not matter.
Part A: Give the 𝜒2test statistic and the P-value.
Part B: What conclusion do you draw based on the P-value? Explain.
Part C: Interpret your conclusion in context.

Question 5

5) Recall from In-Class Activity 15.C that the standardized residuals are values that can be considered normal z-scores thatindicate how large of a difference there is between the observed count and the expected count for each cell. Interpret and compare the standardized residuals for individuals with post-graduate degreesand individuals with high school degrees.

Question 6

6) Based on the results of this test alone, can you assure someone that if they pursue more education, they will have a larger income? Explain. Hint: Since we have concluded that education level and income level are not independent, we have concluded that they are associatedin some way—but do we know how they are associated?

Question 7

7) We concluded from our hypothesis test that the variables Income leveland Education levelare not independent, but we do not know how they are associated. It could be that there is a third variable not included in our study that impacts the values of both of the variables we are considering. Such a variable is called alurking variable. Give an example of a lurking variable that could arise when considering the association of these two variables.

Question 8

8) Now that you’ve seen both the chi-square test of homogeneity and the chi-square test of independence in action, summarize the difference between the two tests in your own words.


  1. Pew Research Center. (2019). Core trends survey-Mobile technology and home broadband 2019 . https://www.pewresearch.org/internet/dataset/core-trends-survey/