15D Preview

Preparing for the next classIn the next in-class activity, you will need to understand the difference between the chi-square test of homogeneity and the chi-square test of independence, as well as understand what it means for two variables to be independent. You will also need to be able to identify the null and alternative hypotheses for a chi-square test of independence and find expected counts for the cells of the contingency table in a chi-square test of independence. The Pew Research Center is a non-partisan fact tank that conducts polls and social science research. One survey that they conduct periodically is called the Core Trends Survey, which measures variables of a wide variety for a representative sample of American adults, including demographic information and information on Internet and social media use. Two of the variables included in the survey are Education level and Income level. The observed counts from the 2019 Core Trends Survey for these two variables are displayed in the following two-way table. ^[1] We’ve seen two-way tables (also called contingency tables) before in a couple of contexts. In the previous lesson, we saw contingency tables that displayed values for one categorical variable for samples from multiple populations. In this situation, the two-way table classifies counts for a sample of individuals from one population on two categorical variables.

Count		Income level
Count		< $30,000	$30,000-$74,999	$75,000 and up	Total
Education level	Post-Grad Degree	2	8	46	56
	College Degree	39	113	202	354
	Some College	131	138	120	389
	HS Grad	175	129	65	369
	No HS Degree	78	32	8	118
	Total	425	420	441	1,286

Since we have two categorical variables measured for the same sample of individuals, the natural question to ask is,“Are these two variables independent?” In other words, “Is income level independent of education level?” We address this question using the chi-square test of independence. Recall from In-Class Activity7.C that two events, A and B, are independent if 𝑃(𝐴)=𝑃(𝐴|𝐵)(i.e., knowing whether event B happens has no effect on how likely event A is to occur). If the two variables Income level and Education level are independent, knowing one’s education level should not change the probability that they will have a particular income level, so the distribution of Income level should be the same for every education level. Similarly, the distribution of Education level should be the same for every income level. This should be feeling fairly reminiscent of the chi-square test of homogeneity, but it is different in a couple of important ways. The homogeneity test considered one categorical variable measured for samples from different populations and asked whether the distribution of that one variable was the same among the populations. In this case, we have one sample from one population of individuals for which two categorical variables are measured, and we’re asking whether those two variables are independent.

Question 1

1) For each of the following statements, select whether it applies to the chi-square test of homogeneity, the chi-square test of independence, or the chi-square goodness of fit test.

Part A: The question we ask is,“Are the variables independent?”

a) Chi-square test of homogeneity