14C Coreq

In the next preview assignment and in the next class, you will need to determine  whether different groups of data have similar variation. In this corequisite support  activity, we’ll explore the use of different tools to explore data before conducting formal  analyses.

What Should You Pack?

Suppose you’re taking a trip across the country to visit several different friends. You’ll  be traveling through Miami, Florida, Phoenix, Arizona, and Honolulu, Hawaii, but you’re  not sure what kinds of clothes to pack. What’s the weather like in those cities? A quick  Internet search tells you that the mean high temperature in each city is as follows:

City Mean High Temperature[1]
Miami 84 °F
Phoenix 87 °F
Honolulu 84 °F

Question 1

1) What are your first impressions of the weather in each city? What other information  would you like to have before deciding what to pack?

Question 2

2) Obviously, a single data point is not enough information to help you pack. The  following are boxplots representing the daily high temperatures in each city over the  course of a year.2

A box plot labeled “Daily High Temperature” on the horizontal axis, with “Miami,” “Phoenix,” and “Honolulu” on the vertical axis. For Miami, the low point is at approximately 75 and the high point is at approximately 92. The low end of the box is at approximately 80, the high end is at approximately 88, and the middle line is at approximately 84. For Phoenix, the low point is at approximately 66 and the high point is at approximately 105. The low end of the box is at approximately 75, the high end is at approximately 101 and the middle line is at approximately 87. For Honolulu, the low point is at approximately 80 and the high point is at approximately 89. The low end of the box is at approximately 81, the high end of the box is at 87, and the middle line is at approximately 84. There are also point on the horizontal axis corresponding to each of the middle lines and labeled “y bar sub 1,” “y bar sub 2,” and “y bar sub 3,” respectively.

Part A: Which city has the greatest variability in the data? Explain what this means  about the weather in this city.

Part B: Which city has the least variability in the data? Explain what this means  about the weather in this city.

2 U.S. climate data. (2021). https://www.usclimatedata.com/climate/united-states/us

Question 3

3) For all three cities, the lowest temperatures are seen in December, January, and  February. If your trip is in January, how will this affect what clothes you pack?

Question 4

4) Practice evaluating variability one more time. Which of the following boxplots represents the data with the greatest variability?

A box plot with A, B, and C on the vertical axis. For A, the low point is at approximately 8 and the high point is at approximately 12. The low end of the box is at approximately 10, the high end is at approximately 11.4, and the middle line is at approximately 10.9. For B, the low point is at approximately 7 and the high point is at approximately 10. The low end of the box is at approximately 7.8, the high end is at approximately 9.3, and the middle line is at approximately 8.5. For C, the low point is at approximately 3 and the high point is at approximately 12. The low end of the box is at approximately 5.4, the high end is at approximately 9.8, and the middle line is at approximately 7.6. There is a point labeled “y bar sub 1” at approximately 10.8, another labeled “y bar sub 2” at approximately 8.4, and one more labeled “y bar sub 3” at approximately 7.6.

Variance From Summary Statistics

Question 5

5) When statistical software provides summary statistics of data, it usually provides  standard deviation instead of variance. We can use that to compare the variability  between groups.

Look at the following summary statistics for two groups of data. Which group has  more variability? Explain.

image

image

Question 6

6) This example uses four groups instead of two and compares SAT scores in different  regions of the United States.

Part A: Look at the following summary statistics. Which region has the most variability? Explain.

Part B: Which region has the least variability? Explain.

Part C: In order for an ANOVA to be appropriate, it is recommended that the largest  standard deviation be no more than two times the smallest standard  deviation when the sample sizes are equal.

Would these groups meet that recommendation for using an ANOVA to  compare means?

Interpreting ANOVA Hypotheses

When conducting an ANOVA, the null and alternative hypotheses can be written as:

  • Null – [latex]H_{0}:\;\mu_{1}=\mu_{2}\ldots=\mu_{k}[/latex]
  • Alternative – [latex]H_{a}[/latex]: At least two means are different.

But what does it mean when we reject the null hypothesis? Remember that an ANOVA  only tells us that there is a difference, not which group(s) are different. Let’s use colors  to understand it better.

Question 7

7) Suppose there is a version of ANOVA that compares colors instead of means. We  have five circles, and we can’t decide if the colors are significantly different without  the help of this analysis.

A dark blue circle with the number 1 in it A somewhat dark blue circle with the number 2 in it A light blue circle with the number 3 in it A dull blue circle with the number 4 in it  A bright blue circle with the number 5 in it

Part A: Write the null and alternative hypotheses for this scenario.

Part B: After the analysis, we fail to reject the null hypothesis. A friend says, “That  means all the colors are the same!” How would you respond?

Question 8

8) Now we’ll repeat the analysis on a new set of circles.

A dark blue circle with the number 1 in it A reddish purple circle with the number 2 in it A bright indigo circle with the number 3 in it A purple circle with the number 4 in it A dark indigo circle with the number 5 in it

Part A: Write the null and alternative hypotheses for this scenario.

Part B: After the analysis, we reject the null hypothesis. Your friend asks you which  circles are different colors. How would you respond?

Question 9

9) Let’s do one more example.

A blue circle with the number 1 in it A blue circle with the number 2 in it A yellow circle with the number 3 in it A blue circle with the number 4 in it  A blue circle with the number 5 in it

Part A: Write the null and alternative hypotheses for this scenario.

Part B: After the analysis, we reject the null hypothesis. Using this analysis, can we  determine which circles have different colors? Explain.


  1. U.S. climate data. (2021). https://www.usclimatedata.com/climate/united-states/us