Overview
- In this in-class activity, students will use residual plots to investigate the appropriateness of linear regression models and explore the effect of outliers.
- The central dataset used in this lesson is from various large grocery store locations (from the same chain) in San Antonio, Texas. The data are from 2019.
- This activity connects back to evaluating the strength of linear relationships, and prepares students for evaluating whether a linear model is appropriate for making predictions.
- [a list of tags like S2, O1, B1, V3] ← Link to EBTP descriptions
Prerequisite assumptions
Students should be able to do each of the following after completing the What to Know assignment.
- Calculate and interpret residual errors.
- Identify violations of assumptions needed to perform linear regression.
- Discuss the effect of influential points on [latex]R^{2}[/latex].
Intended goals for this activity
After completing this activity, students should understand that residual plots can magnify potential issues with using a linear mode, that linear regression models may not be appropriate when we observe non-linear data trends or non-constant variance of residuals, and that outliers affect the strength of a model and should be investigated. They should be able to construct and interpret a residual plot, as well as informally assess the appropriateness of a linear regression model.
Synchronous Delivery and Activity Flow
The sample activity delivery below assumes a face-to-face class meeting but can be adapted to a fully online or hybrid delivery by using break-out rooms for pairs and small groups.
Frame the activity (3 minutes)
- Question 1 — Think-Pair-Share S2, C4, V1, V4, O3
- Have students read the question individually, then discuss with a partner before sharing with the class.
- Transition to the activity by briefly discussing the Objectives.
Activity Flow (20 minutes)
- Questions 2 and 3 — Working in Groups V1, V4, O3, S2, C6
- As a class, discuss responses for Question 2 before moving on to Question 3 (complete understanding of Question 2 is needed before successfully completing Question 3).
- Since residual plots will be a new concept, ask groups that do well with these tasks to share their thinking with the class. This will help groups who may struggle with these tasks.
- Drawing the residual lines both on the scatterplot and in the residual plot is a good scaffold for students who have trouble seeing the connection.
- Questions 4 and 5 — Working in Groups V1, V4, O3, S2, C6
- For each of these questions, students may feel uncomfortable with the fact that there is no single “right answer.” Emphasize that statistics is about making a choice and justifying through proper reasoning. The sample answers provide good models of defensible reasoning.
- As a frame, ask students: “What would be a clearly misleading thing to do in this situation?” This may help students think through which choices are reasonable and which choices are less reasonable.
- Question 6 — Whole Class Discussion S4, C3, V1, O1, B2, B4
- Have students answer this question independently. Then, allow them to discuss in small groups before asking several individuals to share their answers with the class.
- Ensure students include specific reasoning and sample a variety of viewpoints.
Wrap-up/transition (2 minutes)
- Have students refer back to the Objectives for the activity and check the ones they recognize.
- Assign the homework or Practice and any What to Know pages for the Forming Connections activities you plan to complete in the next class meeting. C2