6D | Exemplar Statistics Alpha Book

State	Losses ($)	Observed Insurance Premiums ($)	Predicted Insurance Premiums ($)	Residual ($)
Louisiana	$194.78	$1,281.50
Idaho	$82.75	$642.00
Montana	$85.15	$816.20
Oklahoma	$178.86	$881.50

A scatterplot with a regression line of best fit. The horizontal axis is labeled "losses" and the vertical axis is labeled "insurance_premiums." One of the points is labeled "state: New Jersey" and is located at approximately (159, 1300). The equation for the line of best fit is also given as y = 285 + 4.47x. A scatterplot with a regression line whose slope is labeled as y = 285 + 4.47x. The horizontal axis is labeled "losses" and the vertical axis is labeled "insurance_premiums." A large array of fruits, vegetables, and other fresh foods A residual plot with the x-axis labeled "Average Income in Zip Code ($)" and the y-axis labeled "Residual". The x-axis is numbered in increments of 20,000 starting at 40,000 and continuing up to 140,000. The y-axis is numbered in increments of 20, starting at -20 and going up to 40. The first four points are at approximately (37000, -14), (39000, -9), (41000, -10), (42000, -9). Two graphs. On the left is a scatterplot with a line of best fit. On the right is a residual plot, where the dots look seemingly random in relation to the horizontal line in the middle.

A scatterplot showing "Average Income in Zip Code ($)" on the horizontal axis and "Number of Organic Items Offered" on the vertical axis. The horizontal axis is number in increments of 20,000 from 40,000 to 140,000. The vertical axis is labeled in increments of 20 from 0 to 100. There is a line of best fit whose slope is labeled as y = -14.7 + 0.000959x.

A scatterplot with "Average Income in Zip Code ($)" on the x-axis and "Number of Organic Items Offered" on the y-axis. At the top, it reads "Regression Line: y = -6.82 + 0.000828x." The point furthest from the line is at approximately (125000, 53).

A residual plot with the x-axis labeled "Average Income in Zip Code ($)" and the y-axis labeled "Residual". The x-axis is numbered in increments of 20,000 starting at 40,000 and continuing up to 140,000. The y-axis is numbered in increments of 20, starting at -20 and going up to 40. The point furthest from the center line is at approximately (125000, -42).

A scatterplot labeled "Internet Use (%)" on the horizontal axis, which is numbered in increments of 10 from 10 to 90, and "GDP in billions" on the vertical axis, which is numbered in increments of 2,000 from 0 to 16,000. The line of best fit has a positive slope and is also near the bottom of the graph.

A scatterplot labeled "Internet Use (%)" on the horizontal axis, which is numbered from 10 to 90 in increments of ten, and "GDP per Capita" on the vertical axis, which is numbered 0 to 70 by increments of 10. The line of best fit extends approximately from (22, 0) to (95, 50) and the points are clustered relatively closely around it.

A residual plot labeled "Internet Use (%)" on the x-axis and Residual on the y-axis. Most points are relatively close to the line. Some points are also clustered together. There are more points with higher x-values. The points with higher y-values tend to have high or low x-values, rather than moderate ones.

A scatterplot showing Gestational Period and Longevity for 21 Animals. The x-axis is labeled "Gestation (days)" and the y-axis is labeled "Longevity (years)." The regression line with an equation y = 6.29 + 0.0449x goes from the bottom left of the graph to the upper right. One of the points is labeled "Animal: Bear, Gestation (days): 220, Longevity (years): 22." A scatterplot with a line of best fit. For lower x-values, the points are close to the line. As x increases, the y-distribution of the points increases as well.

Skill or Concept: I can . . .	Questions to check your understanding	Rating from 1 to 5
Calculate and interpret residual errors.	1, 2, 3, 4
Identify violations of assumptions needed to perform linear regression.	5, 8
Discuss the effect of influential points on .	6, 7

influential point: an observation that does not fit the trend of the data.

Spencer Image/Table Part

6D