What to Know About Comparing Quantitative Distributions: 3E – 14

goals for this section

After completing this section, you should feel comfortable performing these skills.

Click on a skill above to jump to its location in this section.

When describing and summarizing data, you will frequently need to be able to describe the shape of the distribution of a quantitative variable and compare centers and spreads of distributions of a quantitative variable. You will also need to determine the presence of outliers in the distribution of a quantitative variable.

In Forming Connections in Applications of Histograms: 3D, we learned to describe the distribution of one variable at a time. Recall that the description of a distribution includes shape, center, spread, and the presence or absence of outliers. In the next activity, you will need to use technology to create and interpret histograms and dotplots for a quantitative variable compared across groups. To prepare for the activity, in this section you’ll practice describing the shape of distributions and identifying the presence of outliers while using graphs to compare the center and spread of several groups at once.

Let’s start by going to the data analysis tool to create a set of histograms for a variable in a dataset that contains several groups. Follow the directions below to create stacked histograms of the price of a room rental in New York City across different types of rooms.

For Questions 1–3 below:

Go to the Describing and Exploring Quantitative Variables tool at https://dcmathpathways.shinyapps.io/EDA_quantitative/. 

Step 1) Select the Several Groups tab at the top of the page.

Step 2) Locate the dropdown under Enter Data and select From Textbook.

Step 3) Locate the dropdown under Dataset and select Airbnb Price by Type of Room.

Step 4) Under Choose Type of Plot: select Histogram

Step 5) Under Histogram Options: select stacked.

Use the displayed histograms of Airbnb rental prices in New York City (in $) to answer Questions 1 – 3 below.

Recall

In the previous section, What to Know About Applications of Histograms: 3D, you learned how to visually assess center and spread. Do you recall those techniques?

Core skill:

Core skill:

question 1

Describe the shape of the distribution of Airbnb prices for renting an entire apartment. Select the best description.

a) Approximately symmetric

b) Right skewed

c) Left skewed

d) Bimodal

comparing centers and spread

[Perspective Video — a 3-instructor video that shows how to think about comparisons of centers and spread in a stacked histogram that displays a quantitative variable across more than one group]

Comparing centers

Now, instead of looking at just one of the distributions, let’s compare the centers of all three plots to answer a question. Recall that we think of the center of quantitative data as the location of the middle of the data (a “typical” observation value).  Later, we’ll identify the numerical value of the center precisely using descriptive statistics. For now, just use the graphs to compare the centers of the different groups.

Examine the stacked histograms that appear in the analysis tool to compare their centers. Use your observations to answer Question 2 below.

question 2

Which of the three Airbnb room types has the smallest typical price?

a) Shared room

b) Private room

c) Entire apartment

d) All three have similar centers

Comparing Spread

Recall that spread is a measure of how much the values in a dataset tend to differ from one another. You saw in the previous section that one way we can describe spread is to calculate the range of the data: the difference between the minimum and maximum values in the data.

We’ll see later that there are other ways to assign a numerical value to spread, but for now just use the graphs to visually compare the range of each distribution.

question 3

Which of the three Airbnb room types has the greatest range?

a) Shared room

b) Private room

c) Entire apartment

d) All three have similar ranges

using technology to interpret distributions

[Worked example – a 3-instructor video works through an example like questions 4 – 6 below]

For Questions 4–6, change the tool inputs to the following:

  • Dataset: CO2 Emissions by Continent
  • Choose Type of Plot: Dotplot
  • Select Dot Size: Choose an appropriate size to visualize the data efficiently

Use the displayed dotplot of per capita CO2 emissions (in metric tons) to answer the following questions.

question 4

Describe the shape of the distribution of per capita CO2 emissions for Europe. Select the best description.

a) Approximately symmetric

b) Right skewed

c)Left skewed

d) Bimodal

question 5

Are there any outliers in the distribution of per capita CO2 emissions for Central and South America? If so, what are the approximate values of the outliers?

question 6

Which of the two distributions of per capita CO2 emissions has the greatest range of values?

a) Central and South America

b) Europe

c) The two distributions have similar ranges.

Summary

In this section, you’ve gained practice with distributions of quantitative variables by examining distributions of a single variable across several groups. You’ve described their shapes, identified centers and spread, and made comparisons between more than one distribution. You’ve also had practice in identifying outliers in a distribution. Let’s summarize what you’ve seen so far.

  • In questions 1 and 4, you used technology to create a distribution as a histogram or a dotplot and described the shape of the distribution you created.
  • In question 2, you compared centers of distributions of a quantitative variable.
  • In questions 3 and 6, you compared the spread in distributions.
  • In question 5, you determined the presence of outliers in the distribution of a quantitative variable.

Being able to describe the center, spread, and shape, and detect outliers are essential to being able to complete the next Forming Connections activity. If you ready, then it’s time to move on!