Corequisite Support Activity for Calculating Mean and Median of a Dataset: 4A – 16

What you’ll need to know

In this support activity you’ll become familiar with the following:

You will also have an opportunity to refresh the following skills:

In the next section of the course material you will refresh your knowledge of mean and median by calculating them for a small dataset. In the following activity, you’ll use technology to calculate them for larger datasets in order to read, interpret, and make comparisons of centers between histograms. In this support activity, you’ll review reading and interpreting graphs that display the distribution of quantitative data, dotplots and histograms.

Graphical Displays Illustrating Frequency

Let’s begin by re-visiting data from a sleep study [1] of college students that we saw in Forming Connections in Displaying Categorical Data: 3A. We’ll explore and compare the distributions of a few of the numerical variables from the study, including alcoholic drinks consumed per week, hours of sleep per night on the weekends, and classes missed in a semester.

Read and interpret a dotplot

Below we are given a dotplot for the variable Alcoholic Drinks Per Week. Recall that a dotplot is used to display the frequency and distribution of a quantitative variable. Use this dotplot to answer Questions 1-3.

Recall

You may wish to refresh your understanding of how data is represented in a dotplot.

Core skill:

A dot plot showing the average number of Alcoholic drinks consumed per week by college students. A count of 33 students drank 0. A count of 9 students drank 1. A count of 16 students drank 2. A count of 30 students drank 3. A count of 18 students drank 4. A count of 31 students drank 5. A count of 23 students drank 6. A count of 22 students drank 7. A count of 14 students drank 8. A count of 11 students drank 9. A count of 26 students drank 10. Zero students drank 11. A count of 9 students drank 12. A count of 3 students drank 13. One student drank 14. A count of 3 students drank 15. One student drank 18. A count of 2 students drank 20. One student drank 24.

In order to use a graphical display to answer questions about the dataset, it helps to first ask yourself a question or two to become familiar with the visualization. We’d like to know what information this dotplot conveys about the participating students in the study. Then we can use it to answer questions about the data. Question 1 will help orient you to the information presented in the dotplot. Questions 2 and 3 ask specifically about the data.

question 1

What is the variable of interest represented in the dotplot? Is it categorical or quantitative? What does each dot in the dotplot represent? Discuss the most and least frequent responses. Are there any unusual observations?

Now that you are familiar with the information presented in the display, you can use it to answer questions about the data.

question 2

How many students in this study stated that they consume 9 alcoholic drinks per week?

question 3

According to the dotplot, which group is larger: the number of students who consume 5 alcoholic drinks per week or the number of students who consume 10 alcoholic drinks per week? How do you know?

Next, we’ll see how a histogram presents information about the distribution of a quantitative variable.

Read and interpret a histogram

Below we are given a histogram for the variable Hours of Sleep Per Night (Weekend). Recall that a histogram displays the distribution of a quantitative variable but, unlike the dotplot in which each observation is stacked above each value appearing, a histogram gathers groups of observations up into its bars.

recall

You may wish to refresh your understanding of how data is represented in a histogram.

Core skill:

Use the following histogram to address Questions 4-7.

A histogram of labeled "Hours of Sleep Per Night (Weekend)" on the x-axis and "Count" on the y-axis. For 4-5, the count is approximately 4. For 5-6, the count is approximately 8. For 6-7, the count is approximately 33. For 7-8, the count is approximately 57. For 8-9, the count is approximately 70. For 9-10, the count is approximately 54. For 10-11, the count is approximately 23. For 11-12, the count is approximately 4. For 12-13, the count is approximately 2.

As you did with for the dotplot above, first orient yourself to the information conveyed in the histogram by answering Question 4.  Then, compare and contrast the histogram to the dotplot in Question 5. Finally, read and interpret the histogram to answer Questions 6 and 7.

question 4

What is the variable of interest displayed in the histogram? Is it categorical or quantitative? What does the height of each bar in the histogram represent? What appears to be the tendency of responses for the variable of interest?

Now that you are familiar with the information presented in the histogram, look back at the dotplot and consider general differences and similarities in the two types of displays.

question 5

Compare and contrast a histogram and dotplot. In what ways are they similar? In what ways are they different?

Histograms are more commonly encountered than dotplots as a visualization of quantitative data since they can more concisely display large datasets. Dotplots are more appropriate for smaller sets of data in which the observations (the dots) are not overwhelmingly numerous.

Now use the histogram to answer questions about the variable Hours of Sleep Per Night (Weekend).

question 6

According to the histogram, approximately how many college students in this study get between 9 and 10 hours of sleep per night on the weekends?

Note: The bin 9–10 technically counts students that slept anywhere from 9 hours up to, but not including, 10 hours.

question 7

According to the histogram, do the college students in this study tend to get 7 or more hours of sleep per night on the weekends or fewer than 7 hours of sleep? Explain.

Let’s examine the distribution of a variable from another dataset.

The histogram below displays the frequency of sodium content per serving for 20 different varieties of cereals.[2]

A histogram labeled "Cereal Sodium Content (mg)" on the horizontal axis, which is numbered in increments of 50 from 0 to 100, and labeled "Count" on the vertical axis. For 0-25, the count is 1. For 50-75, the count is 2. For 100-125, the count is 1. For 125-150, the count is 3. For 150-175, the count is 2. For 175-200, the count is 4. For 200-225, the count is 5. For 275-300, the count is 1. For 325-350, the count is 1.

question 8

How many cereals have a sodium content of at least 200 milligrams (mg)?

question 9

According to the histogram, how many cereals have less than 200 mg?

Compare a single variable across groups using dotplots

Now let’s make comparisons of a single variable across two groups. The following dotplots compare the numbers of classes missed between two groups of students: those who abstain from drinking alcohol (“Abstain”) and those who consume large amounts of alcohol each week (“Heavy”). Note that “Moderate” and “Light” drinkers are excluded here.

Two dot plots are shown. In the first, it shows the number of classes missed by students who abstain from drinking alcohol. 16 students missed 0 classes. 10 students missed 1 class. 4 students missed 2 classes. One student missed 4 classes. One student missed 14 classes. One student missed 15 classes. One student missed 20 classes. The second dot plot shows the number of classes missed by students who drink heavily. 4 students missed 0 classes. 2 students missed 1 class. 3 students missed 2 classes. 4 students missed 3 classes. 1 student missed 4 classes. 1 student missed 5 classes. 1 student missed 6 classes.

question 10

According to this graph, which group is larger: the number of students who abstain from drinking or the number of students who identify as “Heavy” drinkers?

question 11

Out of those students who abstain from drinking, how many have missed more than 5 classes?

question 12

How many students in total didn’t miss a single class?

Compare a single variable across groups using histograms

The following graph compares the distribution of per capita CO2 emissions between two groups: countries in Central and South America and countries in Europe.

Two histograms are shown. In the first histogram, it shows the per capita CO2 Emission in metric tons (x-axis) and the number of Central and South American countries that produce those amounts of emissions. 14 countries emitted between 0 and 2 metric tons. 9 countries emitted between 2 and 4 metric tons. 8 countries emitted between 4 and 6 metric tons. 4 countries emitted between 6 to 8 metric tons. 1 country emitted between 8 and 10 metric tons. 2 countries emitted between 10 and 12 metric tons. In the second histogram, it shows the per capita CO2 Emission in metric tons (x-axis) and the number of European countries that produce those amounts of emissions. 1 country emitted between 0 and 2 metric tons. 4 countries emitted between 2 and 4 metric tons. 7 countries emitted between 4 and 6 metric tons. 7 countries emitted between 6 and 8 metric tons. 9 countries emitted between 8 and 10 metric tons. 1 country emitted between 10 and 12 metric tons. 2 countries emitted 12 or more metric tons.

question 13

How many European countries had CO2 emissions less than 4 metric tons?

question 14

Which group had more countries with CO2 emissions of at least 6 metric tons, Central and South America or Europe?

If you feel comfortable reading, interpreting, and comparing histograms, please move on to the next section and activity.


  1. Onyper, S. V., Thacher, P. V., Gilbert, J. W., & Gradess, S. G. (2012). Class start times, sleep, and academic performance in college: A path analysis. Chronobiology International, 29(3), 318-335.
  2. Agresti, A., Franklin, C. A., & Klingenberg, B. (2021). Statistics: The art and science of learning from data, 5th edition. Pearson. https://www.pearson.com/us/higher-education/program/Agresti-My-Lab-Statistics-with-Pearson-e-Text-Access-Card-for-Statistics-The-Art-and-Science-of-Learning-from-Data-18-Weeks-5th-Edition/PGM2788191.html