Applications of Bar Graphs: What to Know

Learning Goals

After completing this section, you should feel comfortable performing these skills.

Click on a skill above to jump to its location in this section.

A generic contingency (or two-way) table between "Snack of Choice" (Pretzels, Skittles, Cookies, M&Ms, Twizzlers) and "Board Game of Choice" (Monopoly, Clue, Life, Scrabble).     A generic stacked bar graph between "Hours Studied" (0-7 in increments of 1) and "Course" (Math, Science, History, English).     A generic side-by-side bar graph of "Handedness" (Left or right) amongst Faculty and students, and "Frequency (Count)" from 0-100 in increments of 10.

In the upcoming activity, you will need a basic understanding of how contingency tables, stacked bar charts, and side-by-side bar charts are used to describe and analyze data on a single categorical variable for multiple populations or groups. To develop this understanding, let’s begin by recalling how we use  a more familiar graphical display (a pie chart) to represent a categorical variable for a single population by analyzing percentages of votes cast for presidential candidates.

Visualizing a Categorical Variable for One Population

recall

Which types of displays are appropriate for a categorical variable?

Core skill:

Example

Recall the techniques you used in What to Know About Displaying Categorical Data: 3A to read and interpret the following chart, which describes how people in America voted in the 2016 presidential election.[1] Take a moment to familiarize yourself with the chart, then answer Questions 1 – 3 below.

A pie chart of How America participated in the election. Data from U.S. Election Project, Dave Wasserman, Census Bureau. The "Ineligible to vote" section is 28.6%, the "Didn't vote" section is 29.9%, the "Voted Trump" section is 19.5%, the "Voted Clinton" section is 19.8%, and the "Voted other" section is 2.2%.

a)  According to the chart above, what percentage of people living in the United States did not participate in the 2016 presidential election?


 

 

b) We can see from the chart that more people did not participate than those who did. A large percentage of those were deemed “ineligible to vote.” 

Use the Internet to find out what it means to be “ineligible to vote” in a U.S. presidential election. Select all groups from the list below that can be deemed “ineligible” within the United States.

  1. American adults living in Puerto Rico
  2. American adults living in Guam
  3. American adults who at one time were convicted of felony crimes
  4. Americans under the age of 18
  5. American adults who are deemed mentally incapacitated
  6. Non-citizens and Dreamers (people living in the United States under DACA)
  7. All of the above


 

c) The  variable of interest shown in the chart could be defined as “Voter Choice,” with five possible values:

Clinton, Trump, Other, Ineligible to vote, and Chose not to vote.

What is the best description of the population(s) of interest? There is only one correct answer.

    1. Three populations of interest: Republicans, Democrats, and Other
    2. Fifty populations of interest: One for every state that makes up our electoral college
    3. One population of interest: All people living in the United States

We’ve seen that a pie chart is a good visual representation of a categorical variable (Voter Choice) from a single population or group (people living in the United States). But what can we do if we want to compare a categorical variable across multiple groups?

Let’s use the variable from the data above, but instead of grouping all Americans together as a single population of interest, we’ll focus on just the voters in a presidential election.

Displaying a Categorical Variable Across Multiple Populations or Groups

In this example, we’ll explore how to display and interpret changes in a categorical variable of interest (Voter Choice) when comparing multiple populations or groups of interest (Black, White, Latinx, Asian, and Other). We will then convert tables of data called contingency tables (or two-way tables) into stacked bar charts and side-by-side bar charts and make comparisons.

The 2016 presidential race was very different from the one in 2020. In 2016, fewer people turned out to vote,[2] more people were deemed ineligible ([latex]6[/latex] million felons in 2016[3] compared to [latex]5.1[/latex] million felons in 2020),[4] and the election results were much closer. In 2016, Hillary Clinton won the popular vote, and fewer than [latex]80,000[/latex] votes out of [latex]137[/latex] million votes cast determined the outcome of Donald Trump being selected as our president.[5]

Looking to our future, one question might be “If we increase legitimate voter participation, will one party benefit?” We can better answer this question if we study the voting patterns of different groups within the United States.

Contingency Tables (Two-Way Tables)

CNN used an exit poll to estimate the presidential 2020 voting patterns by race.[6] The following is a table of the results, where the rows describe the different groups of people of interest (White, Black, Latinx, Asian, and Other) and the columns represent the vote choices (Biden, Trump, or Other).

reading a contingency table

[Worked Example Video — a 3-instructors video illustrating the how to read a contingency table (how to see a single categorical variable measured on different sub-groups of a larger population — and how the data in the table is distributed into stacked and side-by-side bar charts]

Presidential 2020 Voting Patterns by Race
Biden Trump Other
White [latex]41[/latex] [latex]58[/latex] [latex]1[/latex]
Black [latex]87[/latex] [latex]12[/latex] [latex]1[/latex]
Latinx [latex]65[/latex] [latex]32[/latex] [latex]3[/latex]
Asian [latex]61[/latex] [latex]34[/latex] [latex]5[/latex]
Other [latex]55[/latex] [latex]41[/latex] [latex]4[/latex]

Among Asians, for example, [latex]61[/latex]% voted for Biden, [latex]34[/latex]% voted for Trump, and the remaining [latex]5[/latex]% voted for someone else.

question 1

Because this table displays the results of two categorical variables simultaneously, it is called a two-way table. It is also called a contingency table. The advantage of a contingency table is you can see each precise percentage of responses (or count of responses). A disadvantage is that the table does not present a strong visual comparison between the groups. Distributing the data from a contingency table into a stacked bar chart or side-by-side bar chart can help  us visually compare the groups.

Side-by-side Bar Graphs

Side-by-side bar graphs present data for two categorical variables from more than one group by creating two bars on the chart for each group — one bar for each variable. See the interactive example below for a demonstration.

Example

Say a sample of the members of four student organizations at your college were asked whether they preferred chocolate ice cream or vanilla. Here is a contingency table containing a summary of their responses.

Student Organization Chocolate Vanilla
A 23 12
B 13 15
C 9 21
D 17 14

 

The side-by-side bar chart below contains the same data as the two-way table above. Each of the four groups are represented along the horizontal axis with two vertical bars indicating the frequency of their responses, one for chocolate preference and one for vanilla. The key to the right of the chart identifies which bar is which by color.

A graph displays two vertical bars labeled chocolate and vanilla over the horizontal axis labeled with the groups A, B, C, D. The chocolate bar for A rises above 20 and vanilla bar raises above 10. The chocolate bar for B raises above 10 and the vanilla bar raises to 15. The chocolate bar for group C raises just below 10 and the vanilla bar raises above 20. The chocolate bar for group D raises above 15 and the vanilla bar raises just below 15.

  1. Which organization shows a clear preference for chocolate?
  2. Which organization shows a clear preference for vanilla?
  3. Which display, the table or the chart, is easier for understanding precise counts for each variable? Which gives a strong visual comparison between the groups?

 

See the video below for a perspective on reading a side-by-side bar graph.

Reading a side-by-side Bar Graph

[We can insert another short video demonstration of how to read this graph.]–> this video would be great from 1:38-2:10

Now let’s turn back to the table of voting patterns we looked at above and compare it to a side-by-side graph containing the same information. 

For Questions 2–5, refer to the standard side-by-side bar chart below, which contains the exact same information about 2020 voting patterns as the two-way table above.

A side-by-side bar chart of How America Voted in 2020 estimated using a CNN exit poll. On the right is a legend titled "Vote" that shows Blue indicates Biden, red indicates Trump, and yellow indicates other. The vertical axis of the graph is labeled "Percent (%)" and the horizontal axis is labeled "Race." For the white group, the blue bar reaches to approximately 40%, the red bar reaches almost to 60%, and the yellow bar is slightly above zero. For the black group, the blue bar reaches above 80, the red bar reached about two thirds of the way to 20%, and the yellow bar is slightly above zero. For the Latinx group, the blue bar reaches slightly above 60%, the red bar reaches to approximately halfway between 20% and 40%, and the yellow line reaches about one fifth of the way to 20%. For the Asian group, the blue bar reaches to approximately 60%, the red line reaches to approximately two thirds of the way between 20% and 40%, and the yellow line reaches about one third of the way to 20%. For the Other group, the blue bar reaches almost to 60%, the red bar reaches approximately to 40%, and the yellow bar reaches approximately one fourth of the way to 20%.e bar chart is titled "How America Voted in 2020 (Estimated using a CNN exit poll)". The x-axis is labeled "Race" and includes White, Black, Latinx, Asian, and Other. The y-axis is labeled "Percent" and includes 0-80 in increments of 20. The bars display as follows: White (40% Biden, 59% Trump, 1% Other), Black (83% Biden, 16% Trump, 1% Other), Latinx (63% Biden, 33% Trump, 4% Other), Asian (60% Biden, 33% Trump, 7% Other), and Other (56% Biden, 40% Trump, 4% Other).The groups of interest are listed on the horizontal axis (Whites, Blacks, Latinx, Asian, and Other) and the percentages associated with each voter choice are on the vertical axis. Note: within each group, the heights of the three bars sum to total [latex]100[/latex], representing [latex]100[/latex]% of all responses within that group. Also, since this side-by-side bar chart chose to represent percentages within groups (as opposed to the numbers of actual ballots cast within groups), you cannot make conclusions about counts of votes; rather, you can make conclusions about relative proportions or percentages within each group.

question 2

question 3

question 4

question 5

Stacked Bar Graphs

At this point, students will be presented with two datasets. They will be able to choose which one they would like to use to answer example questions.

Stacked bar graphs display the same type of data as a contingency table (two-way table) and a side-by-side bar graph. This type of chart offers a different perspective of a visual comparison between the groups. See the interactive example below for a demonstration.

Example

Recall the contingency table containing a summary of responses collected from members of four student organizations

Say a sample of the members of four student organizations at your college were asked whether they preferred chocolate ice cream or vanilla. Here is a contingency table containing a summary of their responses.

Student Organization Chocolate Vanilla
A 23 12
B 13 15
C 9 21
D 17 14

Four bars, each labeled A, B, C, or D are arranged along a horizontal axis. Each bar contains two shades, one for chocolate and one for vanilla. The vertical axis is labeled "Count." The bar above A contains the chocolate shading from the bottom to a point above 20, then the vanilla shading to a point above 30. The bar labeled B contains chocolate shading to a point above 10 and vanilla shading from that point to just beneath 30. The bar labeled C contains chocolate shading to a point just below 10 and vanilla shading form that point to just above 30. The bar labeled D contains chocolate shading to a point at approximately 15 and vanilla shading from that point to just above 30.

  1. True or false: the stacked bar chart shows that more students in organization C preferred chocolate than students in organization A.
  2. Which organization does the graph indicate has the greatest preference for chocolate ice cream?

 

Reading a stacked Bar Graph

[We can insert another short video demonstration of how to read this chart.]–> this video is pretty cool from 0:18-2:14

For Questions 6 and 7, consider the following standard stacked bar chart showing the exact same information as the previous table and side-by-side bar chart.

A stacked bar chart of How America Voted in 2020 estimated using a CNN exit poll. The vertical axis is labeled "Percent (%)" and the horizontal axis is labeled "Race." There is a legend on the right side labeled "Vote" showing that yellow indicates "Other," red indicates "Trump," and blue indicates "Biden." For the White group, the blue section of the bar extends approximately to 40%, the red section extends from there nearly to 100%, and the yellow section extends the rest of the way to 100%. For the Black group, the blur bar extends to approximately two thirds of the way between 80% and 100%, the red section extends nearly to 100%, and the yellow section extends the rest of the way to 100%. For the Latinx group, the blue section extends to approximately one quarter of the way between 60% and 80%, the red section extends from there to approximately four fifths of the way between 80% and 100%, and the yellow section extends the rest of the way to 100%. For the Asian group, the blue bar extends to approximately 60%, the red section extends from there to about two thirds of the way between 80% and 100%, and the yellow section extends the rest of the way to 100%. For the Other group, the blue section extends to approximately two thirds of the way between 40% and 60%, the red section extends from there to approximately three quarters of the way between 80% and 100%, and they yellow section extends the rest of the way to 100%.

In this stacked bar chart, each bar represents the responses of one group. The height of each color within that bar represents a percentage of a particular response, and the combination of all colors represents the total ([latex]100[/latex]%) of all responses within that group.  Like the side-by-side bar chart where percentage is plotted along the vertical axis, you cannot make conclusions or comparisons regarding the absolute counts of responses within or between groups.

Note: A single stacked bar chart is very similar to a pie chart, but it uses rectangular regions rather than pie slices to represent each category.

question 6

question 7

When to use Side-by-Side vs. Stacked Bar Graphs

Note the difference between a side-by-side bar graph and a stacked bar graph displaying the same information. Each is useful to display a categorical variable across multiple groups. They only differ depending upon the perspective of the information you wish to present.  A side-by-side bar graph is similar to a bar graph. If you felt a bar graph would best display your data, but you don’t want to use separate bar graphs (one for each group), then use a side-by-side bar graph to combine the two-way data into a single graph. If you felt a pie chart would best display your data, but didn’t want to use separate pie charts for each group, you could use a stacked bar graph to combine all three groups into one graph.

Example

A sample of members from four student organizations where asked whether they prefer chocolate or vanilla ice cream.

Their responses are shown below in both a side-by-side barchart and a stacked barchart.

 

A graph displays two vertical bars labeled chocolate and vanilla over the horizontal axis labeled with the groups A, B, C, D. The chocolate bar for A rises above 20 and vanilla bar raises above 10. The chocolate bar for B raises above 10 and the vanilla bar raises to 15. The chocolate bar for group C raises just below 10 and the vanilla bar raises above 20. The chocolate bar for group D raises above 15 and the vanilla bar raises just below 15.

Four bars, each labeled A, B, C, or D are arranged along a horizontal axis. Each bar contains two shades, one for chocolate and one for vanilla. The vertical axis is labeled "Count." The bar above A contains the chocolate shading from the bottom to a point above 20, then the vanilla shading to a point above 30. The bar labeled B contains chocolate shading to a point above 10 and vanilla shading from that point to just beneath 30. The bar labeled C contains chocolate shading to a point just below 10 and vanilla shading form that point to just above 30. The bar labeled D contains chocolate shading to a point at approximately 15 and vanilla shading from that point to just above 30.

  1. Which type of graph is more like a set of pie charts?
  2. Which type of graph allows you to represent a collection of bar graphs all in the same display?

stacked versus side-by-side bar chart

[Perspective video — a 3-instructor video showing how to think which kind of display to use for which situation (advantages and disadvantages): stacked vs side-by-side bar chart.]

question 8

question 9

Summary

In this section, you’ve seen representations of voter patterns by race in the 2020 presidential election. In the following Forming Connections activity, we’ll explore the possibility of making predictions about how future election outcomes by asking a research question about racial composition in the United States. Let’s summarize all the skills and tasks you’ve applied so far before you dive into the next activity.

  • In Questions 1 – 3, you read and interpreted information from a pie chart.
  • in Question 4, you read and interpreted information from a two-way (contingency) table.
  • In Questions 5 – 8, you read and interpreted a side-by-side bar chart.
  • In Questions 9 – 10, you read and interpreted a stacked bar chart.
  • In Questions 11 – 12, you explained the differences between side-by-side charts and stacked bar charts.

Pie charts are good tools for visualizing a single categorical variable for multiple populations or groups. When we want to display and interpret changes in a categorical variable of interest while comparing multiple populations or groups, we can organize the data into a contingency table (two-way table), which we can then convert into side-by-side bar charts or stacked bar charts. These kinds of charts provide a stronger visual comparison between the groups than the two-way table does.

If you feel comfortable with these ideas, it’s time to move on to Forming Connections in the next activity!


  1. Bump, P. (2016, November 16). A lot of nonvoters are mad at the election results. If only there were something they could have done. The Washington Post. https://www.washingtonpost.com/news/the-fix/wp/2016/11/16/a-lot-of-non-voters-are-mad-at-the-election-results-if-only-there-was-something-they-could-have-done/
  2. Schaul, K., Rabinowitz, K., & Mellnik, T. (2020, December 28). 2020 turnout is the highest in over a century. The Washington Post. https://www.washingtonpost.com/graphics/2020/elections/voter-turnout/
  3. Uggen, C., Larson, R., & Shannon, S. (2016, October 16). 6 million lost voters: State-level estimates of felony disenfranchisement, 2016. The Sentencing Project. https://www.sentencingproject.org/publications/6-million-lost-voters-state-level-estimates-felony-disenfranchisement-2016/
  4. Maxouris, C. (2020, October 15). More than 5 million people with felony convictions can’t vote in this year’s election, advocacy group finds. CNN. https://www.cnn.com/2020/10/15/us/felony-convictions-voting-sentencing-project-study/index.html
  5. Why voting matters: Supreme Court edition. (2018, June 28). Axios. Retrieved from https://www.axios.com/hillary-clinton-2016-election-votes-supreme-court-liberal-justice-1b4bc4fc-9fad-44b4-ab54-9ef86aa9c1f1.html
  6. Exit polls. (2020). CNN Politics. Retrieved from https://www.cnn.com/election/2020/exit-polls/president/national-results