{"id":46,"date":"2022-05-20T16:59:05","date_gmt":"2022-05-20T16:59:05","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/alphamodule\/chapter\/five-number-summary-in-box-plots-and-datasets-what-to-know\/"},"modified":"2022-07-11T20:46:11","modified_gmt":"2022-07-11T20:46:11","slug":"five-number-summary-in-box-plots-and-datasets-what-to-know","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/alphamodule\/chapter\/five-number-summary-in-box-plots-and-datasets-what-to-know\/","title":{"raw":"Five Number Summary in Boxplots and Data Sets: Learn It 1","rendered":"Five Number Summary in Boxplots and Data Sets: Learn It 1"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>Learning Goals<\/h3>\r\nAfter completing this section, you should feel comfortable performing these skills.\r\n<ul>\r\n \t<li><a href=\"#5NumberSummary\">Define the terms: first quartile, third quartile, interquartile range, and five-number summary.<\/a><\/li>\r\n \t<li><a href=\"#featboxplot\">Identify the features of a boxplot<\/a><\/li>\r\n \t<li><a href=\"#outlier\">Calculate interquartile range for a data set.<\/a><\/li>\r\n \t<li><a href=\"#outlier\">Calculate the range of observations characterized as upper outliers or lower outliers.<\/a><\/li>\r\n \t<li><a href=\"#interpfeat_bxplt\">Interpret the features of a boxplot.<\/a><\/li>\r\n \t<li><a href=\"#identshape_bxplt\">Use a boxplot of a data set to identify whether the shape of its distribution is left-skewed, symmetric, or right-skewed.<\/a><\/li>\r\n<\/ul>\r\nClick on a skill above to jump to its location in this section.\r\n\r\n<\/div>\r\nBoxplots are helpful for visualizing the distribution of a quantitative variable. A boxplot clearly shows the median of the data and provides a summary at a glance of the bulk of the data and the presence of outliers. In the next activity, you will need to be able to identity and interpret the features of a boxplot, identify outliers in a data set, and relate a boxplot of a quantitative variable to its distribution. In this section, you'll learn to identify the key pieces of information needed to accomplish these tasks.\r\n\r\n<img class=\"aligncenter size-full wp-image-932\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/08182823\/Screen-Shot-2022-02-11-at-1.08.00-PM.png\" alt=\"an image of a generic boxplot labeled with outliers, minimum, Q1, median, Q3, maximum, and interquartile range (IQR)\" width=\"586\" height=\"330\">\r\n<h2>Boxplots<\/h2>\r\nIn order to interpret boxplots, you will need to identify the minimum, maximum, and median of a quantitative variable. You've done this in previous activities. See the Recall box below if you need a refresher. A boxplot captures only the median of the data set, not the mean, as a measure of center.\r\n<div class=\"textbox examples\">\r\n<h3>recall<\/h3>\r\nCore skill:\r\n[reveal-answer q=\"952167\"]Identify the minimum value, maximum value, and median of a quantitative variable.[\/reveal-answer]\r\n[hidden-answer a=\"952167\"]\r\n\r\nPlace the observed values in order to identify the minimum and maximum values. The median will be the middle number in the list (or the mean of the middle two numbers in an even-numbered list).\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nYou&nbsp;will also need to know the following definitions:\r\n<ul>\r\n \t<li style=\"font-weight: 400;\" aria-level=\"1\">the <strong>first quartile<\/strong> of a quantitative variable (sometimes denoted <strong>Q1<\/strong>) is the value below which one quarter of the data lies, and the first quartile is also equal to the&nbsp;[latex]25[\/latex]<sup>th<\/sup> percentile;<\/li>\r\n \t<li style=\"font-weight: 400;\" aria-level=\"1\">the <strong>third quartile<\/strong> of a quantitative variable (sometimes denoted <strong>Q3<\/strong>) is the value below which three quarters of the data lay, and the third quartile is also equal to the&nbsp;[latex]75[\/latex]<sup>th<\/sup> percentile; and<\/li>\r\n \t<li style=\"font-weight: 400;\" aria-level=\"1\">the<strong> interquartile range<\/strong> (sometimes denoted <strong>IQR<\/strong>) of a quantitative variable is the quantity Q3\u2013Q1.<\/li>\r\n<\/ul>\r\nThe collection of the minimum, first quartile, median, third quartile, and maximum form the <strong>five-number summary<\/strong> of the variable.\r\n<div class=\"textbox tryit\">\r\n<h3>first and third quartiles<\/h3>\r\n<span style=\"background-color: #ffff00;\"><span style=\"background-color: #99cc00;\">[Perspective video -- a 3 instructor video showing how to understand Q1 and Q3 as percentiles and\/or quarters of data. See below for the idea:]<\/span><\/span>\r\n<ul>\r\n \t<li><span style=\"background-color: #99cc00;\">the location of the Q1\/25th percentile and Q3\/75th percentile on a number line along with other percentile locations such as 10th and 98th along with three ways to think about it: <\/span>\r\n<ul>\r\n \t<li><span style=\"background-color: #99cc00;\">1)&nbsp; \"if a student scores in the 10th percentile of a test like the SAT, they have scored higher than only 10% of all the test takers but if they score in the 98th percentile, then their score is higher than 98% of all the test takers.\" and <\/span><\/li>\r\n \t<li><span style=\"background-color: #99cc00;\">2) \"percentiles divide data into two parts -- the lower part (she scored higher than 98% of the test takers) and the higher part (2% of the test takers scored higher than she did)\" and 3) \"the 25th percentile (first quartile) splits the data into the lower 25% and the 75% above that; the 50th percentile (2nd quartile) splits the data in half (marked by the median); the 75th percentile (3rd quartile)splits the data into the lower 75% and the 25% of the data above that.\" <\/span><\/li>\r\n \t<li><span style=\"background-color: #99cc00;\">3) Subtracting the value of Q1 from the value of Q3 gives the IQR (the distance between the 25th percentile and the 75th percentile)<\/span><\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li><span style=\"background-color: #99cc00;\">(critics may point out that students will have seen all of this before, which is true but doesn't acknowledge that students also need a brief refresher at this point.)<\/span><\/li>\r\n<\/ul>\r\n<\/div>\r\n<h3 id=\"featboxplot\">Features of a Boxplot<\/h3>\r\nThe features of a boxplot include the five-number summary (minimimum, Q1, median, Q3, and maximum) together with the interquartile range (IQR) and any outliers. See the interactive example below for a demonstration of how to find and interpret the five-number summary, calculate the IQR, and discuss the presence of outliers.\r\n<div class=\"textbox exercises\">\r\n<h3>interactive example<\/h3>\r\nYou may recognize the descriptive statistics below as a description of the Sleep Study you explored in <a href=\"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/chapter\/calculating-mean-and-median-of-a-data set-what-to-know\/\"><em>Calculating the Mean and Median of a Data Set: What to Know<\/em><\/a>.\r\n\r\nYou can use the quantitative data analysis tool at&nbsp;<a href=\"https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/\">https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/<\/a> to display the descriptive statistics and boxplot by choosing the data set Sleep Study: Average Sleep and Type of Plot: Boxplot in the tool. But these are also reproduced for you below.\r\n\r\nRecall that this data set contains the average number of hours of sleep per night for each of the&nbsp;253&nbsp;students in the sleep study.\r\n<h2>Sleep Study: Average Sleep<\/h2>\r\n<img class=\"aligncenter size-full wp-image-1046\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12220201\/Boxplot_AvgSleep_DescriptStat.jpg\" alt=\"Descriptive Statistics: Sample Size 253, Mean 7.97, Standard Deviation 0.965, Minimum 4.95, Q1 7.42, Median 8, Q3 8.59, Maximum 10.6 and IQR 1.17\" width=\"700\" height=\"127\">\r\n\r\n&nbsp;\r\n\r\n<img class=\"aligncenter size-full wp-image-1045\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12215659\/Boxplot_AvgSleep.png\" alt=\"A boxplot with 2 outliers at 5 and approximately 5.75 on the left, and two above 10 on the right. The whiskers extend from the box ranging from approximately 5.75 to 10. The box extends from 7.42 to 8.59 and shows the median at 8. The horizontal axis is labeled Average Sleep (Hours)\" width=\"700\" height=\"160\">\r\n\r\n<em>Note that the boxplot produced here is presented along a horizontal axis, from left to right. It is also common to see boxplots displayed along a vertical axis, from bottom to top, least to greatest. In fact, the graph you'll use to answer the questions later in the text will be displayed vertically.<\/em>\r\n\r\nUse the descriptive statistics and boxplot given here to answer the following for the Sleep Study: Average Sleep data set.\r\n<ol>\r\n \t<li style=\"list-style-type: none;\">\r\n<ol>\r\n \t<li style=\"list-style-type: none;\">\r\n<ol>\r\n \t<li>Locate the Minimum, First Quartile (Q1), Median, Third Quartile (Q3), and Maximum data values using in the list of Descriptive Statistics presented above and identify them on the graph.<\/li>\r\n \t<li>The plot indicates that about half the students reported getting fewer than _______ hours of sleep per night and half got more than that.<\/li>\r\n \t<li>About a quarter of the students got no more than _______ hours of sleep per night.<\/li>\r\n \t<li>About three-quarters of the students report sleeping up to _____ hours per night.<\/li>\r\n \t<li>About __________ of the students reported sleeping more than 8.59 hours per night.<\/li>\r\n \t<li>What is the interquartile range of this data set?<\/li>\r\n \t<li>The range of numbers considered upper and lower outliers can be found by calculating [latex]1.5\\times\\text{IQR}[\/latex] then locating the values that far below Q1 and above Q3.\r\n<ul>\r\n \t<li style=\"list-style-type: none;\">\r\n<ul>\r\n \t<li>Upper outliers are the observations greater than&nbsp;[latex]\\text{Q3}+1.5\\times\\left(\\text{IQR}\\right)[\/latex]<\/li>\r\n \t<li>Lower outliers are the observations less than [latex]\\text{Q1}-1.5\\times\\left(\\text{IQR}\\right)[\/latex].<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\nUse these formulas to identify the outliers in the data set. That is _______ hours of sleep per night or more would be considered an upper outlier, and ________ hours of sleep per night or less would be considered a lower outlier.<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"413831\"]Show Answer[\/reveal-answer]\r\n[hidden-answer a=\"413831\"]\r\n<ol>\r\n \t<li>five-number summary is indicated in the descriptive statistics and labled on the boxplot below.<img class=\"aligncenter wp-image-1048 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12224143\/Boxplot_AvgSleep_DescriptStat_labeld1.jpg\" alt=\"Descriptive statistics with the five-number summary values circled. \" width=\"700\" height=\"120\"><img class=\"aligncenter size-full wp-image-1049\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12224310\/Boxplot_AvgSleep_labeled.jpg\" alt=\"The boxplot labeled with a, b, c, d, e to correspond with the five-number summary min, Q1, median, Q3, and max.\" width=\"700\" height=\"176\"><\/li>\r\n \t<li>8; the median splits the data in half. Half of the reported sleep hours lie below 8 and half lie above 8.<\/li>\r\n \t<li>7.42; Q1 is the value below which about 25% of the data lie. About a quarter of the observations were below 7.42 hours.<\/li>\r\n \t<li>8.59; Q3 is the value below which about 75% of the data lie. About three-quarters of the students reported sleeping up to 8.59 hours per night.<\/li>\r\n \t<li>25%; Since 75% of the data lie below Q3, then the remaining 25% lie above it. About a quarter of the students reported sleeping more than 8.59 hours per night.<\/li>\r\n \t<li>IQR = 1.17. This is given in the descriptive statistics, but can be calculated as Q3 - Q1, or 8.59 - 7.42 = 1.17.<\/li>\r\n \t<li>Since [latex]1.5*\\left(\\text{IQR}\\right) = (1.5)(1.17) = 1.755[\/latex] then\r\n<ul>\r\n \t<li>Upper outliers: More than [latex]\\text{Q3} + 1.755 = 8.59 + 1.755 =10.345[\/latex].<\/li>\r\n \t<li>Lower outliers: Less than [latex]\\text{Q1} - 1.755 = 7.42 - 1.755 = 3.195[\/latex].<\/li>\r\n \t<li>More than [latex]10.345[\/latex] hours per night would be considered an upper outlier.<\/li>\r\n \t<li>Fewer than [latex]3.195[\/latex] hours per night would be considered a lower outlier.[\/hidden-answer]<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\nNow it's your turn to practice calculating and interpreting the features of a boxplot using a real data set.\r\n<h3>Five-Number Summary<\/h3>\r\nAs we explore the features of boxplots, we will work with part of a data set that reports information about whether drivers involved in a fatal crash were impaired by alcohol.[footnote]Chalabi, M. (2014, October 24). <em>Dear Mona, which state has the worst driver?<\/em> FiveThirtyEight. https:\/\/fivethirtyeight.com\/features\/which-state-has-the-worst-drivers\/[\/footnote] The data set contains&nbsp;[latex]51[\/latex] entries corresponding to all&nbsp;[latex]50[\/latex] states, as well as Washington, DC.\r\n\r\nThe following table gives the five-number summary for the percentage of drivers involved in fatal collisions who were alcohol-impaired in all&nbsp;[latex]50[\/latex] states and Washington, DC.\r\n<table class=\"aligncenter\" style=\"border-collapse: collapse; width: 63.8889%; height: 68px;\" border=\"1\">\r\n<tbody>\r\n<tr>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>Minimum<\/strong><\/td>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>First Quartile<\/strong><\/td>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>Median<\/strong><\/td>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>Third Quartile<\/strong><\/td>\r\n<td style=\"width: 155.938px; text-align: center; vertical-align: middle;\"><strong>Maximum<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]16[\/latex]<\/td>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]28[\/latex]<\/td>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]30[\/latex]<\/td>\r\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]33[\/latex]<\/td>\r\n<td style=\"width: 155.938px; text-align: center; vertical-align: middle;\">[latex]44[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nOne of the ways to visualize the data using the five-number summary is by creating a <strong>boxplot<\/strong>.&nbsp;For questions 1 - 4, refer to the following boxplot, which depicts data about the percentage of drivers involved in fatal collisions who were alcohol-impaired in all&nbsp;[latex]50[\/latex] states and Washington, DC. The boxplot is superimposed with the letters A - G labeling different features of the plot.\r\n\r\n<strong><img class=\"wp-image-1029 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223308\/Picture55-267x300.png\" alt=\"A vertical boxplot titled &quot;Percentage of drivers involved in fatal collisions who were alcohol-impaired.&quot; The vertical axis is numbered by increments of 5 from 15 to 50. On the graph, there are points at16, 41, 42, and 44. The point at 44 is labeled &quot;A.&quot; The high point of the box plot is at 38 and labeled &quot;B,&quot; while the low point is at 23 and labeled &quot;F.&quot; The high end of the box is at 33 and labeled &quot;C&quot; while the low end is at 28 and labeled &quot;E.&quot; The middle line is at 30 and labeled &quot;D.&quot;\" width=\"454\" height=\"510\"><\/strong>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 1<\/h3>\r\n[ohm_question hide_question_numbers=1]241141[\/ohm_question]\r\n\r\n[reveal-answer q=\"102884\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"102884\"]Look back at the table showing the five-number summary, and enter the letter that corresponds to the indicated term.[\/hidden-answer]\r\n\r\n<\/div>\r\nFor questions 2-4, complete each sentence using information from the boxplot above.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 2<\/h3>\r\n[ohm_question hide_question_numbers=1]241142[\/ohm_question]\r\n\r\n[reveal-answer q=\"519981\"]Hint[\/reveal-answer]\r\n\r\n[hidden-answer a=\"519981\"]This question involves about half of the states, so look for the median.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 3<\/h3>\r\n[ohm_question hide_question_numbers=1]241143[\/ohm_question]\r\n\r\n[reveal-answer q=\"16620\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"16620\"]Which number in the five-number summary is related to a quarter of the states?[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 4<\/h3>\r\n[ohm_question hide_question_numbers=1]241144[\/ohm_question]\r\n\r\n[reveal-answer q=\"735025\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"735025\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"outlier\">Interquartile Range and Outliers<\/h3>\r\nNow, let\u2019s define the idea of an outlier more precisely. Previously, we've seen that an outlier is a value that is unusual, given the other values in a data set. But what does \u201cunusual\u201d mean? To be more precise, for data with only one variable, let's define the define the following:\r\n<ul>\r\n \t<li style=\"font-weight: 400;\" aria-level=\"1\"><strong>upper outlier<\/strong> as an observation that is greater than Q3 +&nbsp;[latex]1.5[\/latex] \u00d7 (IQR); and<\/li>\r\n \t<li style=\"font-weight: 400;\" aria-level=\"1\"><strong>lower outlier<\/strong> as an observation that is less than Q1 -&nbsp;[latex]1.5[\/latex] \u00d7 (IQR).<\/li>\r\n<\/ul>\r\nUse these definitions with the boxplot above question 1 to complete the sentences in questions 5 and 6.\r\n<div class=\"textbox tryit\">\r\n<h3>identifying features of a boxplot<\/h3>\r\n<span style=\"background-color: #99cc00;\">[Worked example - a 3-instructor video showing a worked example similar to Questions 5 - 7]<\/span>\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 5<\/h3>\r\n[ohm_question hide_question_numbers=1]241146[\/ohm_question]\r\n\r\n[reveal-answer q=\"587526\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"587526\"]IQR = Q3 \u2013 Q1; use the table in Question 1 to evaluate.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 6<\/h3>\r\n[ohm_question hide_question_numbers=1]241147[\/ohm_question]\r\n\r\n[reveal-answer q=\"206360\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"206360\"]Use the five-number summary and your answer to Question 3 to evaluate. What is&nbsp;[latex]1.5[\/latex] \u00d7 (IQR)?[\/hidden-answer]\r\n\r\n<\/div>\r\nAgain, referring to the boxplot above Question 1, we saw previously how some of the boxplot\u2019s features relate to the five-number summary, but when outliers are present, the boxplot is modified as shown below.\r\n\r\n<strong><img class=\"wp-image-1030 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223314\/Picture56-252x300.png\" alt=\"A vertical boxplot titled &quot;Percentage of drivers involved in fatal collisions who were alcohol-impaired.&quot; The vertical axis is numbered by increments of 5 from 15 to 50. On the graph, there is a point at 16 labeled &quot;D.&quot; There are also points at 41, 42, and 44 collectively labeled &quot;A.&quot; The high point of the box plot is at 38 and labeled &quot;B,&quot; while the low point is at 23 and labeled &quot;C.&quot; The high end of the box is at 33 while the low end is at 28. The middle line is at 30.\" width=\"463\" height=\"551\"><\/strong>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 7<\/h3>\r\n<p style=\"text-align: left;\"><span style=\"font-size: 1rem; text-align: initial;\">[ohm_question hide_question_numbers=1]241149[\/ohm_question]<\/span><\/p>\r\n<p style=\"text-align: left;\"><span style=\"font-size: 1rem; text-align: initial;\">[reveal-answer q=\"496986\"]Hint[\/reveal-answer]<\/span><\/p>\r\n<p style=\"text-align: left;\">[hidden-answer a=\"496986\"]What do <em>you<\/em> think?[\/hidden-answer]<\/p>\r\n\r\n<\/div>\r\nThe following table lists each state in the data set, along with the corresponding percentages of drivers involved in fatal crashes who were impaired by alcohol, in order from lowest percentage to highest percentage. Use this table and the definition of <em>outlier<\/em> to answer questions 8 -9.\r\n<div align=\"center\">\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"text-align: center;\" colspan=\"4\"><strong>Drivers Involved in Fatal Crashes by State<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>State<\/strong><\/td>\r\n<td style=\"text-align: center;\"><strong>Percentage of Drivers Involved in Fatal Crashes and Impaired by Alcohol<\/strong><\/td>\r\n<td style=\"text-align: center;\"><strong>State<\/strong><\/td>\r\n<td style=\"text-align: center;\"><strong>Percentage of Drivers Involved in Fatal Crashes and Impaired by Alcohol<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Utah<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]16[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Maine<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Kentucky<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]23[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>New Hampshire<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Kansas<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]24[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Vermont<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Alaska<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]25[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Mississippi<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]31[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Georgia<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]25[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>North Carolina<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]31[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Iowa<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]25[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Pennsylvania<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]31[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Arkansas<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]26[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Maryland<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]32[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Oregon<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]26[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Nevada<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]32[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>District of Columbia<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]27[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Wyoming<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]32[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>New Mexico<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]27[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Louisiana<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Virginia<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]27[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>South Dakota<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Arizona<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Washington<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>California<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Wisconsin<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Colorado<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Illinois<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]34[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Michigan<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Missouri<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]34[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>New Jersey<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Ohio<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]34[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>West Virginia<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Massachusetts<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]35[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Florida<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Nebraska<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]35[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Idaho<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Connecticut<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]36[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Indiana<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Rhode Island<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]38[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Minnesota<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Texas<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]38[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>New York<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Hawaii<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]41[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Oklahoma<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>South Carolina<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]41[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Tennessee<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>North Dakota<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]42[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Alabama<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><strong>Montana<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]44[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Delaware<\/strong><\/td>\r\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\r\n<td style=\"text-align: center;\"><\/td>\r\n<td style=\"text-align: center;\"><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 8<\/h3>\r\n[ohm_question hide_question_numbers=1]241150[\/ohm_question]\r\n\r\n[reveal-answer q=\"609285\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"609285\"]Use the IQR and the definition of <em>outlier<\/em>.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 9<\/h3>\r\n[ohm_question hide_question_numbers=1]241151[\/ohm_question]\r\n\r\n[reveal-answer q=\"96884\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"96884\"]Make sure to include all states that are considered to be upper outliers.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 10<\/h3>\r\n[ohm_question hide_question_numbers=1]241152[\/ohm_question]\r\n\r\n[reveal-answer q=\"633143\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"633143\"]Imagine how the outliers might appear as tails on a histogram or dotplot and the symmetry (or lack thereof) of the IQR about the median.[\/hidden-answer]\r\n\r\n<\/div>\r\nNow, let's calculate the mean using technology and compare it to the median.\r\n<div class=\"textbox\">\r\n\r\nGo to the Describing and Exploring Quantitative Variables tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/\" target=\"_blank\" rel=\"noopener\">https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/<\/a>.\r\n<p style=\"padding-left: 30px;\">Step 1) Select the <strong>Single Group<\/strong> tab.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 2) Locate the dropdown under <strong>Enter Data<\/strong> and select <strong>From Textbook<\/strong>.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 3) Locate the drop-down menu under <strong>Data Set<\/strong> and select <strong>Bad Drivers (alcohol)<\/strong>.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 4) Use the tool to compute the mean percentage of drivers involved in fatal collisions who were alcohol-impaired.<\/p>\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 11<\/h3>\r\n[ohm_question hide_question_numbers=1]241153[\/ohm_question]\r\n\r\n[reveal-answer q=\"363793\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"363793\"]The mean will appear in the Descriptive Statistics[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3>&nbsp;Outliers and Shape<\/h3>\r\nWere you surprised by the actual difference in the mean and median in the data set Bad Drivers (alcohol)? Or did the tool only confirm your suspicion that the data set was roughly symmetrical? Boxplots, like histograms and dotplots, can also tell us about the shape of a distribution.\r\n<div class=\"textbox exercises\">\r\n<h3>Interactive example<\/h3>\r\nRecall the effect that skew has on the relationship between the mean and median in a data set. A right-skewed data set will pull the mean to the right of the median while a left-skewed data set will pull the mean to the left. We can use visual clues to observe the skew in a boxplot in the same way that we can in a histogram or a dotplot.\r\n\r\nThe descriptive statistics and graphs below describe the data set Oscars: Age, which you explored in&nbsp;<em><a href=\"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/chapter\/visualizing-quantitative-data-what-to-know\/\">Visualizing Quantitative Data: What to Know<\/a><\/em>. Let's use these to understand how to see the shape of a data set from a boxplot.\r\n\r\n&nbsp;\r\n\r\n<img class=\"aligncenter size-full wp-image-1057\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12234214\/Skew_OscarsAge_smaller.png\" alt=\"Descriptive statistics (mean 40, median 38), and a histogram with a tail to the right, and a boxplot with three outliers to the right.\" width=\"700\" height=\"551\">\r\n<ol>\r\n \t<li>Do you notice any skew in the histogram of this data set?<\/li>\r\n \t<li>Can you point out the corresponding outliers in the boxplot of the data?<\/li>\r\n \t<li>What is the relationship between the mean and median of the data? Is the mean less than, greater than, or roughly similar to the median?<\/li>\r\n \t<li>What can you conclude about the shape of the data?<\/li>\r\n \t<li>What visual clue in the boxplot led to your conclusion?<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"321107\"]Show Answer[\/reveal-answer]\r\n[hidden-answer a=\"321107\"]\r\n<ol>\r\n \t<li>The histogram appears to have a pronounced right tail, which would indicate a right skew.<\/li>\r\n \t<li>There are three distinct outliers to the right of the bulk of the data.<\/li>\r\n \t<li>The descriptive statistics give the median as 38 and the mean as 40. The mean is greater than the median.<\/li>\r\n \t<li>The data is right skewed. The extreme values greater than the bulk of the data have pulled the mean to the right of the median.<\/li>\r\n \t<li>The skew can be seen in the boxplot by observing outliers only to the right of the bulk of the data, with no corresponding, symmetrical outliers to the left.<\/li>\r\n<\/ol>\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nNow you try identifying the shape of the data sets represented by the boxplots in Question 12 below.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 12<\/h3>\r\n[ohm_question hide_question_numbers=1]241154[\/ohm_question]\r\n\r\n[reveal-answer q=\"41394\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"41394\"]What effect do outliers have on the skew of data?[\/hidden-answer]\r\n\r\n<\/div>\r\n<h2>Summary<\/h2>\r\nIn this section, you've learned about boxplots: how to calculate the five-number summary, how to read these numbers from a boxplot, and how to identify outliers in a data set using the interquartile range. Let's summarize where these skills showed up in the material.\r\n<ul>\r\n \t<li>In Questions 1 and 7, you identified the features of a boxplot.<\/li>\r\n \t<li>In Questions 2 - 4, you interpreted the features of a boxplot.<\/li>\r\n \t<li>In Questions 5, 6, 8, and 9, you identified outliers in a data set.<\/li>\r\n \t<li>In Questions 10 - 12, you related the boxplot of a quantitative variable to its distribution.<\/li>\r\n<\/ul>\r\nBeing able to calculate and identify features of a boxplot and relate the boxplot and distribution of a quantitative variable are necessary statistical skills and will be used in the next activity. If you feel comfortable with these skills, please move on!\r\n","rendered":"<div class=\"textbox learning-objectives\">\n<h3>Learning Goals<\/h3>\n<p>After completing this section, you should feel comfortable performing these skills.<\/p>\n<ul>\n<li><a href=\"#5NumberSummary\">Define the terms: first quartile, third quartile, interquartile range, and five-number summary.<\/a><\/li>\n<li><a href=\"#featboxplot\">Identify the features of a boxplot<\/a><\/li>\n<li><a href=\"#outlier\">Calculate interquartile range for a data set.<\/a><\/li>\n<li><a href=\"#outlier\">Calculate the range of observations characterized as upper outliers or lower outliers.<\/a><\/li>\n<li><a href=\"#interpfeat_bxplt\">Interpret the features of a boxplot.<\/a><\/li>\n<li><a href=\"#identshape_bxplt\">Use a boxplot of a data set to identify whether the shape of its distribution is left-skewed, symmetric, or right-skewed.<\/a><\/li>\n<\/ul>\n<p>Click on a skill above to jump to its location in this section.<\/p>\n<\/div>\n<p>Boxplots are helpful for visualizing the distribution of a quantitative variable. A boxplot clearly shows the median of the data and provides a summary at a glance of the bulk of the data and the presence of outliers. In the next activity, you will need to be able to identity and interpret the features of a boxplot, identify outliers in a data set, and relate a boxplot of a quantitative variable to its distribution. In this section, you&#8217;ll learn to identify the key pieces of information needed to accomplish these tasks.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-932\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/08182823\/Screen-Shot-2022-02-11-at-1.08.00-PM.png\" alt=\"an image of a generic boxplot labeled with outliers, minimum, Q1, median, Q3, maximum, and interquartile range (IQR)\" width=\"586\" height=\"330\" \/><\/p>\n<h2>Boxplots<\/h2>\n<p>In order to interpret boxplots, you will need to identify the minimum, maximum, and median of a quantitative variable. You&#8217;ve done this in previous activities. See the Recall box below if you need a refresher. A boxplot captures only the median of the data set, not the mean, as a measure of center.<\/p>\n<div class=\"textbox examples\">\n<h3>recall<\/h3>\n<p>Core skill:<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q952167\">Identify the minimum value, maximum value, and median of a quantitative variable.<\/span><\/p>\n<div id=\"q952167\" class=\"hidden-answer\" style=\"display: none\">\n<p>Place the observed values in order to identify the minimum and maximum values. The median will be the middle number in the list (or the mean of the middle two numbers in an even-numbered list).<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>You&nbsp;will also need to know the following definitions:<\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\">the <strong>first quartile<\/strong> of a quantitative variable (sometimes denoted <strong>Q1<\/strong>) is the value below which one quarter of the data lies, and the first quartile is also equal to the&nbsp;[latex]25[\/latex]<sup>th<\/sup> percentile;<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\">the <strong>third quartile<\/strong> of a quantitative variable (sometimes denoted <strong>Q3<\/strong>) is the value below which three quarters of the data lay, and the third quartile is also equal to the&nbsp;[latex]75[\/latex]<sup>th<\/sup> percentile; and<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\">the<strong> interquartile range<\/strong> (sometimes denoted <strong>IQR<\/strong>) of a quantitative variable is the quantity Q3\u2013Q1.<\/li>\n<\/ul>\n<p>The collection of the minimum, first quartile, median, third quartile, and maximum form the <strong>five-number summary<\/strong> of the variable.<\/p>\n<div class=\"textbox tryit\">\n<h3>first and third quartiles<\/h3>\n<p><span style=\"background-color: #ffff00;\"><span style=\"background-color: #99cc00;\">[Perspective video &#8212; a 3 instructor video showing how to understand Q1 and Q3 as percentiles and\/or quarters of data. See below for the idea:]<\/span><\/span><\/p>\n<ul>\n<li><span style=\"background-color: #99cc00;\">the location of the Q1\/25th percentile and Q3\/75th percentile on a number line along with other percentile locations such as 10th and 98th along with three ways to think about it: <\/span>\n<ul>\n<li><span style=\"background-color: #99cc00;\">1)&nbsp; &#8220;if a student scores in the 10th percentile of a test like the SAT, they have scored higher than only 10% of all the test takers but if they score in the 98th percentile, then their score is higher than 98% of all the test takers.&#8221; and <\/span><\/li>\n<li><span style=\"background-color: #99cc00;\">2) &#8220;percentiles divide data into two parts &#8212; the lower part (she scored higher than 98% of the test takers) and the higher part (2% of the test takers scored higher than she did)&#8221; and 3) &#8220;the 25th percentile (first quartile) splits the data into the lower 25% and the 75% above that; the 50th percentile (2nd quartile) splits the data in half (marked by the median); the 75th percentile (3rd quartile)splits the data into the lower 75% and the 25% of the data above that.&#8221; <\/span><\/li>\n<li><span style=\"background-color: #99cc00;\">3) Subtracting the value of Q1 from the value of Q3 gives the IQR (the distance between the 25th percentile and the 75th percentile)<\/span><\/li>\n<\/ul>\n<\/li>\n<li><span style=\"background-color: #99cc00;\">(critics may point out that students will have seen all of this before, which is true but doesn&#8217;t acknowledge that students also need a brief refresher at this point.)<\/span><\/li>\n<\/ul>\n<\/div>\n<h3 id=\"featboxplot\">Features of a Boxplot<\/h3>\n<p>The features of a boxplot include the five-number summary (minimimum, Q1, median, Q3, and maximum) together with the interquartile range (IQR) and any outliers. See the interactive example below for a demonstration of how to find and interpret the five-number summary, calculate the IQR, and discuss the presence of outliers.<\/p>\n<div class=\"textbox exercises\">\n<h3>interactive example<\/h3>\n<p>You may recognize the descriptive statistics below as a description of the Sleep Study you explored in <a href=\"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/chapter\/calculating-mean-and-median-of-a-data set-what-to-know\/\"><em>Calculating the Mean and Median of a Data Set: What to Know<\/em><\/a>.<\/p>\n<p>You can use the quantitative data analysis tool at&nbsp;<a href=\"https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/\">https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/<\/a> to display the descriptive statistics and boxplot by choosing the data set Sleep Study: Average Sleep and Type of Plot: Boxplot in the tool. But these are also reproduced for you below.<\/p>\n<p>Recall that this data set contains the average number of hours of sleep per night for each of the&nbsp;253&nbsp;students in the sleep study.<\/p>\n<h2>Sleep Study: Average Sleep<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1046\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12220201\/Boxplot_AvgSleep_DescriptStat.jpg\" alt=\"Descriptive Statistics: Sample Size 253, Mean 7.97, Standard Deviation 0.965, Minimum 4.95, Q1 7.42, Median 8, Q3 8.59, Maximum 10.6 and IQR 1.17\" width=\"700\" height=\"127\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1045\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12215659\/Boxplot_AvgSleep.png\" alt=\"A boxplot with 2 outliers at 5 and approximately 5.75 on the left, and two above 10 on the right. The whiskers extend from the box ranging from approximately 5.75 to 10. The box extends from 7.42 to 8.59 and shows the median at 8. The horizontal axis is labeled Average Sleep (Hours)\" width=\"700\" height=\"160\" \/><\/p>\n<p><em>Note that the boxplot produced here is presented along a horizontal axis, from left to right. It is also common to see boxplots displayed along a vertical axis, from bottom to top, least to greatest. In fact, the graph you&#8217;ll use to answer the questions later in the text will be displayed vertically.<\/em><\/p>\n<p>Use the descriptive statistics and boxplot given here to answer the following for the Sleep Study: Average Sleep data set.<\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li>Locate the Minimum, First Quartile (Q1), Median, Third Quartile (Q3), and Maximum data values using in the list of Descriptive Statistics presented above and identify them on the graph.<\/li>\n<li>The plot indicates that about half the students reported getting fewer than _______ hours of sleep per night and half got more than that.<\/li>\n<li>About a quarter of the students got no more than _______ hours of sleep per night.<\/li>\n<li>About three-quarters of the students report sleeping up to _____ hours per night.<\/li>\n<li>About __________ of the students reported sleeping more than 8.59 hours per night.<\/li>\n<li>What is the interquartile range of this data set?<\/li>\n<li>The range of numbers considered upper and lower outliers can be found by calculating [latex]1.5\\times\\text{IQR}[\/latex] then locating the values that far below Q1 and above Q3.\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li>Upper outliers are the observations greater than&nbsp;[latex]\\text{Q3}+1.5\\times\\left(\\text{IQR}\\right)[\/latex]<\/li>\n<li>Lower outliers are the observations less than [latex]\\text{Q1}-1.5\\times\\left(\\text{IQR}\\right)[\/latex].<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Use these formulas to identify the outliers in the data set. That is _______ hours of sleep per night or more would be considered an upper outlier, and ________ hours of sleep per night or less would be considered a lower outlier.<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q413831\">Show Answer<\/span><\/p>\n<div id=\"q413831\" class=\"hidden-answer\" style=\"display: none\">\n<ol>\n<li>five-number summary is indicated in the descriptive statistics and labled on the boxplot below.<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1048 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12224143\/Boxplot_AvgSleep_DescriptStat_labeld1.jpg\" alt=\"Descriptive statistics with the five-number summary values circled.\" width=\"700\" height=\"120\" \/><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1049\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12224310\/Boxplot_AvgSleep_labeled.jpg\" alt=\"The boxplot labeled with a, b, c, d, e to correspond with the five-number summary min, Q1, median, Q3, and max.\" width=\"700\" height=\"176\" \/><\/li>\n<li>8; the median splits the data in half. Half of the reported sleep hours lie below 8 and half lie above 8.<\/li>\n<li>7.42; Q1 is the value below which about 25% of the data lie. About a quarter of the observations were below 7.42 hours.<\/li>\n<li>8.59; Q3 is the value below which about 75% of the data lie. About three-quarters of the students reported sleeping up to 8.59 hours per night.<\/li>\n<li>25%; Since 75% of the data lie below Q3, then the remaining 25% lie above it. About a quarter of the students reported sleeping more than 8.59 hours per night.<\/li>\n<li>IQR = 1.17. This is given in the descriptive statistics, but can be calculated as Q3 &#8211; Q1, or 8.59 &#8211; 7.42 = 1.17.<\/li>\n<li>Since [latex]1.5*\\left(\\text{IQR}\\right) = (1.5)(1.17) = 1.755[\/latex] then\n<ul>\n<li>Upper outliers: More than [latex]\\text{Q3} + 1.755 = 8.59 + 1.755 =10.345[\/latex].<\/li>\n<li>Lower outliers: Less than [latex]\\text{Q1} - 1.755 = 7.42 - 1.755 = 3.195[\/latex].<\/li>\n<li>More than [latex]10.345[\/latex] hours per night would be considered an upper outlier.<\/li>\n<li>Fewer than [latex]3.195[\/latex] hours per night would be considered a lower outlier.<\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<\/div>\n<p>Now it&#8217;s your turn to practice calculating and interpreting the features of a boxplot using a real data set.<\/p>\n<h3>Five-Number Summary<\/h3>\n<p>As we explore the features of boxplots, we will work with part of a data set that reports information about whether drivers involved in a fatal crash were impaired by alcohol.<a class=\"footnote\" title=\"Chalabi, M. (2014, October 24). Dear Mona, which state has the worst driver? FiveThirtyEight. https:\/\/fivethirtyeight.com\/features\/which-state-has-the-worst-drivers\/\" id=\"return-footnote-46-1\" href=\"#footnote-46-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> The data set contains&nbsp;[latex]51[\/latex] entries corresponding to all&nbsp;[latex]50[\/latex] states, as well as Washington, DC.<\/p>\n<p>The following table gives the five-number summary for the percentage of drivers involved in fatal collisions who were alcohol-impaired in all&nbsp;[latex]50[\/latex] states and Washington, DC.<\/p>\n<table class=\"aligncenter\" style=\"border-collapse: collapse; width: 63.8889%; height: 68px;\">\n<tbody>\n<tr>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>Minimum<\/strong><\/td>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>First Quartile<\/strong><\/td>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>Median<\/strong><\/td>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\"><strong>Third Quartile<\/strong><\/td>\n<td style=\"width: 155.938px; text-align: center; vertical-align: middle;\"><strong>Maximum<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]16[\/latex]<\/td>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]28[\/latex]<\/td>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]30[\/latex]<\/td>\n<td style=\"width: 155.891px; text-align: center; vertical-align: middle;\">[latex]33[\/latex]<\/td>\n<td style=\"width: 155.938px; text-align: center; vertical-align: middle;\">[latex]44[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>One of the ways to visualize the data using the five-number summary is by creating a <strong>boxplot<\/strong>.&nbsp;For questions 1 &#8211; 4, refer to the following boxplot, which depicts data about the percentage of drivers involved in fatal collisions who were alcohol-impaired in all&nbsp;[latex]50[\/latex] states and Washington, DC. The boxplot is superimposed with the letters A &#8211; G labeling different features of the plot.<\/p>\n<p><strong><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1029 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223308\/Picture55-267x300.png\" alt=\"A vertical boxplot titled &quot;Percentage of drivers involved in fatal collisions who were alcohol-impaired.&quot; The vertical axis is numbered by increments of 5 from 15 to 50. On the graph, there are points at16, 41, 42, and 44. The point at 44 is labeled &quot;A.&quot; The high point of the box plot is at 38 and labeled &quot;B,&quot; while the low point is at 23 and labeled &quot;F.&quot; The high end of the box is at 33 and labeled &quot;C&quot; while the low end is at 28 and labeled &quot;E.&quot; The middle line is at 30 and labeled &quot;D.&quot;\" width=\"454\" height=\"510\" \/><\/strong><\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 1<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241141\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241141&theme=oea&iframe_resize_id=ohm241141\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q102884\">Hint<\/span><\/p>\n<div id=\"q102884\" class=\"hidden-answer\" style=\"display: none\">Look back at the table showing the five-number summary, and enter the letter that corresponds to the indicated term.<\/div>\n<\/div>\n<\/div>\n<p>For questions 2-4, complete each sentence using information from the boxplot above.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 2<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241142\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241142&theme=oea&iframe_resize_id=ohm241142\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q519981\">Hint<\/span><\/p>\n<div id=\"q519981\" class=\"hidden-answer\" style=\"display: none\">This question involves about half of the states, so look for the median.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 3<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241143\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241143&theme=oea&iframe_resize_id=ohm241143\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q16620\">Hint<\/span><\/p>\n<div id=\"q16620\" class=\"hidden-answer\" style=\"display: none\">Which number in the five-number summary is related to a quarter of the states?<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 4<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241144\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241144&theme=oea&iframe_resize_id=ohm241144\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q735025\">Hint<\/span><\/p>\n<div id=\"q735025\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<h3 id=\"outlier\">Interquartile Range and Outliers<\/h3>\n<p>Now, let\u2019s define the idea of an outlier more precisely. Previously, we&#8217;ve seen that an outlier is a value that is unusual, given the other values in a data set. But what does \u201cunusual\u201d mean? To be more precise, for data with only one variable, let&#8217;s define the define the following:<\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><strong>upper outlier<\/strong> as an observation that is greater than Q3 +&nbsp;[latex]1.5[\/latex] \u00d7 (IQR); and<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><strong>lower outlier<\/strong> as an observation that is less than Q1 &#8211;&nbsp;[latex]1.5[\/latex] \u00d7 (IQR).<\/li>\n<\/ul>\n<p>Use these definitions with the boxplot above question 1 to complete the sentences in questions 5 and 6.<\/p>\n<div class=\"textbox tryit\">\n<h3>identifying features of a boxplot<\/h3>\n<p><span style=\"background-color: #99cc00;\">[Worked example &#8211; a 3-instructor video showing a worked example similar to Questions 5 &#8211; 7]<\/span><\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 5<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241146\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241146&theme=oea&iframe_resize_id=ohm241146\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q587526\">Hint<\/span><\/p>\n<div id=\"q587526\" class=\"hidden-answer\" style=\"display: none\">IQR = Q3 \u2013 Q1; use the table in Question 1 to evaluate.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 6<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241147\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241147&theme=oea&iframe_resize_id=ohm241147\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q206360\">Hint<\/span><\/p>\n<div id=\"q206360\" class=\"hidden-answer\" style=\"display: none\">Use the five-number summary and your answer to Question 3 to evaluate. What is&nbsp;[latex]1.5[\/latex] \u00d7 (IQR)?<\/div>\n<\/div>\n<\/div>\n<p>Again, referring to the boxplot above Question 1, we saw previously how some of the boxplot\u2019s features relate to the five-number summary, but when outliers are present, the boxplot is modified as shown below.<\/p>\n<p><strong><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1030 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223314\/Picture56-252x300.png\" alt=\"A vertical boxplot titled &quot;Percentage of drivers involved in fatal collisions who were alcohol-impaired.&quot; The vertical axis is numbered by increments of 5 from 15 to 50. On the graph, there is a point at 16 labeled &quot;D.&quot; There are also points at 41, 42, and 44 collectively labeled &quot;A.&quot; The high point of the box plot is at 38 and labeled &quot;B,&quot; while the low point is at 23 and labeled &quot;C.&quot; The high end of the box is at 33 while the low end is at 28. The middle line is at 30.\" width=\"463\" height=\"551\" \/><\/strong><\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 7<\/h3>\n<p style=\"text-align: left;\"><span style=\"font-size: 1rem; text-align: initial;\"><iframe loading=\"lazy\" id=\"ohm241149\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241149&theme=oea&iframe_resize_id=ohm241149\" width=\"100%\" height=\"150\"><\/iframe><\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-size: 1rem; text-align: initial;\"><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q496986\">Hint<\/span><\/span><\/p>\n<p style=\"text-align: left;\">\n<div id=\"q496986\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<p>The following table lists each state in the data set, along with the corresponding percentages of drivers involved in fatal crashes who were impaired by alcohol, in order from lowest percentage to highest percentage. Use this table and the definition of <em>outlier<\/em> to answer questions 8 -9.<\/p>\n<div style=\"margin: auto;\">\n<table>\n<tbody>\n<tr>\n<td style=\"text-align: center;\" colspan=\"4\"><strong>Drivers Involved in Fatal Crashes by State<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>State<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Percentage of Drivers Involved in Fatal Crashes and Impaired by Alcohol<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>State<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Percentage of Drivers Involved in Fatal Crashes and Impaired by Alcohol<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Utah<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]16[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Maine<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Kentucky<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]23[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>New Hampshire<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Kansas<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]24[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Vermont<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Alaska<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]25[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Mississippi<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]31[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Georgia<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]25[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>North Carolina<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]31[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Iowa<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]25[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Pennsylvania<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]31[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Arkansas<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]26[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Maryland<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]32[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Oregon<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]26[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Nevada<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]32[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>District of Columbia<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]27[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Wyoming<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]32[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>New Mexico<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]27[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Louisiana<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Virginia<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]27[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>South Dakota<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Arizona<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Washington<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>California<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Wisconsin<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]33[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Colorado<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Illinois<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]34[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Michigan<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Missouri<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]34[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>New Jersey<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Ohio<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]34[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>West Virginia<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]28[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Massachusetts<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]35[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Florida<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Nebraska<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]35[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Idaho<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Connecticut<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]36[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Indiana<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Rhode Island<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]38[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Minnesota<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Texas<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]38[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>New York<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Hawaii<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]41[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Oklahoma<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>South Carolina<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]41[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Tennessee<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]29[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>North Dakota<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]42[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Alabama<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\n<td style=\"text-align: center;\"><strong>Montana<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]44[\/latex]<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Delaware<\/strong><\/td>\n<td style=\"text-align: center;\">[latex]30[\/latex]<\/td>\n<td style=\"text-align: center;\"><\/td>\n<td style=\"text-align: center;\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 8<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241150\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241150&theme=oea&iframe_resize_id=ohm241150\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q609285\">Hint<\/span><\/p>\n<div id=\"q609285\" class=\"hidden-answer\" style=\"display: none\">Use the IQR and the definition of <em>outlier<\/em>.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 9<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241151\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241151&theme=oea&iframe_resize_id=ohm241151\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q96884\">Hint<\/span><\/p>\n<div id=\"q96884\" class=\"hidden-answer\" style=\"display: none\">Make sure to include all states that are considered to be upper outliers.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 10<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241152\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241152&theme=oea&iframe_resize_id=ohm241152\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q633143\">Hint<\/span><\/p>\n<div id=\"q633143\" class=\"hidden-answer\" style=\"display: none\">Imagine how the outliers might appear as tails on a histogram or dotplot and the symmetry (or lack thereof) of the IQR about the median.<\/div>\n<\/div>\n<\/div>\n<p>Now, let&#8217;s calculate the mean using technology and compare it to the median.<\/p>\n<div class=\"textbox\">\n<p>Go to the Describing and Exploring Quantitative Variables tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/\" target=\"_blank\" rel=\"noopener\">https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/<\/a>.<\/p>\n<p style=\"padding-left: 30px;\">Step 1) Select the <strong>Single Group<\/strong> tab.<\/p>\n<p style=\"padding-left: 30px;\">Step 2) Locate the dropdown under <strong>Enter Data<\/strong> and select <strong>From Textbook<\/strong>.<\/p>\n<p style=\"padding-left: 30px;\">Step 3) Locate the drop-down menu under <strong>Data Set<\/strong> and select <strong>Bad Drivers (alcohol)<\/strong>.<\/p>\n<p style=\"padding-left: 30px;\">Step 4) Use the tool to compute the mean percentage of drivers involved in fatal collisions who were alcohol-impaired.<\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 11<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241153\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241153&theme=oea&iframe_resize_id=ohm241153\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q363793\">Hint<\/span><\/p>\n<div id=\"q363793\" class=\"hidden-answer\" style=\"display: none\">The mean will appear in the Descriptive Statistics<\/div>\n<\/div>\n<\/div>\n<h3>&nbsp;Outliers and Shape<\/h3>\n<p>Were you surprised by the actual difference in the mean and median in the data set Bad Drivers (alcohol)? Or did the tool only confirm your suspicion that the data set was roughly symmetrical? Boxplots, like histograms and dotplots, can also tell us about the shape of a distribution.<\/p>\n<div class=\"textbox exercises\">\n<h3>Interactive example<\/h3>\n<p>Recall the effect that skew has on the relationship between the mean and median in a data set. A right-skewed data set will pull the mean to the right of the median while a left-skewed data set will pull the mean to the left. We can use visual clues to observe the skew in a boxplot in the same way that we can in a histogram or a dotplot.<\/p>\n<p>The descriptive statistics and graphs below describe the data set Oscars: Age, which you explored in&nbsp;<em><a href=\"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/chapter\/visualizing-quantitative-data-what-to-know\/\">Visualizing Quantitative Data: What to Know<\/a><\/em>. Let&#8217;s use these to understand how to see the shape of a data set from a boxplot.<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1057\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/12234214\/Skew_OscarsAge_smaller.png\" alt=\"Descriptive statistics (mean 40, median 38), and a histogram with a tail to the right, and a boxplot with three outliers to the right.\" width=\"700\" height=\"551\" \/><\/p>\n<ol>\n<li>Do you notice any skew in the histogram of this data set?<\/li>\n<li>Can you point out the corresponding outliers in the boxplot of the data?<\/li>\n<li>What is the relationship between the mean and median of the data? Is the mean less than, greater than, or roughly similar to the median?<\/li>\n<li>What can you conclude about the shape of the data?<\/li>\n<li>What visual clue in the boxplot led to your conclusion?<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q321107\">Show Answer<\/span><\/p>\n<div id=\"q321107\" class=\"hidden-answer\" style=\"display: none\">\n<ol>\n<li>The histogram appears to have a pronounced right tail, which would indicate a right skew.<\/li>\n<li>There are three distinct outliers to the right of the bulk of the data.<\/li>\n<li>The descriptive statistics give the median as 38 and the mean as 40. The mean is greater than the median.<\/li>\n<li>The data is right skewed. The extreme values greater than the bulk of the data have pulled the mean to the right of the median.<\/li>\n<li>The skew can be seen in the boxplot by observing outliers only to the right of the bulk of the data, with no corresponding, symmetrical outliers to the left.<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>\n<p>Now you try identifying the shape of the data sets represented by the boxplots in Question 12 below.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 12<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241154\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241154&theme=oea&iframe_resize_id=ohm241154\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q41394\">Hint<\/span><\/p>\n<div id=\"q41394\" class=\"hidden-answer\" style=\"display: none\">What effect do outliers have on the skew of data?<\/div>\n<\/div>\n<\/div>\n<h2>Summary<\/h2>\n<p>In this section, you&#8217;ve learned about boxplots: how to calculate the five-number summary, how to read these numbers from a boxplot, and how to identify outliers in a data set using the interquartile range. Let&#8217;s summarize where these skills showed up in the material.<\/p>\n<ul>\n<li>In Questions 1 and 7, you identified the features of a boxplot.<\/li>\n<li>In Questions 2 &#8211; 4, you interpreted the features of a boxplot.<\/li>\n<li>In Questions 5, 6, 8, and 9, you identified outliers in a data set.<\/li>\n<li>In Questions 10 &#8211; 12, you related the boxplot of a quantitative variable to its distribution.<\/li>\n<\/ul>\n<p>Being able to calculate and identify features of a boxplot and relate the boxplot and distribution of a quantitative variable are necessary statistical skills and will be used in the next activity. If you feel comfortable with these skills, please move on!<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-46-1\">Chalabi, M. (2014, October 24). <em>Dear Mona, which state has the worst driver?<\/em> FiveThirtyEight. https:\/\/fivethirtyeight.com\/features\/which-state-has-the-worst-drivers\/ <a href=\"#return-footnote-46-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":17533,"menu_order":40,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-46","chapter","type-chapter","status-publish","hentry"],"part":20,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/46","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/users\/17533"}],"version-history":[{"count":2,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/46\/revisions"}],"predecessor-version":[{"id":535,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/46\/revisions\/535"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/parts\/20"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/46\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/media?parent=46"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapter-type?post=46"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/contributor?post=46"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/license?post=46"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}