{"id":98,"date":"2017-04-15T03:16:34","date_gmt":"2017-04-15T03:16:34","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/conceptstest1\/chapter\/interquartile-range-and-boxplots-2-of-3\/"},"modified":"2017-05-28T00:30:47","modified_gmt":"2017-05-28T00:30:47","slug":"interquartile-range-and-boxplots-2-of-3","status":"web-only","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/chapter\/interquartile-range-and-boxplots-2-of-3\/","title":{"raw":"Interquartile Range and Boxplots (2 of 3)","rendered":"Interquartile Range and Boxplots (2 of 3)"},"content":{"raw":"&nbsp;\r\n<div class=\"textbox learning-objectives\">\r\n<h3>Learning Objectives<\/h3>\r\n<ul>\r\n \t<li>Use a five-number summary and a boxplot to describe a distribution.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<h3>Introduction<\/h3>\r\nOn the previous page, we learned about the five-number summary. At this point, you should know the following:\r\n<ul>\r\n \t<li>The five-number summary uses quartiles to identify the center and spread of a distribution.<\/li>\r\n \t<li>The median (which is Q2) is a measure of center. We also view the median as a typical value that represents the distribution.<\/li>\r\n \t<li>The values between Q1 and Q3 give a typical range of values.<\/li>\r\n \t<li>The IQR is a way to measure the variability about the median.<\/li>\r\n<\/ul>\r\nNow we use the five-number summary to make a new type of graph, the <strong>boxplot<\/strong>. Boxplots are commonly used to summarize a distribution of a quantitative variable.\r\n<div class=\"textbox examples\">\r\n<h3>Example<\/h3>\r\n<h2>Boxplots for Exam Scores<\/h2>\r\nHere are the two sets of exam scores from the previous example. Recall that we divided the data into quartiles. In a data set, each quartile contains the same number of scores. In other words, each quartile contains 25% of the data.\r\n\r\nHere is the five-number summary for these two distributions:\r\n<ul>\r\n \t<li>Class A: Min: 40 Q1: 71 Q2: 74.5 Q3: 78.5 Max: 95<\/li>\r\n \t<li>Class B: Min: 40 Q1: 61 Q2: 74.5 Q3: 89 Max: 95<\/li>\r\n<\/ul>\r\nTo create the boxplot for each distribution,\r\n<ul>\r\n \t<li>Draw a box from Q1 to Q3.<\/li>\r\n \t<li>Draw a vertical line in the box at the median.<\/li>\r\n \t<li>Extend a tail from Q1 to the smallest value that is not an outlier and from Q3 to the largest value that is not an outlier.<\/li>\r\n \t<li>Indicate outliers with asterisks (*).<\/li>\r\n<\/ul>\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031629\/m2_summarizing_data_topic_2_3_Topic2_3IQRandBoxplots2of2_image1.png\" alt=\"Boxplot of exam scores. Class A's scores are concentrated in the seventy to eighty percentile. Class B is spread out along the graph.\" width=\"427\" height=\"186\" \/>\r\n\r\nNotice: A long box in the boxplot indicates a large IQR, so the middle half of the data has a lot of variability. A short box in the boxplot indicates a small IQR. In this case, the middle half of the data has little variability.\r\n\r\nFrequently, side-by-side boxplots are drawn vertically. Here we drew vertical dotplots with their boxplots for the exam scores from the two classes.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031631\/m2_summarizing_data_topic_2_3_Topic2_3IQRandBoxplots2of2_image2.png\" alt=\"Vertical, side-by-side boxplots of exam scores from two classes. Class A's scores are mostly in the seventy to eighty percent range. Class B's scores are spread out along the graph.\" width=\"294\" height=\"455\" \/>\r\n\r\nNote: Some statistical packages offer two options: a boxplot and a modified boxplot. We drew modified boxplots in this example. In a modified boxplot, outliers are marked with an asterisk (*). For a boxplot that is not modified, the tails extend to the minimum and maximum values. In this type of boxplot, we cannot see outliers.\r\n\r\n<\/div>\r\n<h3>Making a Boxplot:<\/h3>\r\nNow we walk through the steps for making a modified boxplot using the distribution of ages for winners of the Oscar Award for Best Actress from 1970 to 2001. The five-number summary for this distribution is\r\n<ul>\r\n \t<li>Min: 21 Q1: 32 Median: 35 Q3: 41.5 Max: 80<\/li>\r\n<\/ul>\r\nUsing the IQR definition of an outlier, there are three outliers: 61, 74, and 80.\r\n\r\nhttps:\/\/youtu.be\/qSsXqz67qHU\r\n<div class=\"textbox exercises\">\r\n<h3>Learn By Doing<\/h3>\r\nhttps:\/\/assessments.lumenlearning.com\/assessments\/3447\r\n\r\nhttps:\/\/assessments.lumenlearning.com\/assessments\/3448\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nAt this point, you should know how to\r\n<ul>\r\n \t<li>Create a boxplot from a five-number summary.<\/li>\r\n \t<li>Use a boxplot to identify and interpret quartiles.<\/li>\r\n \t<li>Identify the median and the IQR of a distribution from a boxplot.<\/li>\r\n<\/ul>\r\nNow we want to focus on what a boxplot does <em>not<\/em> tell us. A boxplot does not give us information about the following:\r\n<ul>\r\n \t<li>The number of data points in the data set.<\/li>\r\n \t<li>The number of data points within each quartile (though each quartile contains the same number of data points).<\/li>\r\n \t<li>The pattern of the data within each quartile.<\/li>\r\n<\/ul>\r\nHere are four data sets that illustrate these ideas.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031633\/m2_summarizing_data_topic_2_3_Topic2_3IQRandBoxplots2of2_image4.png\" alt=\"Four data sets with the same boxplots but different data points and distribution shapes\" width=\"539\" height=\"215\" \/>\r\n\r\nHow are these data sets <em>similar<\/em>? Notice that the four data sets have the same boxplot. This is because the five-number summary is the same for each data set. The data sets have identical minimum value, maximum value, and quartile marks, so we could say that these data sets have the same center and spread.\r\n<ul>\r\n \t<li>Center: Each data set has a median of 10.<\/li>\r\n \t<li>Spread: In each data set, the middle half of the data varies from 7 to 14, so the IQR is 7. In each data set, the data varies from 4 to 19, so the overall range is 15.<\/li>\r\n<\/ul>\r\nHow are these data sets <em>different<\/em>? The data sets do not have the same number of data points. Also, the shape of each distribution is different.\r\n\r\nThe goal of the next Learn By Doing activity is to develop a deeper understanding of how the interquartile range (IQR) measures variability about the median. Use the simulation below for the next activity. You have used a similar simulation before. Recall the instructions for adding or removing data points:\r\n<ul>\r\n \t<li>To add a point, move the slider to the value you want, then click <strong>Add<\/strong>.<\/li>\r\n \t<li>To remove a point, move the slider to the value you want, then click <strong>Minus<\/strong>.<\/li>\r\n \t<li>To reset the simulation, click the button in the upper left corner that says <strong>Reset<\/strong>.<\/li>\r\n<\/ul>\r\n<a href=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/interquartilerangeboxplots\/InterquartileRangeBoxplots.html\" target=\"new\">Click here to open this simulation in its own window.<\/a>\r\n\r\n<iframe id=\"_i_2d\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/interquartilerangeboxplots\/InterquartileRangeBoxplots.html\" width=\"750\" height=\"500\"><\/iframe>\r\n<div class=\"textbox exercises\">\r\n<h3>Learn By Doing<\/h3>\r\nhttps:\/\/assessments.lumenlearning.com\/assessments\/3842\r\n\r\nhttps:\/\/assessments.lumenlearning.com\/assessments\/3843\r\n\r\n<\/div>","rendered":"<p>&nbsp;<\/p>\n<div class=\"textbox learning-objectives\">\n<h3>Learning Objectives<\/h3>\n<ul>\n<li>Use a five-number summary and a boxplot to describe a distribution.<\/li>\n<\/ul>\n<\/div>\n<h3>Introduction<\/h3>\n<p>On the previous page, we learned about the five-number summary. At this point, you should know the following:<\/p>\n<ul>\n<li>The five-number summary uses quartiles to identify the center and spread of a distribution.<\/li>\n<li>The median (which is Q2) is a measure of center. We also view the median as a typical value that represents the distribution.<\/li>\n<li>The values between Q1 and Q3 give a typical range of values.<\/li>\n<li>The IQR is a way to measure the variability about the median.<\/li>\n<\/ul>\n<p>Now we use the five-number summary to make a new type of graph, the <strong>boxplot<\/strong>. Boxplots are commonly used to summarize a distribution of a quantitative variable.<\/p>\n<div class=\"textbox examples\">\n<h3>Example<\/h3>\n<h2>Boxplots for Exam Scores<\/h2>\n<p>Here are the two sets of exam scores from the previous example. Recall that we divided the data into quartiles. In a data set, each quartile contains the same number of scores. In other words, each quartile contains 25% of the data.<\/p>\n<p>Here is the five-number summary for these two distributions:<\/p>\n<ul>\n<li>Class A: Min: 40 Q1: 71 Q2: 74.5 Q3: 78.5 Max: 95<\/li>\n<li>Class B: Min: 40 Q1: 61 Q2: 74.5 Q3: 89 Max: 95<\/li>\n<\/ul>\n<p>To create the boxplot for each distribution,<\/p>\n<ul>\n<li>Draw a box from Q1 to Q3.<\/li>\n<li>Draw a vertical line in the box at the median.<\/li>\n<li>Extend a tail from Q1 to the smallest value that is not an outlier and from Q3 to the largest value that is not an outlier.<\/li>\n<li>Indicate outliers with asterisks (*).<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031629\/m2_summarizing_data_topic_2_3_Topic2_3IQRandBoxplots2of2_image1.png\" alt=\"Boxplot of exam scores. Class A's scores are concentrated in the seventy to eighty percentile. Class B is spread out along the graph.\" width=\"427\" height=\"186\" \/><\/p>\n<p>Notice: A long box in the boxplot indicates a large IQR, so the middle half of the data has a lot of variability. A short box in the boxplot indicates a small IQR. In this case, the middle half of the data has little variability.<\/p>\n<p>Frequently, side-by-side boxplots are drawn vertically. Here we drew vertical dotplots with their boxplots for the exam scores from the two classes.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031631\/m2_summarizing_data_topic_2_3_Topic2_3IQRandBoxplots2of2_image2.png\" alt=\"Vertical, side-by-side boxplots of exam scores from two classes. Class A's scores are mostly in the seventy to eighty percent range. Class B's scores are spread out along the graph.\" width=\"294\" height=\"455\" \/><\/p>\n<p>Note: Some statistical packages offer two options: a boxplot and a modified boxplot. We drew modified boxplots in this example. In a modified boxplot, outliers are marked with an asterisk (*). For a boxplot that is not modified, the tails extend to the minimum and maximum values. In this type of boxplot, we cannot see outliers.<\/p>\n<\/div>\n<h3>Making a Boxplot:<\/h3>\n<p>Now we walk through the steps for making a modified boxplot using the distribution of ages for winners of the Oscar Award for Best Actress from 1970 to 2001. The five-number summary for this distribution is<\/p>\n<ul>\n<li>Min: 21 Q1: 32 Median: 35 Q3: 41.5 Max: 80<\/li>\n<\/ul>\n<p>Using the IQR definition of an outlier, there are three outliers: 61, 74, and 80.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-1\" title=\"Constructing a Boxplot\" width=\"500\" height=\"375\" src=\"https:\/\/www.youtube.com\/embed\/qSsXqz67qHU?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<div class=\"textbox exercises\">\n<h3>Learn By Doing<\/h3>\n<p>\t<iframe id=\"lumen_assessment_3447\" class=\"resizable\" src=\"https:\/\/assessments.lumenlearning.com\/assessments\/load?assessment_id=3447&#38;embed=1&#38;external_user_id=&#38;external_context_id=&#38;iframe_resize_id=lumen_assessment_3447\" frameborder=\"0\" style=\"border:none;width:100%;height:100%;min-height:400px;\"><br \/>\n\t<\/iframe><\/p>\n<p>\t<iframe id=\"lumen_assessment_3448\" class=\"resizable\" src=\"https:\/\/assessments.lumenlearning.com\/assessments\/load?assessment_id=3448&#38;embed=1&#38;external_user_id=&#38;external_context_id=&#38;iframe_resize_id=lumen_assessment_3448\" frameborder=\"0\" style=\"border:none;width:100%;height:100%;min-height:400px;\"><br \/>\n\t<\/iframe><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>At this point, you should know how to<\/p>\n<ul>\n<li>Create a boxplot from a five-number summary.<\/li>\n<li>Use a boxplot to identify and interpret quartiles.<\/li>\n<li>Identify the median and the IQR of a distribution from a boxplot.<\/li>\n<\/ul>\n<p>Now we want to focus on what a boxplot does <em>not<\/em> tell us. A boxplot does not give us information about the following:<\/p>\n<ul>\n<li>The number of data points in the data set.<\/li>\n<li>The number of data points within each quartile (though each quartile contains the same number of data points).<\/li>\n<li>The pattern of the data within each quartile.<\/li>\n<\/ul>\n<p>Here are four data sets that illustrate these ideas.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031633\/m2_summarizing_data_topic_2_3_Topic2_3IQRandBoxplots2of2_image4.png\" alt=\"Four data sets with the same boxplots but different data points and distribution shapes\" width=\"539\" height=\"215\" \/><\/p>\n<p>How are these data sets <em>similar<\/em>? Notice that the four data sets have the same boxplot. This is because the five-number summary is the same for each data set. The data sets have identical minimum value, maximum value, and quartile marks, so we could say that these data sets have the same center and spread.<\/p>\n<ul>\n<li>Center: Each data set has a median of 10.<\/li>\n<li>Spread: In each data set, the middle half of the data varies from 7 to 14, so the IQR is 7. In each data set, the data varies from 4 to 19, so the overall range is 15.<\/li>\n<\/ul>\n<p>How are these data sets <em>different<\/em>? The data sets do not have the same number of data points. Also, the shape of each distribution is different.<\/p>\n<p>The goal of the next Learn By Doing activity is to develop a deeper understanding of how the interquartile range (IQR) measures variability about the median. Use the simulation below for the next activity. You have used a similar simulation before. Recall the instructions for adding or removing data points:<\/p>\n<ul>\n<li>To add a point, move the slider to the value you want, then click <strong>Add<\/strong>.<\/li>\n<li>To remove a point, move the slider to the value you want, then click <strong>Minus<\/strong>.<\/li>\n<li>To reset the simulation, click the button in the upper left corner that says <strong>Reset<\/strong>.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/interquartilerangeboxplots\/InterquartileRangeBoxplots.html\" target=\"new\">Click here to open this simulation in its own window.<\/a><\/p>\n<p><iframe loading=\"lazy\" id=\"_i_2d\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/interquartilerangeboxplots\/InterquartileRangeBoxplots.html\" width=\"750\" height=\"500\"><\/iframe><\/p>\n<div class=\"textbox exercises\">\n<h3>Learn By Doing<\/h3>\n<p>\t<iframe id=\"lumen_assessment_3842\" class=\"resizable\" src=\"https:\/\/assessments.lumenlearning.com\/assessments\/load?assessment_id=3842&#38;embed=1&#38;external_user_id=&#38;external_context_id=&#38;iframe_resize_id=lumen_assessment_3842\" frameborder=\"0\" style=\"border:none;width:100%;height:100%;min-height:400px;\"><br \/>\n\t<\/iframe><\/p>\n<p>\t<iframe id=\"lumen_assessment_3843\" class=\"resizable\" src=\"https:\/\/assessments.lumenlearning.com\/assessments\/load?assessment_id=3843&#38;embed=1&#38;external_user_id=&#38;external_context_id=&#38;iframe_resize_id=lumen_assessment_3843\" frameborder=\"0\" style=\"border:none;width:100%;height:100%;min-height:400px;\"><br \/>\n\t<\/iframe><\/p>\n<\/div>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-98\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>Concepts in Statistics. <strong>Provided by<\/strong>: Open Learning Initiative. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/oli.cmu.edu\">http:\/\/oli.cmu.edu<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":163,"menu_order":17,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"Concepts in Statistics\",\"author\":\"\",\"organization\":\"Open Learning Initiative\",\"url\":\"http:\/\/oli.cmu.edu\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"}]","CANDELA_OUTCOMES_GUID":"8cdda8d8-5533-43a7-9f34-87a8300df5aa","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-98","chapter","type-chapter","status-web-only","hentry"],"part":43,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/98","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/wp\/v2\/users\/163"}],"version-history":[{"count":5,"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/98\/revisions"}],"predecessor-version":[{"id":1325,"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/98\/revisions\/1325"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/pressbooks\/v2\/parts\/43"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/98\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/wp\/v2\/media?parent=98"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/pressbooks\/v2\/chapter-type?post=98"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/wp\/v2\/contributor?post=98"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-hccc-wm-concepts-statistics\/wp-json\/wp\/v2\/license?post=98"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}