{"id":45,"date":"2022-05-20T16:59:05","date_gmt":"2022-05-20T16:59:05","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/alphamodule\/chapter\/five-number-summary-in-box-plots-and-datasets-corequisite-support-activity\/"},"modified":"2022-07-11T19:25:10","modified_gmt":"2022-07-11T19:25:10","slug":"five-number-summary-in-box-plots-and-datasets-corequisite-support-activity","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/alphamodule\/chapter\/five-number-summary-in-box-plots-and-datasets-corequisite-support-activity\/","title":{"raw":"Five Number Summary in Boxplots and Data Sets: Background You'll Need 1","rendered":"Five Number Summary in Boxplots and Data Sets: Background You&#8217;ll Need 1"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>Learning Goals<\/h3>\r\nIn this support activity you'll become familiar with the following:\r\n<ul>\r\n \t<li><a href=\"#MinMax\">Identify the minimum and maximum values of a data set.<\/a><\/li>\r\n \t<li><a href=\"#Median\">Calculate and interpret the median.<\/a><\/li>\r\n \t<li><a href=\"#Q1\">Calculate the first quartile (Q1).<\/a><\/li>\r\n \t<li><a href=\"#Q3\">Calculate the third quartile (Q3).<\/a><\/li>\r\n \t<li><a href=\"#Five-Number\">List the five-number-summary for a quantitative variable.<\/a><\/li>\r\n \t<li><a href=\"#IQR\">Calculate the interquartile range (IQR) for a quantitative variable.<\/a><\/li>\r\n \t<li><a href=\"#Outlier\">Determine whether or not a value is an outlier.<\/a><\/li>\r\n<\/ul>\r\n<\/div>\r\nThe upcoming section of material and following activity will introduce a new graph for displaying quantitative data called a boxplot. The image below shows a boxplot labeled with the five-number-summary and interquartile range.\r\n\r\nWe'll explore boxplots in detail soon. The focus of this support activity is to help you become familiar with the characteristics of a boxplot: minimum and maximum values, median, first quartile, third quartile, and interquartile range.\r\n\r\n<img class=\"aligncenter wp-image-3078 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2021\/12\/11181025\/Screen-Shot-2022-02-11-at-1.09.31-PM.png\" alt=\"A general horizontal boxplot displaying the following features from left to right: lower outliers, minimum, Q1, median, Q3, maximum, and upper outliers. The Interquartile Range (IQR) is shown at the top of the boxplot.\" width=\"575\" height=\"319\" \/>\r\n\r\nA boxplot is a graphical visualization of a quantitative variable that shows median, spread, skew, and outliers by illustrating a set of numbers called the <strong>five-number summary<\/strong>.\u00a0In the next section of the course material, you will need to be able to relate the features of a boxplot to the data set it comes from. In the following activity, you will need to be able to interpret and compare boxplots. Begin to familiarize yourself with boxplots in this corequisite support activity during which you'll build up an understanding of the parts of the five-number summary and how to determine whether a data value is \"unusual enough\" to qualify as an outlier.\r\n\r\nTo introduce this new quantitative graph, we'll use a data set that contains the gross domestic product per capita for the\u00a0[latex]10[\/latex] most populous countries.\r\n<h2>GDP of the World\u2019s Most Populous Countries<\/h2>\r\n<img class=\"wp-image-1025 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223247\/Picture51-300x190.jpg\" alt=\"a semispherical map of the world\" width=\"499\" height=\"316\" \/>\r\n\r\nThe following table lists data for the\u00a0[latex]10[\/latex] most populous countries in 2018, and it includes each country\u2019s population rank (we can see that China had the largest population in 2018, India had the second largest population, and so on) and each country\u2019s gross domestic product (GDP) per capita. [footnote] Bevins, V. (2020). The Jakarta method: Washington\u2019s anticommunist crusade and the mass murder program that shaped our world. PublicAffairs. [\/footnote] A country\u2019s GDP is the total monetary value of everything produced in that country over the year. The GDP per capita is a country\u2019s GDP divided by its population.\r\n\r\n&nbsp;\r\n<div align=\"center\">\r\n<table style=\"height: 154px;\">\r\n<tbody>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Country<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>Population Rank<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\"><strong>GDP per Capita<\/strong><\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>China<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>1<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]9,771[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>India<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>2<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]2,016[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>United States<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>3<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]62,641[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Indonesia<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>4<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]3,894[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Pakistan<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>5<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]1,473[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Brazil<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>6<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]8,921[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Nigeria<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>7<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]2,028[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Bangladesh<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>8<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]1,698[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Russia<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>9<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]11,289[\/latex]<\/td>\r\n<\/tr>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Japan<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>10<\/strong><\/td>\r\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]39,287[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nBefore we begin, take a look at the values in the table. Are all the values relatively close or do you see any that seem unusual compared to the others? Use your observations to answer Question 1.\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 1<\/h3>\r\n[ohm_question hide_question_numbers=1]241122[\/ohm_question]\r\n\r\n[reveal-answer q=\"773333\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"773333\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\nDid you identify one or two observations in Question 1 as being unusual? It can be difficult sometimes to decide if a particular value really is an outlier. Keep this thought in mind as you work through this activity. We'll come back to this question again at the end.\r\n<h3 id=\"MinMax\">Minimum and Maximum<\/h3>\r\nTo calculate the minimum and maximum values in a dataset, find the least and the greatest. If the dataset is lengthy, use technology to do the work. Otherwise, just order the values from least to greatest. You'll need this ordering soon to locate the median as well.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 2<\/h3>\r\n[ohm_question hide_question_numbers=1]241123[\/ohm_question]\r\n\r\n[reveal-answer q=\"55738\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"55738\"]See the data table above for the values.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 3<\/h3>\r\n[ohm_question hide_question_numbers=1]241124[\/ohm_question]\r\n\r\n[reveal-answer q=\"462567\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"462567\"]Remember to put them in order first.[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Median\">Median<\/h3>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 4<\/h3>\r\n[ohm_question hide_question_numbers=1]241125[\/ohm_question]\r\n\r\n[reveal-answer q=\"505520\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"505520\"]Recall how to calculate the median of a data set containing an even number of observations. See <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/corequisite-support-activity-4c\/\">Corequisite Support Activity for Interpreting the Mean and Median of a Data Set: 4C<\/a> for a refresher as needed. [\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 5<\/h3>\r\n[ohm_question hide_question_numbers=1]241126[\/ohm_question]\r\n\r\n[reveal-answer q=\"968159\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"968159\"]Do you recall how the median splits the data? What percentile is identified by the median? [\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Q1\">First Quartile (Q1)<\/h3>\r\nTo find the first and third quartiles, first determine the list of values that lie both above and below the median. Then, take the medians of those lists.\r\n<div class=\"textbox exercises\">\r\n<h3>Interactive example<\/h3>\r\n<strong>First and Third Quartiles T<\/strong>he first quartile and third quartile, also known as Q1 and Q2, can be thought of as the median of the lower half of the data (Q1) and the median of the upper half of the data (Q2).\r\n\r\nLet's say you are given the following set of data values: [latex]2, 4, 5, 7, 8, 10, 11, 13, 14, 19, 20[\/latex]. The median of the set is [latex]10[\/latex] since [latex]10[\/latex] is the middlemost number in the set. Use this information to answer the following questions.\r\n<p style=\"text-align: center;\">[latex]2\\quad 4\\quad 5\\quad 7\\quad 8\\quad[\/latex] [latex]\\mathbf{10}\\quad[\/latex] [latex]11\\quad 13\\quad 14\\quad 19\\quad 20[\/latex]<\/p>\r\n\r\n<ol>\r\n \t<li>What numbers in the dataset lie\u00a0<strong>below<\/strong> the median?\u00a0What is the median of those values? This value will be the first quartile, Q1\r\n[reveal-answer q=\"166972\"]Show Answer[\/reveal-answer]\r\n[hidden-answer a=\"166972\"]\r\n<ul>\r\n \t<li>[latex]2, 4, 5, 6, 7[\/latex] lie below the median<\/li>\r\n \t<li>The median of this list is [latex]5[\/latex].<\/li>\r\n \t<li><strong>Q1 is [latex]5[\/latex].<\/strong>[\/hidden-answer]<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li>What numbers in the dataset lie\u00a0<strong>above\u00a0<\/strong>the median?\u00a0What is the median of those values? This value will be the third quartile, Q3.\r\n[reveal-answer q=\"987676\"]Show Answer[\/reveal-answer]\r\n[hidden-answer a=\"987676\"]\r\n<ul>\r\n \t<li>[latex]1, 13, 14, 19, 20[\/latex] lie above the mean.<\/li>\r\n \t<li>The median of this list is [latex]14[\/latex].<\/li>\r\n \t<li><strong>Q3 is\u00a0[latex]14[\/latex].<\/strong>\u00a0[\/hidden-answer]<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"591598\"]Summary[\/reveal-answer]\r\n[hidden-answer a=\"591598\"]\r\n<ul>\r\n \t<li>Q1 = 5<\/li>\r\n \t<li>Median = 10<\/li>\r\n \t<li>Q3 = 14<\/li>\r\n<\/ul>\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nNow you try it with the GDP data.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 6<\/h3>\r\n[ohm_question hide_question_numbers=1]241127[\/ohm_question]\r\n\r\n[reveal-answer q=\"803946\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"803946\"]Refer to the ordered list of data values you created in Question 2.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 7<\/h3>\r\n[ohm_question hide_question_numbers=1]241130[\/ohm_question]\r\n\r\n[reveal-answer q=\"519716\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"519716\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\nWe call this value the <strong>first quartile<\/strong>, and we sometimes denote it as <strong>Q1<\/strong>. It is the median of the values that lie below the median for the whole data set. It is also equal to the\u00a0[latex]25[\/latex]<sup>th<\/sup> percentile.\r\n<h3 id=\"Q3\">Third Quartile (Q3)<\/h3>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 8<\/h3>\r\n[ohm_question hide_question_numbers=1]241131[\/ohm_question]\r\n\r\n[reveal-answer q=\"286964\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"286964\"]Refer to the ordered list of data values you created in Question 2[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 9<\/h3>\r\n[ohm_question hide_question_numbers=1]241132[\/ohm_question]\r\n\r\n[reveal-answer q=\"198445\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"198445\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\nWe call this value the <strong>third quartile<\/strong>, and we sometimes denote it as <strong>Q3<\/strong>. It is the median of the values that lie above the median for the whole data set. It is also equal to the\u00a0[latex]75[\/latex]<sup>th<\/sup> percentile.\r\n<h3 id=\"Five-Number\">Five-Number Summary<\/h3>\r\nBefore you list the five-number summary for the GDP dataset, think about the word\u00a0<em>quartile<\/em>. We know what the first and third quartiles are? Consider for your answer to Question 10 values might represent the second and fourth quartiles? Hint: Would we need to split the data in another way to find <em>quintiles<\/em>, for example?\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 10<\/h3>\r\n[ohm_question hide_question_numbers=1]241133[\/ohm_question]\r\n\r\n[reveal-answer q=\"277023\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"277023\"]The median splits the data into two pieces (the values above the median and the values below the median). Into how many pieces do these numbers split the data?[\/hidden-answer]\r\n\r\n<\/div>\r\nNow you are ready to record the Five-number summary for this dataset to answer Question 11.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 11<\/h3>\r\n[ohm_question hide_question_numbers=1]241135[\/ohm_question]\r\n\r\n[reveal-answer q=\"729681\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"729681\"]Recall the five number summary components and type them in their respective boxes.[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"IQR\">Interquartile Range (IQR)<\/h3>\r\n<div class=\"textbox exercises\">\r\n<h3>interactive example<\/h3>\r\n<strong>Interquartile Range\u00a0<\/strong>The interquartile range (sometimes denoted as IQR) is the quantity Q3 - Q1.\r\nRecall the list of values we saw in the interactive example above with a median of [latex]10[\/latex], Q1 of [latex]5[\/latex], and Q3 of [latex]14[\/latex].\r\n<p style=\"text-align: center;\">[latex]2\\quad 4\\quad \\mathbf{5}\\quad 7\\quad 8\\quad[\/latex] [latex]\\mathbf{10}\\quad[\/latex] [latex]11\\quad 13\\quad \\mathbf{14}\\quad 19\\quad 20[\/latex]<\/p>\r\n\r\n<ol>\r\n \t<li style=\"text-align: left;\">Calculate the IQR by finding the difference Q3 - Q1.<\/li>\r\n \t<li>About how much of the data are located within the IQR?<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"208075\"]Show Answer[\/reveal-answer]\r\n[hidden-answer a=\"208075\"]\r\n<ol>\r\n \t<li>The IQR is [latex]9[\/latex]\r\n<ul>\r\n \t<li>[latex]\\text{Q3}-\\text{Q1}=\\text{IQR}[\/latex]<\/li>\r\n \t<li>[latex]14 - 5 = 9[\/latex]<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li>About 50% of the data lie within the interquartile range between Q3 and Q1.<\/li>\r\n<\/ol>\r\n<p style=\"padding-left: 30px;\"><span style=\"font-size: 1rem; text-align: initial;\">[\/hidden-answer]<\/span><\/p>\r\n\r\n<\/div>\r\nNow it's your turn to calculate the IQR of the GDP dataset.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 12<\/h3>\r\n[ohm_question hide_question_numbers=1]241138[\/ohm_question]\r\n\r\n[reveal-answer q=\"928763\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"928763\"]Recall how pieces of data are defined by all four quartiles. How many pieces lie between Q3 and Q1?[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 13<\/h3>\r\n[ohm_question hide_question_numbers=1]241139[\/ohm_question]\r\n\r\n[reveal-answer q=\"808961\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"808961\"]Recall the numbers you identified as Q1 and Q3 then calculate Q3 - Q1. [\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Outlier\">Outliers<\/h3>\r\nSome outliers seem quite simple to spot (such as the GDP per capita of the United States), but others are harder to identify (such as Japan's GDP per capita). If you were to make up a rule for testing whether a value is \"unusual enough\" to be called an outlier, what would it be? Use your rule on the value of Japan's GDP per capita to decide whether or not it is an outlier. What did you decide?\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 14<\/h3>\r\n[ohm_question hide_question_numbers=1]241140[\/ohm_question]\r\n[reveal-answer q=\"394762\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"394762\"]What do <em>you<\/em> think? Apply any rule you feel is reasonable to answer this question.[\/hidden-answer]\r\n\r\n<\/div>\r\nIn the next section, you'll learn about an accepted method of determining whether a data value \"qualifies\" to be an outlier in a skewed distribution like this one. It's called the IQR method and states that if a data value is located more than\u00a0[latex]1.5[\/latex] times the IQR to the left of Q1 or to the right of Q3, then that value is \"unusual enough\" to be called an outlier. It's important to note that, while this method can be used to identify unusual observations in skewed distributions like this one, other methods, which you'll learn about in an upcoming section, are well suited for symmetrical distributions. In certain applications, it may be desirable to distinguish between \"mild outliers\" (using\u00a0[latex]1.5[\/latex] times IQR) and \"extreme outliers\" (using\u00a0[latex]3[\/latex] times IQR). We can really set the threshold for \"unusual\" values as far away as we'd like, depending on the application. But\u00a0[latex]1.5[\/latex] times IQR is commonly used, so we'll use it here and in the upcoming section.\r\n\r\nLet's apply the method to Japan's GDP per capita in the interactive example below.\r\n<div class=\"textbox exercises\">\r\n<h3>interactive Example<\/h3>\r\nRecall that Japan's GDP per capita from the data set is $[latex]39,287[\/latex]. We would like to know how unusual this value really is in comparison to the rest of the data values. We'll use the IQR method to make the determination.\r\n\r\nUnder this method, a data value is considered an outlier if it lies\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR) above Q3 or below Q1. Since\u00a0[latex]39,287[\/latex] is greater than the median, we'll test it to see if it exceeds Q3 +\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR) (If it were a very small number, we'd test to see if it were lower than Q1 -\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR).).\r\n\r\nRecall, for this data set: Q3 =\u00a0[latex]11,289[\/latex] and IQR =\u00a0[latex]9,273[\/latex].\r\n<p style=\"padding-left: 30px;\">Step 1) Calculate\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR).<\/p>\r\n<p style=\"padding-left: 30px;\">Step 2) Calculate Q3 +\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR)<\/p>\r\n<p style=\"padding-left: 30px;\">Step 3) Compare Japan's GDP per capita. If it exceeds Q3 +\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR), then it is an outlier.<\/p>\r\nWhat did you discover? Is Japan's GDP per capita an outlier in the data set?\r\n\r\n[reveal-answer q=\"266546\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"266546\"]\r\n<p style=\"padding-left: 30px;\">Step 1)\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] ([latex]9273[\/latex]) =\u00a0[latex]13909.50[\/latex]<\/p>\r\n<p style=\"padding-left: 30px;\">Step 2)\u00a0[latex]11289[\/latex] +\u00a0[latex]13909.50[\/latex]\u00a0=\u00a0[latex]25198.50[\/latex]<\/p>\r\n<p style=\"padding-left: 30px;\">Step 3) $[latex]39,287[\/latex] is greater than $[latex]25,198.50[\/latex], therefore it is an outlier.<\/p>\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nIn this support activity, you've seen how to calculate the five-number summary and interquartile range (IQR) by hand for a data set, and you've learned about a method to mathematically determine if an observation is an outlier. These make up the features of a\u00a0 box-plot. It's time to move on to the next section where you'll use these skills as you explore boxplots for visualizing the distribution of a quantitative variable.","rendered":"<div class=\"textbox learning-objectives\">\n<h3>Learning Goals<\/h3>\n<p>In this support activity you&#8217;ll become familiar with the following:<\/p>\n<ul>\n<li><a href=\"#MinMax\">Identify the minimum and maximum values of a data set.<\/a><\/li>\n<li><a href=\"#Median\">Calculate and interpret the median.<\/a><\/li>\n<li><a href=\"#Q1\">Calculate the first quartile (Q1).<\/a><\/li>\n<li><a href=\"#Q3\">Calculate the third quartile (Q3).<\/a><\/li>\n<li><a href=\"#Five-Number\">List the five-number-summary for a quantitative variable.<\/a><\/li>\n<li><a href=\"#IQR\">Calculate the interquartile range (IQR) for a quantitative variable.<\/a><\/li>\n<li><a href=\"#Outlier\">Determine whether or not a value is an outlier.<\/a><\/li>\n<\/ul>\n<\/div>\n<p>The upcoming section of material and following activity will introduce a new graph for displaying quantitative data called a boxplot. The image below shows a boxplot labeled with the five-number-summary and interquartile range.<\/p>\n<p>We&#8217;ll explore boxplots in detail soon. The focus of this support activity is to help you become familiar with the characteristics of a boxplot: minimum and maximum values, median, first quartile, third quartile, and interquartile range.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3078 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2021\/12\/11181025\/Screen-Shot-2022-02-11-at-1.09.31-PM.png\" alt=\"A general horizontal boxplot displaying the following features from left to right: lower outliers, minimum, Q1, median, Q3, maximum, and upper outliers. The Interquartile Range (IQR) is shown at the top of the boxplot.\" width=\"575\" height=\"319\" \/><\/p>\n<p>A boxplot is a graphical visualization of a quantitative variable that shows median, spread, skew, and outliers by illustrating a set of numbers called the <strong>five-number summary<\/strong>.\u00a0In the next section of the course material, you will need to be able to relate the features of a boxplot to the data set it comes from. In the following activity, you will need to be able to interpret and compare boxplots. Begin to familiarize yourself with boxplots in this corequisite support activity during which you&#8217;ll build up an understanding of the parts of the five-number summary and how to determine whether a data value is &#8220;unusual enough&#8221; to qualify as an outlier.<\/p>\n<p>To introduce this new quantitative graph, we&#8217;ll use a data set that contains the gross domestic product per capita for the\u00a0[latex]10[\/latex] most populous countries.<\/p>\n<h2>GDP of the World\u2019s Most Populous Countries<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1025 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223247\/Picture51-300x190.jpg\" alt=\"a semispherical map of the world\" width=\"499\" height=\"316\" \/><\/p>\n<p>The following table lists data for the\u00a0[latex]10[\/latex] most populous countries in 2018, and it includes each country\u2019s population rank (we can see that China had the largest population in 2018, India had the second largest population, and so on) and each country\u2019s gross domestic product (GDP) per capita. <a class=\"footnote\" title=\"Bevins, V. (2020). The Jakarta method: Washington\u2019s anticommunist crusade and the mass murder program that shaped our world. PublicAffairs.\" id=\"return-footnote-45-1\" href=\"#footnote-45-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> A country\u2019s GDP is the total monetary value of everything produced in that country over the year. The GDP per capita is a country\u2019s GDP divided by its population.<\/p>\n<p>&nbsp;<\/p>\n<div style=\"margin: auto;\">\n<table style=\"height: 154px;\">\n<tbody>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Country<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>Population Rank<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\"><strong>GDP per Capita<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>China<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>1<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]9,771[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>India<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>2<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]2,016[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>United States<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>3<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]62,641[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Indonesia<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>4<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]3,894[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Pakistan<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>5<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]1,473[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Brazil<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>6<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]8,921[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Nigeria<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>7<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]2,028[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Bangladesh<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>8<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]1,698[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Russia<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>9<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]11,289[\/latex]<\/td>\n<\/tr>\n<tr style=\"height: 14px;\">\n<td style=\"text-align: center; height: 14px; width: 112.141px;\"><strong>Japan<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 137.5px;\"><strong>10<\/strong><\/td>\n<td style=\"text-align: center; height: 14px; width: 127.859px;\">$[latex]39,287[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Before we begin, take a look at the values in the table. Are all the values relatively close or do you see any that seem unusual compared to the others? Use your observations to answer Question 1.<\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 1<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241122\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241122&theme=oea&iframe_resize_id=ohm241122\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q773333\">Hint<\/span><\/p>\n<div id=\"q773333\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<p>Did you identify one or two observations in Question 1 as being unusual? It can be difficult sometimes to decide if a particular value really is an outlier. Keep this thought in mind as you work through this activity. We&#8217;ll come back to this question again at the end.<\/p>\n<h3 id=\"MinMax\">Minimum and Maximum<\/h3>\n<p>To calculate the minimum and maximum values in a dataset, find the least and the greatest. If the dataset is lengthy, use technology to do the work. Otherwise, just order the values from least to greatest. You&#8217;ll need this ordering soon to locate the median as well.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 2<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241123\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241123&theme=oea&iframe_resize_id=ohm241123\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q55738\">Hint<\/span><\/p>\n<div id=\"q55738\" class=\"hidden-answer\" style=\"display: none\">See the data table above for the values.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 3<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241124\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241124&theme=oea&iframe_resize_id=ohm241124\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q462567\">Hint<\/span><\/p>\n<div id=\"q462567\" class=\"hidden-answer\" style=\"display: none\">Remember to put them in order first.<\/div>\n<\/div>\n<\/div>\n<h3 id=\"Median\">Median<\/h3>\n<div class=\"textbox key-takeaways\">\n<h3>question 4<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241125\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241125&theme=oea&iframe_resize_id=ohm241125\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q505520\">Hint<\/span><\/p>\n<div id=\"q505520\" class=\"hidden-answer\" style=\"display: none\">Recall how to calculate the median of a data set containing an even number of observations. See <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/corequisite-support-activity-4c\/\">Corequisite Support Activity for Interpreting the Mean and Median of a Data Set: 4C<\/a> for a refresher as needed. <\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 5<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241126\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241126&theme=oea&iframe_resize_id=ohm241126\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q968159\">Hint<\/span><\/p>\n<div id=\"q968159\" class=\"hidden-answer\" style=\"display: none\">Do you recall how the median splits the data? What percentile is identified by the median? <\/div>\n<\/div>\n<\/div>\n<h3 id=\"Q1\">First Quartile (Q1)<\/h3>\n<p>To find the first and third quartiles, first determine the list of values that lie both above and below the median. Then, take the medians of those lists.<\/p>\n<div class=\"textbox exercises\">\n<h3>Interactive example<\/h3>\n<p><strong>First and Third Quartiles T<\/strong>he first quartile and third quartile, also known as Q1 and Q2, can be thought of as the median of the lower half of the data (Q1) and the median of the upper half of the data (Q2).<\/p>\n<p>Let&#8217;s say you are given the following set of data values: [latex]2, 4, 5, 7, 8, 10, 11, 13, 14, 19, 20[\/latex]. The median of the set is [latex]10[\/latex] since [latex]10[\/latex] is the middlemost number in the set. Use this information to answer the following questions.<\/p>\n<p style=\"text-align: center;\">[latex]2\\quad 4\\quad 5\\quad 7\\quad 8\\quad[\/latex] [latex]\\mathbf{10}\\quad[\/latex] [latex]11\\quad 13\\quad 14\\quad 19\\quad 20[\/latex]<\/p>\n<ol>\n<li>What numbers in the dataset lie\u00a0<strong>below<\/strong> the median?\u00a0What is the median of those values? This value will be the first quartile, Q1\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q166972\">Show Answer<\/span><\/p>\n<div id=\"q166972\" class=\"hidden-answer\" style=\"display: none\">\n<ul>\n<li>[latex]2, 4, 5, 6, 7[\/latex] lie below the median<\/li>\n<li>The median of this list is [latex]5[\/latex].<\/li>\n<li><strong>Q1 is [latex]5[\/latex].<\/strong><\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li>What numbers in the dataset lie\u00a0<strong>above\u00a0<\/strong>the median?\u00a0What is the median of those values? This value will be the third quartile, Q3.\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q987676\">Show Answer<\/span><\/p>\n<div id=\"q987676\" class=\"hidden-answer\" style=\"display: none\">\n<ul>\n<li>[latex]1, 13, 14, 19, 20[\/latex] lie above the mean.<\/li>\n<li>The median of this list is [latex]14[\/latex].<\/li>\n<li><strong>Q3 is\u00a0[latex]14[\/latex].<\/strong>\u00a0<\/div>\n<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q591598\">Summary<\/span><\/p>\n<div id=\"q591598\" class=\"hidden-answer\" style=\"display: none\">\n<ul>\n<li>Q1 = 5<\/li>\n<li>Median = 10<\/li>\n<li>Q3 = 14<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n<p>Now you try it with the GDP data.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 6<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241127\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241127&theme=oea&iframe_resize_id=ohm241127\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q803946\">Hint<\/span><\/p>\n<div id=\"q803946\" class=\"hidden-answer\" style=\"display: none\">Refer to the ordered list of data values you created in Question 2.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 7<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241130\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241130&theme=oea&iframe_resize_id=ohm241130\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q519716\">Hint<\/span><\/p>\n<div id=\"q519716\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<p>We call this value the <strong>first quartile<\/strong>, and we sometimes denote it as <strong>Q1<\/strong>. It is the median of the values that lie below the median for the whole data set. It is also equal to the\u00a0[latex]25[\/latex]<sup>th<\/sup> percentile.<\/p>\n<h3 id=\"Q3\">Third Quartile (Q3)<\/h3>\n<div class=\"textbox key-takeaways\">\n<h3>question 8<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241131\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241131&theme=oea&iframe_resize_id=ohm241131\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q286964\">Hint<\/span><\/p>\n<div id=\"q286964\" class=\"hidden-answer\" style=\"display: none\">Refer to the ordered list of data values you created in Question 2<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 9<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241132\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241132&theme=oea&iframe_resize_id=ohm241132\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q198445\">Hint<\/span><\/p>\n<div id=\"q198445\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<p>We call this value the <strong>third quartile<\/strong>, and we sometimes denote it as <strong>Q3<\/strong>. It is the median of the values that lie above the median for the whole data set. It is also equal to the\u00a0[latex]75[\/latex]<sup>th<\/sup> percentile.<\/p>\n<h3 id=\"Five-Number\">Five-Number Summary<\/h3>\n<p>Before you list the five-number summary for the GDP dataset, think about the word\u00a0<em>quartile<\/em>. We know what the first and third quartiles are? Consider for your answer to Question 10 values might represent the second and fourth quartiles? Hint: Would we need to split the data in another way to find <em>quintiles<\/em>, for example?<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 10<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241133\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241133&theme=oea&iframe_resize_id=ohm241133\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q277023\">Hint<\/span><\/p>\n<div id=\"q277023\" class=\"hidden-answer\" style=\"display: none\">The median splits the data into two pieces (the values above the median and the values below the median). Into how many pieces do these numbers split the data?<\/div>\n<\/div>\n<\/div>\n<p>Now you are ready to record the Five-number summary for this dataset to answer Question 11.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 11<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241135\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241135&theme=oea&iframe_resize_id=ohm241135\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q729681\">Hint<\/span><\/p>\n<div id=\"q729681\" class=\"hidden-answer\" style=\"display: none\">Recall the five number summary components and type them in their respective boxes.<\/div>\n<\/div>\n<\/div>\n<h3 id=\"IQR\">Interquartile Range (IQR)<\/h3>\n<div class=\"textbox exercises\">\n<h3>interactive example<\/h3>\n<p><strong>Interquartile Range\u00a0<\/strong>The interquartile range (sometimes denoted as IQR) is the quantity Q3 &#8211; Q1.<br \/>\nRecall the list of values we saw in the interactive example above with a median of [latex]10[\/latex], Q1 of [latex]5[\/latex], and Q3 of [latex]14[\/latex].<\/p>\n<p style=\"text-align: center;\">[latex]2\\quad 4\\quad \\mathbf{5}\\quad 7\\quad 8\\quad[\/latex] [latex]\\mathbf{10}\\quad[\/latex] [latex]11\\quad 13\\quad \\mathbf{14}\\quad 19\\quad 20[\/latex]<\/p>\n<ol>\n<li style=\"text-align: left;\">Calculate the IQR by finding the difference Q3 &#8211; Q1.<\/li>\n<li>About how much of the data are located within the IQR?<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q208075\">Show Answer<\/span><\/p>\n<div id=\"q208075\" class=\"hidden-answer\" style=\"display: none\">\n<ol>\n<li>The IQR is [latex]9[\/latex]\n<ul>\n<li>[latex]\\text{Q3}-\\text{Q1}=\\text{IQR}[\/latex]<\/li>\n<li>[latex]14 - 5 = 9[\/latex]<\/li>\n<\/ul>\n<\/li>\n<li>About 50% of the data lie within the interquartile range between Q3 and Q1.<\/li>\n<\/ol>\n<p style=\"padding-left: 30px;\"><span style=\"font-size: 1rem; text-align: initial;\"><\/div>\n<\/div>\n<p><\/span><\/p>\n<\/div>\n<p>Now it&#8217;s your turn to calculate the IQR of the GDP dataset.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 12<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241138\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241138&theme=oea&iframe_resize_id=ohm241138\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q928763\">Hint<\/span><\/p>\n<div id=\"q928763\" class=\"hidden-answer\" style=\"display: none\">Recall how pieces of data are defined by all four quartiles. How many pieces lie between Q3 and Q1?<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 13<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241139\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241139&theme=oea&iframe_resize_id=ohm241139\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q808961\">Hint<\/span><\/p>\n<div id=\"q808961\" class=\"hidden-answer\" style=\"display: none\">Recall the numbers you identified as Q1 and Q3 then calculate Q3 &#8211; Q1. <\/div>\n<\/div>\n<\/div>\n<h3 id=\"Outlier\">Outliers<\/h3>\n<p>Some outliers seem quite simple to spot (such as the GDP per capita of the United States), but others are harder to identify (such as Japan&#8217;s GDP per capita). If you were to make up a rule for testing whether a value is &#8220;unusual enough&#8221; to be called an outlier, what would it be? Use your rule on the value of Japan&#8217;s GDP per capita to decide whether or not it is an outlier. What did you decide?<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 14<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm241140\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=241140&theme=oea&iframe_resize_id=ohm241140\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q394762\">Hint<\/span><\/p>\n<div id=\"q394762\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think? Apply any rule you feel is reasonable to answer this question.<\/div>\n<\/div>\n<\/div>\n<p>In the next section, you&#8217;ll learn about an accepted method of determining whether a data value &#8220;qualifies&#8221; to be an outlier in a skewed distribution like this one. It&#8217;s called the IQR method and states that if a data value is located more than\u00a0[latex]1.5[\/latex] times the IQR to the left of Q1 or to the right of Q3, then that value is &#8220;unusual enough&#8221; to be called an outlier. It&#8217;s important to note that, while this method can be used to identify unusual observations in skewed distributions like this one, other methods, which you&#8217;ll learn about in an upcoming section, are well suited for symmetrical distributions. In certain applications, it may be desirable to distinguish between &#8220;mild outliers&#8221; (using\u00a0[latex]1.5[\/latex] times IQR) and &#8220;extreme outliers&#8221; (using\u00a0[latex]3[\/latex] times IQR). We can really set the threshold for &#8220;unusual&#8221; values as far away as we&#8217;d like, depending on the application. But\u00a0[latex]1.5[\/latex] times IQR is commonly used, so we&#8217;ll use it here and in the upcoming section.<\/p>\n<p>Let&#8217;s apply the method to Japan&#8217;s GDP per capita in the interactive example below.<\/p>\n<div class=\"textbox exercises\">\n<h3>interactive Example<\/h3>\n<p>Recall that Japan&#8217;s GDP per capita from the data set is $[latex]39,287[\/latex]. We would like to know how unusual this value really is in comparison to the rest of the data values. We&#8217;ll use the IQR method to make the determination.<\/p>\n<p>Under this method, a data value is considered an outlier if it lies\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR) above Q3 or below Q1. Since\u00a0[latex]39,287[\/latex] is greater than the median, we&#8217;ll test it to see if it exceeds Q3 +\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR) (If it were a very small number, we&#8217;d test to see if it were lower than Q1 &#8211;\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR).).<\/p>\n<p>Recall, for this data set: Q3 =\u00a0[latex]11,289[\/latex] and IQR =\u00a0[latex]9,273[\/latex].<\/p>\n<p style=\"padding-left: 30px;\">Step 1) Calculate\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR).<\/p>\n<p style=\"padding-left: 30px;\">Step 2) Calculate Q3 +\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR)<\/p>\n<p style=\"padding-left: 30px;\">Step 3) Compare Japan&#8217;s GDP per capita. If it exceeds Q3 +\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] (IQR), then it is an outlier.<\/p>\n<p>What did you discover? Is Japan&#8217;s GDP per capita an outlier in the data set?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q266546\">Show Solution<\/span><\/p>\n<div id=\"q266546\" class=\"hidden-answer\" style=\"display: none\">\n<p style=\"padding-left: 30px;\">Step 1)\u00a0[latex]1.5[\/latex] [latex]\\times[\/latex] ([latex]9273[\/latex]) =\u00a0[latex]13909.50[\/latex]<\/p>\n<p style=\"padding-left: 30px;\">Step 2)\u00a0[latex]11289[\/latex] +\u00a0[latex]13909.50[\/latex]\u00a0=\u00a0[latex]25198.50[\/latex]<\/p>\n<p style=\"padding-left: 30px;\">Step 3) $[latex]39,287[\/latex] is greater than $[latex]25,198.50[\/latex], therefore it is an outlier.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>In this support activity, you&#8217;ve seen how to calculate the five-number summary and interquartile range (IQR) by hand for a data set, and you&#8217;ve learned about a method to mathematically determine if an observation is an outlier. These make up the features of a\u00a0 box-plot. It&#8217;s time to move on to the next section where you&#8217;ll use these skills as you explore boxplots for visualizing the distribution of a quantitative variable.<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-45-1\"> Bevins, V. (2020). The Jakarta method: Washington\u2019s anticommunist crusade and the mass murder program that shaped our world. PublicAffairs.  <a href=\"#return-footnote-45-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":17533,"menu_order":39,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-45","chapter","type-chapter","status-publish","hentry"],"part":20,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/45","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/users\/17533"}],"version-history":[{"count":1,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/45\/revisions"}],"predecessor-version":[{"id":499,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/45\/revisions\/499"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/parts\/20"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapters\/45\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/media?parent=45"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/pressbooks\/v2\/chapter-type?post=45"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/contributor?post=45"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/alphamodule\/wp-json\/wp\/v2\/license?post=45"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}