{"id":490,"date":"2021-12-20T14:49:46","date_gmt":"2021-12-20T14:49:46","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=490"},"modified":"2022-02-17T20:13:13","modified_gmt":"2022-02-17T20:13:13","slug":"corequisite-support-activity-4d","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/corequisite-support-activity-4d\/","title":{"raw":"Corequisite Support Activity for Five Number Summary in Box Plots and Datasets: 4D - 25","rendered":"Corequisite Support Activity for Five Number Summary in Box Plots and Datasets: 4D &#8211; 25"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>What you'll need to know<\/h3>\r\nIn this support activity you'll become familiar with the following:\r\n<ul>\r\n \t<li style=\"list-style-type: none;\">\r\n<ul>\r\n \t<li><a href=\"#MinMax\">Identify the minimum and maximum values of a dataset.<\/a><\/li>\r\n \t<li><a href=\"#Median\">Calculate and interpret the median.<\/a><\/li>\r\n \t<li><a href=\"#Q1\">Calculate the first quartile (Q1).<\/a><\/li>\r\n \t<li><a href=\"#Q3\">Calculate the third quartile (Q3).<\/a><\/li>\r\n \t<li><a href=\"#Five-Number\">List the five-number-summary for a quantitative variable.<\/a><\/li>\r\n \t<li><a href=\"#IQR\">Calculate the interquartile range (IQR) for a quantitative variable.<\/a><\/li>\r\n \t<li><a href=\"#Outlier\">Determine whether or not a value is an outlier.<\/a><\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n<\/div>\r\nThe upcoming section of material and following activity will introduce a new graph for displaying quantitative data called a boxplot. The image below shows a boxplot labeled with the five-number-summary and interquartile range. We'll explore boxplots in detail soon. The focus of this support activity is to help you become familiar with the characteristics of a boxplot: minimum and maximum values, median, first quartile, third quartile, and interquartile range.\r\n\r\n<img class=\"aligncenter wp-image-3078 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2021\/12\/11181025\/Screen-Shot-2022-02-11-at-1.09.31-PM.png\" alt=\"A general horizontal boxplot displaying the following features from left to right: lower outliers, minimum, Q1, median, Q3, maximum, and upper outliers. The Interquartile Range (IQR) is shown at the top of the boxplot.\" width=\"575\" height=\"319\" \/>\r\n\r\nA boxplot is a graphical visualization of a quantitative variable that shows median, spread, skew, and outliers by illustrating a set of numbers called the <strong>five-number summary<\/strong>.\u00a0In the next section of the course material, you will need to be able to relate the features of a boxplot to the dataset it comes from. In the following activity, you will need to be able to interpret and compare boxplots. Begin to familiarize yourself with boxplots in this corequisite support activity during which you'll build up an understanding of the parts of the five-number summary and how to determine whether a data value is \"unusual enough\" to qualify as an outlier.\r\n\r\nTo introduce this new quantitative graph, we'll use a dataset that contains the gross domestic product per capita for the 10 most populous countries.\r\n<h2>GDP of the World\u2019s Most Populous Countries<\/h2>\r\n<img class=\"wp-image-1025 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223247\/Picture51-300x190.jpg\" alt=\"a semispherical map of the world\" width=\"499\" height=\"316\" \/>\r\n\r\nThe following table lists data for the 10 most populous countries in 2018, and it includes each country\u2019s population rank (we can see that China had the largest population in 2018, India had the second largest population, and so on) and each country\u2019s gross domestic product (GDP) per capita. [footnote] Bevins, V. (2020). The Jakarta method: Washington\u2019s anticommunist crusade and the mass murder program that shaped our world. PublicAffairs. [\/footnote] A country\u2019s GDP is the total monetary value of everything produced in that country over the year. The GDP per capita is a country\u2019s GDP divided by its population.\r\n\r\n&nbsp;\r\n<div align=\"center\">\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Country<\/strong><\/td>\r\n<td style=\"text-align: center;\"><strong>Population Rank<\/strong><\/td>\r\n<td style=\"text-align: center;\"><strong>GDP per Capita<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>China<\/strong><\/td>\r\n<td style=\"text-align: center;\">1<\/td>\r\n<td style=\"text-align: center;\">$9,771<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>India<\/strong><\/td>\r\n<td style=\"text-align: center;\">2<\/td>\r\n<td style=\"text-align: center;\">$2,016<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>United States<\/strong><\/td>\r\n<td style=\"text-align: center;\">3<\/td>\r\n<td style=\"text-align: center;\">$62,641<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Indonesia<\/strong><\/td>\r\n<td style=\"text-align: center;\">4<\/td>\r\n<td style=\"text-align: center;\">$3,894<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Pakistan<\/strong><\/td>\r\n<td style=\"text-align: center;\">5<\/td>\r\n<td style=\"text-align: center;\">$1,473<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Brazil<\/strong><\/td>\r\n<td style=\"text-align: center;\">6<\/td>\r\n<td style=\"text-align: center;\">$8,921<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Nigeria<\/strong><\/td>\r\n<td style=\"text-align: center;\">7<\/td>\r\n<td style=\"text-align: center;\">$2,028<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Bangladesh<\/strong><\/td>\r\n<td style=\"text-align: center;\">8<\/td>\r\n<td style=\"text-align: center;\">$1,698<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Russia<\/strong><\/td>\r\n<td style=\"text-align: center;\">9<\/td>\r\n<td style=\"text-align: center;\">$11,289<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Japan<\/strong><\/td>\r\n<td style=\"text-align: center;\">10<\/td>\r\n<td style=\"text-align: center;\">$39,287<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 1<\/h3>\r\nAre there any observations that seem unusual compared to the other entries?\r\n\r\n[reveal-answer q=\"773333\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"773333\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\nDid you identify one or two observations in Question 1 as being unusual? It can be difficult sometimes to decide if a particular value really is an outlier. Keep this thought in mind as you work through this activity. We'll come back to this question again at the end.\r\n<h3 id=\"MinMax\">Identify minimum and maximum data values<\/h3>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 2<\/h3>\r\nList the 10 most populous countries\u2019 GDP per capita in 2018 from least to greatest.\r\n\r\n[reveal-answer q=\"55738\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"55738\"]See the data table above for the values.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 3<\/h3>\r\nWhat are the minimum and maximum values of this dataset?\r\n\r\n[reveal-answer q=\"462567\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"462567\"]Remember to put them in order first.[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Median\">Calculate and interpret the median<\/h3>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 4<\/h3>\r\nIn 2018, what was the median GDP per capita among the 10 most populous countries?\r\n\r\n[reveal-answer q=\"505520\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"505520\"]Recall how to calculate the median of a dataset containing an even number of observations. See <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/corequisite-support-activity-4c\/\">Corequisite Support Activity for Interpreting the Mean and Median of a Dataset: 4C<\/a> for a refresher as needed. [\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 5<\/h3>\r\nInterpret the median.\r\n\r\n[reveal-answer q=\"968159\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"968159\"]Do you recall how the median splits the data? What percentile is identified by the median? [\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Q1\">Calculate the first quartile (Q1)<\/h3>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 6<\/h3>\r\nWhat values lie below the median? List them in order from least to greatest.\r\n\r\n[reveal-answer q=\"803946\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"803946\"]Refer to the ordered list of data values you created in Question 2.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 7<\/h3>\r\nWhat is the median of the list you generated in Question 6 above?\r\n\r\n[reveal-answer q=\"519716\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"519716\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\nWe call this value the <strong>first quartile<\/strong>, and we sometimes denote it as <strong>Q1<\/strong>. It is the median of the values that lie below the median for the whole dataset. It is also equal to the 25th percentile.\r\n<h3 id=\"Q3\">Calculate the third quartile (Q3)<\/h3>\r\nDetermine the values that lie above the median for the whole dataset.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 8<\/h3>\r\nList the values above the median in order from least to greatest.\r\n\r\n[reveal-answer q=\"286964\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"286964\"]Refer to the ordered list of data values you created in Question 2[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 9<\/h3>\r\nWhat is the median of the list above?\r\n\r\n[reveal-answer q=\"198445\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"198445\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\nWe call this value the <strong>third quartile<\/strong>, and we sometimes denote it as <strong>Q3<\/strong>. It is the median of the values that lie above the median for the whole dataset. It is also equal to the 75th percentile.\r\n<h3 id=\"Five-Number\">List the the five-number summary<\/h3>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 10<\/h3>\r\nWe have identified the first quartile (Q1) and the third quartile (Q3). What do you think the second and fourth quartiles are? Why do you think we call these values \u201cquartiles?\u201d\r\n\r\n[reveal-answer q=\"277023\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"277023\"]The median splits the data into two pieces (the values above the median and the values below the median). Into how many pieces do these numbers split the data?[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 11<\/h3>\r\nThe collection of the minimum, first quartile, median, third quartile, and maximum form the five-number summary of the data. Record the five-number summary for this dataset.\r\n\r\n[reveal-answer q=\"729681\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"729681\"]List them in a row, separated by commas.[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"IQR\">Calculate the interquartile range (IQR)<\/h3>\r\nThe interquartile range (sometimes denoted as IQR) is the quantity Q3 - Q1.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 12<\/h3>\r\nAbout how much of the data lie between Q3 and Q1?\r\n\r\n[reveal-answer q=\"928763\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"928763\"]Recall how pieces of data are defined by all four quartiles. How many pieces lie between Q3 and Q1?[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 13<\/h3>\r\nWhat is the IQR for the GDP per capita of the 10 most populous countries in 2018?\r\n\r\n[reveal-answer q=\"808961\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"808961\"]Recall the numbers you identified as Q1 and Q3 then calculate Q3 - Q1. [\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Outlier\">Determine if a value is an outlier<\/h3>\r\nSome outliers seem quite simple to spot (such as the GDP per capita of the United States), but others are harder to identify (such as Japan's GDP per capita). If you were to make up a rule for testing whether a value is \"unusual enough\" to be called an outlier, what would it be? Use your rule on the value of Japan's GDP per capita to decide whether or not it is an outlier. What did you decide?\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 14<\/h3>\r\nReturn to your answer to Question 1. How can you decide whether an entry is unusual or not? In other words, which entries did you decide are outliers in this dataset? Explain your reasoning.\r\n[reveal-answer q=\"394762\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"394762\"]What do <em>you<\/em> think? Apply any rule you feel is reasonable to answer this question.[\/hidden-answer]\r\n\r\n<\/div>\r\nIn the next section, you'll learn about an accepted method of determining whether a data value \"qualifies\" to be an outlier in a skewed distribution like this one. It's called the IQR method and states that if a data value is located more than 1.5 times the IQR to the left of Q1 or to the right of Q3, then that value is \"unusual enough\" to be called an outlier. It's important to note that, while this method can be used to identify unusual observations in skewed distributions like this one, other methods, which you'll learn about in an upcoming section, are well suited for symmetrical distributions. In certain applications, it may be desirable to distinguish between \"mild outliers\" (using 1.5 times IQR) and \"extreme outliers\" (using 3 times IQR). We can really set the threshold for \"unusual\" values as far away as we'd like, depending on the application. But 1.5 times IQR is commonly used, so we'll use it here and in the upcoming section.\r\n\r\nLet's apply the method to Japan's GDP per capita in the interactive example below.\r\n<div class=\"textbox exercises\">\r\n<h3>Interactive example<\/h3>\r\nRecall that Japan's GDP per capita from the dataset is $39,287. We would like to know how unusual this value really is in comparison to the rest of the data values. We'll use the IQR method to make the determination.\r\n\r\nUnder this method, a data value is considered an outlier if it lies 1.5 [latex]\\times[\/latex] (IQR) above Q3 or below Q1. Since 39,287 is greater than the median, we'll test it to see if it exceeds Q3 + 1.5 [latex]\\times[\/latex] (IQR). (If it were a very small number, we'd test to see if it were lower than Q1 - 1.5 [latex]\\times[\/latex] (IQR).)\r\n\r\nRecall, for this dataset: Q3 = 11,289 and IQR = 9,273.\r\n<p style=\"padding-left: 30px;\">Step 1) Calculate 1.5 [latex]\\times[\/latex] (IQR).<\/p>\r\n<p style=\"padding-left: 30px;\">Step 2) Calculate Q3 + 1.5 [latex]\\times[\/latex] (IQR)<\/p>\r\n<p style=\"padding-left: 30px;\">Step 3) Compare Japan's GDP per capita. If it exceeds Q3 + 1.5 [latex]\\times[\/latex] (IQR), then it is an outlier.<\/p>\r\nWhat did you discover? Is Japan's GDP per capita an outlier in the dataset?\r\n\r\n[reveal-answer q=\"266546\"]Show Answer[\/reveal-answer]\r\n[hidden-answer a=\"266546\"]\r\n<p style=\"padding-left: 30px;\">Step 1) 1.5 [latex]\\times[\/latex] (9273) = 13909.50<\/p>\r\n<p style=\"padding-left: 30px;\">Step 2) 11289 + 13909.50\u00a0= 25198.50<\/p>\r\n<p style=\"padding-left: 30px;\">Step 3) $39,287 is greater than $25,198.50, therefore it is an outlier.<\/p>\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nIn this support activity, you've seen how to calculate the five-number summary and interquartile range (IQR) by hand for a dataset, and you've learned about a method to mathematically determine if an observation is an outlier. These make up the features of a\u00a0 box-plot. It's time to move on to the next section where you'll use these skills as you explore boxplots for visualizing the distribution of a quantitative variable.","rendered":"<div class=\"textbox learning-objectives\">\n<h3>What you&#8217;ll need to know<\/h3>\n<p>In this support activity you&#8217;ll become familiar with the following:<\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li><a href=\"#MinMax\">Identify the minimum and maximum values of a dataset.<\/a><\/li>\n<li><a href=\"#Median\">Calculate and interpret the median.<\/a><\/li>\n<li><a href=\"#Q1\">Calculate the first quartile (Q1).<\/a><\/li>\n<li><a href=\"#Q3\">Calculate the third quartile (Q3).<\/a><\/li>\n<li><a href=\"#Five-Number\">List the five-number-summary for a quantitative variable.<\/a><\/li>\n<li><a href=\"#IQR\">Calculate the interquartile range (IQR) for a quantitative variable.<\/a><\/li>\n<li><a href=\"#Outlier\">Determine whether or not a value is an outlier.<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/div>\n<p>The upcoming section of material and following activity will introduce a new graph for displaying quantitative data called a boxplot. The image below shows a boxplot labeled with the five-number-summary and interquartile range. We&#8217;ll explore boxplots in detail soon. The focus of this support activity is to help you become familiar with the characteristics of a boxplot: minimum and maximum values, median, first quartile, third quartile, and interquartile range.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3078 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2021\/12\/11181025\/Screen-Shot-2022-02-11-at-1.09.31-PM.png\" alt=\"A general horizontal boxplot displaying the following features from left to right: lower outliers, minimum, Q1, median, Q3, maximum, and upper outliers. The Interquartile Range (IQR) is shown at the top of the boxplot.\" width=\"575\" height=\"319\" \/><\/p>\n<p>A boxplot is a graphical visualization of a quantitative variable that shows median, spread, skew, and outliers by illustrating a set of numbers called the <strong>five-number summary<\/strong>.\u00a0In the next section of the course material, you will need to be able to relate the features of a boxplot to the dataset it comes from. In the following activity, you will need to be able to interpret and compare boxplots. Begin to familiarize yourself with boxplots in this corequisite support activity during which you&#8217;ll build up an understanding of the parts of the five-number summary and how to determine whether a data value is &#8220;unusual enough&#8221; to qualify as an outlier.<\/p>\n<p>To introduce this new quantitative graph, we&#8217;ll use a dataset that contains the gross domestic product per capita for the 10 most populous countries.<\/p>\n<h2>GDP of the World\u2019s Most Populous Countries<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1025 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11223247\/Picture51-300x190.jpg\" alt=\"a semispherical map of the world\" width=\"499\" height=\"316\" \/><\/p>\n<p>The following table lists data for the 10 most populous countries in 2018, and it includes each country\u2019s population rank (we can see that China had the largest population in 2018, India had the second largest population, and so on) and each country\u2019s gross domestic product (GDP) per capita. <a class=\"footnote\" title=\"Bevins, V. (2020). The Jakarta method: Washington\u2019s anticommunist crusade and the mass murder program that shaped our world. PublicAffairs.\" id=\"return-footnote-490-1\" href=\"#footnote-490-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> A country\u2019s GDP is the total monetary value of everything produced in that country over the year. The GDP per capita is a country\u2019s GDP divided by its population.<\/p>\n<p>&nbsp;<\/p>\n<div style=\"margin: auto;\">\n<table>\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><strong>Country<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Population Rank<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>GDP per Capita<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>China<\/strong><\/td>\n<td style=\"text-align: center;\">1<\/td>\n<td style=\"text-align: center;\">$9,771<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>India<\/strong><\/td>\n<td style=\"text-align: center;\">2<\/td>\n<td style=\"text-align: center;\">$2,016<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>United States<\/strong><\/td>\n<td style=\"text-align: center;\">3<\/td>\n<td style=\"text-align: center;\">$62,641<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Indonesia<\/strong><\/td>\n<td style=\"text-align: center;\">4<\/td>\n<td style=\"text-align: center;\">$3,894<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Pakistan<\/strong><\/td>\n<td style=\"text-align: center;\">5<\/td>\n<td style=\"text-align: center;\">$1,473<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Brazil<\/strong><\/td>\n<td style=\"text-align: center;\">6<\/td>\n<td style=\"text-align: center;\">$8,921<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Nigeria<\/strong><\/td>\n<td style=\"text-align: center;\">7<\/td>\n<td style=\"text-align: center;\">$2,028<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Bangladesh<\/strong><\/td>\n<td style=\"text-align: center;\">8<\/td>\n<td style=\"text-align: center;\">$1,698<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Russia<\/strong><\/td>\n<td style=\"text-align: center;\">9<\/td>\n<td style=\"text-align: center;\">$11,289<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Japan<\/strong><\/td>\n<td style=\"text-align: center;\">10<\/td>\n<td style=\"text-align: center;\">$39,287<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 1<\/h3>\n<p>Are there any observations that seem unusual compared to the other entries?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q773333\">Hint<\/span><\/p>\n<div id=\"q773333\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<p>Did you identify one or two observations in Question 1 as being unusual? It can be difficult sometimes to decide if a particular value really is an outlier. Keep this thought in mind as you work through this activity. We&#8217;ll come back to this question again at the end.<\/p>\n<h3 id=\"MinMax\">Identify minimum and maximum data values<\/h3>\n<div class=\"textbox key-takeaways\">\n<h3>question 2<\/h3>\n<p>List the 10 most populous countries\u2019 GDP per capita in 2018 from least to greatest.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q55738\">Hint<\/span><\/p>\n<div id=\"q55738\" class=\"hidden-answer\" style=\"display: none\">See the data table above for the values.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 3<\/h3>\n<p>What are the minimum and maximum values of this dataset?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q462567\">Hint<\/span><\/p>\n<div id=\"q462567\" class=\"hidden-answer\" style=\"display: none\">Remember to put them in order first.<\/div>\n<\/div>\n<\/div>\n<h3 id=\"Median\">Calculate and interpret the median<\/h3>\n<div class=\"textbox key-takeaways\">\n<h3>question 4<\/h3>\n<p>In 2018, what was the median GDP per capita among the 10 most populous countries?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q505520\">Hint<\/span><\/p>\n<div id=\"q505520\" class=\"hidden-answer\" style=\"display: none\">Recall how to calculate the median of a dataset containing an even number of observations. See <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/corequisite-support-activity-4c\/\">Corequisite Support Activity for Interpreting the Mean and Median of a Dataset: 4C<\/a> for a refresher as needed. <\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 5<\/h3>\n<p>Interpret the median.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q968159\">Hint<\/span><\/p>\n<div id=\"q968159\" class=\"hidden-answer\" style=\"display: none\">Do you recall how the median splits the data? What percentile is identified by the median? <\/div>\n<\/div>\n<\/div>\n<h3 id=\"Q1\">Calculate the first quartile (Q1)<\/h3>\n<div class=\"textbox key-takeaways\">\n<h3>question 6<\/h3>\n<p>What values lie below the median? List them in order from least to greatest.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q803946\">Hint<\/span><\/p>\n<div id=\"q803946\" class=\"hidden-answer\" style=\"display: none\">Refer to the ordered list of data values you created in Question 2.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 7<\/h3>\n<p>What is the median of the list you generated in Question 6 above?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q519716\">Hint<\/span><\/p>\n<div id=\"q519716\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<p>We call this value the <strong>first quartile<\/strong>, and we sometimes denote it as <strong>Q1<\/strong>. It is the median of the values that lie below the median for the whole dataset. It is also equal to the 25th percentile.<\/p>\n<h3 id=\"Q3\">Calculate the third quartile (Q3)<\/h3>\n<p>Determine the values that lie above the median for the whole dataset.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 8<\/h3>\n<p>List the values above the median in order from least to greatest.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q286964\">Hint<\/span><\/p>\n<div id=\"q286964\" class=\"hidden-answer\" style=\"display: none\">Refer to the ordered list of data values you created in Question 2<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 9<\/h3>\n<p>What is the median of the list above?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q198445\">Hint<\/span><\/p>\n<div id=\"q198445\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<p>We call this value the <strong>third quartile<\/strong>, and we sometimes denote it as <strong>Q3<\/strong>. It is the median of the values that lie above the median for the whole dataset. It is also equal to the 75th percentile.<\/p>\n<h3 id=\"Five-Number\">List the the five-number summary<\/h3>\n<div class=\"textbox key-takeaways\">\n<h3>question 10<\/h3>\n<p>We have identified the first quartile (Q1) and the third quartile (Q3). What do you think the second and fourth quartiles are? Why do you think we call these values \u201cquartiles?\u201d<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q277023\">Hint<\/span><\/p>\n<div id=\"q277023\" class=\"hidden-answer\" style=\"display: none\">The median splits the data into two pieces (the values above the median and the values below the median). Into how many pieces do these numbers split the data?<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 11<\/h3>\n<p>The collection of the minimum, first quartile, median, third quartile, and maximum form the five-number summary of the data. Record the five-number summary for this dataset.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q729681\">Hint<\/span><\/p>\n<div id=\"q729681\" class=\"hidden-answer\" style=\"display: none\">List them in a row, separated by commas.<\/div>\n<\/div>\n<\/div>\n<h3 id=\"IQR\">Calculate the interquartile range (IQR)<\/h3>\n<p>The interquartile range (sometimes denoted as IQR) is the quantity Q3 &#8211; Q1.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 12<\/h3>\n<p>About how much of the data lie between Q3 and Q1?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q928763\">Hint<\/span><\/p>\n<div id=\"q928763\" class=\"hidden-answer\" style=\"display: none\">Recall how pieces of data are defined by all four quartiles. How many pieces lie between Q3 and Q1?<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 13<\/h3>\n<p>What is the IQR for the GDP per capita of the 10 most populous countries in 2018?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q808961\">Hint<\/span><\/p>\n<div id=\"q808961\" class=\"hidden-answer\" style=\"display: none\">Recall the numbers you identified as Q1 and Q3 then calculate Q3 &#8211; Q1. <\/div>\n<\/div>\n<\/div>\n<h3 id=\"Outlier\">Determine if a value is an outlier<\/h3>\n<p>Some outliers seem quite simple to spot (such as the GDP per capita of the United States), but others are harder to identify (such as Japan&#8217;s GDP per capita). If you were to make up a rule for testing whether a value is &#8220;unusual enough&#8221; to be called an outlier, what would it be? Use your rule on the value of Japan&#8217;s GDP per capita to decide whether or not it is an outlier. What did you decide?<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 14<\/h3>\n<p>Return to your answer to Question 1. How can you decide whether an entry is unusual or not? In other words, which entries did you decide are outliers in this dataset? Explain your reasoning.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q394762\">Hint<\/span><\/p>\n<div id=\"q394762\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think? Apply any rule you feel is reasonable to answer this question.<\/div>\n<\/div>\n<\/div>\n<p>In the next section, you&#8217;ll learn about an accepted method of determining whether a data value &#8220;qualifies&#8221; to be an outlier in a skewed distribution like this one. It&#8217;s called the IQR method and states that if a data value is located more than 1.5 times the IQR to the left of Q1 or to the right of Q3, then that value is &#8220;unusual enough&#8221; to be called an outlier. It&#8217;s important to note that, while this method can be used to identify unusual observations in skewed distributions like this one, other methods, which you&#8217;ll learn about in an upcoming section, are well suited for symmetrical distributions. In certain applications, it may be desirable to distinguish between &#8220;mild outliers&#8221; (using 1.5 times IQR) and &#8220;extreme outliers&#8221; (using 3 times IQR). We can really set the threshold for &#8220;unusual&#8221; values as far away as we&#8217;d like, depending on the application. But 1.5 times IQR is commonly used, so we&#8217;ll use it here and in the upcoming section.<\/p>\n<p>Let&#8217;s apply the method to Japan&#8217;s GDP per capita in the interactive example below.<\/p>\n<div class=\"textbox exercises\">\n<h3>Interactive example<\/h3>\n<p>Recall that Japan&#8217;s GDP per capita from the dataset is $39,287. We would like to know how unusual this value really is in comparison to the rest of the data values. We&#8217;ll use the IQR method to make the determination.<\/p>\n<p>Under this method, a data value is considered an outlier if it lies 1.5 [latex]\\times[\/latex] (IQR) above Q3 or below Q1. Since 39,287 is greater than the median, we&#8217;ll test it to see if it exceeds Q3 + 1.5 [latex]\\times[\/latex] (IQR). (If it were a very small number, we&#8217;d test to see if it were lower than Q1 &#8211; 1.5 [latex]\\times[\/latex] (IQR).)<\/p>\n<p>Recall, for this dataset: Q3 = 11,289 and IQR = 9,273.<\/p>\n<p style=\"padding-left: 30px;\">Step 1) Calculate 1.5 [latex]\\times[\/latex] (IQR).<\/p>\n<p style=\"padding-left: 30px;\">Step 2) Calculate Q3 + 1.5 [latex]\\times[\/latex] (IQR)<\/p>\n<p style=\"padding-left: 30px;\">Step 3) Compare Japan&#8217;s GDP per capita. If it exceeds Q3 + 1.5 [latex]\\times[\/latex] (IQR), then it is an outlier.<\/p>\n<p>What did you discover? Is Japan&#8217;s GDP per capita an outlier in the dataset?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q266546\">Show Answer<\/span><\/p>\n<div id=\"q266546\" class=\"hidden-answer\" style=\"display: none\">\n<p style=\"padding-left: 30px;\">Step 1) 1.5 [latex]\\times[\/latex] (9273) = 13909.50<\/p>\n<p style=\"padding-left: 30px;\">Step 2) 11289 + 13909.50\u00a0= 25198.50<\/p>\n<p style=\"padding-left: 30px;\">Step 3) $39,287 is greater than $25,198.50, therefore it is an outlier.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>In this support activity, you&#8217;ve seen how to calculate the five-number summary and interquartile range (IQR) by hand for a dataset, and you&#8217;ve learned about a method to mathematically determine if an observation is an outlier. These make up the features of a\u00a0 box-plot. It&#8217;s time to move on to the next section where you&#8217;ll use these skills as you explore boxplots for visualizing the distribution of a quantitative variable.<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-490-1\"> Bevins, V. (2020). The Jakarta method: Washington\u2019s anticommunist crusade and the mass murder program that shaped our world. PublicAffairs.  <a href=\"#return-footnote-490-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":25777,"menu_order":23,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-490","chapter","type-chapter","status-publish","hentry"],"part":621,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/490","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/25777"}],"version-history":[{"count":29,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/490\/revisions"}],"predecessor-version":[{"id":3322,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/490\/revisions\/3322"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/621"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/490\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=490"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=490"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=490"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=490"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}