{"id":1468,"date":"2017-02-09T00:46:29","date_gmt":"2017-02-09T00:46:29","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/waymakermath4libarts\/?post_type=chapter&#038;p=1468"},"modified":"2019-10-03T21:03:40","modified_gmt":"2019-10-03T21:03:40","slug":"introduction-describing-mathematical-characteristics-of-a-data-set","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/chapter\/introduction-describing-mathematical-characteristics-of-a-data-set\/","title":{"raw":"Numerical Summaries of Data","rendered":"Numerical Summaries of Data"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>Learning Outcomes<\/h3>\r\n<ul>\r\n \t<li>Calculate the mean, median, and mode of a set of data<\/li>\r\n \t<li>Calculate the range of a data set, and recognize it's limitations in fully describing the behavior of a data set<\/li>\r\n \t<li>Calculate the standard deviation for a data set, and determine it's units<\/li>\r\n \t<li>Identify the difference between population variance and sample variance<\/li>\r\n \t<li>Identify the quartiles for a data set, and the calculations used to define them<\/li>\r\n \t<li>Identify the parts of a five number summary for a set of data, and create a box plot using it<\/li>\r\n<\/ul>\r\n<\/div>\r\nIt is often desirable to use a few numbers to summarize a data set. One important aspect of a set of data\u00a0is where its center is located. In this lesson, measures of central tendency are discussed first. A second aspect of a distribution is how spread out it is. In other words, how much the data in the distribution vary from one another. The second section of this lesson describes measures of variability.\r\n<h2>Measures of Central Tendency<\/h2>\r\n<h3>Mean, Median, and Mode<\/h3>\r\n<a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22190716\/8027247398_96f7013a1f_z.jpg\"><img class=\"aligncenter size-full wp-image-955\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22190716\/8027247398_96f7013a1f_z.jpg\" alt=\"Silver sphere which has red smaller spheres clustered on its left side, with magnetic attraction\" width=\"640\" height=\"427\" \/><\/a>\r\n\r\nLet's begin by trying to find the most \"typical\" value of a data set.\r\n\r\nNote that we just used the word \"typical\" although in many cases you might think of using the word \"average.\" We need to be careful with the word \"average\" as it means different things to different people in different contexts.\u00a0 One of the most common uses of the word \"average\" is what mathematicians and statisticians call the <strong>arithmetic mean<\/strong>, or just plain old <strong>mean<\/strong> for short.\u00a0 \"Arithmetic mean\" sounds rather fancy, but you have likely calculated a mean many times without realizing it; the mean is what most people think of when they use the word \"average.\"\r\n<div class=\"textbox\">\r\n<h3>Mean<\/h3>\r\nThe <strong>mean<\/strong> of a set of data is the sum of the data values divided by the number of values.\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>examples<\/h3>\r\nMarci\u2019s exam scores for her last math class were 79, 86, 82, and 94. What would the mean of these values would be?\r\n[reveal-answer q=\"162631\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"162631\"][latex]\\frac{79+86+82+94}{4}=85.25[\/latex]. Typically we round means to one more decimal place than the original data had. In this case, we would round 85.25 to 85.3.[\/hidden-answer]\r\n\r\n<hr \/>\r\n\r\nThe number of touchdown (TD) passes thrown by each of the 31 teams in the National Football League in the 2000 season are shown below.\r\n\r\n37 33 33 32 29 28 28 23 22 22 22 21 21 21 20\r\n\r\n20 19 19 18 18 18 18 16 15 14 14 14 12 12 9 6\r\n\r\nWhat is the mean number of TD passes?\r\n[reveal-answer q=\"244479\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"244479\"]\r\n\r\nAdding these values, we get 634 total TDs. Dividing by 31, the number of data values, we get 634\/31 = 20.4516. It would be appropriate to round this to 20.5.\r\n\r\nIt would be most correct for us to report that \u201cThe mean number of touchdown passes thrown in the NFL in the 2000 season was 20.5 passes,\u201d but it is not uncommon to see the more casual word \u201caverage\u201d used in place of \u201cmean.\u201d\r\n\r\n[\/hidden-answer]\r\n\r\nBoth examples are described further in the following video.\r\n\r\nhttps:\/\/youtu.be\/3if9Le2sO0c\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nThe price of a jar of peanut butter at 5 stores was $3.29, $3.59, $3.79, $3.75, and $3.99. Find the mean price.\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>examples<\/h3>\r\nThe one hundred families in a particular neighborhood are asked their annual household income, to the nearest $5 thousand dollars. The results are summarized in a frequency table below.\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Income (thousands of dollars)<\/strong><\/td>\r\n<td><strong>Frequency<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>15<\/td>\r\n<td>6<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>20<\/td>\r\n<td>8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>25<\/td>\r\n<td>11<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>30<\/td>\r\n<td>17<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>35<\/td>\r\n<td>19<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>40<\/td>\r\n<td>20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>45<\/td>\r\n<td>12<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>50<\/td>\r\n<td>7<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nWhat is the mean average income in this neighborhood?\r\n[reveal-answer q=\"231491\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"231491\"]\r\n\r\nCalculating the mean by hand could get tricky if we try to type in all 100 values:\r\n\r\n[latex]\\frac{\\overbrace{15+\\cdots+15}^{\\text{6terms}}+\\overbrace{20+\\cdots+20}^{\\text{8terms}}+\\overbrace{25+\\cdots+25}^{\\text{11terms}}+\\cdots}{\\text{100}}[\/latex]\r\n\r\nWe could calculate this more easily by noticing that adding 15 to itself six times is the same as = 90. Using this simplification, we get\r\n\r\n[latex]\\frac{15\\cdot6+20\\cdot8+25\\cdot11+30\\cdot17+35\\cdot19+40\\cdot20+45\\cdot12+50\\cdot7}{\\text{100}}=\\frac{3390}{100}=33.9[\/latex]\r\n\r\nThe mean household income of our sample is 33.9 thousand dollars ($33,900).\r\n\r\n[\/hidden-answer]\r\n\r\n<hr \/>\r\n\r\nExtending off the last example, suppose a new family moves into the neighborhood example that has a household income of $5 million ($5000 thousand).\r\n\r\nWhat is the new mean of this neighborhood's income?\r\n[reveal-answer q=\"820039\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"820039\"]\r\n\r\nAdding this to our sample, our mean is now:\r\n\r\n[latex]\\frac{15\\cdot6+20\\cdot8+25\\cdot11+30\\cdot17+35\\cdot19+40\\cdot20+45\\cdot12+50\\cdot7+5000\\cdot1}{\\text{101}}=\\frac{8390}{101}=83.069[\/latex]\r\n\r\n[\/hidden-answer]\r\n\r\nBoth situations are explained further in this video.\r\n\r\nhttps:\/\/youtu.be\/1_4Hxcq8DpQ\r\n\r\n<\/div>\r\nWhile 83.1 thousand dollars ($83,069) is the correct mean household income, it no longer represents a \u201ctypical\u201d value.\r\n\r\nImagine the data values on a see-saw or balance scale. The mean is the value that keeps the data in balance, like in the picture below.\r\n\r\n<img class=\"aligncenter wp-image-423\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12182802\/balance1.png\" alt=\"Drawing of a balance bar. A large blue block is on left end, and two smaller blue rectangles are on right end of balance point. One is close to the balance, one is further away. \" width=\"349\" height=\"61\" \/>\r\n\r\nIf we graph our household data, the $5 million data value is so far out to the right that the mean has to adjust up to keep things in balance.\r\n\r\n<img class=\"aligncenter wp-image-424\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12182831\/balance2.png\" alt=\"Drawing of a balance bar. A large blue block is on left end and two smaller blue rectangles are on also on left of balance point. On right, a small blue rectangle is significantly far away from the balance point.\" width=\"1256\" height=\"100\" \/>\r\n\r\nFor this reason, when working with data that have <strong>outliers<\/strong> \u2013 values far outside the primary grouping \u2013 it is common to use a different measure of center, the <strong>median<\/strong>.\r\n<div class=\"textbox\">\r\n<h3>Median<\/h3>\r\nThe <strong>median<\/strong> of a set of data is the value in the middle when the data is in order.\r\n<ul>\r\n \t<li>To find the median, begin by listing the data in order from smallest to largest, or largest to smallest.<\/li>\r\n \t<li>If the number of data values, <em>N<\/em>, is odd, then the median is the middle data value. This value can be found by rounding <em>N<\/em>\/2 up to the next whole number.<\/li>\r\n \t<li>If the number of data values is even, there is no one middle value, so we find the mean of the two middle values (values <em>N<\/em>\/2 and <em>N<\/em>\/2 + 1)<\/li>\r\n<\/ul>\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>example<\/h3>\r\nReturning to the football touchdown data, we would start by listing the data in order. Luckily, it was already in decreasing order, so we can work with it without needing to reorder it first.\r\n\r\n37 33 33 32 29 28 28 23 22 22 22 21 21 21 20\r\n\r\n20 19 19 18 18 18 18 16 15 14 14 14 12 12 9 6\r\n\r\nWhat is the median TD value?\r\n[reveal-answer q=\"866224\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"866224\"]Since there are 31 data values, an odd number, the median will be the middle number, the 16th data value (31\/2 = 15.5, round up to 16, leaving 15 values below and 15 above). The 16th data value is 20, so the median number of touchdown passes in the 2000 season was 20 passes. Notice that for this data, the median is fairly close to the mean we calculated earlier, 20.5.[\/hidden-answer]\r\n\r\n<hr \/>\r\n\r\nFind the median of these quiz scores: 5 10 8 6 4 8 2 5 7 7\r\n[reveal-answer q=\"282623\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"282623\"]\r\n\r\nWe start by listing the data in order: 2 4 5 5 6 7 7 8 8 10\r\n\r\nSince there are 10 data values, an even number, there is no one middle number. So we find the mean of the two middle numbers, 6 and 7, and get (6+7)\/2 = 6.5.\r\n\r\nThe median quiz score was 6.5.\r\n\r\n[\/hidden-answer]\r\n\r\nLearn more about these median examples in this video.\r\n\r\nhttps:\/\/youtu.be\/WEdr_rSRObk\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nThe price of a jar of peanut butter at 5 stores was\u00a0$3.29, $3.59, $3.79, $3.75, and $3.99. Find the median price.\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nLet us return now to our original household income data\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Income (thousands of dollars)<\/strong><\/td>\r\n<td><strong>Frequency<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>15<\/td>\r\n<td>6<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>20<\/td>\r\n<td>8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>25<\/td>\r\n<td>11<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>30<\/td>\r\n<td>17<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>35<\/td>\r\n<td>19<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>40<\/td>\r\n<td>20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>45<\/td>\r\n<td>12<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>50<\/td>\r\n<td>7<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nWhat is the mean of this neighborhood's household income?\r\n[reveal-answer q=\"941761\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"941761\"]\r\n\r\nHere we have 100 data values. If we didn\u2019t already know that, we could find it by adding the frequencies. Since 100 is an even number, we need to find the mean of the middle two data values - the 50th and 51st data values. To find these, we start counting up from the bottom:\r\n\r\nThere are 6 data values of $15, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 1 to 6 are $15 thousand\r\n\r\nThe next 8 data values are $20, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 7 to (6+8)=14 are $20 thousand\r\n\r\nThe next 11 data values are $25, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 15 to (14+11)=25 are $25 thousand\r\n\r\nThe next 17 data values are $30, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 26 to (25+17)=42 are $30 thousand\r\n\r\nThe next 19 data values are $35, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 43 to (42+19)=61 are $35 thousand\r\n\r\nFrom this we can tell that values 50 and 51 will be $35 thousand, and the mean of these two values is $35 thousand. The median income in this neighborhood is $35 thousand.\r\n\r\n[\/hidden-answer]\r\n\r\n<hr \/>\r\n\r\nIf we add in the new neighbor with a $5 million household income, then there will be 101 data values, and the 51st value will be the median. As we discovered in the last example, the 51st value is $35 thousand. Notice that the new neighbor did not affect the median in this case. The median is not swayed as much by outliers as the mean is.\r\n\r\nView more about the median of this neighborhood's household incomes here.\r\n\r\nhttps:\/\/youtu.be\/kqEu9EDkmfU\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nIn addition to the mean and the median, there is one other common measurement of the \"typical\" value of a data set: the <strong>mode<\/strong>.\r\n<div class=\"textbox\">\r\n<h3>Mode<\/h3>\r\nThe <strong>mode<\/strong> is the element of the data set that occurs most frequently.\r\n\r\n<\/div>\r\nThe mode is fairly useless with data like weights or heights where there are a large number of possible values. The mode is most commonly used for categorical data, for which median and mean cannot be computed.\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nIn our vehicle color survey earlier in this section, we collected the data\r\n<table>\r\n<tbody>\r\n<tr>\r\n<th>Color<\/th>\r\n<th>Frequency<\/th>\r\n<\/tr>\r\n<tr>\r\n<td>Blue<\/td>\r\n<td>3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Green<\/td>\r\n<td>5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Red<\/td>\r\n<td>4<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>White<\/td>\r\n<td>3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Black<\/td>\r\n<td>2<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Grey<\/td>\r\n<td>3<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nWhich color is the mode?\r\n[reveal-answer q=\"638793\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"638793\"]For this data, Green is the mode, since it is the data value that occurred the most frequently.[\/hidden-answer]\r\n\r\nMode in this example is explained by the video here.\r\n\r\nhttps:\/\/youtu.be\/pFpkWrib3Jk\r\n\r\n<\/div>\r\nIt is possible for a data set to have more than one mode if several categories have the same frequency, or no modes if each every category occurs only once.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nReviewers were asked to rate a product on a scale of 1 to 5. Find\r\n<ol>\r\n \t<li>The mean rating<\/li>\r\n \t<li>The median rating<\/li>\r\n \t<li>The mode rating<\/li>\r\n<\/ol>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Rating<\/strong><\/td>\r\n<td><strong>Frequency<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>1<\/td>\r\n<td>4<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>2<\/td>\r\n<td>8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>3<\/td>\r\n<td>7<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>4<\/td>\r\n<td>3<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>1<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<h2>Measures of Variation<\/h2>\r\n<h3>Range and Standard Deviation<\/h3>\r\nConsider these three sets of quiz scores:\r\n<p style=\"padding-left: 120px;\">Section A: 5 5 5 5 5 5 5 5 5 5<\/p>\r\n<p style=\"padding-left: 120px;\">Section B: 0 0 0 0 0 10 10 10 10 10<\/p>\r\n<p style=\"padding-left: 120px;\">Section C: 4 4 4 5 5 5 5 6 6 6<\/p>\r\nAll three of these sets of data have a mean of 5 and median of 5, yet the sets of scores are clearly quite different. In section A, everyone had the same score; in section B half the class got no points and the other half got a perfect score, assuming this was a 10-point quiz.\u00a0 Section C was not as consistent as section A, but not as widely varied as section B.\r\n\r\nIn addition to the mean and median, which are measures of the \"typical\" or \"middle\" value, we also need a measure of how \"spread out\" or varied each data set is.<a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22194649\/299549598_e1007533c6_z.jpg\"><img class=\"aligncenter size-full wp-image-960\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22194649\/299549598_e1007533c6_z.jpg\" alt=\"Collage of photos of trees around a central octagonal park space\" width=\"640\" height=\"481\" \/><\/a>\r\n\r\nThere are several ways to measure this \"spread\" of the data. The first is the simplest and is called the <strong>range<\/strong>.\r\n<div class=\"textbox\">\r\n<h3>Range<\/h3>\r\nThe range is the difference between the maximum value and the minimum value of the data set.\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>example<\/h3>\r\nUsing the quiz scores from above,\r\n\r\nFor section A, the range is 0 since both maximum and minimum are 5 and 5 \u2013 5 =\u00a00\r\n\r\nFor section B, the range is 10 since 10 \u2013 0\u00a0=\u00a010\r\n\r\nFor section C, the range is 2 since 6 \u2013 4 = 2\r\n\r\nIn the last example, the range seems to be revealing how spread out the data is. However, suppose we add a fourth section, Section D, with scores 0 5 5 5 5 5 5 5 5 10.\r\n\r\nThis section also has a mean and median of 5. The range is 10, yet this data set is quite different than Section B. To better illuminate the differences, we\u2019ll have to turn to more sophisticated measures of variation.\r\n\r\nThe range of this example is explained in the following video.\r\n\r\nhttps:\/\/youtu.be\/b3ofWalrHgQ\r\n\r\n<\/div>\r\n<div class=\"textbox\">\r\n<h3>Standard deviation<\/h3>\r\nThe standard deviation is a measure of variation based on measuring how far each data value deviates, or is different, from the mean. A few important characteristics:\r\n<ul>\r\n \t<li>Standard deviation is always positive. Standard deviation will be zero if all the data values are equal, and will get larger as the data spreads out.<\/li>\r\n \t<li>Standard deviation has the same units as the original data.<\/li>\r\n \t<li>Standard deviation, like the mean, can be highly influenced by outliers.<\/li>\r\n<\/ul>\r\n<\/div>\r\nUsing the data from section D, we could compute for each data value the difference between the data value and the mean:\r\n<table>\r\n<tbody>\r\n<tr>\r\n<th>data value<\/th>\r\n<th>deviation: data value - mean<\/th>\r\n<\/tr>\r\n<tr>\r\n<td>0<\/td>\r\n<td>0-5 = -5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>10<\/td>\r\n<td>10-5 = 5<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nWe would like to get an idea of the \"average\" deviation from the mean, but if we find the average of the values in the second column the negative and positive values cancel each other out (this will always happen), so to prevent this we square every value in the second column:\r\n<table>\r\n<tbody>\r\n<tr>\r\n<th>data value<\/th>\r\n<th>deviation: data value - mean<\/th>\r\n<th>deviation squared<\/th>\r\n<\/tr>\r\n<tr>\r\n<td>0<\/td>\r\n<td>0-5 = -5<\/td>\r\n<td>(-5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>5<\/td>\r\n<td>5-5 = 0<\/td>\r\n<td>0<sup>2<\/sup> = 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>10<\/td>\r\n<td>10-5 = 5<\/td>\r\n<td>(5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nWe then add the squared deviations up to get 25 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 25 = 50.\u00a0 Ordinarily we would then divide by the number of scores,<em> n<\/em>, (in this case, 10) to find the mean of the deviations.\u00a0 But we only do this if the data set represents a population; if the data set represents a sample (as it almost always does), we instead divide by <em>n<\/em> - 1 (in this case, 10 - 1 = 9).[footnote]The reason we do this is highly technical, but we can see how it might be useful by considering the case of a small sample from a population that contains an outlier, which would increase the average deviation: the outlier very likely won't be included in the sample, so the mean deviation of the sample would underestimate the mean deviation of the population; thus we divide by a slightly smaller number to get a slightly bigger average deviation.[\/footnote]\r\n\r\nSo in our example, we would have 50\/10 = 5 if section D represents a population and 50\/9 = about 5.56 if section D represents a sample.\u00a0 These values (5 and 5.56) are called, respectively, the <strong>population variance<\/strong> and the <strong>sample variance<\/strong> for section D.\r\n\r\nVariance can be a useful statistical concept, but note that the units of variance in this instance would be points-squared since we squared all of the deviations. What are points-squared? Good question. We would rather deal with the units we started with (points in this case), so to convert back we take the square root and get:\r\n<p style=\"text-align: center;\">[latex]\\begin{align}&amp;\\text{populationstandarddeviation}=\\sqrt{\\frac{50}{10}}=\\sqrt{5}\\approx2.2\\\\&amp;\\text{or}\\\\&amp;\\text{samplestandarddeviation}=\\sqrt{\\frac{50}{9}}\\approx2.4\\\\\\end{align}[\/latex]<\/p>\r\nIf we are unsure whether the data set is a sample or a population, we will usually assume it is a sample, and we will round answers to one more decimal place than the original data, as we have done above.\r\n<div class=\"textbox\">\r\n<h3>To compute standard deviation<\/h3>\r\n<ol>\r\n \t<li>Find the deviation of each data from the mean. In other words, subtract the mean from the data value.<\/li>\r\n \t<li>Square each deviation.<\/li>\r\n \t<li>Add the squared deviations.<\/li>\r\n \t<li>Divide by <em>n<\/em>, the number of data values, if the data represents a whole population; divide by <em>n<\/em> \u2013 1 if the data is from a sample.<\/li>\r\n \t<li>Compute the square root of the result.<\/li>\r\n<\/ol>\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>example<\/h3>\r\nComputing the standard deviation for Section B above, we first calculate that the mean is 5. Using a table can help keep track of your computations for the standard deviation:\r\n<table>\r\n<tbody>\r\n<tr>\r\n<th>data value<\/th>\r\n<th>deviation: data value - mean<\/th>\r\n<th>deviation squared<\/th>\r\n<\/tr>\r\n<tr>\r\n<td>0<\/td>\r\n<td>0-5 = -5<\/td>\r\n<td>(-5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>0<\/td>\r\n<td>0-5 = -5<\/td>\r\n<td>(-5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>0<\/td>\r\n<td>0-5 = -5<\/td>\r\n<td>(-5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>0<\/td>\r\n<td>0-5 = -5<\/td>\r\n<td>(-5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>0<\/td>\r\n<td>0-5 = -5<\/td>\r\n<td>(-5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>10<\/td>\r\n<td>10-5 = 5<\/td>\r\n<td>(5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>10<\/td>\r\n<td>10-5 = 5<\/td>\r\n<td>(5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>10<\/td>\r\n<td>10-5 = 5<\/td>\r\n<td>(5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>10<\/td>\r\n<td>10-5 = 5<\/td>\r\n<td>(5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>10<\/td>\r\n<td>10-5 = 5<\/td>\r\n<td>(5)<sup>2<\/sup> = 25<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nAssuming this data represents a population, we will add the squared deviations, divide by 10, the number of data values, and compute the square root:\r\n<p style=\"text-align: center;\">[latex]\\sqrt{\\frac{25+25+25+25+25+25+25+25+25+25}{10}}=\\sqrt{\\frac{250}{10}}=5[\/latex]<\/p>\r\nNotice that the standard deviation of this data set is much larger than that of section D since the data in this set is more spread out.\r\n\r\nFor comparison, the standard deviations of all four sections are:\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td>Section A: 5 5 5 5 5 5 5 5 5 5<\/td>\r\n<td>Standard deviation: 0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Section B: 0 0 0 0 0 10 10 10 10 10<\/td>\r\n<td>Standard deviation: 5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Section C: 4 4 4 5 5 5 5 6 6 6<\/td>\r\n<td>Standard deviation: 0.8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Section D: 0 5 5 5 5 5 5 5 5 10<\/td>\r\n<td>Standard deviation: 2.2<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nSee the following video for more about calculating the standard deviation in this example.\r\n\r\nhttps:\/\/youtu.be\/wS8z90f04OU\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nThe price of a jar of peanut butter at 5 stores was\u00a0$3.29, $3.59, $3.79, $3.75, and $3.99. Find the standard deviation of the prices.\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nWhere standard deviation is a measure of variation based on the mean, <strong>quartiles<\/strong> are based on the median.\r\n<div class=\"textbox\">\r\n<h3>Quartiles<\/h3>\r\nQuartiles are values that divide the data in quarters.\r\n\r\nThe first quartile (Q1) is the value so that 25% of the data values are below it; the third quartile (Q3) is the value so that 75% of the data values are below it. You may have guessed that the second quartile is the same as the median, since the median is the value so that 50% of the data values are below it.\r\n\r\nThis divides the data into quarters; 25% of the data is between the minimum and Q1, 25% is between Q1 and the median, 25% is between the median and Q3, and 25% is between Q3 and the maximum value.\r\n\r\n<\/div>\r\nWhile quartiles are not a 1-number summary of variation like standard deviation, the quartiles are used with the median, minimum, and maximum values to form a <strong>5 number summary<\/strong> of the data.\r\n<div class=\"textbox\">\r\n<h3>Five number summary<\/h3>\r\nThe five number summary takes this form:\r\n<p style=\"text-align: center;\">Minimum, Q1, Median, Q3, Maximum<\/p>\r\n\r\n<\/div>\r\n<span style=\"font-size: 16px;\">To find the first quartile, we need to find the data value so that 25% of the data is below it. If <\/span><em style=\"font-size: 16px;\">n<\/em><span style=\"font-size: 16px;\"> is the number of data values, we compute a locator by finding 25% of <\/span><em style=\"font-size: 16px;\">n<\/em><span style=\"font-size: 16px;\">. If this locator is a decimal value, we round up, and find the data value in that position. If the locator is a whole number, we find the mean of the data value in that position and the next data value. This is identical to the process we used to find the median, except we use 25% of the data values rather than half the data values as the locator.<\/span>\r\n<div class=\"textbox\">\r\n<h3>To find the first quartile, Q1<\/h3>\r\n<ol>\r\n \t<li>Begin by ordering the data from smallest to largest<\/li>\r\n \t<li>Compute the locator: <em>L<\/em> = 0.25<em>n<\/em><\/li>\r\n \t<li>If <em>L<\/em> is a decimal value:\r\n<ul>\r\n \t<li>Round up to <em>L+<\/em><\/li>\r\n \t<li>Use the data value in the <em>L+<\/em>th position<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li>If <em>L<\/em> is a whole number:\r\n<ul>\r\n \t<li>Find the mean of the data values in the <em>L<\/em>th and <em>L<\/em>+1th positions.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\n<div class=\"textbox\">\r\n<h3>To find the third quartile, Q3<\/h3>\r\nUse the same procedure as for Q1, but with locator: <em>L<\/em> = 0.75<em>n<\/em>\r\n\r\nExamples should help make this clearer.\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>examples<\/h3>\r\nSuppose we have measured 9 females, and their heights (in inches) sorted from smallest to largest are:\r\n\r\n59 60 62 64 66 67 69 70 72\r\n\r\nWhat are the first and third quartiles?\r\n[reveal-answer q=\"450713\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"450713\"]\r\n\r\nTo find the first quartile we first compute the locator: 25% of 9 is <em>L<\/em> = 0.25(9) = 2.25. Since this value is not a whole number, we round up to 3. The first quartile will be the third data value: 62 inches.\r\n\r\nTo find the third quartile, we again compute the locator: 75% of 9 is 0.75(9) = 6.75. Since this value is not a whole number, we round up to 7. The third quartile will be the seventh data value: 69 inches.\r\n\r\n[\/hidden-answer]\r\n\r\n<hr \/>\r\n\r\nSuppose we had measured 8 females, and their heights (in inches) sorted from smallest to largest are:\r\n\r\n59 60 62 64 66 67 69 70\r\n\r\nWhat are the first and third quartiles? What is the 5 number summary?\r\n[reveal-answer q=\"699335\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"699335\"]\r\n\r\nTo find the first quartile we first compute the locator: 25% of 8 is <em>L<\/em> = 0.25(8) = 2. Since this value <em>is<\/em> a whole number, we will find the mean of the 2nd and 3rd data values: (60+62)\/2 = 61, so the first quartile is 61 inches.\r\n\r\nThe third quartile is computed similarly, using 75% instead of 25%. <em>L<\/em> = 0.75(8) = 6. This is a whole number, so we will find the mean of the 6th and 7th data values: (67+69)\/2 = 68, so Q3 is 68.\r\n\r\nNote that the median could be computed the same way, using 50%.\r\n\r\n[\/hidden-answer]\r\n\r\n<hr \/>\r\n\r\nThe 5-number summary combines the first and third quartile with the minimum, median, and maximum values.\r\n\r\nWhat are the 5-number summaries for each of the previous 2 examples?\r\n[reveal-answer q=\"190147\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"190147\"]\r\n\r\nFor the 9 female sample, the median is 66, the minimum is 59, and the maximum is 72. The 5 number summary is: 59, 62, 66, 69, 72.\r\n\r\nFor the 8 female sample, the median is 65, the minimum is 59, and the maximum is 70, so the 5 number summary would be: 59, 61, 65, 68, 70.\r\n\r\n[\/hidden-answer]\r\n\r\nMore about each set of women's heights is in the following videos.\r\n\r\nhttps:\/\/youtu.be\/00iQvPOOUu4\r\n\r\nhttps:\/\/youtu.be\/x73G2Nep05g\r\n\r\n<hr \/>\r\n\r\nReturning to our quiz score data:\u00a0in each case, the first quartile locator is 0.25(10) = 2.5, so the first quartile will be the 3rd data value, and the third quartile will be the 8th data value. Creating the five-number summaries:\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td>Section and data<\/td>\r\n<td>5-number summary<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Section A: 5 5 5 5 5 5 5 5 5 5<\/td>\r\n<td>5, 5, 5, 5, 5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Section B: 0 0 0 0 0 10 10 10 10 10<\/td>\r\n<td>0, 0, 5, 10, 10<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Section C: 4 4 4 5 5 5 5 6 6 6<\/td>\r\n<td>4, 4, 5, 6, 6<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Section D: 0 5 5 5 5 5 5 5 5 10<\/td>\r\n<td>0, 5, 5, 5, 10<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nOf course, with a relatively small data set, finding a five-number summary is a bit silly, since the summary contains almost as many values as the original data.\r\n\r\nA video walkthrough of this example is available below.\r\n\r\nhttps:\/\/youtu.be\/uifLbZKPUDU\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nThe total cost of textbooks for the term was collected from 36 students. Find the 5 number summary of this data.\r\n\r\n$140\u00a0\u00a0\u00a0 $160\u00a0\u00a0\u00a0 $160\u00a0\u00a0\u00a0 $165\u00a0\u00a0\u00a0 $180\u00a0\u00a0\u00a0 $220\u00a0\u00a0\u00a0 $235\u00a0\u00a0\u00a0 $240\u00a0\u00a0\u00a0 $250\u00a0\u00a0\u00a0 $260\u00a0\u00a0\u00a0 $280\u00a0\u00a0\u00a0 $285\r\n\r\n$285\u00a0\u00a0\u00a0 $285\u00a0\u00a0\u00a0 $290\u00a0\u00a0\u00a0 $300\u00a0\u00a0\u00a0 $300\u00a0\u00a0\u00a0 $305\u00a0\u00a0\u00a0 $310\u00a0\u00a0\u00a0 $310\u00a0\u00a0\u00a0 $315\u00a0\u00a0\u00a0 $315\u00a0\u00a0\u00a0 $320\u00a0\u00a0\u00a0 $320\r\n\r\n$330\u00a0\u00a0\u00a0 $340\u00a0\u00a0\u00a0 $345\u00a0\u00a0\u00a0 $350\u00a0\u00a0\u00a0 $355\u00a0\u00a0\u00a0 $360\u00a0\u00a0\u00a0 $360\u00a0\u00a0\u00a0 $380\u00a0\u00a0\u00a0 $395\u00a0\u00a0\u00a0 $420\u00a0\u00a0\u00a0 $460\u00a0\u00a0\u00a0 $460\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nReturning to the household income data from earlier in the section, create the five-number summary.\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Income (thousands of dollars)<\/strong><\/td>\r\n<td><strong>Frequency<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>15<\/td>\r\n<td>6<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>20<\/td>\r\n<td>8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>25<\/td>\r\n<td>11<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>30<\/td>\r\n<td>17<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>35<\/td>\r\n<td>19<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>40<\/td>\r\n<td>20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>45<\/td>\r\n<td>12<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>50<\/td>\r\n<td>7<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n[reveal-answer q=\"78296\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"78296\"]\r\n\r\nBy adding the frequencies, we can see there are 100 data values represented in the table. In Example 20, we found the median was $35 thousand. We can see in the table that the minimum income is $15 thousand, and the maximum is $50 thousand.\r\n\r\nTo find Q1, we calculate the locator: <em>L<\/em> = 0.25(100) = 25. This is a whole number, so Q1 will be the mean of the 25th and 26th data values.\r\n\r\nCounting up in the data as we did before,\r\n\r\nThere are 6 data values of $15, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 1 to 6 are $15 thousand\r\n\r\nThe next 8 data values are $20, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 7 to (6+8)=14 are $20 thousand\r\n\r\nThe next 11 data values are $25, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 15 to (14+11)=25 are $25 thousand\r\n\r\nThe next 17 data values are $30, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 26 to (25+17)=42 are $30 thousand\r\n\r\nThe 25th data value is $25 thousand, and the 26th data value is $30 thousand, so Q1 will be the mean of these: (25 + 30)\/2 = $27.5 thousand.\r\n\r\nTo find Q3, we calculate the locator: <em>L<\/em> = 0.75(100) = 75. This is a whole number, so Q3 will be the mean of the 75th and 76th data values. Continuing our counting from earlier,\r\n\r\nThe next 19 data values are $35, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 43 to (42+19)=61 are $35 thousand\r\n\r\nThe next 20 data values are $40, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 61 to (61+20)=81 are $40 thousand\r\n\r\nBoth the 75th and 76th data values lie in this group, so Q3 will be $40 thousand.\r\n\r\nPutting these values together into a five-number summary, we get: 15, 27.5, 35, 40, 50\r\n\r\n[\/hidden-answer]\r\n\r\nThis example is demonstrated in this video.\r\n\r\nhttps:\/\/youtu.be\/ECOeeDrUxpo\r\n\r\n<\/div>\r\nNote that the 5 number summary divides the data into four intervals, each of which will contain about 25% of the data. In the previous example, that means about 25% of households have income between $40 thousand and $50 thousand.\r\n\r\nFor visualizing data, there is a graphical representation of a 5-number summary called a <strong>box plot<\/strong>, or box and whisker graph.\r\n<div class=\"textbox\">\r\n<h3>Box plot<\/h3>\r\nA <strong>box plot<\/strong> is a graphical representation of a five-number summary.\r\n\r\n<\/div>\r\nTo create a box plot, a number line is first drawn. A box is drawn from the first quartile to the third quartile, and a line is drawn through the box at the median. \u201cWhiskers\u201d are extended out to the minimum and maximum values.\r\n<div class=\"textbox exercises\">\r\n<h3>examples<\/h3>\r\nThe box plot below is based on the 9 female height data with 5 number summary:\r\n\r\n59, 62, 66, 69, 72.\r\n\r\n<img class=\"aligncenter wp-image-428\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201259\/5numbersummary.png\" alt=\"Number line titled Heights (inches), in increments of 1 from 55-75. Above this, a vertical line indicates 59. A horizontal line connects this to the next vertical line, 62. This line forms the left side of a rectangle; a line at 66 is its right side. The line at 66 also serves as the left side of another rectangle, with a line at 69 as its right side. This line at 69 connects with a horizontal line to a final vertical line at 72. \" width=\"498\" height=\"123\" \/>\r\n\r\n<hr \/>\r\n\r\nThe box plot below is based on the household income data with 5 number summary:\r\n\r\n15, 27.5, 35, 40, 50<img class=\"aligncenter wp-image-429\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201335\/5numbersummary-2.png\" alt=\"Number line titled Thousands of Dollars, in increments of 5 from 0-55. Above this, a vertical line indicates 15. A horizontal line connects this to the next vertical line, 27.5. This line forms the left side of a rectangle; a line at 35 is its right side. The line at 35 also serves as the left side of another rectangle, with a line at 40 as its right side. This line at 40 connects with a horizontal line to a final vertical line at 50. \" width=\"501\" height=\"137\" \/>\r\n\r\nBox plot creation is described further here.\r\n\r\nhttps:\/\/youtu.be\/s4SPGFlMBMU\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nCreate a box plot based on the textbook price data from the last Try It.\r\n\r\n<\/div>\r\nBox plots are particularly useful for comparing data from two populations.\r\n<div class=\"textbox exercises\">\r\n<h3>examples<\/h3>\r\nThe box plot of service times for two fast-food restaurants is shown below.\r\n\r\n<img class=\"aligncenter wp-image-430\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201425\/store1store2.png\" alt=\"Number line titled Service Time (minutes), in increments of 1 from 0-10. Two box plots are above it. The top one is labeled Store 1. A vertical line indicates 0.7. A horizontal line connects this to the next vertical line, 1.8. This line forms the left side of a rectangle; a line at 2.3 is its right side. The line at 2.3 also serves as the left side of another rectangle, with a line at 2.9 as its right side. This line at 2.9 connects with a horizontal line to a final vertical line at 6.3. The bottom box plot is labeled Store 2. A vertical line indicates 0.5. A horizontal line connects this to the next vertical line, 1.1. This line forms the left side of a rectangle; a line at 2.1 is its right side. The line at 2.1 also serves as the left side of another rectangle, with a line at 5.7 as its right side. This line at 5.7 connects with a horizontal line to a final vertical line at 9.6.\" width=\"501\" height=\"228\" \/>\r\n\r\nWhile store 2 had a slightly shorter median service time (2.1 minutes vs. 2.3 minutes), store 2 is less consistent, with a wider spread of the data.\r\n\r\nAt store 1, 75% of customers were served within 2.9 minutes, while at store 2, 75% of customers were served within 5.7 minutes.\r\n\r\nWhich store should you go to in a hurry?\r\n[reveal-answer q=\"770439\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"770439\"]That depends upon your opinions about luck \u2013 25% of customers at store 2 had to wait between 5.7 and 9.6 minutes.[\/hidden-answer]\r\n\r\n<hr \/>\r\n\r\nThe box plot below is based on the birth weights of infants with severe idiopathic respiratory distress syndrome (SIRDS)[footnote]van Vliet, P.K. and Gupta, J.M. (1973) Sodium bicarbonate in idiopathic respiratory distress syndrome. Arch. Disease in Childhood, 48, 249\u2013255.\u00a0\u00a0 As quoted on <a href=\"http:\/\/openlearn.open.ac.uk\/mod\/oucontent\/view.php?id=398296&amp;section=1.1.3\" target=\"_blank\" rel=\"noopener\">http:\/\/openlearn.open.ac.uk\/mod\/oucontent\/view.php?id=398296&amp;section=1.1.3<\/a>[\/footnote]. The box plot is separated to show the birth weights of infants who survived and those that did not.<img class=\"aligncenter wp-image-431\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201506\/surviveddied.png\" alt=\"Number line titled Birth Weight (kg), in increments of 1 from 0-4. Two box plots are above it. The top one is labeled Survived. A vertical line indicates a little more than 1. A horizontal line connects this to the next vertical line, ~1.75. This line forms the left side of a rectangle; a line at ~2.2 is its right side. The line at ~2.2 also serves as the left side of another rectangle, with a line at ~2.8 as its right side. This line at ~2.8 connects with a horizontal line to a final vertical line at ~3.7. The bottom box plot is labeled Died. A vertical line indicates ~1.1. A horizontal line connects this to the next vertical line, ~1.25. This line forms the left side of a rectangle; a line at ~1.6 is its right side. The line at ~1.6 also serves as the left side of another rectangle, with a line at ~2.3 as its right side. This line at ~2.3 connects with a horizontal line to a final vertical line at ~2.75.\" width=\"500\" height=\"228\" \/>\r\n\r\nComparing the two groups, the box plot reveals that the birth weights of the infants that died appear to be, overall, smaller than the weights of infants that survived. In fact, we can see that the median birth weight of infants that survived is the same as the third quartile of the infants that died.\r\n\r\nSimilarly, we can see that the first quartile of the survivors is larger than the median weight of those that died, meaning that over 75% of the survivors had a birth weight larger than the median birth weight of those that died.\r\n\r\nLooking at the maximum value for those that died and the third quartile of the survivors, we can see that over 25% of the survivors had birth weights higher than the heaviest infant that died.\r\n\r\nThe box plot gives us a quick, albeit informal, way to determine that birth weight is quite likely linked to survival of infants with SIRDS.\r\n\r\nThe following video analyzes the examples above.\r\n\r\nhttps:\/\/youtu.be\/eUkgf-2NVO8\r\n\r\n<\/div>\r\n&nbsp;","rendered":"<div class=\"textbox learning-objectives\">\n<h3>Learning Outcomes<\/h3>\n<ul>\n<li>Calculate the mean, median, and mode of a set of data<\/li>\n<li>Calculate the range of a data set, and recognize it&#8217;s limitations in fully describing the behavior of a data set<\/li>\n<li>Calculate the standard deviation for a data set, and determine it&#8217;s units<\/li>\n<li>Identify the difference between population variance and sample variance<\/li>\n<li>Identify the quartiles for a data set, and the calculations used to define them<\/li>\n<li>Identify the parts of a five number summary for a set of data, and create a box plot using it<\/li>\n<\/ul>\n<\/div>\n<p>It is often desirable to use a few numbers to summarize a data set. One important aspect of a set of data\u00a0is where its center is located. In this lesson, measures of central tendency are discussed first. A second aspect of a distribution is how spread out it is. In other words, how much the data in the distribution vary from one another. The second section of this lesson describes measures of variability.<\/p>\n<h2>Measures of Central Tendency<\/h2>\n<h3>Mean, Median, and Mode<\/h3>\n<p><a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22190716\/8027247398_96f7013a1f_z.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-955\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22190716\/8027247398_96f7013a1f_z.jpg\" alt=\"Silver sphere which has red smaller spheres clustered on its left side, with magnetic attraction\" width=\"640\" height=\"427\" \/><\/a><\/p>\n<p>Let&#8217;s begin by trying to find the most &#8220;typical&#8221; value of a data set.<\/p>\n<p>Note that we just used the word &#8220;typical&#8221; although in many cases you might think of using the word &#8220;average.&#8221; We need to be careful with the word &#8220;average&#8221; as it means different things to different people in different contexts.\u00a0 One of the most common uses of the word &#8220;average&#8221; is what mathematicians and statisticians call the <strong>arithmetic mean<\/strong>, or just plain old <strong>mean<\/strong> for short.\u00a0 &#8220;Arithmetic mean&#8221; sounds rather fancy, but you have likely calculated a mean many times without realizing it; the mean is what most people think of when they use the word &#8220;average.&#8221;<\/p>\n<div class=\"textbox\">\n<h3>Mean<\/h3>\n<p>The <strong>mean<\/strong> of a set of data is the sum of the data values divided by the number of values.<\/p>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>examples<\/h3>\n<p>Marci\u2019s exam scores for her last math class were 79, 86, 82, and 94. What would the mean of these values would be?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q162631\">Show Solution<\/span><\/p>\n<div id=\"q162631\" class=\"hidden-answer\" style=\"display: none\">[latex]\\frac{79+86+82+94}{4}=85.25[\/latex]. Typically we round means to one more decimal place than the original data had. In this case, we would round 85.25 to 85.3.<\/div>\n<\/div>\n<hr \/>\n<p>The number of touchdown (TD) passes thrown by each of the 31 teams in the National Football League in the 2000 season are shown below.<\/p>\n<p>37 33 33 32 29 28 28 23 22 22 22 21 21 21 20<\/p>\n<p>20 19 19 18 18 18 18 16 15 14 14 14 12 12 9 6<\/p>\n<p>What is the mean number of TD passes?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q244479\">Show Solution<\/span><\/p>\n<div id=\"q244479\" class=\"hidden-answer\" style=\"display: none\">\n<p>Adding these values, we get 634 total TDs. Dividing by 31, the number of data values, we get 634\/31 = 20.4516. It would be appropriate to round this to 20.5.<\/p>\n<p>It would be most correct for us to report that \u201cThe mean number of touchdown passes thrown in the NFL in the 2000 season was 20.5 passes,\u201d but it is not uncommon to see the more casual word \u201caverage\u201d used in place of \u201cmean.\u201d<\/p>\n<\/div>\n<\/div>\n<p>Both examples are described further in the following video.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-1\" title=\"Finding the mean of a data set\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/3if9Le2sO0c?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>The price of a jar of peanut butter at 5 stores was $3.29, $3.59, $3.79, $3.75, and $3.99. Find the mean price.<\/p>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>examples<\/h3>\n<p>The one hundred families in a particular neighborhood are asked their annual household income, to the nearest $5 thousand dollars. The results are summarized in a frequency table below.<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Income (thousands of dollars)<\/strong><\/td>\n<td><strong>Frequency<\/strong><\/td>\n<\/tr>\n<tr>\n<td>15<\/td>\n<td>6<\/td>\n<\/tr>\n<tr>\n<td>20<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>25<\/td>\n<td>11<\/td>\n<\/tr>\n<tr>\n<td>30<\/td>\n<td>17<\/td>\n<\/tr>\n<tr>\n<td>35<\/td>\n<td>19<\/td>\n<\/tr>\n<tr>\n<td>40<\/td>\n<td>20<\/td>\n<\/tr>\n<tr>\n<td>45<\/td>\n<td>12<\/td>\n<\/tr>\n<tr>\n<td>50<\/td>\n<td>7<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>What is the mean average income in this neighborhood?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q231491\">Show Solution<\/span><\/p>\n<div id=\"q231491\" class=\"hidden-answer\" style=\"display: none\">\n<p>Calculating the mean by hand could get tricky if we try to type in all 100 values:<\/p>\n<p>[latex]\\frac{\\overbrace{15+\\cdots+15}^{\\text{6terms}}+\\overbrace{20+\\cdots+20}^{\\text{8terms}}+\\overbrace{25+\\cdots+25}^{\\text{11terms}}+\\cdots}{\\text{100}}[\/latex]<\/p>\n<p>We could calculate this more easily by noticing that adding 15 to itself six times is the same as = 90. Using this simplification, we get<\/p>\n<p>[latex]\\frac{15\\cdot6+20\\cdot8+25\\cdot11+30\\cdot17+35\\cdot19+40\\cdot20+45\\cdot12+50\\cdot7}{\\text{100}}=\\frac{3390}{100}=33.9[\/latex]<\/p>\n<p>The mean household income of our sample is 33.9 thousand dollars ($33,900).<\/p>\n<\/div>\n<\/div>\n<hr \/>\n<p>Extending off the last example, suppose a new family moves into the neighborhood example that has a household income of $5 million ($5000 thousand).<\/p>\n<p>What is the new mean of this neighborhood&#8217;s income?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q820039\">Show Solution<\/span><\/p>\n<div id=\"q820039\" class=\"hidden-answer\" style=\"display: none\">\n<p>Adding this to our sample, our mean is now:<\/p>\n<p>[latex]\\frac{15\\cdot6+20\\cdot8+25\\cdot11+30\\cdot17+35\\cdot19+40\\cdot20+45\\cdot12+50\\cdot7+5000\\cdot1}{\\text{101}}=\\frac{8390}{101}=83.069[\/latex]<\/p>\n<\/div>\n<\/div>\n<p>Both situations are explained further in this video.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-2\" title=\"Mean from a frequency table\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/1_4Hxcq8DpQ?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<p>While 83.1 thousand dollars ($83,069) is the correct mean household income, it no longer represents a \u201ctypical\u201d value.<\/p>\n<p>Imagine the data values on a see-saw or balance scale. The mean is the value that keeps the data in balance, like in the picture below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-423\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12182802\/balance1.png\" alt=\"Drawing of a balance bar. A large blue block is on left end, and two smaller blue rectangles are on right end of balance point. One is close to the balance, one is further away.\" width=\"349\" height=\"61\" \/><\/p>\n<p>If we graph our household data, the $5 million data value is so far out to the right that the mean has to adjust up to keep things in balance.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-424\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12182831\/balance2.png\" alt=\"Drawing of a balance bar. A large blue block is on left end and two smaller blue rectangles are on also on left of balance point. On right, a small blue rectangle is significantly far away from the balance point.\" width=\"1256\" height=\"100\" \/><\/p>\n<p>For this reason, when working with data that have <strong>outliers<\/strong> \u2013 values far outside the primary grouping \u2013 it is common to use a different measure of center, the <strong>median<\/strong>.<\/p>\n<div class=\"textbox\">\n<h3>Median<\/h3>\n<p>The <strong>median<\/strong> of a set of data is the value in the middle when the data is in order.<\/p>\n<ul>\n<li>To find the median, begin by listing the data in order from smallest to largest, or largest to smallest.<\/li>\n<li>If the number of data values, <em>N<\/em>, is odd, then the median is the middle data value. This value can be found by rounding <em>N<\/em>\/2 up to the next whole number.<\/li>\n<li>If the number of data values is even, there is no one middle value, so we find the mean of the two middle values (values <em>N<\/em>\/2 and <em>N<\/em>\/2 + 1)<\/li>\n<\/ul>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>example<\/h3>\n<p>Returning to the football touchdown data, we would start by listing the data in order. Luckily, it was already in decreasing order, so we can work with it without needing to reorder it first.<\/p>\n<p>37 33 33 32 29 28 28 23 22 22 22 21 21 21 20<\/p>\n<p>20 19 19 18 18 18 18 16 15 14 14 14 12 12 9 6<\/p>\n<p>What is the median TD value?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q866224\">Show Solution<\/span><\/p>\n<div id=\"q866224\" class=\"hidden-answer\" style=\"display: none\">Since there are 31 data values, an odd number, the median will be the middle number, the 16th data value (31\/2 = 15.5, round up to 16, leaving 15 values below and 15 above). The 16th data value is 20, so the median number of touchdown passes in the 2000 season was 20 passes. Notice that for this data, the median is fairly close to the mean we calculated earlier, 20.5.<\/div>\n<\/div>\n<hr \/>\n<p>Find the median of these quiz scores: 5 10 8 6 4 8 2 5 7 7<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q282623\">Show Solution<\/span><\/p>\n<div id=\"q282623\" class=\"hidden-answer\" style=\"display: none\">\n<p>We start by listing the data in order: 2 4 5 5 6 7 7 8 8 10<\/p>\n<p>Since there are 10 data values, an even number, there is no one middle number. So we find the mean of the two middle numbers, 6 and 7, and get (6+7)\/2 = 6.5.<\/p>\n<p>The median quiz score was 6.5.<\/p>\n<\/div>\n<\/div>\n<p>Learn more about these median examples in this video.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-3\" title=\"Median from a data list\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/WEdr_rSRObk?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>The price of a jar of peanut butter at 5 stores was\u00a0$3.29, $3.59, $3.79, $3.75, and $3.99. Find the median price.<\/p>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Let us return now to our original household income data<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Income (thousands of dollars)<\/strong><\/td>\n<td><strong>Frequency<\/strong><\/td>\n<\/tr>\n<tr>\n<td>15<\/td>\n<td>6<\/td>\n<\/tr>\n<tr>\n<td>20<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>25<\/td>\n<td>11<\/td>\n<\/tr>\n<tr>\n<td>30<\/td>\n<td>17<\/td>\n<\/tr>\n<tr>\n<td>35<\/td>\n<td>19<\/td>\n<\/tr>\n<tr>\n<td>40<\/td>\n<td>20<\/td>\n<\/tr>\n<tr>\n<td>45<\/td>\n<td>12<\/td>\n<\/tr>\n<tr>\n<td>50<\/td>\n<td>7<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>What is the mean of this neighborhood&#8217;s household income?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q941761\">Show Solution<\/span><\/p>\n<div id=\"q941761\" class=\"hidden-answer\" style=\"display: none\">\n<p>Here we have 100 data values. If we didn\u2019t already know that, we could find it by adding the frequencies. Since 100 is an even number, we need to find the mean of the middle two data values &#8211; the 50th and 51st data values. To find these, we start counting up from the bottom:<\/p>\n<p>There are 6 data values of $15, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 1 to 6 are $15 thousand<\/p>\n<p>The next 8 data values are $20, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 7 to (6+8)=14 are $20 thousand<\/p>\n<p>The next 11 data values are $25, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 15 to (14+11)=25 are $25 thousand<\/p>\n<p>The next 17 data values are $30, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 26 to (25+17)=42 are $30 thousand<\/p>\n<p>The next 19 data values are $35, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 43 to (42+19)=61 are $35 thousand<\/p>\n<p>From this we can tell that values 50 and 51 will be $35 thousand, and the mean of these two values is $35 thousand. The median income in this neighborhood is $35 thousand.<\/p>\n<\/div>\n<\/div>\n<hr \/>\n<p>If we add in the new neighbor with a $5 million household income, then there will be 101 data values, and the 51st value will be the median. As we discovered in the last example, the 51st value is $35 thousand. Notice that the new neighbor did not affect the median in this case. The median is not swayed as much by outliers as the mean is.<\/p>\n<p>View more about the median of this neighborhood&#8217;s household incomes here.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-4\" title=\"Median from a frequency table\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/kqEu9EDkmfU?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>In addition to the mean and the median, there is one other common measurement of the &#8220;typical&#8221; value of a data set: the <strong>mode<\/strong>.<\/p>\n<div class=\"textbox\">\n<h3>Mode<\/h3>\n<p>The <strong>mode<\/strong> is the element of the data set that occurs most frequently.<\/p>\n<\/div>\n<p>The mode is fairly useless with data like weights or heights where there are a large number of possible values. The mode is most commonly used for categorical data, for which median and mean cannot be computed.<\/p>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>In our vehicle color survey earlier in this section, we collected the data<\/p>\n<table>\n<tbody>\n<tr>\n<th>Color<\/th>\n<th>Frequency<\/th>\n<\/tr>\n<tr>\n<td>Blue<\/td>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>Green<\/td>\n<td>5<\/td>\n<\/tr>\n<tr>\n<td>Red<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>White<\/td>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>Black<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>Grey<\/td>\n<td>3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Which color is the mode?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q638793\">Show Solution<\/span><\/p>\n<div id=\"q638793\" class=\"hidden-answer\" style=\"display: none\">For this data, Green is the mode, since it is the data value that occurred the most frequently.<\/div>\n<\/div>\n<p>Mode in this example is explained by the video here.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-5\" title=\"Mode for categorical data\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/pFpkWrib3Jk?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<p>It is possible for a data set to have more than one mode if several categories have the same frequency, or no modes if each every category occurs only once.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>Reviewers were asked to rate a product on a scale of 1 to 5. Find<\/p>\n<ol>\n<li>The mean rating<\/li>\n<li>The median rating<\/li>\n<li>The mode rating<\/li>\n<\/ol>\n<table>\n<tbody>\n<tr>\n<td><strong>Rating<\/strong><\/td>\n<td><strong>Frequency<\/strong><\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>7<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h2>Measures of Variation<\/h2>\n<h3>Range and Standard Deviation<\/h3>\n<p>Consider these three sets of quiz scores:<\/p>\n<p style=\"padding-left: 120px;\">Section A: 5 5 5 5 5 5 5 5 5 5<\/p>\n<p style=\"padding-left: 120px;\">Section B: 0 0 0 0 0 10 10 10 10 10<\/p>\n<p style=\"padding-left: 120px;\">Section C: 4 4 4 5 5 5 5 6 6 6<\/p>\n<p>All three of these sets of data have a mean of 5 and median of 5, yet the sets of scores are clearly quite different. In section A, everyone had the same score; in section B half the class got no points and the other half got a perfect score, assuming this was a 10-point quiz.\u00a0 Section C was not as consistent as section A, but not as widely varied as section B.<\/p>\n<p>In addition to the mean and median, which are measures of the &#8220;typical&#8221; or &#8220;middle&#8221; value, we also need a measure of how &#8220;spread out&#8221; or varied each data set is.<a href=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22194649\/299549598_e1007533c6_z.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-960\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1141\/2016\/12\/22194649\/299549598_e1007533c6_z.jpg\" alt=\"Collage of photos of trees around a central octagonal park space\" width=\"640\" height=\"481\" \/><\/a><\/p>\n<p>There are several ways to measure this &#8220;spread&#8221; of the data. The first is the simplest and is called the <strong>range<\/strong>.<\/p>\n<div class=\"textbox\">\n<h3>Range<\/h3>\n<p>The range is the difference between the maximum value and the minimum value of the data set.<\/p>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>example<\/h3>\n<p>Using the quiz scores from above,<\/p>\n<p>For section A, the range is 0 since both maximum and minimum are 5 and 5 \u2013 5 =\u00a00<\/p>\n<p>For section B, the range is 10 since 10 \u2013 0\u00a0=\u00a010<\/p>\n<p>For section C, the range is 2 since 6 \u2013 4 = 2<\/p>\n<p>In the last example, the range seems to be revealing how spread out the data is. However, suppose we add a fourth section, Section D, with scores 0 5 5 5 5 5 5 5 5 10.<\/p>\n<p>This section also has a mean and median of 5. The range is 10, yet this data set is quite different than Section B. To better illuminate the differences, we\u2019ll have to turn to more sophisticated measures of variation.<\/p>\n<p>The range of this example is explained in the following video.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-6\" title=\"Finding range of a data set\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/b3ofWalrHgQ?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<div class=\"textbox\">\n<h3>Standard deviation<\/h3>\n<p>The standard deviation is a measure of variation based on measuring how far each data value deviates, or is different, from the mean. A few important characteristics:<\/p>\n<ul>\n<li>Standard deviation is always positive. Standard deviation will be zero if all the data values are equal, and will get larger as the data spreads out.<\/li>\n<li>Standard deviation has the same units as the original data.<\/li>\n<li>Standard deviation, like the mean, can be highly influenced by outliers.<\/li>\n<\/ul>\n<\/div>\n<p>Using the data from section D, we could compute for each data value the difference between the data value and the mean:<\/p>\n<table>\n<tbody>\n<tr>\n<th>data value<\/th>\n<th>deviation: data value &#8211; mean<\/th>\n<\/tr>\n<tr>\n<td>0<\/td>\n<td>0-5 = -5<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>10-5 = 5<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>We would like to get an idea of the &#8220;average&#8221; deviation from the mean, but if we find the average of the values in the second column the negative and positive values cancel each other out (this will always happen), so to prevent this we square every value in the second column:<\/p>\n<table>\n<tbody>\n<tr>\n<th>data value<\/th>\n<th>deviation: data value &#8211; mean<\/th>\n<th>deviation squared<\/th>\n<\/tr>\n<tr>\n<td>0<\/td>\n<td>0-5 = -5<\/td>\n<td>(-5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>5<\/td>\n<td>5-5 = 0<\/td>\n<td>0<sup>2<\/sup> = 0<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>10-5 = 5<\/td>\n<td>(5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>We then add the squared deviations up to get 25 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 0 + 25 = 50.\u00a0 Ordinarily we would then divide by the number of scores,<em> n<\/em>, (in this case, 10) to find the mean of the deviations.\u00a0 But we only do this if the data set represents a population; if the data set represents a sample (as it almost always does), we instead divide by <em>n<\/em> &#8211; 1 (in this case, 10 &#8211; 1 = 9).<a class=\"footnote\" title=\"The reason we do this is highly technical, but we can see how it might be useful by considering the case of a small sample from a population that contains an outlier, which would increase the average deviation: the outlier very likely won't be included in the sample, so the mean deviation of the sample would underestimate the mean deviation of the population; thus we divide by a slightly smaller number to get a slightly bigger average deviation.\" id=\"return-footnote-1468-1\" href=\"#footnote-1468-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a><\/p>\n<p>So in our example, we would have 50\/10 = 5 if section D represents a population and 50\/9 = about 5.56 if section D represents a sample.\u00a0 These values (5 and 5.56) are called, respectively, the <strong>population variance<\/strong> and the <strong>sample variance<\/strong> for section D.<\/p>\n<p>Variance can be a useful statistical concept, but note that the units of variance in this instance would be points-squared since we squared all of the deviations. What are points-squared? Good question. We would rather deal with the units we started with (points in this case), so to convert back we take the square root and get:<\/p>\n<p style=\"text-align: center;\">[latex]\\begin{align}&\\text{populationstandarddeviation}=\\sqrt{\\frac{50}{10}}=\\sqrt{5}\\approx2.2\\\\&\\text{or}\\\\&\\text{samplestandarddeviation}=\\sqrt{\\frac{50}{9}}\\approx2.4\\\\\\end{align}[\/latex]<\/p>\n<p>If we are unsure whether the data set is a sample or a population, we will usually assume it is a sample, and we will round answers to one more decimal place than the original data, as we have done above.<\/p>\n<div class=\"textbox\">\n<h3>To compute standard deviation<\/h3>\n<ol>\n<li>Find the deviation of each data from the mean. In other words, subtract the mean from the data value.<\/li>\n<li>Square each deviation.<\/li>\n<li>Add the squared deviations.<\/li>\n<li>Divide by <em>n<\/em>, the number of data values, if the data represents a whole population; divide by <em>n<\/em> \u2013 1 if the data is from a sample.<\/li>\n<li>Compute the square root of the result.<\/li>\n<\/ol>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>example<\/h3>\n<p>Computing the standard deviation for Section B above, we first calculate that the mean is 5. Using a table can help keep track of your computations for the standard deviation:<\/p>\n<table>\n<tbody>\n<tr>\n<th>data value<\/th>\n<th>deviation: data value &#8211; mean<\/th>\n<th>deviation squared<\/th>\n<\/tr>\n<tr>\n<td>0<\/td>\n<td>0-5 = -5<\/td>\n<td>(-5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>0<\/td>\n<td>0-5 = -5<\/td>\n<td>(-5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>0<\/td>\n<td>0-5 = -5<\/td>\n<td>(-5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>0<\/td>\n<td>0-5 = -5<\/td>\n<td>(-5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>0<\/td>\n<td>0-5 = -5<\/td>\n<td>(-5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>10-5 = 5<\/td>\n<td>(5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>10-5 = 5<\/td>\n<td>(5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>10-5 = 5<\/td>\n<td>(5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>10-5 = 5<\/td>\n<td>(5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<tr>\n<td>10<\/td>\n<td>10-5 = 5<\/td>\n<td>(5)<sup>2<\/sup> = 25<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Assuming this data represents a population, we will add the squared deviations, divide by 10, the number of data values, and compute the square root:<\/p>\n<p style=\"text-align: center;\">[latex]\\sqrt{\\frac{25+25+25+25+25+25+25+25+25+25}{10}}=\\sqrt{\\frac{250}{10}}=5[\/latex]<\/p>\n<p>Notice that the standard deviation of this data set is much larger than that of section D since the data in this set is more spread out.<\/p>\n<p>For comparison, the standard deviations of all four sections are:<\/p>\n<table>\n<tbody>\n<tr>\n<td>Section A: 5 5 5 5 5 5 5 5 5 5<\/td>\n<td>Standard deviation: 0<\/td>\n<\/tr>\n<tr>\n<td>Section B: 0 0 0 0 0 10 10 10 10 10<\/td>\n<td>Standard deviation: 5<\/td>\n<\/tr>\n<tr>\n<td>Section C: 4 4 4 5 5 5 5 6 6 6<\/td>\n<td>Standard deviation: 0.8<\/td>\n<\/tr>\n<tr>\n<td>Section D: 0 5 5 5 5 5 5 5 5 10<\/td>\n<td>Standard deviation: 2.2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>See the following video for more about calculating the standard deviation in this example.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-7\" title=\"Computing standard deviation 1\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/wS8z90f04OU?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>The price of a jar of peanut butter at 5 stores was\u00a0$3.29, $3.59, $3.79, $3.75, and $3.99. Find the standard deviation of the prices.<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Where standard deviation is a measure of variation based on the mean, <strong>quartiles<\/strong> are based on the median.<\/p>\n<div class=\"textbox\">\n<h3>Quartiles<\/h3>\n<p>Quartiles are values that divide the data in quarters.<\/p>\n<p>The first quartile (Q1) is the value so that 25% of the data values are below it; the third quartile (Q3) is the value so that 75% of the data values are below it. You may have guessed that the second quartile is the same as the median, since the median is the value so that 50% of the data values are below it.<\/p>\n<p>This divides the data into quarters; 25% of the data is between the minimum and Q1, 25% is between Q1 and the median, 25% is between the median and Q3, and 25% is between Q3 and the maximum value.<\/p>\n<\/div>\n<p>While quartiles are not a 1-number summary of variation like standard deviation, the quartiles are used with the median, minimum, and maximum values to form a <strong>5 number summary<\/strong> of the data.<\/p>\n<div class=\"textbox\">\n<h3>Five number summary<\/h3>\n<p>The five number summary takes this form:<\/p>\n<p style=\"text-align: center;\">Minimum, Q1, Median, Q3, Maximum<\/p>\n<\/div>\n<p><span style=\"font-size: 16px;\">To find the first quartile, we need to find the data value so that 25% of the data is below it. If <\/span><em style=\"font-size: 16px;\">n<\/em><span style=\"font-size: 16px;\"> is the number of data values, we compute a locator by finding 25% of <\/span><em style=\"font-size: 16px;\">n<\/em><span style=\"font-size: 16px;\">. If this locator is a decimal value, we round up, and find the data value in that position. If the locator is a whole number, we find the mean of the data value in that position and the next data value. This is identical to the process we used to find the median, except we use 25% of the data values rather than half the data values as the locator.<\/span><\/p>\n<div class=\"textbox\">\n<h3>To find the first quartile, Q1<\/h3>\n<ol>\n<li>Begin by ordering the data from smallest to largest<\/li>\n<li>Compute the locator: <em>L<\/em> = 0.25<em>n<\/em><\/li>\n<li>If <em>L<\/em> is a decimal value:\n<ul>\n<li>Round up to <em>L+<\/em><\/li>\n<li>Use the data value in the <em>L+<\/em>th position<\/li>\n<\/ul>\n<\/li>\n<li>If <em>L<\/em> is a whole number:\n<ul>\n<li>Find the mean of the data values in the <em>L<\/em>th and <em>L<\/em>+1th positions.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<\/div>\n<div class=\"textbox\">\n<h3>To find the third quartile, Q3<\/h3>\n<p>Use the same procedure as for Q1, but with locator: <em>L<\/em> = 0.75<em>n<\/em><\/p>\n<p>Examples should help make this clearer.<\/p>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>examples<\/h3>\n<p>Suppose we have measured 9 females, and their heights (in inches) sorted from smallest to largest are:<\/p>\n<p>59 60 62 64 66 67 69 70 72<\/p>\n<p>What are the first and third quartiles?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q450713\">Show Solution<\/span><\/p>\n<div id=\"q450713\" class=\"hidden-answer\" style=\"display: none\">\n<p>To find the first quartile we first compute the locator: 25% of 9 is <em>L<\/em> = 0.25(9) = 2.25. Since this value is not a whole number, we round up to 3. The first quartile will be the third data value: 62 inches.<\/p>\n<p>To find the third quartile, we again compute the locator: 75% of 9 is 0.75(9) = 6.75. Since this value is not a whole number, we round up to 7. The third quartile will be the seventh data value: 69 inches.<\/p>\n<\/div>\n<\/div>\n<hr \/>\n<p>Suppose we had measured 8 females, and their heights (in inches) sorted from smallest to largest are:<\/p>\n<p>59 60 62 64 66 67 69 70<\/p>\n<p>What are the first and third quartiles? What is the 5 number summary?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q699335\">Show Solution<\/span><\/p>\n<div id=\"q699335\" class=\"hidden-answer\" style=\"display: none\">\n<p>To find the first quartile we first compute the locator: 25% of 8 is <em>L<\/em> = 0.25(8) = 2. Since this value <em>is<\/em> a whole number, we will find the mean of the 2nd and 3rd data values: (60+62)\/2 = 61, so the first quartile is 61 inches.<\/p>\n<p>The third quartile is computed similarly, using 75% instead of 25%. <em>L<\/em> = 0.75(8) = 6. This is a whole number, so we will find the mean of the 6th and 7th data values: (67+69)\/2 = 68, so Q3 is 68.<\/p>\n<p>Note that the median could be computed the same way, using 50%.<\/p>\n<\/div>\n<\/div>\n<hr \/>\n<p>The 5-number summary combines the first and third quartile with the minimum, median, and maximum values.<\/p>\n<p>What are the 5-number summaries for each of the previous 2 examples?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q190147\">Show Solution<\/span><\/p>\n<div id=\"q190147\" class=\"hidden-answer\" style=\"display: none\">\n<p>For the 9 female sample, the median is 66, the minimum is 59, and the maximum is 72. The 5 number summary is: 59, 62, 66, 69, 72.<\/p>\n<p>For the 8 female sample, the median is 65, the minimum is 59, and the maximum is 70, so the 5 number summary would be: 59, 61, 65, 68, 70.<\/p>\n<\/div>\n<\/div>\n<p>More about each set of women&#8217;s heights is in the following videos.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-8\" title=\"Five number summary 1\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/00iQvPOOUu4?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-9\" title=\"Five number summary 2\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/x73G2Nep05g?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<hr \/>\n<p>Returning to our quiz score data:\u00a0in each case, the first quartile locator is 0.25(10) = 2.5, so the first quartile will be the 3rd data value, and the third quartile will be the 8th data value. Creating the five-number summaries:<\/p>\n<table>\n<tbody>\n<tr>\n<td>Section and data<\/td>\n<td>5-number summary<\/td>\n<\/tr>\n<tr>\n<td>Section A: 5 5 5 5 5 5 5 5 5 5<\/td>\n<td>5, 5, 5, 5, 5<\/td>\n<\/tr>\n<tr>\n<td>Section B: 0 0 0 0 0 10 10 10 10 10<\/td>\n<td>0, 0, 5, 10, 10<\/td>\n<\/tr>\n<tr>\n<td>Section C: 4 4 4 5 5 5 5 6 6 6<\/td>\n<td>4, 4, 5, 6, 6<\/td>\n<\/tr>\n<tr>\n<td>Section D: 0 5 5 5 5 5 5 5 5 10<\/td>\n<td>0, 5, 5, 5, 10<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Of course, with a relatively small data set, finding a five-number summary is a bit silly, since the summary contains almost as many values as the original data.<\/p>\n<p>A video walkthrough of this example is available below.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-10\" title=\"Five number summary 3\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/uifLbZKPUDU?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>The total cost of textbooks for the term was collected from 36 students. Find the 5 number summary of this data.<\/p>\n<p>$140\u00a0\u00a0\u00a0 $160\u00a0\u00a0\u00a0 $160\u00a0\u00a0\u00a0 $165\u00a0\u00a0\u00a0 $180\u00a0\u00a0\u00a0 $220\u00a0\u00a0\u00a0 $235\u00a0\u00a0\u00a0 $240\u00a0\u00a0\u00a0 $250\u00a0\u00a0\u00a0 $260\u00a0\u00a0\u00a0 $280\u00a0\u00a0\u00a0 $285<\/p>\n<p>$285\u00a0\u00a0\u00a0 $285\u00a0\u00a0\u00a0 $290\u00a0\u00a0\u00a0 $300\u00a0\u00a0\u00a0 $300\u00a0\u00a0\u00a0 $305\u00a0\u00a0\u00a0 $310\u00a0\u00a0\u00a0 $310\u00a0\u00a0\u00a0 $315\u00a0\u00a0\u00a0 $315\u00a0\u00a0\u00a0 $320\u00a0\u00a0\u00a0 $320<\/p>\n<p>$330\u00a0\u00a0\u00a0 $340\u00a0\u00a0\u00a0 $345\u00a0\u00a0\u00a0 $350\u00a0\u00a0\u00a0 $355\u00a0\u00a0\u00a0 $360\u00a0\u00a0\u00a0 $360\u00a0\u00a0\u00a0 $380\u00a0\u00a0\u00a0 $395\u00a0\u00a0\u00a0 $420\u00a0\u00a0\u00a0 $460\u00a0\u00a0\u00a0 $460<\/p>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Returning to the household income data from earlier in the section, create the five-number summary.<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Income (thousands of dollars)<\/strong><\/td>\n<td><strong>Frequency<\/strong><\/td>\n<\/tr>\n<tr>\n<td>15<\/td>\n<td>6<\/td>\n<\/tr>\n<tr>\n<td>20<\/td>\n<td>8<\/td>\n<\/tr>\n<tr>\n<td>25<\/td>\n<td>11<\/td>\n<\/tr>\n<tr>\n<td>30<\/td>\n<td>17<\/td>\n<\/tr>\n<tr>\n<td>35<\/td>\n<td>19<\/td>\n<\/tr>\n<tr>\n<td>40<\/td>\n<td>20<\/td>\n<\/tr>\n<tr>\n<td>45<\/td>\n<td>12<\/td>\n<\/tr>\n<tr>\n<td>50<\/td>\n<td>7<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q78296\">Show Solution<\/span><\/p>\n<div id=\"q78296\" class=\"hidden-answer\" style=\"display: none\">\n<p>By adding the frequencies, we can see there are 100 data values represented in the table. In Example 20, we found the median was $35 thousand. We can see in the table that the minimum income is $15 thousand, and the maximum is $50 thousand.<\/p>\n<p>To find Q1, we calculate the locator: <em>L<\/em> = 0.25(100) = 25. This is a whole number, so Q1 will be the mean of the 25th and 26th data values.<\/p>\n<p>Counting up in the data as we did before,<\/p>\n<p>There are 6 data values of $15, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 1 to 6 are $15 thousand<\/p>\n<p>The next 8 data values are $20, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 7 to (6+8)=14 are $20 thousand<\/p>\n<p>The next 11 data values are $25, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 15 to (14+11)=25 are $25 thousand<\/p>\n<p>The next 17 data values are $30, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 26 to (25+17)=42 are $30 thousand<\/p>\n<p>The 25th data value is $25 thousand, and the 26th data value is $30 thousand, so Q1 will be the mean of these: (25 + 30)\/2 = $27.5 thousand.<\/p>\n<p>To find Q3, we calculate the locator: <em>L<\/em> = 0.75(100) = 75. This is a whole number, so Q3 will be the mean of the 75th and 76th data values. Continuing our counting from earlier,<\/p>\n<p>The next 19 data values are $35, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 43 to (42+19)=61 are $35 thousand<\/p>\n<p>The next 20 data values are $40, so \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Values 61 to (61+20)=81 are $40 thousand<\/p>\n<p>Both the 75th and 76th data values lie in this group, so Q3 will be $40 thousand.<\/p>\n<p>Putting these values together into a five-number summary, we get: 15, 27.5, 35, 40, 50<\/p>\n<\/div>\n<\/div>\n<p>This example is demonstrated in this video.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-11\" title=\"Five number summary from a frequency table\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/ECOeeDrUxpo?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<p>Note that the 5 number summary divides the data into four intervals, each of which will contain about 25% of the data. In the previous example, that means about 25% of households have income between $40 thousand and $50 thousand.<\/p>\n<p>For visualizing data, there is a graphical representation of a 5-number summary called a <strong>box plot<\/strong>, or box and whisker graph.<\/p>\n<div class=\"textbox\">\n<h3>Box plot<\/h3>\n<p>A <strong>box plot<\/strong> is a graphical representation of a five-number summary.<\/p>\n<\/div>\n<p>To create a box plot, a number line is first drawn. A box is drawn from the first quartile to the third quartile, and a line is drawn through the box at the median. \u201cWhiskers\u201d are extended out to the minimum and maximum values.<\/p>\n<div class=\"textbox exercises\">\n<h3>examples<\/h3>\n<p>The box plot below is based on the 9 female height data with 5 number summary:<\/p>\n<p>59, 62, 66, 69, 72.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-428\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201259\/5numbersummary.png\" alt=\"Number line titled Heights (inches), in increments of 1 from 55-75. Above this, a vertical line indicates 59. A horizontal line connects this to the next vertical line, 62. This line forms the left side of a rectangle; a line at 66 is its right side. The line at 66 also serves as the left side of another rectangle, with a line at 69 as its right side. This line at 69 connects with a horizontal line to a final vertical line at 72.\" width=\"498\" height=\"123\" \/><\/p>\n<hr \/>\n<p>The box plot below is based on the household income data with 5 number summary:<\/p>\n<p>15, 27.5, 35, 40, 50<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-429\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201335\/5numbersummary-2.png\" alt=\"Number line titled Thousands of Dollars, in increments of 5 from 0-55. Above this, a vertical line indicates 15. A horizontal line connects this to the next vertical line, 27.5. This line forms the left side of a rectangle; a line at 35 is its right side. The line at 35 also serves as the left side of another rectangle, with a line at 40 as its right side. This line at 40 connects with a horizontal line to a final vertical line at 50.\" width=\"501\" height=\"137\" \/><\/p>\n<p>Box plot creation is described further here.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-12\" title=\"Creating a boxplot\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/s4SPGFlMBMU?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>Create a box plot based on the textbook price data from the last Try It.<\/p>\n<\/div>\n<p>Box plots are particularly useful for comparing data from two populations.<\/p>\n<div class=\"textbox exercises\">\n<h3>examples<\/h3>\n<p>The box plot of service times for two fast-food restaurants is shown below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-430\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201425\/store1store2.png\" alt=\"Number line titled Service Time (minutes), in increments of 1 from 0-10. Two box plots are above it. The top one is labeled Store 1. A vertical line indicates 0.7. A horizontal line connects this to the next vertical line, 1.8. This line forms the left side of a rectangle; a line at 2.3 is its right side. The line at 2.3 also serves as the left side of another rectangle, with a line at 2.9 as its right side. This line at 2.9 connects with a horizontal line to a final vertical line at 6.3. The bottom box plot is labeled Store 2. A vertical line indicates 0.5. A horizontal line connects this to the next vertical line, 1.1. This line forms the left side of a rectangle; a line at 2.1 is its right side. The line at 2.1 also serves as the left side of another rectangle, with a line at 5.7 as its right side. This line at 5.7 connects with a horizontal line to a final vertical line at 9.6.\" width=\"501\" height=\"228\" \/><\/p>\n<p>While store 2 had a slightly shorter median service time (2.1 minutes vs. 2.3 minutes), store 2 is less consistent, with a wider spread of the data.<\/p>\n<p>At store 1, 75% of customers were served within 2.9 minutes, while at store 2, 75% of customers were served within 5.7 minutes.<\/p>\n<p>Which store should you go to in a hurry?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q770439\">Show Solution<\/span><\/p>\n<div id=\"q770439\" class=\"hidden-answer\" style=\"display: none\">That depends upon your opinions about luck \u2013 25% of customers at store 2 had to wait between 5.7 and 9.6 minutes.<\/div>\n<\/div>\n<hr \/>\n<p>The box plot below is based on the birth weights of infants with severe idiopathic respiratory distress syndrome (SIRDS)<a class=\"footnote\" title=\"van Vliet, P.K. and Gupta, J.M. (1973) Sodium bicarbonate in idiopathic respiratory distress syndrome. Arch. Disease in Childhood, 48, 249\u2013255.\u00a0\u00a0 As quoted on http:\/\/openlearn.open.ac.uk\/mod\/oucontent\/view.php?id=398296&amp;section=1.1.3\" id=\"return-footnote-1468-2\" href=\"#footnote-1468-2\" aria-label=\"Footnote 2\"><sup class=\"footnote\">[2]<\/sup><\/a>. The box plot is separated to show the birth weights of infants who survived and those that did not.<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-431\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/276\/2016\/10\/12201506\/surviveddied.png\" alt=\"Number line titled Birth Weight (kg), in increments of 1 from 0-4. Two box plots are above it. The top one is labeled Survived. A vertical line indicates a little more than 1. A horizontal line connects this to the next vertical line, ~1.75. This line forms the left side of a rectangle; a line at ~2.2 is its right side. The line at ~2.2 also serves as the left side of another rectangle, with a line at ~2.8 as its right side. This line at ~2.8 connects with a horizontal line to a final vertical line at ~3.7. The bottom box plot is labeled Died. A vertical line indicates ~1.1. A horizontal line connects this to the next vertical line, ~1.25. This line forms the left side of a rectangle; a line at ~1.6 is its right side. The line at ~1.6 also serves as the left side of another rectangle, with a line at ~2.3 as its right side. This line at ~2.3 connects with a horizontal line to a final vertical line at ~2.75.\" width=\"500\" height=\"228\" \/><\/p>\n<p>Comparing the two groups, the box plot reveals that the birth weights of the infants that died appear to be, overall, smaller than the weights of infants that survived. In fact, we can see that the median birth weight of infants that survived is the same as the third quartile of the infants that died.<\/p>\n<p>Similarly, we can see that the first quartile of the survivors is larger than the median weight of those that died, meaning that over 75% of the survivors had a birth weight larger than the median birth weight of those that died.<\/p>\n<p>Looking at the maximum value for those that died and the third quartile of the survivors, we can see that over 25% of the survivors had birth weights higher than the heaviest infant that died.<\/p>\n<p>The box plot gives us a quick, albeit informal, way to determine that birth weight is quite likely linked to survival of infants with SIRDS.<\/p>\n<p>The following video analyzes the examples above.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-13\" title=\"Comparing boxplots\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/eUkgf-2NVO8?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-1468\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Original<\/div><ul class=\"citation-list\"><li>Learning Objectives and Introduction. <strong>Provided by<\/strong>: Lumen Learning. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Revision and Adaptation. <strong>Provided by<\/strong>: Lumen Learning. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><\/ul><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>Math in Society. <strong>Authored by<\/strong>: David Lippman. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/www.opentextbookstore.com\/mathinsociety\/\">http:\/\/www.opentextbookstore.com\/mathinsociety\/<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/\">CC BY-SA: Attribution-ShareAlike<\/a><\/em><\/li><li>Magnetic. <strong>Authored by<\/strong>: Philippe Put. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/flic.kr\/p\/dekJZd\">https:\/\/flic.kr\/p\/dekJZd<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Finding the mean of a data set. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/3if9Le2sO0c\">https:\/\/youtu.be\/3if9Le2sO0c<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Mean from a frequency table. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/1_4Hxcq8DpQ\">https:\/\/youtu.be\/1_4Hxcq8DpQ<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Median from a data list. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/WEdr_rSRObk\">https:\/\/youtu.be\/WEdr_rSRObk<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Median from a frequency table. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/kqEu9EDkmfU\">https:\/\/youtu.be\/kqEu9EDkmfU<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Mode for categorical data. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/pFpkWrib3Jk\">https:\/\/youtu.be\/pFpkWrib3Jk<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Butte aux canons. <strong>Authored by<\/strong>: Alexandre Duret-Lutz. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/flic.kr\/p\/stgEf\">https:\/\/flic.kr\/p\/stgEf<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by-sa\/4.0\/\">CC BY-SA: Attribution-ShareAlike<\/a><\/em><\/li><li>Finding range of a data set. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/b3ofWalrHgQ\">https:\/\/youtu.be\/b3ofWalrHgQ<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Computing standard deviation 1. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/wS8z90f04OU\">https:\/\/youtu.be\/wS8z90f04OU<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Five number summary 1. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/00iQvPOOUu4\">https:\/\/youtu.be\/00iQvPOOUu4<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Five number summary 2. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/x73G2Nep05g\">https:\/\/youtu.be\/x73G2Nep05g<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Five number summary 3. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/uifLbZKPUDU\">https:\/\/youtu.be\/uifLbZKPUDU<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Five number summary from a frequency table. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/ECOeeDrUxpo\">https:\/\/youtu.be\/ECOeeDrUxpo<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Creating a boxplot. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/s4SPGFlMBMU\">https:\/\/youtu.be\/s4SPGFlMBMU<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Comparing boxplots. <strong>Authored by<\/strong>: OCLPhase2&#039;s channel. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/eUkgf-2NVO8\">https:\/\/youtu.be\/eUkgf-2NVO8<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section><hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-1468-1\">The reason we do this is highly technical, but we can see how it might be useful by considering the case of a small sample from a population that contains an outlier, which would increase the average deviation: the outlier very likely won't be included in the sample, so the mean deviation of the sample would underestimate the mean deviation of the population; thus we divide by a slightly smaller number to get a slightly bigger average deviation. <a href=\"#return-footnote-1468-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><li id=\"footnote-1468-2\">van Vliet, P.K. and Gupta, J.M. (1973) Sodium bicarbonate in idiopathic respiratory distress syndrome. Arch. Disease in Childhood, 48, 249\u2013255.\u00a0\u00a0 As quoted on <a href=\"http:\/\/openlearn.open.ac.uk\/mod\/oucontent\/view.php?id=398296&amp;section=1.1.3\" target=\"_blank\" rel=\"noopener\">http:\/\/openlearn.open.ac.uk\/mod\/oucontent\/view.php?id=398296&amp;section=1.1.3<\/a> <a href=\"#return-footnote-1468-2\" class=\"return-footnote\" aria-label=\"Return to footnote 2\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":21,"menu_order":3,"template":"","meta":{"_candela_citation":"[{\"type\":\"original\",\"description\":\"Learning Objectives and Introduction\",\"author\":\"\",\"organization\":\"Lumen Learning\",\"url\":\"\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Math in Society\",\"author\":\"David Lippman\",\"organization\":\"\",\"url\":\"http:\/\/www.opentextbookstore.com\/mathinsociety\/\",\"project\":\"\",\"license\":\"cc-by-sa\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Magnetic\",\"author\":\"Philippe Put\",\"organization\":\"\",\"url\":\"https:\/\/flic.kr\/p\/dekJZd\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Finding the mean of a data set\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/3if9Le2sO0c\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Mean from a frequency table\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/1_4Hxcq8DpQ\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Median from a data list\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/WEdr_rSRObk\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Median from a frequency table\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/kqEu9EDkmfU\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Mode for categorical data\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/pFpkWrib3Jk\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"original\",\"description\":\"Revision and Adaptation\",\"author\":\"\",\"organization\":\"Lumen Learning\",\"url\":\"\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Butte aux canons\",\"author\":\"Alexandre Duret-Lutz\",\"organization\":\"\",\"url\":\"https:\/\/flic.kr\/p\/stgEf\",\"project\":\"\",\"license\":\"cc-by-sa\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Finding range of a data set\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/b3ofWalrHgQ\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Computing standard deviation 1\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/wS8z90f04OU\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Five number summary 1\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/00iQvPOOUu4\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Five number summary 2\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/x73G2Nep05g\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Five number summary 3\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/uifLbZKPUDU\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Five number summary from a frequency table\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/ECOeeDrUxpo\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Creating a boxplot\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/s4SPGFlMBMU\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Comparing boxplots\",\"author\":\"OCLPhase2\\'s channel\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/eUkgf-2NVO8\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"}]","CANDELA_OUTCOMES_GUID":"c2327eec-6c24-4849-b88a-236011ee7e68","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-1468","chapter","type-chapter","status-publish","hentry"],"part":398,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/pressbooks\/v2\/chapters\/1468","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/wp\/v2\/users\/21"}],"version-history":[{"count":11,"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/pressbooks\/v2\/chapters\/1468\/revisions"}],"predecessor-version":[{"id":3176,"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/pressbooks\/v2\/chapters\/1468\/revisions\/3176"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/pressbooks\/v2\/parts\/398"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/pressbooks\/v2\/chapters\/1468\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/wp\/v2\/media?parent=1468"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/pressbooks\/v2\/chapter-type?post=1468"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/wp\/v2\/contributor?post=1468"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/wmopen-mathforliberalarts\/wp-json\/wp\/v2\/license?post=1468"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}