{"id":90,"date":"2016-04-21T22:43:45","date_gmt":"2016-04-21T22:43:45","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/introstats1xmaster\/?post_type=chapter&#038;p=90"},"modified":"2022-02-08T20:28:16","modified_gmt":"2022-02-08T20:28:16","slug":"measures-of-the-spread-of-data","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/chapter\/measures-of-the-spread-of-data\/","title":{"raw":"Measures of the Spread of Data","rendered":"Measures of the Spread of Data"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>Learning Outcomes<\/h3>\r\n<ul id=\"list123523\">\r\n \t<li>Recognize, describe, and calculate the measures of the spread of data: variance, standard deviation, and range.<\/li>\r\n<\/ul>\r\n<\/div>\r\nAn important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation. The <strong>standard deviation<\/strong> is a number that measures how far data values are from their mean.\r\n\r\nThe standard deviation provides a numerical measure of the overall amount of variation in a data set, and can be used to determine whether a particular data value is close to or far from the mean.\r\n<h4>The standard deviation provides a measure of the overall variation in a data set.<\/h4>\r\nThe standard deviation is always positive or zero. The standard deviation is small when the data are all concentrated close to the mean, exhibiting little variation or spread. The standard deviation is larger when the data values are more spread out from the mean, exhibiting more variation.\r\n\r\nSuppose that we are studying the amount of time customers wait in line at the checkout at supermarket [latex]A[\/latex] and supermarket [latex]B[\/latex]. the average wait time at both supermarkets is five minutes. At supermarket [latex]A[\/latex], the standard deviation for the wait time is two minutes; at supermarket [latex]B[\/latex] the standard deviation for the wait time is four minutes.\r\n\r\nBecause supermarket [latex]B[\/latex] has a higher standard deviation, we know that there is more variation in the wait times at supermarket [latex]B[\/latex]. Overall, wait times at supermarket [latex]B[\/latex] are more spread out from the average; wait times at supermarket [latex]A[\/latex] are more concentrated near the average.\r\n<h4>The standard deviation can be used to determine whether a data value is close to or far from the mean.<\/h4>\r\nSuppose that Rosa and Binh both shop at supermarket [latex]A[\/latex]. Rosa waits at the checkout counter for seven minutes and Binh waits for one minute. At supermarket [latex]A[\/latex], the mean waiting time is five minutes and the standard deviation is two minutes. The standard deviation can be used to determine whether a data value is close to or far from the mean.\r\n\r\nRosa waits for seven minutes:\r\n<ul>\r\n \t<li>Seven is two minutes longer than the average of five; two minutes is equal to one standard deviation.<\/li>\r\n \t<li>Rosa's wait time of seven minutes is <strong>two minutes longer than the average<\/strong> of five minutes.<\/li>\r\n \t<li>Rosa's wait time of seven minutes is <strong>one standard deviation above the average <\/strong>of five minutes.<\/li>\r\n<\/ul>\r\nBinh waits for one minute.\r\n<ul>\r\n \t<li>One is four minutes less than the average of five; four minutes is equal to two standard deviations.<\/li>\r\n \t<li>Binh's wait time of one minute is <strong>four minutes less than the average<\/strong> of five minutes.<\/li>\r\n \t<li>Binh's wait time of one minute is <strong>two standard deviations below the average<\/strong> of five minutes.<\/li>\r\n<\/ul>\r\nA data value that is two standard deviations from the average is just on the borderline for what many statisticians would consider to be far from the average. Considering data to be far from the mean if it is more than two standard deviations away is more of an approximate \"rule of thumb\" than a rigid rule. In general, the shape of the distribution of the data affects how much of the data is further away than two standard deviations. (You will learn more about this in later chapters.)\r\n\r\nThe number line may help you understand standard deviation. If we were to put five and seven on a number line, seven is to the right of five. We say, then, that seven is\r\n<strong>one<\/strong> standard deviation to the <strong>right<\/strong> of five because [latex]5 + (1)(2) = 7[\/latex].\r\n\r\nIf one were also part of the data set, then one is <strong>two<\/strong> standard deviations to the <strong>left<\/strong> of five because [latex]5 + (\u20132)(2) = 1[\/latex].\r\n\r\n<img class=\"aligncenter\" src=\"https:\/\/textimgs.s3.amazonaws.com\/DE\/stats\/1ofl-icac027i#fixme#fixme#fixme\" alt=\"This shows a number line in intervals of 1 from 0 to 7.\" \/>\r\n<ul>\r\n \t<li>In general, a <strong>va<\/strong><strong>lue = mean + (#ofSTDEV)(standard deviation)<\/strong><\/li>\r\n \t<li>where #ofSTDEVs = the number of standard deviations<\/li>\r\n \t<li>#ofSTDEV does not need to be an integer<\/li>\r\n \t<li>One is <strong>two s<\/strong><strong>tandard deviations less than the mean<\/strong> of five because: [latex]1 = 5 + (\u20132)(2)[\/latex].<\/li>\r\n<\/ul>\r\nThe equation <strong>value = mean + (#ofSTDEVs)(standard deviation)<\/strong> can be expressed for a sample and for a population.\r\n<ul>\r\n \t<li>Sample: [latex]\\displaystyle{x}=\\overline{{x}}+[\/latex](# of STDEV)[latex]{({s})}[\/latex]<\/li>\r\n \t<li>Population: [latex]\\displaystyle{x}=\\mu+[\/latex](# of STDEV)[latex]{(\\sigma)}[\/latex]<\/li>\r\n<\/ul>\r\nThe lower case letter [latex]s[\/latex] represents the sample standard deviation and the Greek letter [latex]\u03c3[\/latex] (sigma, lower case) represents the population standard deviation.\r\n\r\nThe symbol [latex]\\displaystyle\\overline{{x}}[\/latex] is the sample mean and the Greek symbol [latex]\u03bc[\/latex] is the population mean.\r\n<h2>Calculating the Standard Deviation<\/h2>\r\nIf [latex]x[\/latex] is a number, then the difference \"[latex]x[\/latex] \u2013 mean\" is called its <strong>deviation<\/strong>. In a data set, there are as many deviations as there are items in the data set. The deviations are used to calculate the standard deviation. If the numbers belong to a population, in symbols a deviation is [latex]x \u2013 \u03bc[\/latex]. For sample data, in symbols a deviation is [latex]\\displaystyle{x}-\\overline{{x}}[\/latex].\r\n\r\nThe procedure to calculate the standard deviation depends on whether the numbers are the entire population or are data from a sample. The calculations are similar, but not identical. Therefore the symbol used to represent the standard deviation depends on whether it is calculated from a population or a sample. The lower case letter [latex]s[\/latex] represents the sample standard deviation and the Greek letter [latex]\u03c3[\/latex] (sigma, lower case) represents the population standard deviation. If the sample has the same characteristics as the population, then [latex]s[\/latex] should be a good estimate of [latex]\u03c3[\/latex].\r\n\r\nTo calculate the standard deviation, we need to calculate the variance first. The\u00a0<strong>variance<\/strong> is the <strong>average of the squares of the deviations<\/strong> (the [latex]x[\/latex] \u2013 [latex]\\displaystyle\\overline{{x}}[\/latex] values for a sample, or the [latex]x \u2013 \u03bc[\/latex] values for a population). The symbol [latex]\u03c3^2[\/latex] represents the population variance; the population standard deviation [latex]\u03c3[\/latex] is the square root of the population variance. The symbol [latex]s^2[\/latex] represents the sample variance; the sample standard deviation [latex]s[\/latex] is the square root of the sample variance. You can think of the standard deviation as a special average of the deviations.\r\n\r\nIf the numbers come from a census of the entire <strong>population<\/strong> and not a sample, when we calculate the average of the squared deviations to find the variance, we divide by [latex]N[\/latex], the number of items in the population. If the data are from a <strong>sample<\/strong> rather than a population, when we calculate the average of the squared deviations, we divide by <strong>[latex]n \u2013 1[\/latex]<\/strong>, one less than the number of items in the sample.\r\n\r\nIn the following video an example of calculating the variance and standard deviation of a set of data is presented.\r\n\r\nhttps:\/\/youtu.be\/qqOyy_NjflU\r\n<h2>Formulas for the Sample Standard Deviation<\/h2>\r\n[latex]\\displaystyle{s}=\\sqrt{{\\frac{{\\sum{({x}-\\overline{{x}})}^{{2}}}}{{{n}-{1}}}}}{\\quad\\text{or}\\quad}{s}=\\sqrt{{\\frac{{\\sum{f{{({x}-\\overline{{x}})}}}^{{2}}}}{{{n}-{1}}}}}[\/latex]\r\n<p class=\"p1\">For the sample standard deviation, the denominator is [latex]n \u2013 1[\/latex], that is the sample size MINUS [latex]1[\/latex].<\/p>\r\n\r\n<h2>Formulas for the Population Standard Deviation<\/h2>\r\n[latex]\\displaystyle\\sigma=\\sqrt{{\\frac{{\\sum{({x}-\\mu)}^{{2}}}}{{{N}}}}}{\\quad\\text{or}\\quad}\\sigma=\\sqrt{{\\frac{{\\sum{f{{({x}-\\mu)}}}^{{2}}}}{{{N}}}}}[\/latex]\r\n\r\nFor the population standard deviation, the denominator is [latex]N[\/latex], the number of items in the population.\r\n\r\nIn these formulas, [latex]f[\/latex] represents the frequency with which a value appears. For example, if a value appears once, [latex]f[\/latex] is one. If a value appears three times in the data set or population, [latex]f[\/latex] is three.\r\n<h2>Sampling Variability of a Statistic<\/h2>\r\nHow much the statistic varies from one sample to another is known as the <strong>sampling variability of a statistic<\/strong>. You typically measure the sampling variability of a statistic by its standard error. The <strong>standard error of the mean<\/strong> is an example of a standard error. It is a special standard deviation and is known as the standard deviation of the sampling distribution of the mean. You will cover the standard error of the mean when you learn about The Central Limit Theorem (not now). The notation for the standard error of the mean is [latex]\\displaystyle\\frac{{\\sigma}}{{\\sqrt{n}}}[\/latex] where [latex]\u03c3[\/latex] is the standard deviation of the population and [latex]n[\/latex] is the size of the sample.\r\n<div class=\"textbox shaded\">\r\n<h3>Note<\/h3>\r\n<strong>In practice, use Excel or a calculator to calculate the standard deviation. <\/strong>\r\n\r\nIf you are using <strong>Excel<\/strong>, you will use the following functions:\r\n<ul>\r\n \t<li>STDEV.S for the standard deviation of sample data<\/li>\r\n \t<li>STDEV.P for the standard deviation of population data<\/li>\r\n \t<li>VAR.S for the variance of sample data<\/li>\r\n \t<li>VAR.P for the variance of population data<\/li>\r\n<\/ul>\r\nIf you are using a <strong>TI-83, 83+, 84+ calculator<\/strong>, you need to select the appropriate standard deviation [latex]\u03c3_x[\/latex] or [latex]s_x[\/latex] from the summary statistics.\r\n\r\nWe will concentrate on using and interpreting the information that the standard deviation gives us. However you should study the following step-by-step example to help you understand how the standard deviation measures variation from the mean.\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nIn a fifth grade class, the teacher was interested in the average age and the sample standard deviation of the ages of her students. The following data are the ages for a sample of [latex]n = 20[\/latex] fifth grade students. The ages are rounded to the nearest half year:\r\n\r\n[latex]\\displaystyle {9; 9.5; 9.5; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5;}[\/latex]\r\n\r\nSolution:\r\n\r\n[latex]\\displaystyle\\overline{x} = {9+9.5(2)+10(4)+10.5(4)+11(6)+11.5(3)20}={10.525}[\/latex]\r\nThe average age is [latex]10.53[\/latex] years, rounded to two decimal places.\r\n\r\nThe variance may be calculated by using a table. Then the standard deviation is calculated by taking the square root of the variance. We will explain the parts of the table after calculating [latex]s[\/latex].\r\n<table>\r\n<thead>\r\n<tr>\r\n<th>Data<\/th>\r\n<th>Freq.<\/th>\r\n<th>Deviations<\/th>\r\n<th>[latex]Deviations^2[\/latex]<\/th>\r\n<th>(Freq.)( [latex]Deviations^2[\/latex])<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>[latex]x[\/latex]<\/td>\r\n<td>[latex]f[\/latex]<\/td>\r\n<td>( [latex]x[\/latex] \u2013 [latex]\\displaystyle\\overline{x}[\/latex])<\/td>\r\n<td>( [latex]x[\/latex] \u2013[latex]\\displaystyle\\overline{x}[\/latex])<sup data-redactor-tag=\"sup\">2<\/sup><\/td>\r\n<td>( [latex]f[\/latex])([latex]x[\/latex] \u2013[latex]\\displaystyle\\overline{x}[\/latex])<sup data-redactor-tag=\"sup\">2<\/sup><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]9[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]9 \u2013 10.525 = \u20131.525[\/latex]<\/td>\r\n<td>[latex](\u20131.525)^2 = 2.325625[\/latex]<\/td>\r\n<td>[latex]1 \u00d7 2.325625 = 2.325625[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]9.5[\/latex]<\/td>\r\n<td>[latex]2[\/latex]<\/td>\r\n<td>[latex]9.5 \u2013 10.525 = \u20131.025[\/latex]<\/td>\r\n<td>[latex](\u20131.025)^2 = 1.050625[\/latex]<\/td>\r\n<td>[latex]2 \u00d7 1.050625 = 2.101250[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]10[\/latex]<\/td>\r\n<td>[latex]4[\/latex]<\/td>\r\n<td>[latex]10 \u2013 10.525 = \u20130.525[\/latex]<\/td>\r\n<td>[latex](\u20130.525)^2 = 0.275625[\/latex]<\/td>\r\n<td>[latex]4 \u00d7 0.275625 = 1.1025[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]10.5[\/latex]<\/td>\r\n<td>[latex]4[\/latex]<\/td>\r\n<td>[latex]10.5 \u2013 10.525 = \u20130.025[\/latex]<\/td>\r\n<td>[latex](\u20130.025)^2 = 0.000625[\/latex]<\/td>\r\n<td>[latex]4 \u00d7 0.000625 = 0.0025[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]11[\/latex]<\/td>\r\n<td>[latex]6[\/latex]<\/td>\r\n<td>[latex]11 \u2013 10.525 = 0.475[\/latex]<\/td>\r\n<td>[latex](0.475)^2 = 0.225625[\/latex]<\/td>\r\n<td>[latex]6 \u00d7 0.225625 = 1.35375[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]11.5[\/latex]<\/td>\r\n<td>[latex]3[\/latex]<\/td>\r\n<td>[latex]11.5 \u2013 10.525 = 0.975[\/latex]<\/td>\r\n<td>[latex](0.975)^2 = 0.950625[\/latex]<\/td>\r\n<td>[latex]3 \u00d7 0.950625 = 2.851875[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<td>The total is [latex]9.7375[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe sample variance, [latex]\\displaystyle{s}^{2}[\/latex], is equal to the sum of the last column [latex](9.7375)[\/latex] divided by the total number of data values minus one [latex](20 \u2013 1)[\/latex]:\r\n[latex]s^2 =\\frac{9.7375}{20-1} =0.5125[\/latex]\r\n\r\nThe <strong>sample standard deviation<\/strong> [latex]s[\/latex] is equal to the square root of the sample variance: [latex]s = \\sqrt{0.5125} = 0.715891[\/latex] which is rounded to two decimal places, [latex]s[\/latex] = 0.72.\r\n\r\n<strong data-redactor-tag=\"strong\">Typically, you do the calculation for the standard deviation on your calculator or computer.<\/strong> The intermediate results are not rounded. This is done for accuracy.\r\n<ul>\r\n \t<li>For the following problems, recall that <strong data-redactor-tag=\"strong\">value = mean + (#ofSTDEVs)(standard deviation)<\/strong>. Verify the mean and standard deviation or a calculator or computer.<\/li>\r\n \t<li>For a sample: [latex]x[\/latex] =[latex]\\displaystyle\\overline{x}[\/latex]\u00a0+ (#ofSTDEVs)([latex]s[\/latex])<\/li>\r\n \t<li>For a population: [latex]x[\/latex] = [latex]\u03bc[\/latex] + (#ofSTDEVs)([latex]\u03c3[\/latex])<\/li>\r\n \t<li>For this example, use [latex]x[\/latex] =[latex]\\displaystyle\\overline{x}[\/latex]\u00a0+ (#ofSTDEVs)([latex]s[\/latex]) because the data is from a sample<\/li>\r\n<\/ul>\r\n<ol>\r\n \t<li>Verify the mean and standard deviation on your calculator or computer.<\/li>\r\n \t<li>Find the value that is one standard deviation above the mean. Find ([latex]\\displaystyle\\overline{x}[\/latex]+ [latex]1s[\/latex]).<\/li>\r\n \t<li>Find the value that is two standard deviations below the mean. Find ([latex]\\displaystyle\\overline{x}[\/latex]\u00a0\u2013 [latex]2s[\/latex]).<\/li>\r\n \t<li>Find the values that are [latex]1.5[\/latex] standard deviations <strong>from<\/strong> (below and above) the mean.<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"124075\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"124075\"]\r\n\r\n1.\u00a0 Using EXCEL\r\n<ul>\r\n \t<li>Input the values into a column in Excel.<\/li>\r\n \t<li>In a blank cell, type =AVERAGE( and then select the cells containing the values, close the parentheses, and hit enter.\u00a0 For example, if your data is input into cells A1 through A20, type =AVERAGE(A1:A20). Will return result [latex]10.53[\/latex], so we verify\u00a0[latex]\\displaystyle\\overline{x} =10.53[\/latex].<\/li>\r\n \t<li>In a blank cell, type =STDEV.S( and then select the cells containing the values, close the parentheses, and hit enter.\u00a0 For example, if your data is input into cells A1 through A20, type =STDEV.S(A1:A20). We verify\u00a0[latex]s = 0.715891[\/latex].<\/li>\r\n<\/ul>\r\nor USING THE TI-83, 83+, 84, 84+ CALCULATOR\r\n<ul>\r\n \t<li>Clear lists L1 and L2. Press STAT 4:ClrList. Enter 2nd 1 for L1, the comma (,), and 2nd 2 for L2.<\/li>\r\n \t<li>Enter data into the list editor. Press STAT 1:EDIT. If necessary, clear the lists by arrowing up into the name. Press CLEAR and arrow down.<\/li>\r\n \t<li>Put the data values ([latex]9[\/latex], [latex]9.5[\/latex], [latex]10[\/latex], [latex]10.5[\/latex], [latex]11[\/latex], [latex]11.5[\/latex]) into list L1 and the frequencies ([latex]1[\/latex], [latex]2[\/latex], [latex]4[\/latex], [latex]4[\/latex], [latex]6[\/latex], [latex]3[\/latex]) into list L2. Use the arrow keys to move around.<\/li>\r\n \t<li>Press STAT and arrow to CALC. Press 1:1-VarStats and enter L1 (2nd 1), L2 (2nd 2). Do not forget the comma. Press ENTER.<\/li>\r\n \t<li>[latex]\\displaystyle\\overline{x}[\/latex]\u00a0= [latex]10.525[\/latex]<\/li>\r\n \t<li>Use Sx because this is sample data (not a population): Sx=[latex]0.715891[\/latex]<\/li>\r\n<\/ul>\r\n2. ([latex]\\displaystyle\\overline{x}+ 1s) = 10.53 + (1)(0.72) = 11.25[\/latex]\r\n\r\n3. ([latex]\\displaystyle\\overline{x}\u2013 2s) = 10.53 \u2013 (2)(0.72) = 9.09[\/latex]\r\n\r\n4. ([latex]\\displaystyle\\overline{x}\u2013 1.5s) = 10.53 \u2013 (1.5)(0.72) = 9.45[\/latex]\r\n\r\n([latex]\\displaystyle\\overline{x}+ 1.5s) = 10.53 + (1.5)(0.72) = 11.61[\/latex]\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nOn a baseball team, the ages of each of the players are as follows:\r\n\r\n[latex]\\displaystyle {21; 21; 22; 23; 24; 24; 25; 25; 28; 29; 29; 31; 32; 33; 33; 34; 35; 36; 36; 36; 36; 38; 38; 38; 40}[\/latex]\r\n\r\nUse your calculator or computer to find the mean and standard deviation. Then find the value that is two standard deviations above the mean.\r\n\r\n[reveal-answer q=\"124076\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"124076\"]\r\n[latex]\\displaystyle\\overline{x} = 30.68[\/latex]\r\n\r\n[latex]s = 6.09[\/latex]\r\n\r\n([latex]\\displaystyle\\overline{x}+ 2s) = 30.68 + (2)(6.09) = 42.86[\/latex].\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<h2>Explanation of the standard deviation calculation shown in the table<\/h2>\r\nThe deviations show how spread out the data are about the mean. The data value [latex]11.5[\/latex] is farther from the mean than is the data value [latex]11[\/latex] which is indicated by the deviations [latex]0.97[\/latex] and [latex]0.47[\/latex]. A positive deviation occurs when the data value is greater than the mean, whereas a negative deviation occurs when the data value is less than the mean. The deviation is [latex]\u20131.525[\/latex] for the data value nine. <strong data-redactor-tag=\"strong\">If you add the deviations, the sum is always zero. <\/strong>(For Example 1, there are [latex]n = 20[\/latex] deviations.) So you cannot simply add the deviations to get the spread of the data. By squaring the deviations, you make them positive numbers, and the sum will also be positive. The variance, then, is the average squared deviation.\r\n\r\nThe variance is a squared measure and does not have the same units as the data. Taking the square root solves the problem. The standard deviation measures the spread in the same units as the data.\r\n\r\nNotice that instead of dividing by [latex]n= 20[\/latex], the calculation divided by [latex]n \u2013 1 = 20 \u2013 1 = 19[\/latex] because the data is a sample. For the <strong data-redactor-tag=\"strong\">sample<\/strong> variance, we divide by the sample size minus one ([latex]n \u2013 1[\/latex]). Why not divide by [latex]n[\/latex]? The answer has to do with the population variance. <strong data-redactor-tag=\"strong\">The sample variance is an estimate of the population variance.<\/strong> Based on the theoretical mathematics that lies behind these calculations, dividing by ([latex]n \u2013 1[\/latex]) gives a better estimate of the population variance.\r\n<div class=\"textbox shaded\">\r\n<h3>Note<\/h3>\r\nYour concentration should be on what the standard deviation tells us about the data. The standard deviation is a number which measures how far the data are spread from the mean. Let a calculator or computer do the arithmetic.\r\n\r\n<\/div>\r\nThe standard deviation, [latex]s[\/latex] or [latex]\u03c3[\/latex], is either zero or larger than zero. When the standard deviation is zero, there is no spread; that is, the all the data values are equal to each other. The standard deviation is small when the data are all concentrated close to the mean, and is larger when the data values show more variation from the mean. When the standard deviation is a lot larger than zero, the data values are very spread out about the mean; outliers can make [latex]s[\/latex] or [latex]\u03c3[\/latex] very large.\r\n\r\nThe standard deviation, when first presented, can seem unclear. By graphing your data, you can get a better \"feel\" for the deviations and the standard deviation. You will find that in symmetrical distributions, the standard deviation can be very helpful but in skewed distributions, the standard deviation may not be much help. The reason is that the two sides of a skewed distribution have different spreads. In a skewed distribution, it is better to look at the first quartile, the median, the third quartile, the smallest value, and the largest value. Because numbers can be confusing, <strong data-redactor-tag=\"strong\">always graph your data<\/strong>. Display your data in a histogram or a box plot.\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nUse the following data (first exam scores) from Susan Dean's spring pre-calculus class:\r\n\r\n[latex]\\displaystyle {33; 42; 49; 49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94; 96; 100}[\/latex]\r\n<ol>\r\n \t<li>Create a chart containing the data, frequencies, relative frequencies, and cumulative relative frequencies to three decimal places.<\/li>\r\n \t<li>Calculate the following to one decimal place using a calculator or computer:\r\n<ol>\r\n \t<li>The sample mean<\/li>\r\n \t<li>The sample standard deviation<\/li>\r\n \t<li>The median<\/li>\r\n \t<li>The first quartile<\/li>\r\n \t<li>The third quartile<\/li>\r\n \t<li>[latex]IQR[\/latex]<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li>Construct a box plot and a histogram on the same set of axes. Make comments about the box plot, the histogram, and the chart.<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"124077\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"124077\"]\r\n<ol>\r\n \t<li>\r\n<table>\r\n<thead>\r\n<tr>\r\n<th>Data<\/th>\r\n<th>Frequency<\/th>\r\n<th>Relative Frequency<\/th>\r\n<th>Cumulative Relative Frequency<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>[latex]33[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]42[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.064[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]49[\/latex]<\/td>\r\n<td>[latex]2[\/latex]<\/td>\r\n<td>[latex]0.065[\/latex]<\/td>\r\n<td>[latex]0.129[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]53[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.161[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]55[\/latex]<\/td>\r\n<td>[latex]2[\/latex]<\/td>\r\n<td>[latex]0.065[\/latex]<\/td>\r\n<td>[latex]0.226[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]61[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.258[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]63[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.29[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]67[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.322[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]68[\/latex]<\/td>\r\n<td>[latex]2[\/latex]<\/td>\r\n<td>[latex]0.065[\/latex]<\/td>\r\n<td>[latex]0.387[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]69[\/latex]<\/td>\r\n<td>[latex]2[\/latex]<\/td>\r\n<td>[latex]0.065[\/latex]<\/td>\r\n<td>[latex]0.452[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]72[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.484[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]73[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.516[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]74[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.548[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]78[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.580[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]80[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.612[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]83[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.644[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]88[\/latex]<\/td>\r\n<td>[latex]3[\/latex]<\/td>\r\n<td>[latex]0.097[\/latex]<\/td>\r\n<td>[latex]0.741[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]90[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.773[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]92[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.805[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]94[\/latex]<\/td>\r\n<td>[latex]4[\/latex]<\/td>\r\n<td>[latex]0.129[\/latex]<\/td>\r\n<td>[latex]0.934[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]96[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.966[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>[latex]100[\/latex]<\/td>\r\n<td>[latex]1[\/latex]<\/td>\r\n<td>[latex]0.032[\/latex]<\/td>\r\n<td>[latex]0.998[\/latex] (Why isn't this value [latex]1[\/latex]?)<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<ol>\r\n \t<li>The sample mean = [latex]73.5[\/latex]<\/li>\r\n \t<li>The sample standard deviation = [latex]17.9[\/latex]<\/li>\r\n \t<li>The median = [latex]73[\/latex]<\/li>\r\n \t<li>The first quartile = [latex]61[\/latex]<\/li>\r\n \t<li>The third quartile = [latex]90[\/latex]<\/li>\r\n \t<li>[latex]IQR = 90 \u2013 61 = 29[\/latex]<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li>The [latex]x[\/latex]-axis goes from [latex]32.5[\/latex] to [latex]100.5[\/latex]; [latex]y[\/latex]-axis goes from [latex]\u20132.4[\/latex] to [latex]15[\/latex] for the histogram. The number of intervals is five, so the width of an interval is [latex](100.5 \u2013 32.5)[\/latex] divided by five, is equal to [latex]13.6[\/latex]. Endpoints of the intervals are as follows: the starting point is [latex]32.5, 32.5 + 13.6 = 46.1[\/latex], [latex]46.1 + 13.6 = 59.7[\/latex], [latex]59.7 + 13.6 = 73.3[\/latex], [latex]73.3 + 13.6 = 86.9[\/latex], [latex]86.9 + 13.6 = 100.5[\/latex] = the ending value; No data values fall on an interval boundary.\r\n<a href=\"https:\/\/courses.candelalearning.com\/introstats1xmaster\/wp-content\/uploads\/sites\/635\/2015\/06\/Screen-Shot-2015-06-07-at-5.19.14-PM.png\"><img class=\"aligncenter wp-image-496 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/132\/2016\/04\/21214248\/Screen-Shot-2015-06-07-at-5.19.14-PM.png\" alt=\"Box Plot\" width=\"511\" height=\"328\" \/><\/a><\/li>\r\n<\/ol>\r\nThe long left whisker in the box plot is reflected in the left side of the histogram. The spread of the exam scores in the lower [latex]50[\/latex]% is greater ([latex]73 \u2013 33 = 40[\/latex]) than the spread in the upper [latex]50[\/latex]% ([latex]100 \u2013 73 = 27[\/latex]). The histogram, box plot, and chart all reflect this. There are a substantial number of A and B grades ([latex]80[\/latex]s, [latex]90[\/latex]s, and [latex]100[\/latex]). The histogram clearly shows this. The box plot shows us that the middle [latex]50[\/latex]% of the exam scores ([latex]IQR[\/latex] = [latex]29[\/latex]) are Ds, Cs, and Bs. The box plot also shows us that the lower [latex]25[\/latex]% of the exam scores are Ds and Fs.\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nThe following data show the different types of pet food stores in the area carry.\r\n\r\n[latex]\\displaystyle {6; 6; 6; 6; 7; 7; 7; 7; 7; 8; 9; 9; 9; 9; 10; 10; 10; 10; 10; 11; 11; 11; 11; 12; 12; 12; 12; 12; 12;}[\/latex]\r\nCalculate the sample mean and the sample standard deviation to one decimal place using a calculator or computer.\r\n\r\n[reveal-answer q=\"124078\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"124078\"]\r\n[latex]\u03bc = 9.3[\/latex]\r\n\r\n[latex]s = 2.2[\/latex]\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<h2>Comparing Values from Different Data Sets<\/h2>\r\nThe standard deviation is useful when comparing data values that come from different data sets. If the data sets have different means and standard deviations, then comparing the data values directly can be misleading.\r\n<ul>\r\n \t<li>For each data value, calculate how many standard deviations away from its mean the value is.<\/li>\r\n \t<li>Use the formula: value = mean + (#ofSTDEVs)(standard deviation); solve for #ofSTDEVs.<\/li>\r\n \t<li><em>#ofSTDEVs <\/em>= [latex]\\frac{value - mean}{standard deviation} [\/latex]<\/li>\r\n \t<li>Compare the results of this calculation.<\/li>\r\n<\/ul>\r\n#ofSTDEVs is often called a \" [latex]z[\/latex]-score\"; we can use the symbol [latex]z[\/latex]. In symbols, the formulas become:\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td>Sample<\/td>\r\n<td>[latex]x=\\overline{x}+zs[\/latex]<\/td>\r\n<td>[latex]z = \\frac{x - \\overline{x}}{s}[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Population<\/td>\r\n<td>[latex]x = \u03bc + z\u03c3[\/latex]<\/td>\r\n<td>[latex]z = \\frac{x - \u03bc}{\u03c3}[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nTwo students, John and Ali, from different high schools, wanted to find out who had the highest GPA when compared to his school. Which student had the highest GPA when compared to his school?\r\n<table>\r\n<thead>\r\n<tr>\r\n<th>Student<\/th>\r\n<th>GPA<\/th>\r\n<th>School Mean GPA<\/th>\r\n<th>School Standard Deviation<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>John<\/td>\r\n<td>[latex]2.85[\/latex]<\/td>\r\n<td>[latex]3.0[\/latex]<\/td>\r\n<td>[latex]0.7[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Ali<\/td>\r\n<td>[latex]77[\/latex]<\/td>\r\n<td>[latex]80[\/latex]<\/td>\r\n<td>[latex]10[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n[reveal-answer q=\"124079\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"124079\"]\r\nFor each student, determine how many standard deviations (#ofSTDEVs) his GPA is away from the average, for his school. Pay careful attention to signs when comparing and interpreting the answer.\r\n\r\n[latex]z[\/latex] = # of STDEVs = [latex]\\frac{value - mean}{standard deviation} [\/latex] =\u00a0 [latex]\\frac{x-\u03bc}{\u03c3}[\/latex]\r\n\r\nFor John, [latex]z[\/latex] = # ofSTDEVs = [latex]\\displaystyle\\frac{{2.85 - 3.00}}{{0.7}}=-{0.21}[\/latex]\r\n\r\nFor Ali, [latex]z[\/latex] = # ofSTDEVs = [latex]\\displaystyle\\frac{{77- 80}}{{10}}=\u22120.3[\/latex]\r\n\r\nJohn has the better GPA when compared to his school because his GPA is [latex]0.21[\/latex] standard deviations\u00a0<strong data-redactor-tag=\"strong\">below<\/strong> his school's mean while Ali's GPA is [latex]0.3[\/latex] standard deviations <strong data-redactor-tag=\"strong\">below<\/strong> his school's mean.\r\n\r\nJohn's [latex]z[\/latex]-score of [latex]\u20130.21[\/latex] is higher than Ali's [latex]z[\/latex]-score of [latex]\u20130.3[\/latex]. For GPA, higher values are better, so we conclude that John has the better GPA when compared to his school.\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Try It<\/h3>\r\nTwo swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time for the 50 meter freestyle when compared to her team. Which swimmer had the fastest time when compared to her team?\r\n<table>\r\n<thead>\r\n<tr>\r\n<th>Swimmer<\/th>\r\n<th>Time (seconds)<\/th>\r\n<th>Team Mean Time<\/th>\r\n<th>Team Standard Deviation<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>Angie<\/td>\r\n<td>[latex]26.2[\/latex]<\/td>\r\n<td>[latex]27.2[\/latex]<\/td>\r\n<td>[latex]0.8[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Beth<\/td>\r\n<td>[latex]27.3[\/latex]<\/td>\r\n<td>[latex]30.1[\/latex]<\/td>\r\n<td>[latex]1.4[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n[reveal-answer q=\"124080\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"124080\"]\r\nFor Angie: [latex]z=\\frac{\\left(26.2-27.2\\right)}{0.8}=-1.25[\/latex]\r\n\r\nFor Beth: [latex]z=\\frac{\\left(27.3-30.1\\right)}{1.4}=-2[\/latex]\r\n\r\nBeth had the fastest time compared to her team, because her\u00a0[latex]z[\/latex]-score was more negative, meaning quicker in relation to the average for her team.\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nThe following lists give a few facts that provide a little more insight into what the standard deviation tells us about the distribution of the data.\r\n\r\n<strong data-redactor-tag=\"strong\">For ANY data set, no matter what the distribution of the data is:<\/strong>\r\n<ul>\r\n \t<li>At least [latex]75[\/latex]% of the data is within two standard deviations of the mean.<\/li>\r\n \t<li>At least [latex]89[\/latex]% of the data is within three standard deviations of the mean.<\/li>\r\n \t<li>At least [latex]95[\/latex]% of the data is within [latex]4.5[\/latex] standard deviations of the mean.<\/li>\r\n \t<li>This is known as Chebyshev's Rule.<\/li>\r\n<\/ul>\r\n<strong data-redactor-tag=\"strong\">For data having a distribution that is BELL-SHAPED and SYMMETRIC:<\/strong>\r\n<ul>\r\n \t<li>Approximately [latex]68[\/latex]% of the data is within one standard deviation of the mean.<\/li>\r\n \t<li>Approximately [latex]95[\/latex]% of the data is within two standard deviations of the mean.<\/li>\r\n \t<li>More than [latex]99[\/latex]% of the data is within three standard deviations of the mean.<\/li>\r\n \t<li>This is known as the Empirical Rule.<\/li>\r\n \t<li>It is important to note that this rule only applies when the shape of the distribution of the data is bell-shaped and symmetric. We will learn more about this when studying the \"Normal\" or \"Gaussian\" probability distribution in later chapters.<\/li>\r\n<\/ul>\r\n<h2>Concept Review<\/h2>\r\nThe standard deviation can help you calculate the spread of data. There are different equations to use if are calculating the standard deviation of a sample or of a population.\r\n<ul>\r\n \t<li>The Standard Deviation allows us to compare individual data or classes to the data set mean numerically.<\/li>\r\n \t<li>[latex]\\displaystyle{s}_{x}=\\sqrt{{\\frac{{f{(x-\\overline{x})}^{2}}}{{n-1}}}}[\/latex]\u00a0is the formula for calculating the standard deviation of a sample.<\/li>\r\n \t<li>To calculate the standard deviation of a population, we would use the population mean, <em data-redactor-tag=\"em\">\u03bc<\/em>, and the formula [latex]\\displaystyle{\\sigma}=\\sqrt{{\\frac{{f{(x-\\mu)}^{2}}}{{N}}}}[\/latex]<\/li>\r\n<\/ul>\r\n<h2>Formula Review<\/h2>\r\n<h3>Sample Standard Deviation<\/h3>\r\n[latex]\\displaystyle{s}=\\sqrt{{\\frac{{\\sum{({x}-\\overline{{x}})}^{{2}}}}{{{n}-{1}}}}}{\\quad\\text{or}\\quad}{s}=\\sqrt{{\\frac{{\\sum{f{{({x}-\\overline{{x}})}}}^{{2}}}}{{{n}-{1}}}}}[\/latex]\r\n<p class=\"p1\">For the sample standard deviation, the denominator is [latex]n \u2013 1[\/latex], that is the sample size MINUS [latex]1[\/latex].<\/p>\r\n\r\n<h3>Population Standard Deviation<\/h3>\r\n[latex]\\displaystyle\\sigma=\\sqrt{{\\frac{{\\sum{({x}-\\mu)}^{{2}}}}{{{N}}}}}{\\quad\\text{or}\\quad}\\sigma=\\sqrt{{\\frac{{\\sum{f{{({x}-\\mu)}}}^{{2}}}}{{{N}}}}}[\/latex]\r\n\r\nFor the population standard deviation, the denominator is [latex]N[\/latex], the number of items in the population.\r\n<h2>References<\/h2>\r\nData from Microsoft Bookshelf.\r\n\r\nKing, Bill.\"Graphically Speaking.\" Institutional Research, Lake Tahoe Community College. Available online at http:\/\/www.ltcc.edu\/web\/about\/institutional-research (accessed April 3, 2013).","rendered":"<div class=\"textbox learning-objectives\">\n<h3>Learning Outcomes<\/h3>\n<ul id=\"list123523\">\n<li>Recognize, describe, and calculate the measures of the spread of data: variance, standard deviation, and range.<\/li>\n<\/ul>\n<\/div>\n<p>An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation. The <strong>standard deviation<\/strong> is a number that measures how far data values are from their mean.<\/p>\n<p>The standard deviation provides a numerical measure of the overall amount of variation in a data set, and can be used to determine whether a particular data value is close to or far from the mean.<\/p>\n<h4>The standard deviation provides a measure of the overall variation in a data set.<\/h4>\n<p>The standard deviation is always positive or zero. The standard deviation is small when the data are all concentrated close to the mean, exhibiting little variation or spread. The standard deviation is larger when the data values are more spread out from the mean, exhibiting more variation.<\/p>\n<p>Suppose that we are studying the amount of time customers wait in line at the checkout at supermarket [latex]A[\/latex] and supermarket [latex]B[\/latex]. the average wait time at both supermarkets is five minutes. At supermarket [latex]A[\/latex], the standard deviation for the wait time is two minutes; at supermarket [latex]B[\/latex] the standard deviation for the wait time is four minutes.<\/p>\n<p>Because supermarket [latex]B[\/latex] has a higher standard deviation, we know that there is more variation in the wait times at supermarket [latex]B[\/latex]. Overall, wait times at supermarket [latex]B[\/latex] are more spread out from the average; wait times at supermarket [latex]A[\/latex] are more concentrated near the average.<\/p>\n<h4>The standard deviation can be used to determine whether a data value is close to or far from the mean.<\/h4>\n<p>Suppose that Rosa and Binh both shop at supermarket [latex]A[\/latex]. Rosa waits at the checkout counter for seven minutes and Binh waits for one minute. At supermarket [latex]A[\/latex], the mean waiting time is five minutes and the standard deviation is two minutes. The standard deviation can be used to determine whether a data value is close to or far from the mean.<\/p>\n<p>Rosa waits for seven minutes:<\/p>\n<ul>\n<li>Seven is two minutes longer than the average of five; two minutes is equal to one standard deviation.<\/li>\n<li>Rosa&#8217;s wait time of seven minutes is <strong>two minutes longer than the average<\/strong> of five minutes.<\/li>\n<li>Rosa&#8217;s wait time of seven minutes is <strong>one standard deviation above the average <\/strong>of five minutes.<\/li>\n<\/ul>\n<p>Binh waits for one minute.<\/p>\n<ul>\n<li>One is four minutes less than the average of five; four minutes is equal to two standard deviations.<\/li>\n<li>Binh&#8217;s wait time of one minute is <strong>four minutes less than the average<\/strong> of five minutes.<\/li>\n<li>Binh&#8217;s wait time of one minute is <strong>two standard deviations below the average<\/strong> of five minutes.<\/li>\n<\/ul>\n<p>A data value that is two standard deviations from the average is just on the borderline for what many statisticians would consider to be far from the average. Considering data to be far from the mean if it is more than two standard deviations away is more of an approximate &#8220;rule of thumb&#8221; than a rigid rule. In general, the shape of the distribution of the data affects how much of the data is further away than two standard deviations. (You will learn more about this in later chapters.)<\/p>\n<p>The number line may help you understand standard deviation. If we were to put five and seven on a number line, seven is to the right of five. We say, then, that seven is<br \/>\n<strong>one<\/strong> standard deviation to the <strong>right<\/strong> of five because [latex]5 + (1)(2) = 7[\/latex].<\/p>\n<p>If one were also part of the data set, then one is <strong>two<\/strong> standard deviations to the <strong>left<\/strong> of five because [latex]5 + (\u20132)(2) = 1[\/latex].<\/p>\n<p><img decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/textimgs.s3.amazonaws.com\/DE\/stats\/1ofl-icac027i#fixme#fixme#fixme\" alt=\"This shows a number line in intervals of 1 from 0 to 7.\" \/><\/p>\n<ul>\n<li>In general, a <strong>va<\/strong><strong>lue = mean + (#ofSTDEV)(standard deviation)<\/strong><\/li>\n<li>where #ofSTDEVs = the number of standard deviations<\/li>\n<li>#ofSTDEV does not need to be an integer<\/li>\n<li>One is <strong>two s<\/strong><strong>tandard deviations less than the mean<\/strong> of five because: [latex]1 = 5 + (\u20132)(2)[\/latex].<\/li>\n<\/ul>\n<p>The equation <strong>value = mean + (#ofSTDEVs)(standard deviation)<\/strong> can be expressed for a sample and for a population.<\/p>\n<ul>\n<li>Sample: [latex]\\displaystyle{x}=\\overline{{x}}+[\/latex](# of STDEV)[latex]{({s})}[\/latex]<\/li>\n<li>Population: [latex]\\displaystyle{x}=\\mu+[\/latex](# of STDEV)[latex]{(\\sigma)}[\/latex]<\/li>\n<\/ul>\n<p>The lower case letter [latex]s[\/latex] represents the sample standard deviation and the Greek letter [latex]\u03c3[\/latex] (sigma, lower case) represents the population standard deviation.<\/p>\n<p>The symbol [latex]\\displaystyle\\overline{{x}}[\/latex] is the sample mean and the Greek symbol [latex]\u03bc[\/latex] is the population mean.<\/p>\n<h2>Calculating the Standard Deviation<\/h2>\n<p>If [latex]x[\/latex] is a number, then the difference &#8220;[latex]x[\/latex] \u2013 mean&#8221; is called its <strong>deviation<\/strong>. In a data set, there are as many deviations as there are items in the data set. The deviations are used to calculate the standard deviation. If the numbers belong to a population, in symbols a deviation is [latex]x \u2013 \u03bc[\/latex]. For sample data, in symbols a deviation is [latex]\\displaystyle{x}-\\overline{{x}}[\/latex].<\/p>\n<p>The procedure to calculate the standard deviation depends on whether the numbers are the entire population or are data from a sample. The calculations are similar, but not identical. Therefore the symbol used to represent the standard deviation depends on whether it is calculated from a population or a sample. The lower case letter [latex]s[\/latex] represents the sample standard deviation and the Greek letter [latex]\u03c3[\/latex] (sigma, lower case) represents the population standard deviation. If the sample has the same characteristics as the population, then [latex]s[\/latex] should be a good estimate of [latex]\u03c3[\/latex].<\/p>\n<p>To calculate the standard deviation, we need to calculate the variance first. The\u00a0<strong>variance<\/strong> is the <strong>average of the squares of the deviations<\/strong> (the [latex]x[\/latex] \u2013 [latex]\\displaystyle\\overline{{x}}[\/latex] values for a sample, or the [latex]x \u2013 \u03bc[\/latex] values for a population). The symbol [latex]\u03c3^2[\/latex] represents the population variance; the population standard deviation [latex]\u03c3[\/latex] is the square root of the population variance. The symbol [latex]s^2[\/latex] represents the sample variance; the sample standard deviation [latex]s[\/latex] is the square root of the sample variance. You can think of the standard deviation as a special average of the deviations.<\/p>\n<p>If the numbers come from a census of the entire <strong>population<\/strong> and not a sample, when we calculate the average of the squared deviations to find the variance, we divide by [latex]N[\/latex], the number of items in the population. If the data are from a <strong>sample<\/strong> rather than a population, when we calculate the average of the squared deviations, we divide by <strong>[latex]n \u2013 1[\/latex]<\/strong>, one less than the number of items in the sample.<\/p>\n<p>In the following video an example of calculating the variance and standard deviation of a set of data is presented.<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-1\" title=\"How to calculate Standard Deviation and Variance\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/qqOyy_NjflU?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<h2>Formulas for the Sample Standard Deviation<\/h2>\n<p>[latex]\\displaystyle{s}=\\sqrt{{\\frac{{\\sum{({x}-\\overline{{x}})}^{{2}}}}{{{n}-{1}}}}}{\\quad\\text{or}\\quad}{s}=\\sqrt{{\\frac{{\\sum{f{{({x}-\\overline{{x}})}}}^{{2}}}}{{{n}-{1}}}}}[\/latex]<\/p>\n<p class=\"p1\">For the sample standard deviation, the denominator is [latex]n \u2013 1[\/latex], that is the sample size MINUS [latex]1[\/latex].<\/p>\n<h2>Formulas for the Population Standard Deviation<\/h2>\n<p>[latex]\\displaystyle\\sigma=\\sqrt{{\\frac{{\\sum{({x}-\\mu)}^{{2}}}}{{{N}}}}}{\\quad\\text{or}\\quad}\\sigma=\\sqrt{{\\frac{{\\sum{f{{({x}-\\mu)}}}^{{2}}}}{{{N}}}}}[\/latex]<\/p>\n<p>For the population standard deviation, the denominator is [latex]N[\/latex], the number of items in the population.<\/p>\n<p>In these formulas, [latex]f[\/latex] represents the frequency with which a value appears. For example, if a value appears once, [latex]f[\/latex] is one. If a value appears three times in the data set or population, [latex]f[\/latex] is three.<\/p>\n<h2>Sampling Variability of a Statistic<\/h2>\n<p>How much the statistic varies from one sample to another is known as the <strong>sampling variability of a statistic<\/strong>. You typically measure the sampling variability of a statistic by its standard error. The <strong>standard error of the mean<\/strong> is an example of a standard error. It is a special standard deviation and is known as the standard deviation of the sampling distribution of the mean. You will cover the standard error of the mean when you learn about The Central Limit Theorem (not now). The notation for the standard error of the mean is [latex]\\displaystyle\\frac{{\\sigma}}{{\\sqrt{n}}}[\/latex] where [latex]\u03c3[\/latex] is the standard deviation of the population and [latex]n[\/latex] is the size of the sample.<\/p>\n<div class=\"textbox shaded\">\n<h3>Note<\/h3>\n<p><strong>In practice, use Excel or a calculator to calculate the standard deviation. <\/strong><\/p>\n<p>If you are using <strong>Excel<\/strong>, you will use the following functions:<\/p>\n<ul>\n<li>STDEV.S for the standard deviation of sample data<\/li>\n<li>STDEV.P for the standard deviation of population data<\/li>\n<li>VAR.S for the variance of sample data<\/li>\n<li>VAR.P for the variance of population data<\/li>\n<\/ul>\n<p>If you are using a <strong>TI-83, 83+, 84+ calculator<\/strong>, you need to select the appropriate standard deviation [latex]\u03c3_x[\/latex] or [latex]s_x[\/latex] from the summary statistics.<\/p>\n<p>We will concentrate on using and interpreting the information that the standard deviation gives us. However you should study the following step-by-step example to help you understand how the standard deviation measures variation from the mean.<\/p>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>In a fifth grade class, the teacher was interested in the average age and the sample standard deviation of the ages of her students. The following data are the ages for a sample of [latex]n = 20[\/latex] fifth grade students. The ages are rounded to the nearest half year:<\/p>\n<p>[latex]\\displaystyle {9; 9.5; 9.5; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5;}[\/latex]<\/p>\n<p>Solution:<\/p>\n<p>[latex]\\displaystyle\\overline{x} = {9+9.5(2)+10(4)+10.5(4)+11(6)+11.5(3)20}={10.525}[\/latex]<br \/>\nThe average age is [latex]10.53[\/latex] years, rounded to two decimal places.<\/p>\n<p>The variance may be calculated by using a table. Then the standard deviation is calculated by taking the square root of the variance. We will explain the parts of the table after calculating [latex]s[\/latex].<\/p>\n<table>\n<thead>\n<tr>\n<th>Data<\/th>\n<th>Freq.<\/th>\n<th>Deviations<\/th>\n<th>[latex]Deviations^2[\/latex]<\/th>\n<th>(Freq.)( [latex]Deviations^2[\/latex])<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>[latex]x[\/latex]<\/td>\n<td>[latex]f[\/latex]<\/td>\n<td>( [latex]x[\/latex] \u2013 [latex]\\displaystyle\\overline{x}[\/latex])<\/td>\n<td>( [latex]x[\/latex] \u2013[latex]\\displaystyle\\overline{x}[\/latex])<sup data-redactor-tag=\"sup\">2<\/sup><\/td>\n<td>( [latex]f[\/latex])([latex]x[\/latex] \u2013[latex]\\displaystyle\\overline{x}[\/latex])<sup data-redactor-tag=\"sup\">2<\/sup><\/td>\n<\/tr>\n<tr>\n<td>[latex]9[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]9 \u2013 10.525 = \u20131.525[\/latex]<\/td>\n<td>[latex](\u20131.525)^2 = 2.325625[\/latex]<\/td>\n<td>[latex]1 \u00d7 2.325625 = 2.325625[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]9.5[\/latex]<\/td>\n<td>[latex]2[\/latex]<\/td>\n<td>[latex]9.5 \u2013 10.525 = \u20131.025[\/latex]<\/td>\n<td>[latex](\u20131.025)^2 = 1.050625[\/latex]<\/td>\n<td>[latex]2 \u00d7 1.050625 = 2.101250[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]10[\/latex]<\/td>\n<td>[latex]4[\/latex]<\/td>\n<td>[latex]10 \u2013 10.525 = \u20130.525[\/latex]<\/td>\n<td>[latex](\u20130.525)^2 = 0.275625[\/latex]<\/td>\n<td>[latex]4 \u00d7 0.275625 = 1.1025[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]10.5[\/latex]<\/td>\n<td>[latex]4[\/latex]<\/td>\n<td>[latex]10.5 \u2013 10.525 = \u20130.025[\/latex]<\/td>\n<td>[latex](\u20130.025)^2 = 0.000625[\/latex]<\/td>\n<td>[latex]4 \u00d7 0.000625 = 0.0025[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]11[\/latex]<\/td>\n<td>[latex]6[\/latex]<\/td>\n<td>[latex]11 \u2013 10.525 = 0.475[\/latex]<\/td>\n<td>[latex](0.475)^2 = 0.225625[\/latex]<\/td>\n<td>[latex]6 \u00d7 0.225625 = 1.35375[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]11.5[\/latex]<\/td>\n<td>[latex]3[\/latex]<\/td>\n<td>[latex]11.5 \u2013 10.525 = 0.975[\/latex]<\/td>\n<td>[latex](0.975)^2 = 0.950625[\/latex]<\/td>\n<td>[latex]3 \u00d7 0.950625 = 2.851875[\/latex]<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<td>The total is [latex]9.7375[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The sample variance, [latex]\\displaystyle{s}^{2}[\/latex], is equal to the sum of the last column [latex](9.7375)[\/latex] divided by the total number of data values minus one [latex](20 \u2013 1)[\/latex]:<br \/>\n[latex]s^2 =\\frac{9.7375}{20-1} =0.5125[\/latex]<\/p>\n<p>The <strong>sample standard deviation<\/strong> [latex]s[\/latex] is equal to the square root of the sample variance: [latex]s = \\sqrt{0.5125} = 0.715891[\/latex] which is rounded to two decimal places, [latex]s[\/latex] = 0.72.<\/p>\n<p><strong data-redactor-tag=\"strong\">Typically, you do the calculation for the standard deviation on your calculator or computer.<\/strong> The intermediate results are not rounded. This is done for accuracy.<\/p>\n<ul>\n<li>For the following problems, recall that <strong data-redactor-tag=\"strong\">value = mean + (#ofSTDEVs)(standard deviation)<\/strong>. Verify the mean and standard deviation or a calculator or computer.<\/li>\n<li>For a sample: [latex]x[\/latex] =[latex]\\displaystyle\\overline{x}[\/latex]\u00a0+ (#ofSTDEVs)([latex]s[\/latex])<\/li>\n<li>For a population: [latex]x[\/latex] = [latex]\u03bc[\/latex] + (#ofSTDEVs)([latex]\u03c3[\/latex])<\/li>\n<li>For this example, use [latex]x[\/latex] =[latex]\\displaystyle\\overline{x}[\/latex]\u00a0+ (#ofSTDEVs)([latex]s[\/latex]) because the data is from a sample<\/li>\n<\/ul>\n<ol>\n<li>Verify the mean and standard deviation on your calculator or computer.<\/li>\n<li>Find the value that is one standard deviation above the mean. Find ([latex]\\displaystyle\\overline{x}[\/latex]+ [latex]1s[\/latex]).<\/li>\n<li>Find the value that is two standard deviations below the mean. Find ([latex]\\displaystyle\\overline{x}[\/latex]\u00a0\u2013 [latex]2s[\/latex]).<\/li>\n<li>Find the values that are [latex]1.5[\/latex] standard deviations <strong>from<\/strong> (below and above) the mean.<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q124075\">Show Solution<\/span><\/p>\n<div id=\"q124075\" class=\"hidden-answer\" style=\"display: none\">\n<p>1.\u00a0 Using EXCEL<\/p>\n<ul>\n<li>Input the values into a column in Excel.<\/li>\n<li>In a blank cell, type =AVERAGE( and then select the cells containing the values, close the parentheses, and hit enter.\u00a0 For example, if your data is input into cells A1 through A20, type =AVERAGE(A1:A20). Will return result [latex]10.53[\/latex], so we verify\u00a0[latex]\\displaystyle\\overline{x} =10.53[\/latex].<\/li>\n<li>In a blank cell, type =STDEV.S( and then select the cells containing the values, close the parentheses, and hit enter.\u00a0 For example, if your data is input into cells A1 through A20, type =STDEV.S(A1:A20). We verify\u00a0[latex]s = 0.715891[\/latex].<\/li>\n<\/ul>\n<p>or USING THE TI-83, 83+, 84, 84+ CALCULATOR<\/p>\n<ul>\n<li>Clear lists L1 and L2. Press STAT 4:ClrList. Enter 2nd 1 for L1, the comma (,), and 2nd 2 for L2.<\/li>\n<li>Enter data into the list editor. Press STAT 1:EDIT. If necessary, clear the lists by arrowing up into the name. Press CLEAR and arrow down.<\/li>\n<li>Put the data values ([latex]9[\/latex], [latex]9.5[\/latex], [latex]10[\/latex], [latex]10.5[\/latex], [latex]11[\/latex], [latex]11.5[\/latex]) into list L1 and the frequencies ([latex]1[\/latex], [latex]2[\/latex], [latex]4[\/latex], [latex]4[\/latex], [latex]6[\/latex], [latex]3[\/latex]) into list L2. Use the arrow keys to move around.<\/li>\n<li>Press STAT and arrow to CALC. Press 1:1-VarStats and enter L1 (2nd 1), L2 (2nd 2). Do not forget the comma. Press ENTER.<\/li>\n<li>[latex]\\displaystyle\\overline{x}[\/latex]\u00a0= [latex]10.525[\/latex]<\/li>\n<li>Use Sx because this is sample data (not a population): Sx=[latex]0.715891[\/latex]<\/li>\n<\/ul>\n<p>2. ([latex]\\displaystyle\\overline{x}+ 1s) = 10.53 + (1)(0.72) = 11.25[\/latex]<\/p>\n<p>3. ([latex]\\displaystyle\\overline{x}\u2013 2s) = 10.53 \u2013 (2)(0.72) = 9.09[\/latex]<\/p>\n<p>4. ([latex]\\displaystyle\\overline{x}\u2013 1.5s) = 10.53 \u2013 (1.5)(0.72) = 9.45[\/latex]<\/p>\n<p>([latex]\\displaystyle\\overline{x}+ 1.5s) = 10.53 + (1.5)(0.72) = 11.61[\/latex]<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>On a baseball team, the ages of each of the players are as follows:<\/p>\n<p>[latex]\\displaystyle {21; 21; 22; 23; 24; 24; 25; 25; 28; 29; 29; 31; 32; 33; 33; 34; 35; 36; 36; 36; 36; 38; 38; 38; 40}[\/latex]<\/p>\n<p>Use your calculator or computer to find the mean and standard deviation. Then find the value that is two standard deviations above the mean.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q124076\">Show Solution<\/span><\/p>\n<div id=\"q124076\" class=\"hidden-answer\" style=\"display: none\">\n[latex]\\displaystyle\\overline{x} = 30.68[\/latex]<\/p>\n<p>[latex]s = 6.09[\/latex]<\/p>\n<p>([latex]\\displaystyle\\overline{x}+ 2s) = 30.68 + (2)(6.09) = 42.86[\/latex].\n<\/p><\/div>\n<\/div>\n<\/div>\n<h2>Explanation of the standard deviation calculation shown in the table<\/h2>\n<p>The deviations show how spread out the data are about the mean. The data value [latex]11.5[\/latex] is farther from the mean than is the data value [latex]11[\/latex] which is indicated by the deviations [latex]0.97[\/latex] and [latex]0.47[\/latex]. A positive deviation occurs when the data value is greater than the mean, whereas a negative deviation occurs when the data value is less than the mean. The deviation is [latex]\u20131.525[\/latex] for the data value nine. <strong data-redactor-tag=\"strong\">If you add the deviations, the sum is always zero. <\/strong>(For Example 1, there are [latex]n = 20[\/latex] deviations.) So you cannot simply add the deviations to get the spread of the data. By squaring the deviations, you make them positive numbers, and the sum will also be positive. The variance, then, is the average squared deviation.<\/p>\n<p>The variance is a squared measure and does not have the same units as the data. Taking the square root solves the problem. The standard deviation measures the spread in the same units as the data.<\/p>\n<p>Notice that instead of dividing by [latex]n= 20[\/latex], the calculation divided by [latex]n \u2013 1 = 20 \u2013 1 = 19[\/latex] because the data is a sample. For the <strong data-redactor-tag=\"strong\">sample<\/strong> variance, we divide by the sample size minus one ([latex]n \u2013 1[\/latex]). Why not divide by [latex]n[\/latex]? The answer has to do with the population variance. <strong data-redactor-tag=\"strong\">The sample variance is an estimate of the population variance.<\/strong> Based on the theoretical mathematics that lies behind these calculations, dividing by ([latex]n \u2013 1[\/latex]) gives a better estimate of the population variance.<\/p>\n<div class=\"textbox shaded\">\n<h3>Note<\/h3>\n<p>Your concentration should be on what the standard deviation tells us about the data. The standard deviation is a number which measures how far the data are spread from the mean. Let a calculator or computer do the arithmetic.<\/p>\n<\/div>\n<p>The standard deviation, [latex]s[\/latex] or [latex]\u03c3[\/latex], is either zero or larger than zero. When the standard deviation is zero, there is no spread; that is, the all the data values are equal to each other. The standard deviation is small when the data are all concentrated close to the mean, and is larger when the data values show more variation from the mean. When the standard deviation is a lot larger than zero, the data values are very spread out about the mean; outliers can make [latex]s[\/latex] or [latex]\u03c3[\/latex] very large.<\/p>\n<p>The standard deviation, when first presented, can seem unclear. By graphing your data, you can get a better &#8220;feel&#8221; for the deviations and the standard deviation. You will find that in symmetrical distributions, the standard deviation can be very helpful but in skewed distributions, the standard deviation may not be much help. The reason is that the two sides of a skewed distribution have different spreads. In a skewed distribution, it is better to look at the first quartile, the median, the third quartile, the smallest value, and the largest value. Because numbers can be confusing, <strong data-redactor-tag=\"strong\">always graph your data<\/strong>. Display your data in a histogram or a box plot.<\/p>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Use the following data (first exam scores) from Susan Dean&#8217;s spring pre-calculus class:<\/p>\n<p>[latex]\\displaystyle {33; 42; 49; 49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94; 96; 100}[\/latex]<\/p>\n<ol>\n<li>Create a chart containing the data, frequencies, relative frequencies, and cumulative relative frequencies to three decimal places.<\/li>\n<li>Calculate the following to one decimal place using a calculator or computer:\n<ol>\n<li>The sample mean<\/li>\n<li>The sample standard deviation<\/li>\n<li>The median<\/li>\n<li>The first quartile<\/li>\n<li>The third quartile<\/li>\n<li>[latex]IQR[\/latex]<\/li>\n<\/ol>\n<\/li>\n<li>Construct a box plot and a histogram on the same set of axes. Make comments about the box plot, the histogram, and the chart.<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q124077\">Show Solution<\/span><\/p>\n<div id=\"q124077\" class=\"hidden-answer\" style=\"display: none\">\n<ol>\n<li>\n<table>\n<thead>\n<tr>\n<th>Data<\/th>\n<th>Frequency<\/th>\n<th>Relative Frequency<\/th>\n<th>Cumulative Relative Frequency<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>[latex]33[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]42[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.064[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]49[\/latex]<\/td>\n<td>[latex]2[\/latex]<\/td>\n<td>[latex]0.065[\/latex]<\/td>\n<td>[latex]0.129[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]53[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.161[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]55[\/latex]<\/td>\n<td>[latex]2[\/latex]<\/td>\n<td>[latex]0.065[\/latex]<\/td>\n<td>[latex]0.226[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]61[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.258[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]63[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.29[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]67[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.322[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]68[\/latex]<\/td>\n<td>[latex]2[\/latex]<\/td>\n<td>[latex]0.065[\/latex]<\/td>\n<td>[latex]0.387[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]69[\/latex]<\/td>\n<td>[latex]2[\/latex]<\/td>\n<td>[latex]0.065[\/latex]<\/td>\n<td>[latex]0.452[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]72[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.484[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]73[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.516[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]74[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.548[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]78[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.580[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]80[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.612[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]83[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.644[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]88[\/latex]<\/td>\n<td>[latex]3[\/latex]<\/td>\n<td>[latex]0.097[\/latex]<\/td>\n<td>[latex]0.741[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]90[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.773[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]92[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.805[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]94[\/latex]<\/td>\n<td>[latex]4[\/latex]<\/td>\n<td>[latex]0.129[\/latex]<\/td>\n<td>[latex]0.934[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]96[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.966[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>[latex]100[\/latex]<\/td>\n<td>[latex]1[\/latex]<\/td>\n<td>[latex]0.032[\/latex]<\/td>\n<td>[latex]0.998[\/latex] (Why isn&#8217;t this value [latex]1[\/latex]?)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ol>\n<li>The sample mean = [latex]73.5[\/latex]<\/li>\n<li>The sample standard deviation = [latex]17.9[\/latex]<\/li>\n<li>The median = [latex]73[\/latex]<\/li>\n<li>The first quartile = [latex]61[\/latex]<\/li>\n<li>The third quartile = [latex]90[\/latex]<\/li>\n<li>[latex]IQR = 90 \u2013 61 = 29[\/latex]<\/li>\n<\/ol>\n<\/li>\n<li>The [latex]x[\/latex]-axis goes from [latex]32.5[\/latex] to [latex]100.5[\/latex]; [latex]y[\/latex]-axis goes from [latex]\u20132.4[\/latex] to [latex]15[\/latex] for the histogram. The number of intervals is five, so the width of an interval is [latex](100.5 \u2013 32.5)[\/latex] divided by five, is equal to [latex]13.6[\/latex]. Endpoints of the intervals are as follows: the starting point is [latex]32.5, 32.5 + 13.6 = 46.1[\/latex], [latex]46.1 + 13.6 = 59.7[\/latex], [latex]59.7 + 13.6 = 73.3[\/latex], [latex]73.3 + 13.6 = 86.9[\/latex], [latex]86.9 + 13.6 = 100.5[\/latex] = the ending value; No data values fall on an interval boundary.<br \/>\n<a href=\"https:\/\/courses.candelalearning.com\/introstats1xmaster\/wp-content\/uploads\/sites\/635\/2015\/06\/Screen-Shot-2015-06-07-at-5.19.14-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-496 size-full\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/132\/2016\/04\/21214248\/Screen-Shot-2015-06-07-at-5.19.14-PM.png\" alt=\"Box Plot\" width=\"511\" height=\"328\" \/><\/a><\/li>\n<\/ol>\n<p>The long left whisker in the box plot is reflected in the left side of the histogram. The spread of the exam scores in the lower [latex]50[\/latex]% is greater ([latex]73 \u2013 33 = 40[\/latex]) than the spread in the upper [latex]50[\/latex]% ([latex]100 \u2013 73 = 27[\/latex]). The histogram, box plot, and chart all reflect this. There are a substantial number of A and B grades ([latex]80[\/latex]s, [latex]90[\/latex]s, and [latex]100[\/latex]). The histogram clearly shows this. The box plot shows us that the middle [latex]50[\/latex]% of the exam scores ([latex]IQR[\/latex] = [latex]29[\/latex]) are Ds, Cs, and Bs. The box plot also shows us that the lower [latex]25[\/latex]% of the exam scores are Ds and Fs.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>The following data show the different types of pet food stores in the area carry.<\/p>\n<p>[latex]\\displaystyle {6; 6; 6; 6; 7; 7; 7; 7; 7; 8; 9; 9; 9; 9; 10; 10; 10; 10; 10; 11; 11; 11; 11; 12; 12; 12; 12; 12; 12;}[\/latex]<br \/>\nCalculate the sample mean and the sample standard deviation to one decimal place using a calculator or computer.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q124078\">Show Solution<\/span><\/p>\n<div id=\"q124078\" class=\"hidden-answer\" style=\"display: none\">\n[latex]\u03bc = 9.3[\/latex]<\/p>\n<p>[latex]s = 2.2[\/latex]\n<\/p><\/div>\n<\/div>\n<\/div>\n<h2>Comparing Values from Different Data Sets<\/h2>\n<p>The standard deviation is useful when comparing data values that come from different data sets. If the data sets have different means and standard deviations, then comparing the data values directly can be misleading.<\/p>\n<ul>\n<li>For each data value, calculate how many standard deviations away from its mean the value is.<\/li>\n<li>Use the formula: value = mean + (#ofSTDEVs)(standard deviation); solve for #ofSTDEVs.<\/li>\n<li><em>#ofSTDEVs <\/em>= [latex]\\frac{value - mean}{standard deviation}[\/latex]<\/li>\n<li>Compare the results of this calculation.<\/li>\n<\/ul>\n<p>#ofSTDEVs is often called a &#8221; [latex]z[\/latex]-score&#8221;; we can use the symbol [latex]z[\/latex]. In symbols, the formulas become:<\/p>\n<table>\n<tbody>\n<tr>\n<td>Sample<\/td>\n<td>[latex]x=\\overline{x}+zs[\/latex]<\/td>\n<td>[latex]z = \\frac{x - \\overline{x}}{s}[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>Population<\/td>\n<td>[latex]x = \u03bc + z\u03c3[\/latex]<\/td>\n<td>[latex]z = \\frac{x - \u03bc}{\u03c3}[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Two students, John and Ali, from different high schools, wanted to find out who had the highest GPA when compared to his school. Which student had the highest GPA when compared to his school?<\/p>\n<table>\n<thead>\n<tr>\n<th>Student<\/th>\n<th>GPA<\/th>\n<th>School Mean GPA<\/th>\n<th>School Standard Deviation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>John<\/td>\n<td>[latex]2.85[\/latex]<\/td>\n<td>[latex]3.0[\/latex]<\/td>\n<td>[latex]0.7[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>Ali<\/td>\n<td>[latex]77[\/latex]<\/td>\n<td>[latex]80[\/latex]<\/td>\n<td>[latex]10[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q124079\">Show Solution<\/span><\/p>\n<div id=\"q124079\" class=\"hidden-answer\" style=\"display: none\">\nFor each student, determine how many standard deviations (#ofSTDEVs) his GPA is away from the average, for his school. Pay careful attention to signs when comparing and interpreting the answer.<\/p>\n<p>[latex]z[\/latex] = # of STDEVs = [latex]\\frac{value - mean}{standard deviation}[\/latex] =\u00a0 [latex]\\frac{x-\u03bc}{\u03c3}[\/latex]<\/p>\n<p>For John, [latex]z[\/latex] = # ofSTDEVs = [latex]\\displaystyle\\frac{{2.85 - 3.00}}{{0.7}}=-{0.21}[\/latex]<\/p>\n<p>For Ali, [latex]z[\/latex] = # ofSTDEVs = [latex]\\displaystyle\\frac{{77- 80}}{{10}}=\u22120.3[\/latex]<\/p>\n<p>John has the better GPA when compared to his school because his GPA is [latex]0.21[\/latex] standard deviations\u00a0<strong data-redactor-tag=\"strong\">below<\/strong> his school&#8217;s mean while Ali&#8217;s GPA is [latex]0.3[\/latex] standard deviations <strong data-redactor-tag=\"strong\">below<\/strong> his school&#8217;s mean.<\/p>\n<p>John&#8217;s [latex]z[\/latex]-score of [latex]\u20130.21[\/latex] is higher than Ali&#8217;s [latex]z[\/latex]-score of [latex]\u20130.3[\/latex]. For GPA, higher values are better, so we conclude that John has the better GPA when compared to his school.\n<\/p><\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Try It<\/h3>\n<p>Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time for the 50 meter freestyle when compared to her team. Which swimmer had the fastest time when compared to her team?<\/p>\n<table>\n<thead>\n<tr>\n<th>Swimmer<\/th>\n<th>Time (seconds)<\/th>\n<th>Team Mean Time<\/th>\n<th>Team Standard Deviation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Angie<\/td>\n<td>[latex]26.2[\/latex]<\/td>\n<td>[latex]27.2[\/latex]<\/td>\n<td>[latex]0.8[\/latex]<\/td>\n<\/tr>\n<tr>\n<td>Beth<\/td>\n<td>[latex]27.3[\/latex]<\/td>\n<td>[latex]30.1[\/latex]<\/td>\n<td>[latex]1.4[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q124080\">Show Solution<\/span><\/p>\n<div id=\"q124080\" class=\"hidden-answer\" style=\"display: none\">\nFor Angie: [latex]z=\\frac{\\left(26.2-27.2\\right)}{0.8}=-1.25[\/latex]<\/p>\n<p>For Beth: [latex]z=\\frac{\\left(27.3-30.1\\right)}{1.4}=-2[\/latex]<\/p>\n<p>Beth had the fastest time compared to her team, because her\u00a0[latex]z[\/latex]-score was more negative, meaning quicker in relation to the average for her team.\n<\/p><\/div>\n<\/div>\n<\/div>\n<p>The following lists give a few facts that provide a little more insight into what the standard deviation tells us about the distribution of the data.<\/p>\n<p><strong data-redactor-tag=\"strong\">For ANY data set, no matter what the distribution of the data is:<\/strong><\/p>\n<ul>\n<li>At least [latex]75[\/latex]% of the data is within two standard deviations of the mean.<\/li>\n<li>At least [latex]89[\/latex]% of the data is within three standard deviations of the mean.<\/li>\n<li>At least [latex]95[\/latex]% of the data is within [latex]4.5[\/latex] standard deviations of the mean.<\/li>\n<li>This is known as Chebyshev&#8217;s Rule.<\/li>\n<\/ul>\n<p><strong data-redactor-tag=\"strong\">For data having a distribution that is BELL-SHAPED and SYMMETRIC:<\/strong><\/p>\n<ul>\n<li>Approximately [latex]68[\/latex]% of the data is within one standard deviation of the mean.<\/li>\n<li>Approximately [latex]95[\/latex]% of the data is within two standard deviations of the mean.<\/li>\n<li>More than [latex]99[\/latex]% of the data is within three standard deviations of the mean.<\/li>\n<li>This is known as the Empirical Rule.<\/li>\n<li>It is important to note that this rule only applies when the shape of the distribution of the data is bell-shaped and symmetric. We will learn more about this when studying the &#8220;Normal&#8221; or &#8220;Gaussian&#8221; probability distribution in later chapters.<\/li>\n<\/ul>\n<h2>Concept Review<\/h2>\n<p>The standard deviation can help you calculate the spread of data. There are different equations to use if are calculating the standard deviation of a sample or of a population.<\/p>\n<ul>\n<li>The Standard Deviation allows us to compare individual data or classes to the data set mean numerically.<\/li>\n<li>[latex]\\displaystyle{s}_{x}=\\sqrt{{\\frac{{f{(x-\\overline{x})}^{2}}}{{n-1}}}}[\/latex]\u00a0is the formula for calculating the standard deviation of a sample.<\/li>\n<li>To calculate the standard deviation of a population, we would use the population mean, <em data-redactor-tag=\"em\">\u03bc<\/em>, and the formula [latex]\\displaystyle{\\sigma}=\\sqrt{{\\frac{{f{(x-\\mu)}^{2}}}{{N}}}}[\/latex]<\/li>\n<\/ul>\n<h2>Formula Review<\/h2>\n<h3>Sample Standard Deviation<\/h3>\n<p>[latex]\\displaystyle{s}=\\sqrt{{\\frac{{\\sum{({x}-\\overline{{x}})}^{{2}}}}{{{n}-{1}}}}}{\\quad\\text{or}\\quad}{s}=\\sqrt{{\\frac{{\\sum{f{{({x}-\\overline{{x}})}}}^{{2}}}}{{{n}-{1}}}}}[\/latex]<\/p>\n<p class=\"p1\">For the sample standard deviation, the denominator is [latex]n \u2013 1[\/latex], that is the sample size MINUS [latex]1[\/latex].<\/p>\n<h3>Population Standard Deviation<\/h3>\n<p>[latex]\\displaystyle\\sigma=\\sqrt{{\\frac{{\\sum{({x}-\\mu)}^{{2}}}}{{{N}}}}}{\\quad\\text{or}\\quad}\\sigma=\\sqrt{{\\frac{{\\sum{f{{({x}-\\mu)}}}^{{2}}}}{{{N}}}}}[\/latex]<\/p>\n<p>For the population standard deviation, the denominator is [latex]N[\/latex], the number of items in the population.<\/p>\n<h2>References<\/h2>\n<p>Data from Microsoft Bookshelf.<\/p>\n<p>King, Bill.&#8221;Graphically Speaking.&#8221; Institutional Research, Lake Tahoe Community College. Available online at http:\/\/www.ltcc.edu\/web\/about\/institutional-research (accessed April 3, 2013).<\/p>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-90\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>OpenStax Statistics. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"\"><\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Introductory Statistics . <strong>Authored by<\/strong>: Barbara Illowski, Susan Dean. <strong>Provided by<\/strong>: Open Stax. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44\">http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em>. <strong>License Terms<\/strong>: Download for free at http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44<\/li><\/ul><div class=\"license-attribution-dropdown-subheading\">All rights reserved content<\/div><ul class=\"citation-list\"><li>How to calculate Standard Deviation and Variance. <strong>Authored by<\/strong>: statisticsfun. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/youtu.be\/qqOyy_NjflU\">https:\/\/youtu.be\/qqOyy_NjflU<\/a>. <strong>License<\/strong>: <em>All Rights Reserved<\/em>. <strong>License Terms<\/strong>: Standard YouTube License<\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":21,"menu_order":11,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"OpenStax Statistics\",\"author\":\"\",\"organization\":\"\",\"url\":\"Download for free at http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"copyrighted_video\",\"description\":\"How to calculate Standard Deviation and Variance\",\"author\":\"statisticsfun\",\"organization\":\"\",\"url\":\"https:\/\/youtu.be\/qqOyy_NjflU\",\"project\":\"\",\"license\":\"arr\",\"license_terms\":\"Standard YouTube License\"},{\"type\":\"cc\",\"description\":\"Introductory Statistics \",\"author\":\"Barbara Illowski, Susan Dean\",\"organization\":\"Open Stax\",\"url\":\"http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"Download for free at http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44\"}]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-90","chapter","type-chapter","status-publish","hentry"],"part":69,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/pressbooks\/v2\/chapters\/90","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/wp\/v2\/users\/21"}],"version-history":[{"count":29,"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/pressbooks\/v2\/chapters\/90\/revisions"}],"predecessor-version":[{"id":2833,"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/pressbooks\/v2\/chapters\/90\/revisions\/2833"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/pressbooks\/v2\/parts\/69"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/pressbooks\/v2\/chapters\/90\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/wp\/v2\/media?parent=90"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/pressbooks\/v2\/chapter-type?post=90"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/wp\/v2\/contributor?post=90"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/frontrange-introstats1\/wp-json\/wp\/v2\/license?post=90"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}