Measures of the Spread of Data

Learning Outcomes

Calculate and interpret z-scores

Comparing Values from Different Data Sets

The standard deviation is useful when comparing data values that come from different data sets. If the data sets have different means and standard deviations, then comparing the data values directly can be misleading.

For each data value, calculate how many standard deviations away from its mean the value is.
Use the formula: value = mean + (#ofSTDEVs)(standard deviation); solve for #ofSTDEVs.
#ofSTDEVs = [latex]\frac{\mathrm{value} - \mathrm{mean}}{\mathrm{standard \ deviation}}[/latex]
Compare the results of this calculation.

#ofSTDEVs is often called a “[latex]z[/latex]-score”; we can use the symbol [latex]z[/latex]. In symbols, the formulas become:

Sample	[latex]x=\overline{x}+zs[/latex]	[latex]z = \frac{x - \overline{x}}{s}[/latex]
Population	[latex]x = μ + zσ[/latex]	[latex]z = \frac{x - μ}{σ}[/latex]

Example

Two students, John and Eric, from different high schools, wanted to find out who had the highest GPA when compared to his school. Which student had the highest GPA when compared to his school?

Student	GPA	School Mean GPA	School Standard Deviation
John	[latex]2.85[/latex]	[latex]3.0[/latex]	[latex]0.7[/latex]
Eric	[latex]77[/latex]	[latex]80[/latex]	[latex]10[/latex]

Show Solution

Try It

Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time for the 50 meter freestyle when compared to her team. Which swimmer had the fastest time when compared to her team?

Swimmer	Time (seconds)	Team Mean Time	Team Standard Deviation
Angie	[latex]26.2[/latex]	[latex]27.2[/latex]	[latex]0.8[/latex]
Beth	[latex]27.3[/latex]	[latex]30.1[/latex]	[latex]1.4[/latex]

Show Solution

The following lists give a few facts that provide a little more insight into what the standard deviation tells us about the distribution of the data.

For ANY data set, no matter what the distribution of the data is:

At least [latex]75[/latex]% of the data is within two standard deviations of the mean.
At least [latex]89[/latex]% of the data is within three standard deviations of the mean.
At least [latex]95[/latex]% of the data is within [latex]4.5[/latex] standard deviations of the mean.
This is known as Chebyshev’s Rule.

For data having a distribution that is BELL-SHAPED and SYMMETRIC:

Approximately [latex]68[/latex]% of the data is within one standard deviation of the mean.
Approximately [latex]95[/latex]% of the data is within two standard deviations of the mean.
More than [latex]99[/latex]% of the data is within three standard deviations of the mean.
This is known as the Empirical Rule.
It is important to note that this rule only applies when the shape of the distribution of the data is bell-shaped and symmetric. We will learn more about this when studying the “Normal” or “Gaussian” probability distribution in later chapters.

Module 2: Descriptive Statistics

Learning Outcomes

Comparing Values from Different Data Sets

Example

Try It

Candela Citations