Learning Goals
After completing this section, you should feel comfortable performing these skills.
- Define the standardized value, or z-score.
- Use technology to convert values into standardized scores.
- Use a dotplot and histogram to identify the number of standard deviations from the mean of certain observations.
- Calculate a value’s standardized score by hand to determine its location relative to the mean.
- Define the Empirical Rule.
Click on a skill above to jump to its location in this section.
In the next activity, you will need to be able to convert values into standardized values (also called standardized scores or z-scores) and use a value’s standardized value to determine whether the value is above, below, or equal to the mean. You will also need to be able to explain the Empirical Rule. In this section, we’ll use a data set to explore how to perform necessary calculations by hand and using technology.
Standardized Values
You learned in Comparing Variability of Data Sets: What to Know that a standard deviation is a measure for how spread out observations are from the mean.
A standardized value, or z-score, is the number of standard deviations an observation is away from the mean.
For example, in this section we will analyze runtimes (in minutes) of G-rated movies to learn how to calculate standardized values. Within this context, the standardized value, or z-score, is the number of standard deviations a particular movie runtime is from the mean.
It is important to note that the distance of a particular movie runtime from the mean is not measured in minutes; rather it is measured in standard deviations. Thus, a z-score of [latex]-2.3[/latex] is an observation that is [latex]2.3[/latex] standard deviations below the mean, and a z-score of [latex]2.3[/latex] is an observation that is [latex]2.3[/latex] standard deviations above the mean. It is important to note that z-scores do not have units associated with them.
Z-Score Formula The value of an observation is standardized using the formula [latex]z=\dfrac{x-\mu}{\sigma}[/latex], where [latex]x[/latex] represents the value of the observation, [latex]\mu[/latex] represents the population mean, [latex]\sigma[/latex] represents the population standard deviation, and [latex]z[/latex] represents the standardized value, or z-score.
Before we use the formula to convert values into standardized values, let’s recap our understanding of standard deviation. In Comparing Variability of Data Sets: What to Know, you learned to understand standard deviation as a measure of variability in a data set. You looked at the statistical components that went into the formulas for standard deviation and variance and saw that larger standard deviations could represent more variability, and vice-versa. We’d like to shift that perspective now and look at a unit of standard deviation as a distance from the mean of a data set in a distribution.
standard deviation as a unit of distance
[Perspective video — a 3-instructor video showing how to think about standard deviation as a unit of distance in a distribution — i.e., illustrating values so many standard deviations above and below the mean of a bell-shaped, unimodal, symmetric distribution. Show how adding or subtracting std devs can obtain a certain value at that location in the distribution. Show that a value’s z-score (negative or positive) is that many std deviations away from the mean in that direction.]
See the example below for a demonstration, then try it out using the Movie Runtimes database to answer the questions below.
inTeractive example
Let’s return again to the data set Sleep Study: Average Sleep, which we used in Comparing Variability of Data Sets: What to Know to learn about standard deviation as a measure of the variability of a data set.
Open the tool at https://dcmathpathways.shinyapps.io/EDA_quantitative/ and select the Sleep Study: Average Sleep data set. Display a histogram and dotplot and make a note of the mean and standard deviation in the descriptive statistics. Round your final answers to the questions below to 3 decimal places, as needed.
- Describe the shape of the data set using the histogram and dotplot. For practice, display a boxplot as well and note the visual clues that you can use to determine the shape of the distribution from the boxplot.
- How does the relationship between the mean and median (given in descriptive statistics) help to support your analysis?
- What are the mean and standard deviation of the data set?
- What number of sleep hours lies one standard deviation above the mean? What value lies one standard deviation below?
- What number of sleep hours lie two standard deviations above and below the mean?