Comparing Variability of Data Sets: What to Know 5

Variance

Variance is the standard deviation squared. We use the Greek letter [latex]\sigma^{2}[/latex] (sigma squared) to denote the variance of a population of observations, and we use [latex]s^{2}[/latex] to denote the variation of a sample of observations. The following formulas are used to calculate the variation of a population and a sample:

Variance of a population: [latex]\sigma^{2}=\dfrac{\sum\left(x-\mu\right)^{2}}{n}[/latex]

Variance of a sample: [latex]s^{2}=\dfrac{\sum\left(x-\bar{x}\right)^{2}}{n-1}[/latex]

Important: The Describing and Exploring Quantitative Variables tool does not calculate the variance, so you will need to use the tool to calculate the standard deviation and then square it by hand in order to get the variance.

question 7

Range

The simplest way to calculate the variability of a data set is with the range:

Range = maximum value – minimum value

or

Range = largest value – smallest value

Larger values of range indicate more variability in the data. However, the range value only utilizes two observations in the entire data set to measure variability. This is not an ideal measure of spread, but when used in combination with other measures of spread, it can help us gain a clearer understanding of the spread of a distribution.

question 8

Summary

In this section, you’ve learned about variability in a data set in preparation for exploring data via the measures of center and spread. Let’s summarize where these skills showed up in the material.

  • In Questions 1, 2, and 3, you visually assessed the differences in variability, given comparative histograms or dotplots.
  • In Questions 4 and 5, you gained experience using the summary statistics feature of the Describing and Exploring Quantitative Variables tool.
  • In questions 6 – 8, you used technology to calculate measures of variability: standard deviation, variance, and range.

Key formulas

Standard deviation of a population: [latex]\sigma = \sqrt{\dfrac{\sum \left(x-\mu\right)^2}{n}}[/latex], where [latex]\mu[/latex] represents the population mean.

Standard deviation of a sample: [latex]s=\sqrt{\dfrac{\sum \left(x-\bar{x}\right)^2}{n-1}}[/latex], where [latex]\bar{x}[/latex] represents the sample mean.

Variance of a population: [latex]\sigma^{2}=\dfrac{\sum\left(x-\mu\right)^{2}}{n}[/latex]

Variance of a sample: [latex]s^{2}=\dfrac{\sum\left(x-\bar{x}\right)^{2}}{n-1}[/latex]

Range: Range = maximum value – minimum value

Exploring the measures of center and spread to describe data is a necessary skill for completing the next activity. If you feel comfortable with these skills, it’s time to move on!