5.1.3: Descriptive Statistics: Mean

Learning Outcomes

  • Calculate the mean of a set of numbers
  • Calculate the mean from a frequency table or a grouped frequency table.

KEY words

  • Mean: the arithmetic average

Mean

So far we have studied two averages; the median and the mode. A third average is the mean, which is often called the arithmetic average. It is computed by dividing the sum of the data values by the number of data values. Generally speaking, when people talk about average, it is usually the mean they are referring to. Technically, the mean is the arithmetic average, and average is a central location, but mean and average are used interchangeably in common practice.  The mean is a central value of a finite set of numbers: specifically, the sum of the values divided by the number of values.

Mean

The mean of a set of [latex]n[/latex] numerical data points is the arithmetic average of the numbers.

[latex]\text{mean}={\frac{\text{sum of values in data set}}{n}}[/latex]

where [latex]n[\latex] is the number of data points.

Suppose Eva’s first three test scores were [latex]85,88,\text{and }94[/latex]. To find the mean score, we would add them and divide by [latex]3[/latex].

[latex]\begin{array}{}\\ {\frac{85+88+94}{3}} &=&{\frac{267}{3}}\\ &=&89\end{array}[/latex]

The mean test score is [latex]89[/latex] points.

The mean of a set of data is denoted by either the Greek letter [latex]μ[/latex] (mu) when representing the population mean, or [latex]\displaystyle\overline{{x}}[/latex] (read “[latex]x[/latex] bar”) when representing a sample mean. One of the requirements for the sample mean to be a good estimate of the population mean is for the sample taken to be truly random.

Calculate the mean of a set of numbers.

  1. Write the formula for the mean
    [latex]\text{mean}={\frac{\text{sum of values in the data set}}{n}}[/latex]
  2. Find the sum of all the values in the set. Write the sum in the numerator.
  3. Count the number, [latex]n[/latex], of values in the set. Write this number in the denominator.
  4. Divide the numerator by the denominator.
  5. Check to see that the mean is reasonable. The mean is an average which is a measure of the center of the data set, so it should be greater than the least number and less than the greatest number in the set.

example

Find the mean of the numbers [latex]8,12,15, 9,\text{ and }6[/latex].

Solution

Write the formula for the mean: [latex]\text{mean}={\frac{\text{sum of all the numbers}}{n}}[/latex]
Write the sum of the numbers in the numerator. [latex]\text{mean}={\frac{8+12+15+9+6}{n}}[/latex]
Count how many numbers are in the set. There are [latex]5[/latex] numbers in the set, so [latex]n=5[/latex] . [latex]\text{mean}={\frac{8+12+15+9+6}{5}}[/latex]
Add the numbers in the numerator. [latex]\text{mean}={\frac{50}{5}}[/latex]
Then divide. [latex]\text{mean}=10[/latex]
Check to see that the mean is ‘typical’: [latex]10[/latex] is neither less than [latex]6[/latex] nor greater than [latex]15[/latex]. The mean is [latex]10[/latex].

try it

try it

 

A table that shows the frequency of each data value in a sample is called a frequency table. It is used when each value in the data set is not unique so that the frequency of the value can be shown. The mean can be calculated by multiplying each distinct value by its frequency and then dividing the sum by the total number of data values.

Example

AIDS data indicating the number of months a patient with AIDS lives after taking a new antibody drug are as follows:

Number of Months Frequency
[latex]3[/latex] [latex]1[/latex]
[latex]4[/latex] [latex]1[/latex]
[latex]8[/latex] [latex]2[/latex]
[latex]10[/latex] [latex]1[/latex]
[latex]11[/latex] [latex]1[/latex]
[latex]12[/latex] [latex]1[/latex]
[latex]13[/latex] [latex]1[/latex]
[latex]14[/latex] [latex]1[/latex]
[latex]15[/latex] [latex]2[/latex]
[latex]16[/latex] [latex]2[/latex]
[latex]17[/latex] [latex]2[/latex]
[latex]18[/latex] [latex]1[/latex]
[latex]21[/latex] [latex]1[/latex]
[latex]22[/latex] [latex]2[/latex]
[latex]24[/latex] [latex]2[/latex]
[latex]25[/latex] [latex]1[/latex]
[latex]26[/latex] [latex]2[/latex]
[latex]27[/latex] [latex]2[/latex]
[latex]29[/latex] [latex]2[/latex]
[latex]31[/latex] [latex]1[/latex]
[latex]32[/latex] [latex]1[/latex]
[latex]33[/latex] [latex]2[/latex]
[latex]34[/latex] [latex]2[/latex]
[latex]35[/latex] [latex]1[/latex]
[latex]37[/latex] [latex]1[/latex]
[latex]40[/latex] [latex]1[/latex]
[latex]44[/latex] [latex]2[/latex]
[latex]47[/latex] [latex]1[/latex]

Calculate the mean.

Solution

The calculation for the mean is:

[latex]\displaystyle\overline{{x}}=\frac{{{[{3}+{4}+{({8})}{({2})}+{10}+{11}+{12}+{13}+{14}+{({15})}{({2})}+{({16})}{({2})}+\ldots+{35}+{37}+{40}+{({44})}{({2})}+{47}]}}}{{40}}={23.6}\text{months}[/latex]

Try It

The following data show the number of months patients typically wait on a transplant list before getting surgery. The data are ordered from smallest to largest. Calculate the mean, median, and mode.

Number of Months Frequency
[latex]3[/latex] [latex]1[/latex]
[latex]4[/latex] [latex]1[/latex]
[latex]5[/latex] [latex]1[/latex]
[latex]7[/latex] [latex]4[/latex]
[latex]8[/latex] [latex]2[/latex]
[latex]9[/latex] [latex]2[/latex]
[latex]10[/latex] [latex]5[/latex]
[latex]11[/latex] [latex]1[/latex]
[latex]12[/latex] [latex]2[/latex]
[latex]13[/latex] [latex]1[/latex]
[latex]14[/latex] [latex]2[/latex]
[latex]15[/latex] [latex]2[/latex]
[latex]17[/latex] [latex]2[/latex]
[latex]18[/latex] [latex]1[/latex]
[latex]19[/latex] [latex]2[/latex]
[latex]21[/latex] [latex]2[/latex]
[latex]22[/latex] [latex]2[/latex]
[latex]23[/latex] [latex]1[/latex]
[latex]24[/latex] [latex]4[/latex]

Notice that mean = [latex]13.95[/latex] months; median = [latex]13[/latex] months; mode = [latex]10[/latex] months. Although they are all different, they are each a measure of average.

Mean and Median. What’s the Difference?

Since mean and median are both measures of the center of the data set, how do we know which one is more representative of the data set? The next example, considers this question:

example

Suppose that in a small town of [latex]50[/latex] people, one person earns $[latex]5,000,000[/latex] per year and the other [latex]49[/latex] each earn $[latex]30,000 [/latex]. Which is the better measure of the “center”: the mean or the median?

 

Try It

In a sample of [latex]61[/latex] households, one house is worth $[latex]2,500,000[/latex]. Half of the rest (30) are worth $[latex]280,000[/latex], and all the others (30) are worth $[latex]315,000[/latex]. Which is the better measure of the “center”: the mean or the median?

When the data set consists of values that are basically symmetric about the center, the mean and the median get closer together. When the data set consists of values that are skewed to the left or right of center, or if the data set contains outliers, the median is the better choice.

In addition, the Law of Large Numbers says that if we take samples of larger and larger size from any population, then the mean, [latex]\displaystyle\overline{{x}}[/latex], of the sample is very likely to get closer and closer to the population mean, [latex]µ[/latex].

Calculating the Mean of Grouped Frequency Tables

When only grouped data is available, we do not know the individual data values (we only know classes and class frequencies); therefore, we cannot compute an exact mean for the data set. What we must do is estimate the actual mean by calculating the mean of a frequency table. Remember that a frequency table is a data representation in which grouped data is displayed along with the corresponding frequencies. To calculate the mean from a grouped frequency table we can apply the basic definition of mean:
[latex]\displaystyle\text{mean}=\frac{{\text{data sum}}}{{\text{number of data values}}}[/latex]. We simply need to modify the definition to fit within the restrictions of a frequency table.

Since we do not know the individual data values we can instead find the midpoint of each class. The midpoint is the mean of the lower and upper interval constraints: [latex]\displaystyle\frac{{\text{lower boundary } + \text{upper boundary}}}{{2}}[/latex]. Then, the best estimate of the mean is:

GROUPED FREQUENCY Mean

The best estimate of a grouped frequency mean is:

[latex]\text{mean}={\frac{\text{sum of the product of each class frequency and midpoint }}{\text{sum of each frequency}}}[/latex]

example

A frequency table displaying professor Payne’s last math test is shown. Find the best estimate of the class mean.

Grade Interval Number of Students
[latex]50–56.5[/latex] [latex]1[/latex]
[latex]56.5–62.5[/latex] [latex]0[/latex]
[latex]62.5–68.5[/latex] [latex]4[/latex]
[latex]68.5–74.5[/latex] [latex]4[/latex]
[latex]74.5–80.5[/latex] [latex]2[/latex]
[latex]80.5–86.5[/latex] [latex]3[/latex]
[latex]86.5–92.5[/latex] [latex]4[/latex]
[latex]92.5–98.5[/latex] [latex]1[/latex]

Solution

Start by finding the midpoints for all the classes:

Grade Class  Midpoint
[latex]50–56.5[/latex] [latex]53.25[/latex]
[latex]56.5–62.5[/latex] [latex]59.5[/latex]
[latex]62.5–68.5[/latex] [latex]65.5[/latex]
[latex]68.5–74.5[/latex] [latex]71.5[/latex]
[latex]74.5–80.5[/latex] [latex]77.5[/latex]
[latex]80.5–86.5[/latex] [latex]83.5[/latex]
[latex]86.5–92.5[/latex] [latex]89.5[/latex]
[latex]92.5–98.5[/latex] [latex]95.5[/latex]

Then calculate the sum of the product of each interval frequency and midpoint:

[latex]53.25(1) + 59.5(0) + 65.5(4) + 71.5(4) + 77.5(2) + 83.5(3) + 89.5(4) + 95.5(1) = 1460.25[/latex]
 

Then divide by the total frequencies = 19

[latex]\displaystyle\mu=\frac{{1460.25}}{{19}}={76.86}[/latex]

NOTE: the test scores represent a population (professor Payne’s class) so the mean is denoted by [latex]μ[/latex].

When calculating this use a calculator to determine the numerator then immediately divide by the denominator. Do not round the numerator before dividing as this will introduce rounding error.

Try It

Maris conducted a study on the effect that playing video games has on memory recall. As part of her study, she compiled the following data:

Hours Teenagers Spend on Video Games Number of Teenagers
[latex]0–3.5[/latex] [latex]3[/latex]
[latex]3.5–7.5[/latex] [latex]7[/latex]
[latex]7.5–11.5[/latex]  [latex]12[/latex]
[latex]11.5–15.5[/latex] [latex]7[/latex]
[latex]15.5–19.5[/latex] [latex]9[/latex]

What is the best estimate for the mean number of hours spent playing video games?