The Law of Large Numbers and the Mean

Learning Outcomes

Calculate a mean from a grouped frequency table

The Law of Large Numbers says that if you take samples of larger and larger size from any population, then the mean [latex]\displaystyle\overline{{x}}[/latex] of the sample is very likely to get closer and closer to [latex]µ[/latex]. This is discussed in more detail later in the text.

Sampling Distributions and Statistic of a Sampling Distribution

You can think of a sampling distribution as a relative frequency distribution with a great many samples. Suppose thirty randomly selected students were asked the number of movies they watched the previous week. The results are in the relative frequency table shown below.

# of movies	Relative Frequency
[latex]0[/latex]	[latex]\displaystyle\frac{{5}}{{30}}[/latex]
[latex]1[/latex]	[latex]\displaystyle\frac{{15}}{{30}}[/latex]
[latex]2[/latex]	[latex]\displaystyle\frac{{6}}{{30}}[/latex]
[latex]3[/latex]	[latex]\displaystyle\frac{{3}}{{30}}[/latex]
[latex]4[/latex]	[latex]\displaystyle\frac{{1}}{{30}}[/latex]

If you let the number of samples get very large (say, 300 million or more), the relative frequency table becomes a relative frequency distribution.

A statistic is a number calculated from a sample. Statistic examples include the mean, the median, and the mode as well as others. The sample mean [latex]\displaystyle\overline{{x}}[/latex] is an example of a statistic which estimates the population mean [latex]μ[/latex].

Calculating the Mean of Grouped Frequency Tables

When only grouped data is available, you do not know the individual data values (we only know intervals and interval frequencies); therefore, you cannot compute an exact mean for the data set. What we must do is estimate the actual mean by calculating the mean of a frequency table. A frequency table is a data representation in which grouped data is displayed along with the corresponding frequencies. To calculate the mean from a grouped frequency table we can apply the basic definition of mean:
[latex]\displaystyle\text{mean}=\frac{{\text{data sum}}}{{\text{number of data values}}}[/latex]. We simply need to modify the definition to fit within the restrictions of a frequency table.

Since we do not know the individual data values we can instead find the midpoint of each interval. The midpoint is [latex]\displaystyle\frac{{\text{lower boundary } + \text{ upper boundary}}}{{2}}[/latex] We can now modify the mean definition to be [latex]\displaystyle\text{Mean of Frequency Table} = \frac{\sum\nolimits{fm}}{\sum\nolimits{f}}[/latex] where [latex]f[/latex] = the frequency of the interval and [latex]m[/latex] = the midpoint of the interval.

example

A frequency table displaying professor Blount’s last statistic test is shown. Find the best estimate of the class mean.

Grade Interval	Number of Students
[latex]50–56.5[/latex]	[latex]1[/latex]
[latex]56.5–62.5[/latex]	[latex]0[/latex]
[latex]62.5–68.5[/latex]	[latex]4[/latex]
[latex]68.5–74.5[/latex]	[latex]4[/latex]
[latex]74.5–80.5[/latex]	[latex]2[/latex]
[latex]80.5–86.5[/latex]	[latex]3[/latex]
[latex]86.5–92.5[/latex]	[latex]4[/latex]
[latex]92.5–98.5[/latex]	[latex]1[/latex]

Show Solution

Try It

Maris conducted a study on the effect that playing video games has on memory recall. As part of her study, she compiled the following data:

Hours Teenagers Spend on Video Games	Number of Teenagers
[latex]0–3.5[/latex]	[latex]3[/latex]
[latex]3.5–7.5[/latex]	[latex]7[/latex]
[latex]7.5–11.5[/latex]	[latex]12[/latex]
[latex]11.5–15.5[/latex]	[latex]7[/latex]
[latex]15.5–19.5[/latex]	[latex]9[/latex]

What is the best estimate for the mean number of hours spent playing video games?

Show Solution

Module 2: Descriptive Statistics