Distinguishing between Population Parameters and Sample Statistics

Learning Outcomes

  • Distinguish between a population and a sample
  • Distinguish between a parameter and a statistic

Recall that a population is the entire group of individuals or objects that we want to study. Usually, it is not possible to study the whole population, so we collect data from a part of the population, called a sample. We use the sample to draw conclusions about the population.

For example, suppose our research question is “What is the average amount of money spent on textbooks per semester by full-time students at Seattle Central?” We cannot interview every full-time student at Seattle Central because it would take too much time and cost too much money. We therefore carefully select a sample of full-time students at Seattle Central to represent the population of all full-time students at that college. Then we collect data from the sample to estimate the average amount spent on textbooks.

This example illustrates how the research question guides the investigation. A well-stated research question contains information about:

  • The population (full-time students at Seattle Central).
  • The information we will collect from each individual in the sample. We call this the variable. The variable is what we plan to measure (amount of money spent on textbooks per semester), and is often represented by X.
  • A numerical characteristic about the population related to this variable (the average amount of money spent on textbooks per semester).

A numerical characteristic about a population is called a parameter. In the example above we are interested in the average, or mean amount of money all students spent on textbooks per semester.  A population mean is represented by [latex]\mu[/latex].

We use information from the sample to estimate the population mean. A numerical characteristic of a sample is called a sample statistic. It seems natural that our first estimate of the population mean, [latex]\mu[/latex] is the sample mean [latex]\overline{x}[/latex]. If a statement refers to a mean or average, it indicates that [latex]\mu[/latex] or [latex]\overline{x}[/latex] is being considered.

Another type of research question about a population we might be interested in is whether the majority of students qualify for federal student loans.

  • The population (full-time students at Seattle Central).
  • The information we will collect from each individual in the sample is the variable. The variable is whether or not each student in the sample qualifies for federal student loans.
  • The numerical characteristic about the population related to this variable is the proportion of all students at Seattle Central that qualify for student loans.

A population proportion is represented by [latex]\varrho[/latex]. We estimate a population proportion from a sample proportion. A sample proportion is represented by [latex]\hat{\varrho}[/latex]. If a statement quotes a percentage or a fraction, a proportion is being considered. If 3 out of 5 people in our sample agree with a statement, this corresponds to the sample proportion [latex]\hat{p} = \frac{3}{5} = 0.6 = 60%[/latex].

Some examples of research questions and corresponding to variables of interest are given in the following table.

Research Question Population Parameter and Notation
What is the average number of hours students work per week? Mean, [latex]\mu[/latex]
What percentage of students commute? Proportion, [latex]\varrho[/latex]
Do the majority of students participate in athletics? Proportion, [latex]\varrho[/latex]
Do athletes have a higher grade point average than non-athletes? Means, [latex]\mu _\mathrm{athletes} \ \mathrm{and} \ \mu _\mathrm{nonathletes} [/latex]

In deciding whether information refers to a sample or a population, consider whether all members of the population have been included. If so, a numerical characteristic refers to a population parameter. The word “every” or “each” in a statement can be a clue that an entire population has been included.

If only a portion of the population has been included, a numerical characteristic refers to a sample statistic. If a statement refers to a relatively small number of what would be a large population, the information likely refers to a sample.

Example

Determine whether the number refers to a population parameter or a sample statistic.

  1. Based on records of all employees at a small company, the average annual salary is $52,539.
  2. In a survey of 50 athletes at a large university, 78% said they were happy they chose to continue competing at the collegiate level.
  3. In a study of 63 law firms in the United States, the average hourly billing rate was $279.
  4. 706 of the 2,223 passengers on the Titanic survived.

Try It