Learning OUTCOMES
- Describe the sampling distribution for sample proportions and use it to identify unusual (and more common) sample results.
The simulations on the previous page reinforce what we have observed about patterns in random sampling.
- Proportions from random samples approximate the population proportion, p, so sample proportions average out to the population proportion.
- Larger random samples better approximate the population proportion, so large samples have sample proportions closer to p. In other words, a sampling distribution for large samples has less variability.
- The distribution of sample proportions appears normal (at least for the examples we have investigated).
We can describe the sampling distribution with a mathematical model that has these same features.
Sampling Distribution of Sample Proportions
For a categorical variable, imagine a population with a proportion p of successes. (For example, for the variable gender, imagine a population of part-time college students with p = 0.60 female. Note that a success is the category of interest. It is what we are counting. Here a success is a female.) We create a mathematical model that describes the sample proportions from all possible random samples of size n from this population. The model has the following center, spread, and shape.
Center: Mean of the sample proportions is p, the population proportion.
Spread: Standard deviation of the sample proportions is [latex]\sqrt{\frac{p(1-p)}{n}}[/latex]. The standard deviation of the sampling distribution is also called the standard error.
Shape: A normal model is a good fit if the expected number of successes and failures is at least 10. We can translate these conditions into formulas: [latex]np≥10\text{}\mathrm{and}\text{}n(1-p)≥10.[/latex]
Comment
The distribution of sample proportions for ALL samples of the same size is called the sampling distribution of sample proportions.
In a simulation, we collect thousands of random samples to examine the distribution of sample proportions. But when we model this distribution, our model describes the sampling distribution that comes from ALL possible random samples of the same size.
Example
Applying the Model for the Sampling Distribution
Let’s apply this model to our previous example about the population of part-time college students to see how it compares to our simulation. Recall that we assumed the population of part-time college students is 60% female. We selected samples of 25 part-time college students and calculated the proportion of females in each sample.
Simulation: Thousands of random samples, each with 25 individuals | Mathematical Model: ALL possible samples, each with 25 individuals | |
---|---|---|
Mean of sample proportions | 0.6 | 0.6 |
Standard Deviation of sample proportions (Standard error) | 0.97 | [latex]\sqrt{\dfrac{0.6\left(1-0.6\right)}{25}}\approx 0.098[/latex] |
Shape of distribution of sample proportions | Approximately normal | Normal because conditions are met: [latex]\begin{array}{rcl}np&=&25\left(0.60\right)=15\\n\left(1-p\right)&=&25\left(0.40\right)=10\end{array}[/latex] |
Compare the mean and standard deviation we observed in the simulation to the mathematical model. Notice that the conditions are met, so a normal model is a good fit. We see that the model is a good description of the center, spread, and shape we observed in the simulation.
Try It
According to the National Postsecondary Student Aid Study conducted by the U.S. Department of Education in 2008, 62% of graduates from public universities had student loans.
Try It