12A InClass

Airbnb is a website that connects people who are renting out rooms or homes with  people looking for accommodations in that area. In this in-class activity, we are going to  look at the prices of Airbnb listings under $500 per night in New York City and examine  the behavior of sample mean prices for random samples of listings.

Question 1

1) The average Airbnb listing price in New York City is $130 per night.[1] Which of the following would be more surprising: a single randomly selected listing price of $200 per night or an average listing price of $200 per night in a random sample of 25 listings? Explain.

A woman standing with a suitcase looking out over a river in front of a city.

Credit: iStock/FillippoBacci

In-Class Activities 9.B and 9.C introduced you to sampling variability. A sampling  distribution is the probability distribution of a sample statistic, such as a sample mean  or sample proportion, as it varies from sample to sample. In-Class Activity 9.B  introduced you to the sampling distribution of a sample proportion.

In this activity, you will explore the sampling distribution of the sample mean for varying  sample sizes and discover the Central Limit Theorem as it applies to means.

Go to the DCMP Sampling Distribution of the Sample Mean (Continuous Population) tool at https://dcmathpathways.shinyapps.io/SampDist_cont/.

You will use this tool to simulate random samples of different sizes from all Airbnb listings under $500 in New York City (NYC), where the sample mean price (in USD) is  calculated in each sample. Enter the following inputs:

  • Select Population Distribution: Real Population Data
  • Select Example: New York Airbnb Prices

Question 2

2) Examine the population distribution of the NYC Airbnb listing prices displayed in the  data analysis tool. What shape is this distribution?

(a) Symmetric

(b) Skewed right

(c) Skewed left

(d) Normal

Question 3

3) Before using the tool to generate random samples, let’s make a few predictions.

a) If you were to take 1,000 random samples of [latex]n=2[/latex] NYC Airbnb listings and  calculate the sample mean price for each sample, what shape would you  expect the distribution of the sample means to have?

b) If you were to take 1,000 random samples of [latex]n=10[/latex] NYC Airbnb listings and  calculate the sample mean price for each sample, what shape would you  expect the distribution of the sample means to have?

c) If you were to take 1,000 random samples of [latex]n=50[/latex] NYC Airbnb listings and  calculate the sample mean price for each sample, what shape would you  expect the distribution of the sample means to have?

d) How would you expect the variability in sample mean listing prices to change  as the sample size increases?

e) How would you expect the average of the sample mean listing prices to  change as the sample size increases?

Question 4

4) Use the tool to generate 1,000 random samples for each of the following sample  sizes. Select the “Show Normal Approximation” box to overlay a normal distribution  on each simulated sampling distribution. For each sample size:

  • Sketch the graph of the resulting sampling distribution of the sample mean listing  prices, including the x-axis label and scale (use the same x-axis scale for all  three plots) and the overlayed normal distribution.
  • Write down the mean of the sampling distribution of the sample mean listing  prices.
  • Write down the standard deviation of the sampling distribution of the sample  mean listing prices.

a) [latex]n= 2[/latex]

b) [latex]n= 10[/latex]

c) [latex]n= 50[/latex]

Question 5

5) Now, you’ll compare the results in Question 4 to your predictions from Question 3.

a) As the sample size increases, how does the shape of the sampling  distribution of the mean listing prices change? Does this match the pattern in  your predicted shapes from Question 3, Parts A, B, and C?

b) As the sample size increases, how does the standard deviation of the  sampling distribution of the mean listing prices change? Does this match  your prediction from Question 3, Part D?

c) As the sample size increases, how does the mean of the sampling distribution of the mean listing prices change? Does this match your  prediction from Question 3, Part E?

We can calculate the mean of sample means and standard deviation of sample means  through simulation, as in Question 4, or through mathematical formulas. Suppose the  mean of the population is µ and the standard deviation of the population is σ. Then the  mean of the sample means is the same as the population mean, µ, but the standard  deviation of the sample means decreases with the sample size. Specifically, the  standard deviation of the sample means is equal to [latex]\frac{\sigma}{\sqrt{n}}[/latex].

Sampling Distribution of the Sample Mean

When taking many random samples of size [latex]n[/latex] from a population distribution with mean [latex]\mu[/latex] and standard deviation [latex]\sigma[/latex]:

The mean of the distribution of the sample means is [latex]\mu[/latex].

The standard deviation of the distribution of the sample means is [latex]\frac{\sigma}{\sqrt{n}}[/latex].

Question 6

6) Examine again the population distribution of the NYC Airbnb listing prices displayed in the data analysis tool. What are the values of [latex]\mu[/latex] and [latex]\sigma[/latex]?

Question 7

7) Calculate the mean and standard deviation of sample mean listing prices for each of  the following sample sizes using the mathematical formulas given previously.

a) [latex]n=2[/latex]

b) [latex]n=10[/latex]

c) [latex]n=50[/latex]

Question 8

8) Compare the simulated mean and standard deviation of the sample mean listing prices from Question 4 to those calculated in Question 7. Do the values seem similar? (Hint: They should!) 

In In-Class Activity 9.C, you saw the Central Limit Theorem at work for sample  proportions. In this activity, you just witnessed the Central Limit Theorem at work for sample means. The Central Limit Theorem states that, as the sample size gets larger,  the distribution of the sample mean will become closer to a normal distribution.  Combining this with the expressions for the mean and standard deviation of the sample  means results in:

Sampling Distribution of the Sample Mean

When taking many random samples of size [latex]n[/latex] from a population distribution with mean [latex]\mu[/latex] and standard deviation [latex]\sigma[/latex]:

The mean of the distribution of the sample means is [latex]\mu[/latex].

The standard deviation of the distribution of the sample means is [latex]\frac{\sigma}{\sqrt{n}}[/latex].

If the population distribution is not normal, the Central Limit Theorem states that the  distribution of the sample means still follows an approximate normal distribution as long  as the sample size is large (e.g., [latex]n≥30[/latex]) and the population distribution is not strongly  skewed.

Question 9

9) Suppose you are planning a vacation to Los Angeles (LA), and you would like to  learn more about the distribution of the Airbnb listing prices in LA. You take a random sample of 50 LA Airbnb listings. The mean listing price in your sample is  $152.

Assume the population of all LA Airbnb listing prices has the same mean and  standard deviation as that for NYC. (You found these in Question 6.)

a) Use the mean and standard deviation you found in Question 7, Part C to  calculate the z-score for a sample mean of $152. Write a sentence  interpreting this value in the context of the problem.

b) Using the normal approximation, find the probability of observing a sample  mean listing price of $152 or higher from a random sample of 50 listings.

c) Based on your answers in Parts A and B, do these data provide evidence that the mean Airbnb listing price in LA is higher than the mean Airbnb listing  price in NYC? Explain.


  1. Kaggle. (2019). New York City Airbnb open data. https://www.kaggle.com/dgomonov/new-york-city airbnb-open-data