12D Preview

Preparing for the next class

In the next in-class activity, you will need to be able to determine if two samples are  independent or paired and use information from independent samples to assess  whether the assumptions/conditions for the two-sample t confidence interval are  reasonably met.

When you are interested in estimating a difference in population means, you usually  start with data from samples from each of the populations of interest. There are two  different strategies for selecting the two samples. One strategy is to select a sample

from one population and then independently select a sample from the second  population. Using this strategy results in two samples where the individuals selected for  the first sample do not influence the individuals selected for the second sample. This  would be the case if you take a random sample from each population. Samples selected  in this way are said to be independent samples.

Question 1

1) Suppose you are interested in estimating the difference in mean typing speeds for  high school students who have taken a keyboarding class and high school students  who have not taken a keyboarding class.

Consider the following ways of selecting the samples of high school students.

a) You select a random sample of 40 students from the population of all  students at a high school who have taken the keyboarding class offered by  the school, and you also select a random sample of 40 students from the  population of all students who have not taken the keyboarding class. Would  this result in independent samples? Explain.

Hint: Look back at the explanation of what it means for samples to be independent.

b) You select a random sample of students who have signed up for the  keyboarding class to represent the population of students who have not yet  taken the keyboarding class. You measure their typing speeds before they  start the class. Then, at the end of the class, you use this same sample of  students to represent the population of students who have taken the  keyboarding class and measure their typing speeds. Would this result in  independent samples? Explain.

Hint: There are still two samples—one from the population of students who have not had a keyboarding class and one from the population of students who have had a  keyboarding class—but are they independent samples?

The strategy for selecting the samples in Question 1, Part B results in samples where  each observation in one sample is paired in a logical way with a particular observation  in the second sample. In that example, the observations would be paired by student—there is a before keyboarding class typing speed and an after keyboarding class typing  speed for each student. If samples are chosen in a way that results in the observations  in one sample being paired with the observations in the other sample, the samples are  said to be paired samples. Paired samples are also sometimes called dependent samples.

One common process that results in paired samples is when data are collected both  before and after some intervention (like the keyboarding class). But there are other data  collection methods that can result in paired samples. One example would be if  participants in a study to evaluate the effect of exercise (light vs. moderate exercise) were paired by weight prior to the study, and then one person from each pair was  assigned to each exercise group. This would result in exercise groups that were similar  with respect to weight, and the two samples would be paired because there is a logical  way to match an observation from the light exercise group with a particular observation  from the moderate exercise group.

It is important to make a distinction between independent samples and paired samples  because the way the data from the samples are analyzed is different for these two  cases. In the next in-class activity, you will be working only with independent samples.  How to analyze data from paired samples is the topic of In-Class Activity 13.D.

Question 2

2) Suppose you want to compare the mean reaction times (while using cell phones) of students at your college to the mean reaction times (while not using cell phones) of  students at your college.

a) Give an example of a strategy that would result in two samples of size 40  that are independent samples.

Hint: Your strategy would involve 80 participants.

b) Give an example of a strategy that would result in two samples of size 40  that are paired samples.

Hint: Think of a way you could collect data from 40 total participants, or think of a  way you could collect data on 80 participants who have been paired in some way.

Recall that when you are using a confidence interval to estimate a population  parameter, there are a few assumptions/conditions that you should check before  proceeding. When you are interested in estimating a difference in population means  using data from independent samples, you will use a two-sample t confidence interval.  The conditions that you need to check for the two-sample t confidence interval are:

  1. The samples are independent.
  2. Each sample is a random sample from the corresponding population of interest or it is reasonable to regard the sample as if it were a random sample. It is reasonable to regard the sample as a random sample if it was selected in a way that should result in the sample being representative of the population. If the data  are from an experiment, you just need to check that there was random assignment to experimental groups—this substitutes for the random sample  condition and also results in independent samples. 
  1. For each population, the distribution of the variable that was measured is approximately normal, or the sample size for the sample from that population is large. Usually, a sample of size 30 or more is considered to be “large.” If a sample size is less than 30, you should look at a plot of the data from that sample (a dotplot, a boxplot, or, if the sample size isn’t really small, a histogram) to make sure that the distribution looks approximately symmetric and that there are no outliers.

Notice the last two conditions are the same as those for the one-sample t confidence  interval. You just have to remember to check them for each of the two samples and to  make sure that you have independent samples.

Question 3

3) Data from a study of 64 students at the University of Utah[1] were used to construct  the following graphs. In this study, the 64 students were randomly assigned to one of  two groups. Students in both groups were asked to drive in a driving simulator and to  press a brake button as quickly as possible when they saw a red light. Response  times (in milliseconds) were measured. Students in one group used their cell phones while driving in the simulator and students in the other group did not use their cell  phones. You would like to use a two-sample t confidence interval to estimate the  difference in mean reaction times for the two conditions.

a) Is it reasonable to think that conditions 1 and 2 (defined previously) for the  two-sample t confidence interval listed above are met?

Hint: See the note about data from an experiment.

b) Based on the boxplots, do you think it is reasonable to think that the  population distribution of the response times when driving while using cell  phones is approximately normal?

c) Is it reasonable to think that condition 3 (defined previously) for the two sample t confidence interval is met?


  1. Strayer, D. L., & Johnston, W. A. (2001, November 1). Driven to distraction: Dual-task studies of  simulated driving and conversing on a cellular telephone. Psychological Science, 12(6), 462–466.  https://doi.org/10.1111/1467-9280.00386