12D Preview

Preparing for the next class

In the next in-class activity, you will need to be able to determine if two samples are independent or paired and use information from independent samples to assess whether the assumptions/conditions for the two-sample t confidence interval are reasonably met.

When you are interested in estimating a difference in population means, you usually start with data from samples from each of the populations of interest. There are two different strategies for selecting the two samples. One strategy is to select a sample

from one population and then independently select a sample from the second population. Using this strategy results in two samples where the individuals selected for the first sample do not influence the individuals selected for the second sample. This would be the case if you take a random sample from each population. Samples selected in this way are said to be independent samples.

Question 1

1) Suppose you are interested in estimating the difference in mean typing speeds for high school students who have taken a keyboarding class and high school students who have not taken a keyboarding class.

Consider the following ways of selecting the samples of high school students.

a) You select a random sample of 40 students from the population of all students at a high school who have taken the keyboarding class offered by the school, and you also select a random sample of 40 students from the population of all students who have not taken the keyboarding class. Would this result in independent samples? Explain.

Hint: Look back at the explanation of what it means for samples to be independent.

b) You select a random sample of students who have signed up for the keyboarding class to represent the population of students who have not yet taken the keyboarding class. You measure their typing speeds before they start the class. Then, at the end of the class, you use this same sample of students to represent the population of students who have taken the keyboarding class and measure their typing speeds. Would this result in independent samples? Explain.

Hint: There are still two samples—one from the population of students who have not had a keyboarding class and one from the population of students who have had a keyboarding class—but are they independent samples?

The strategy for selecting the samples in Question 1, Part B results in samples where each observation in one sample is paired in a logical way with a particular observation in the second sample. In that example, the observations would be paired by student—there is a before keyboarding class typing speed and an after keyboarding class typing speed for each student. If samples are chosen in a way that results in the observations in one sample being paired with the observations in the other sample, the samples are said to be paired samples. Paired samples are also sometimes called dependent samples.

One common process that results in paired samples is when data are collected both before and after some intervention (like the keyboarding class). But there are other data collection methods that can result in paired samples. One example would be if participants in a study to evaluate the effect of exercise (light vs. moderate exercise) were paired by weight prior to the study, and then one person from each pair was assigned to each exercise group. This would result in exercise groups that were similar with respect to weight, and the two samples would be paired because there is a logical way to match an observation from the light exercise group with a particular observation from the moderate exercise group.

It is important to make a distinction between independent samples and paired samples because the way the data from the samples are analyzed is different for these two cases. In the next in-class activity, you will be working only with independent samples. How to analyze data from paired samples is the topic of In-Class Activity 13.D.

Question 2

2) Suppose you want to compare the mean reaction times (while using cell phones) of students at your college to the mean reaction times (while not using cell phones) of students at your college.

a) Give an example of a strategy that would result in two samples of size 40 that are independent samples.

Hint: Your strategy would involve 80 participants.

b) Give an example of a strategy that would result in two samples of size 40 that are paired samples.

Hint: Think of a way you could collect data from 40 total participants, or think of a way you could collect data on 80 participants who have been paired in some way.

Recall that when you are using a confidence interval to estimate a population parameter, there are a few assumptions/conditions that you should check before proceeding. When you are interested in estimating a difference in population means using data from independent samples, you will use a two-sample t confidence interval. The conditions that you need to check for the two-sample t confidence interval are:

The samples are independent.
Each sample is a random sample from the corresponding population of interest or it is reasonable to regard the sample as if it were a random sample. It is reasonable to regard the sample as a random sample if it was selected in a way that should result in the sample being representative of the population. If the data are from an experiment, you just need to check that there was random assignment to experimental groups—this substitutes for the random sample condition and also results in independent samples.

For each population, the distribution of the variable that was measured is approximately normal, or the sample size for the sample from that population is large. Usually, a sample of size 30 or more is considered to be “large.” If a sample size is less than 30, you should look at a plot of the data from that sample (a dotplot, a boxplot, or, if the sample size isn’t really small, a histogram) to make sure that the distribution looks approximately symmetric and that there are no outliers.

Notice the last two conditions are the same as those for the one-sample t confidence interval. You just have to remember to check them for each of the two samples and to make sure that you have independent samples.

Question 3

3) Data from a study of 64 students at the University of Utah^[1] were used to construct the following graphs. In this study, the 64 students were randomly assigned to one of two groups. Students in both groups were asked to drive in a driving simulator and to press a brake button as quickly as possible when they saw a red light. Response times (in milliseconds) were measured. Students in one group used their cell phones while driving in the simulator and students in the other group did not use their cell phones. You would like to use a two-sample t confidence interval to estimate the difference in mean reaction times for the two conditions.

a) Is it reasonable to think that conditions 1 and 2 (defined previously) for the two-sample t confidence interval listed above are met?

Hint: See the note about data from an experiment.

b) Based on the boxplots, do you think it is reasonable to think that the population distribution of the response times when driving while using cell phones is approximately normal?

c) Is it reasonable to think that condition 3 (defined previously) for the two sample t confidence interval is met?

Strayer, D. L., & Johnston, W. A. (2001, November 1). Driven to distraction: Dual-task studies of simulated driving and conversing on a cellular telephone. Psychological Science, 12(6), 462–466. https://doi.org/10.1111/1467-9280.00386 ↵

Module 12

Question 1

Question 2

Question 3