13D Preview

Preparing for the next class

In the next in-class activity, you will need to be able to calculate the mean of a  difference and identify the differences between independent and dependent samples.

Dependent Samples vs. Independent Samples

Previously, you learned how to create confidence intervals and conduct hypothesis tests  with a single variable. You also learned how to compare means or proportions from two  samples. Some statistical studies use samples from more than one population. In order  to compare the difference between two populations, it is important to identify if the  samples are dependent (paired) or independent. Dependent and independent sample hypothesis tests are used to answer questions about the difference between two population means.

For dependent (paired) samples, the same variable is recorded for each sample, and  there is a logical way to pair the observations from one sample with the observations in  the other sample. In contrast, when samples are independently selected, the same  variable is measured for both samples, but there is no logical way to pair an observation  from one sample with a particular observation from the other sample.

For an example of paired samples, consider an investigation on the effectiveness of  hypnosis in reducing pain. The variable could be the pain level of a patient, and it could  be measured “before” hypnosis and then again “after” hypnosis for the same patient.  This would result in two samples, one “before” pain measurement and one “after” pain  measurement, and there would be a logical pairing of the “before” measurement with  the “after” measurement for the same person. This form of pairing, often referred to as  “pre/post,” is not the only situation where paired samples can be used. Other cases  involve using “natural pairs,” such as twins, siblings, or couples. In either case, it is not  reasonable that the measurement from one sample is not related to the measurement in  the second sample.

Questions 1–7: Use the previous information to determine if the following situations  would result in dependent or independent samples.

Question 1

1) A company that creates fishing accessories is researching two of their most popular  fishing rods. The company collects a random sample of the number of sales for each  fishing rod from 100 of their stores.

Question 2

2) The North Carolina Zoo is researching whether their animals are more active in the  morning or in the evening. An employee at the zoo visits each habitat in the zoo and  collects information for the study. The employee counts how many of each species  is visible in the morning and then visits a second time to count how many of each  species is visible during the evening.

Question 3

3) A company that creates blood pressure medicine is researching the effectiveness of  their new blood pressure medicine. The company conducts a study in which  volunteers are randomly assigned to two groups. One group is given the new  medication and the other group continues to take their current blood pressure  medicine.

Question 4

4) The same company that creates blood pressure medicine is still researching the  effectiveness of their new blood pressure medicine. The company conducts a  second study in which volunteers are all given the new medication. The blood pressure of each patient is measured before the study begins. The patients are all  given the new medication for six weeks. The blood pressure of each patient is  measured after the six-week period.

Question 5

5) A psychologist wants to know if children’s levels of anxiety are different if their  parents are divorced. The psychologist decides to study 100 children from divorced  parents and 100 children from non-divorced parents.

Question 6

6) The quality control manager at a manufacturing plant is investigating the production  rate of two machines that were built with the same materials and the same design  but were manufactured at two different plants.

Question 7

7) A statistics teacher wants to know if a curriculum is effective. The teacher conducts  a pre-test, implements the curriculum, and then conducts a post-test on the same  group of students. The scores on the pre-tests and post-tests are used to compare  the difference in understanding of statistics before and after students completed the  curriculum.

Question 8

8) Suppose you want to study the effectiveness of a diet. Suppose that eight people  were randomly selected to participate in your study. The weight (lb) of each of the  eight participants is recorded before and after the diet in the following table. You  know from past studies that body weight is approximately normally distributed.

Patient Before After
1 150 146
2 160 159
3 200 200
4 178 174
5 190 189
6 167 160
7 151 148
8 210 198
Mean [latex](\mu)[/latex]

a) What is the average weight before and after the diet? Fill in your answers in the table.

b) On average, how many pounds did the participants lose? In other words,  what is the estimated difference between the mean weight before and after the diet?

c) Are the two samples independent or dependent?

Question 9

9) Consider the previous example using this new table:

Patient Before After Difference [latex](d)[/latex]
1 150 146 146–150 = −4
2 160 159 159–160 = −1
3 200 200
4 178 174
5 190 189
6 167 160
7 151 148
8 210 198
Mean [latex](\mu)[/latex]

a) How much weight did each individual lose? Complete the table by finding the  difference in each participant’s weight (after−before).

b) Consider ONLY the difference variable. What is the average weight loss for  the eight participants?

c) How does your answer to Question 9, Part B compare to your answer from  Question 8, Part B?

Comparing Means from Two Dependent (Paired) Samples

We will use the individual differences, [latex]d[/latex], between each pair as our sample. A dependent or paired t-test compares the mean of the differences, [latex]\mu_{d}[/latex], to a  hypothesized value, which is often 0. Thus, a dependent t-test is the same as a one sample t-test performed on the difference variable, [latex]d[/latex].

When thinking about the difference variable, we need to use a different calculation for the standard deviation of the estimate. The standard deviation of the difference in the  sample means, [latex]\bar{x}_{1}-\bar{x}_{2}[/latex] is NOT the same as the standard deviation of the difference variable, denoted using [latex]s_{d}[/latex].

Since a dependent t-test is the same as a one-sample t-test on the mean of the difference variable, the assumptions for a paired t-test are the same as those discussed  in In-Class Activity 13.B for a single sample hypothesis test for means.

Conditions for a One-Sample t-Test

  1. The sample is a random sample from the population of interest or it is  reasonable to regard the sample as random. It is reasonable to regard the  sample as a random sample if it was selected in a way that should result in a  sample that is representative of the population.
  2. For each population, the distribution of the variable that was measured is  approximately normal, or the sample size for the sample from that  population is large. Usually, a sample of size 30 or more is considered to be  “large.” If a sample size is less than 30, you should look at a plot of the data  from that sample (a dotplot, a boxplot, or, if the sample size isn’t really small, a  histogram) to make sure that the distribution looks approximately symmetric  and that there are no outliers.

In summary, where [latex]k[/latex] is the value of the null hypothesis, we have:

Null Hypothesis for Independent Samples Null Hypothesis for Dependent Samples
[latex]H_{0}:\mu_{1}-\mu_{2}=k[/latex] [latex]H_{0}:\mu_{d}=k[/latex]
Alternative Hypothesis for Independent Samples Alternative Hypothesis for Dependent Samples
[latex]H_{A}:\mu_{1}-\mu_{2}>k[/latex] [latex]H_{A}:\mu_{d}>k[/latex]
[latex]H_{A}:\mu_{1}-\mu_{2} [latex]H_{A}:\mu_{d}
[latex]H_{A}:\mu_{1}-\mu_{2}\neq k[/latex] [latex]H_{A}:\mu_{d}\neq k[/latex]

The notations for the summary statistics used to compare paired populations/samples  are shown in the following table. We will use[latex]d[/latex] to represent the difference variable.

Summary Statistics Notation
Population Mean of Difference [latex]\mu_{d}[/latex]
Sample Mean of Difference [latex]\bar{d}[/latex]
Population Standard Deviation of Difference [latex]\sigma_{d}[/latex]
Sample Standard Deviation of  Difference [latex]s_{d}[/latex]

Question 10

10) It is a common belief that using higher-octane fuel will improve the gas mileage of a vehicle. In order to test this claim, a mechanic randomly selects 12 customers to  participate in a study. The mechanic puts 10 gallons of fuel in each participant’s car  and asks participants to circle a racetrack until they run out of gas. Each participant  is asked to perform this action two times, once with 87-octane fuel and another time  with 92-octane fuel. The differences in miles driven (miles driven with 87-octane fuel and miles driven with 92-octane fuel) are calculated and recorded. The  participants do not know which fuels they are using while they are driving around  the racetrack.

What are the appropriate null and alternative hypotheses for this scenario?