Chapter 6: Two-way Analysis of Variance

In the previous chapter we used one-way ANOVA to analyze data from three or more populations using the null hypothesis that all means were the same (no treatment effect). For example, a biologist wants to compare mean growth for three different levels of fertilizer. A one-way ANOVA tests to see if at least one of the treatment means is significantly different from the others. If the null hypothesis is rejected, a multiple comparison method, such as Tukey’s, can be used to identify which means are different, and the confidence interval can be used to estimate the difference between the different means.

Suppose the biologist wants to ask this same question but with two different species of plants while still testing the three different levels of fertilizer. The biologist needs to investigate not only the average growth between the two species (main effect A) and the average growth for the three levels of fertilizer (main effect B), but also the interaction or relationship between the two factors of species and fertilizer. Two-way analysis of variance allows the biologist to answer the question about growth affected by species and levels of fertilizer, and to account for the variation due to both factors simultaneously.

Our examination of one-way ANOVA was done in the context of a completely randomized design where the treatments are assigned randomly to each subject (or experimental unit). We now consider analysis in which two factors can explain variability in the response variable. Remember that we can deal with factors by controlling them, by fixing them at specific levels, and randomly applying the treatments so the effect of uncontrolled variables on the response variable is minimized. With two factors, we need a factorial experiment.

9778.png

Table 1. Observed data for two species at three levels of fertilizer. 

This is an example of a factorial experiment in which there are a total of 2 x 3 = 6 possible combinations of the levels for the two different factors (species and level of fertilizer). These six combinations are referred to as treatments and the experiment is called a 2 x 3 factorial experiment. We use this type of experiment to investigate the effect of multiple factors on a response and the interaction between the factors. Each of the n observations of the response variable for the different levels of the factors exists within a cell. In this example, there are six cells and each cell corresponds to a specific treatment.

When you compare treatment means for a factorial experiment (or for any other experiment), multiple observations are required for each treatment. These are called replicates. For example, if you have four observations for each of the six treatments, you have four replications of the experiment. Replication demonstrates the results to be reproducible and provides the means to estimate experimental error variance. Replication also provides the capacity to increase the precision for estimates of treatment means. Increasing replication decreases Image37424.PNG = Image37440.PNG thereby increasing the precision of Image37460.PNG

Notation

k = number of levels of factor A

l = number of levels of factor B

kl = number of treatments (each one a combination of a factor A level and a factor B level)

m = number of observations on each treatment

Main Effects and Interaction Effect

Main effects deal with each factor separately. In the previous example we have two factors, A and B. The main effect of Factor A (species) is the difference between the mean growth for Species 1 and Species 2, averaged across the three levels of fertilizer. The main effect of Factor B (fertilizer) is the difference in mean growth for levels 1, 2, and 3 averaged across the two species. The interaction is the simultaneous changes in the levels of both factors. If the changes in the level of Factor A result in different changes in the value of the response variable for the different levels of Factor B, we say that there is an interaction effect between the factors. Consider the following example to help clarify this idea of interaction.

Example 1

Factor A has two levels and Factor B has two levels. In the left box, when Factor A is at level 1, Factor B changes by 3 units. When Factor A is at level 2, Factor B again changes by 3 units. Similarly, when Factor B is at level 1, Factor A changes by 2 units. When Factor B is at level 2, Factor A again changes by 2 units. There is no interaction. The change in the true average response when the level of either factor changes from 1 to 2 is the same for each level of the other factor. In this case, changes in levels of the two factors affect the true average response separately, or in an additive manner.

New%20Fig.%201%20pg.132.png

Figure 1. Illustration of interaction effect.

The right box illustrates the idea of interaction. When Factor A is at level 1, Factor B changes by 3 units but when Factor A is at level 2, Factor B changes by 6 units. When Factor B is at level 1, Factor A changes by 2 units but when Factor B is at level 2, Factor A changes by 5 units. The change in the true average response when the levels of both factors change simultaneously from level 1 to level 2 is 8 units, which is much larger than the separate changes suggest. In this case, there is an interaction between the two factors, so the effect of simultaneous changes cannot be determined from the individual effects of the separate changes. Change in the true average response when the level of one factor changes depends on the level of the other factor. You cannot determine the separate effect of Factor A or Factor B on the response because of the interaction.

Assumptions

Basic Assumption: The observations on any particular treatment are independently selected from a normal distribution with variance σ2 (the same variance for each treatment), and samples from different treatments are independent of one another.

We can use normal probability plots to satisfy the assumption of normality for each treatment. The requirement for equal variances is more difficult to confirm, but we can generally check by making sure that the largest sample standard deviation is no more than twice the smallest sample standard deviation.

Although not a requirement for two-way ANOVA, having an equal number of observations in each treatment, referred to as a balance design, increases the power of the test. However, unequal replications (an unbalanced design), are very common. Some statistical software packages (such as Excel) will only work with balanced designs. Minitab will provide the correct analysis for both balanced and unbalanced designs in the General Linear Model component under ANOVA statistical analysis. However, for the sake of simplicity, we will focus on balanced designs in this chapter.

Sums of Squares and the ANOVA Table

In the previous chapter, the idea of sums of squares was introduced to partition the variation due to treatment and random variation. The relationship is as follows:

SSTo = SSTr + SSE

We now partition the variation even more to reflect the main effects (Factor A and Factor B) and the interaction term:

SSTo = SSA + SSB +SSAB +SSE

where

  1. SSTo is the total sums of squares, with the associated degrees of freedom klm – 1
  2. SSA is the factor A main effect sums of squares, with associated degrees of freedom k – 1
  3. SSB is the factor B main effect sums of squares, with associated degrees of freedom l – 1
  4. SSAB is the interaction sum of squares, with associated degrees of freedom (k – 1)(l – 1)
  5. SSE is the error sum of squares, with associated degrees of freedom kl(m – 1)

As we saw in the previous chapter, the magnitude of the SSE is related entirely to the amount of underlying variability in the distributions being sampled. It has nothing to do with values of the various true average responses. SSAB reflects in part underlying variability, but its value is also affected by whether or not there is an interaction between the factors; the greater the interaction, the greater the value of SSAB.

The following ANOVA table illustrates the relationship between the sums of squares for each component and the resulting F-statistic for testing the three null and alternative hypotheses for a two-way ANOVA.

  1. H0: There is no interaction between factors
    H1: There is a significant interaction between factors
  2. H0: There is no effect of Factor A on the response variable
    H1: There is an effect of Factor A on the response variable
  3. H0: There is no effect of Factor B on the response variable
    H1: There is an effect of Factor B on the response variable

If there is a significant interaction, then ignore the following two sets of hypotheses for the main effects. A significant interaction tells you that the change in the true average response for a level of Factor A depends on the level of Factor B. The effect of simultaneous changes cannot be determined by examining the main effects separately. If there is NOT a significant interaction, then proceed to test the main effects. The Factor A sums of squares will reflect random variation and any differences between the true average responses for different levels of Factor A. Similarly, Factor B sums of squares will reflect random variation and the true average responses for the different levels of Factor B.

098.jpg

Table 2. Two-way ANOVA table.

Each of the five sources of variation, when divided by the appropriate degrees of freedom (df), provides an estimate of the variation in the experiment. The estimates are called mean squares and are displayed along with their respective sums of squares and df in the analysis of variance table. In one-way ANOVA, the mean square error (MSE) is the best estimate of σ2 (the population variance) and is the denominator in the F-statistic. In a two-way ANOVA, it is still the best estimate of σ2. Notice that in each case, the MSE is the denominator in the test statistic and the numerator is the mean sum of squares for each main factor and interaction term. The F-statistic is found in the final column of this table and is used to answer the three alternative hypotheses. Typically, the p-values associated with each F-statistic are also presented in an ANOVA table. You will use the Decision Rule to determine the outcome for each of the three pairs of hypotheses.

If the p-value is smaller than α (level of significance), you will reject the null hypothesis.

When we conduct a two-way ANOVA, we always first test the hypothesis regarding the interaction effect. If the null hypothesis of no interaction is rejected, we do NOT interpret the results of the hypotheses involving the main effects. If the interaction term is NOT significant, then we examine the two main effects separately. Let’s look at an example.

Example 2

An experiment was carried out to assess the effects of soy plant variety (factor A, with k = 3 levels) and planting density (factor B, with l = 4 levels – 5, 10, 15, and 20 thousand plants per hectare) on yield. Each of the 12 treatments (k * l) was randomly applied to m = 3 plots (klm = 36 total observations). Use a two-way ANOVA to assess the effects at a 5% level of significance.

9695.png

Table 3. Observed data for three varieties of soy plants at four densities.

It is always important to look at the sample average yields for each treatment, each level of factor A, and each level of factor B.

Density

Variety

5

10

15

20

Sample average yield for each level of factor A

1

9.17

12.40

12.90

10.80

11.32

2

8.90

12.67

14.50

12.77

12.21

3

16.30

18.10

19.87

18.20

18.12

Sample average yield for each level of factor B

11.46

14.39

15.77

13.92

13.88

Table 4. Summary table.

For example, 11.32 is the average yield for variety #1 over all levels of planting densities. The value 11.46 is the average yield for plots planted with 5,000 plants across all varieties. The grand mean is 13.88. The ANOVA table is presented next.

Source

DF

SS

MSS

F

P

variety

2

327.774

163.887

100.48

<0.001

density

3

86.908

28.969

17.76

<0.001

variety*density

6

8.068

1.345

0.82

0.562

error

24

39.147

1.631

total

35

Table 5. Two-way ANOVA table.

You begin with the following null and alternative hypotheses:

H0: There is no interaction between factors

H1: There is a significant interaction between factors

The F-statistic: 10004.png

The p-value for the test for a significant interaction between factors is 0.562. This p-value is greater than 5% (α), therefore we fail to reject the null hypothesis. There is no evidence of a significant interaction between variety and density. So it is appropriate to carry out further tests concerning the presence of the main effects.

H0: There is no effect of Factor A (variety) on the response variable

H1: There is an effect of Factor A on the response variable

The F-statistic: 10014.png

The p-value (<0.001) is less than 0.05 so we will reject the null hypothesis. There is a significant difference in yield between the three varieties.

H0: There is no effect of Factor B (density) on the response variable

H1: There is an effect of Factor B on the response variable

The F-statistic: 10022.png

The p-value (<0.001) is less than 0.05 so we will reject the null hypothesis. There is a significant difference in yield between the four planting densities.

Multiple Comparisons

The next step is to examine the multiple comparisons for each main effect to determine the differences. We will proceed as we did with one-way ANOVA multiple comparisons by examining the Tukey’s Grouping for each main effect. For factor A, variety, the sample means, and grouping letters are presented to identify those varieties that are significantly different from other varieties. Varieties 1 and 2 are not significantly different from each other, both producing similar yields. Variety 3 produced significantly greater yields than both variety 1 and 2.

Grouping Information Using Tukey Method and 95.0% Confidence

variety

N

Mean

Grouping

3

12

18.117

A

2

12

12.208

B

1

12

11.317

B

Means that do not share a letter are significantly different.

Some of the densities are also significantly different. We will follow the same procedure to determine the differences.

Grouping Information Using Tukey Method and 95.0% Confidence

density

N

Mean

Grouping

15

9

15.756

A

10

9

14.389

A

B

20

9

13.922

B

5

9

11.456

C

Means that do not share a letter are significantly different.

The Grouping Information shows us that a planting density of 15,000 plants/plot results in the greatest yield. However, there is no significant difference in yield between 10,000 and 15,000 plants/plot or between 10,000 and 20,000 plants/plot. The plots with 5,000 plants/plot result in the lowest yields and these yields are significantly lower than all other densities tested.

The main effects plots also illustrate the differences in yield across the three varieties and four densities.

9662.png

Figure 2. Main effects plots.

But what happens if there is a significant interaction between the main effects? This next example will demonstrate how a significant interaction alters the interpretation of a 2-way ANOVA.

Example 3

A researcher was interested in the effects of four levels of fertilization (control, 100 lb., 150 lb., and 200 lb.) and four levels of irrigation (A, B, C, and D) on biomass yield. The sixteen possible treatment combinations were randomly assigned to 80 plots (5 plots for each treatment). The total biomass yields for each treatment are listed below.

Fertilizer

Irrigation

Control

100 lb.

150 lb.

200 lb.

A

2700,2801,2720, 2390, 2890

3250, 3151, 3170, 3300, 3290

3300, 3235, 3025, 3165, 3120

3500, 3455, 3100, 3600, 3250

B

3101, 3035, 3205, 3007, 3100

2700, 2935, 2250, 2495, 2850

3050, 3110, 3033, 3195, 4250

3100, 3235, 3005, 3095, 3050

C

101, 97, 106, 142, 99

400, 302, 296, 315, 390

630, 624, 595, 675, 595

400, 325, 200, 375, 390

D

121, 174, 88, 100, 76

100, 125, 91, 222, 219

60, 28, 112, 89, 67

201, 223, 195, 120, 180

Table 6. Observed data for four irrigation levels and four fertilizer levels.

Factor A (irrigation level) has k = 4 levels and factor B (fertilizer) has l = 4 levels. There are m = 5 replicates and 80 total observations. This is a balanced design as the number of replicates is equal. The ANOVA table is presented next.

Source

DF

SS

MSS

F

P

fertilizer

3

1128272

376091

12.76

<0.001

irrigation

3

161776127

53925376

1830.16

<0.001

fert*irrigation

9

2088667

232074

7.88

<0.001

error

64

1885746

29465

total

79

166878812

Table 7. Two-way ANOVA table.

We again begin with testing the interaction term. Remember, if the interaction term is significant, we ignore the main effects.

H0: There is no interaction between factors

H1: There is a significant interaction between factors

The F-statistic: 10031.png

The p-value for the test for a significant interaction between factors is <0.001. This p-value is less than 5%, therefore we reject the null hypothesis. There is evidence of a significant interaction between fertilizer and irrigation. Since the interaction term is significant, we do not investigate the presence of the main effects. We must now examine multiple comparisons for all 16 treatments (each combination of fertilizer and irrigation level) to determine the differences in yield, aided by the factor plot.

Grouping Information Using Tukey Method and 95.0% Confidence

fert

irrigation

N

Mean

Grouping

200

A

5

3381.00

A

150

B

5

3327.60

A

100

A

5

3232.20

A

150

A

5

3169.00

A

200

B

5

3097.00

A

C

B

5

3089.60

A

C

A

5

2700.20

B

100

B

5

2646.00

B

150

C

5

623.80

C

100

C

5

340.60

C

D

200

C

5

338.00

C

D

200

D

5

183.80

D

100

D

5

151.40

D

C

D

5

111.80

D

C

C

5

109.00

D

150

D

5

71.20

D

Means that do not share a letter are significantly different.

The factor plot allows you to visualize the differences between the 16 treatments. Factor plots can present the information two ways, each with a different factor on the x-axis. In the first plot, fertilizer level is on the x-axis. There is a clear distinction in average yields for the different treatments. Irrigation levels A and B appear to be producing greater yields across all levels of fertilizers compared to irrigation levels C and D. In the second plot, irrigation level is on the x-axis. All levels of fertilizer seem to result in greater yields for irrigation levels A and B compared to C and D.

9631.png

Figure 3. Interaction plots.

The next step is to use the multiple comparison output to determine where there are SIGNIFICANT differences. Let’s focus on the first factor plot to do this.

9620.png

Figure 4. Interaction plot.

The Grouping Information tells us that while irrigation levels A and B look similar across all levels of fertilizer, only treatments A-100, A-150, A-200, B-control, B-150, and B-200 are statistically similar (upper circle). Treatment B-100 and A-control also result in similar yields (middle circle) and both have significantly lower yields than the first group.

Irrigation levels C and D result in the lowest yields across the fertilizer levels. We again refer to the Grouping Information to identify the differences. There is no significant difference in yield for irrigation level D over any level of fertilizer. Yields for D are also similar to yields for irrigation level C at 100, 200, and control levels for fertilizer (lowest circle). Irrigation level C at 150 level fertilizer results in significantly higher yields than any yield from irrigation level D for any fertilizer level, however, this yield is still significantly smaller than the first group using irrigation levels A and B.

Interpreting Factor Plots

When the interaction term is significant the analysis focuses solely on the treatments, not the main effects. The factor plot and grouping information allow the researcher to identify similarities and differences, along with any trends or patterns. The following series of factor plots illustrate some true average responses in terms of interactions and main effects.

This first plot clearly shows a significant interaction between the factors. The change in response when level B changes, depends on level A.

9609.png

Figure 5. Interaction plot.

The second plot shows no significant interaction. The change in response for the level of factor A is the same for each level of factor B.

9598.png

Figure 6. Interaction plot.

The third plot shows no significant interaction and shows that the average response does not depend on the level of factor A.

9588.png

Figure 7. Interaction plot.

This fourth plot again shows no significant interaction and shows that the average response does not depend on the level of factor B.

9579.png

Figure 8. Interaction plot.

This final plot illustrates no interaction and neither factor has any effect on the response.

9568.png

Figure 9. Interaction plot.

Summary

Two-way analysis of variance allows you to examine the effect of two factors simultaneously on the average response. The interaction of these two factors is always the starting point for two-way ANOVA. If the interaction term is significant, then you will ignore the main effects and focus solely on the unique treatments (combinations of the different levels of the two factors). If the interaction term is not significant, then it is appropriate to investigate the presence of the main effect of the response variable separately.

Software Solutions

Minitab

113_1.tif113_2.tif

General Linear Model: yield vs. fert, irrigation

Factor

Type

Levels

Values

fert

fixed

4

100,

150,

200,

C

irrigation

fixed

4

A,

B,

C,

D

Analysis of Variance for Yield, using Adjusted SS for Tests

Source

DF

Seq SS

Adj SS

Adj MS

F

P

fert

3

1128272

1128272

376091

12.76

0.000

irrigation

3

161776127

161776127

53925376

1830.16

0.000

fert*irrigation

9

2088667

2088667

232074

7.88

0.000

Error

64

1885746

1885746

29465

Total

79

166878812

S = 171.653 R-Sq = 98.87% R-Sq(adj) = 98.61%

Unusual Observations for yield

Obs

yield

Fit

SE

Fit

Residual

St

Resid

4

2390.00

2700.20

76.77

-310.20

-2.02

R

28

2250.00

2646.00

76.77

-396.00

-2.58

R

35

4250.00

3327.60

76.77

922.40

6.01

R

R denotes an observation with a large standardized residual.

Grouping Information Using Tukey Method and 95.0% Confidence

irrigation

N

Mean

Grouping

A

20

3120.60

A

B

20

3040.05

A

C

20

352.85

B

D

20

129.55

C

Means that do not share a letter are significantly different.

Grouping Information Using Tukey Method and 95.0% Confidence

fert

N

Mean

Grouping

150

20

1797.90

A

200

20

1749.95

A

100

20

1592.55

B

C

20

1502.65

B

Means that do not share a letter are significantly different.

Grouping Information Using Tukey Method and 95.0% Confidence

fert

irrigation

N

Mean

Grouping

200

A

5

3381.00

A

150

B

5

3327.60

A

100

A

5

3232.20

A

150

A

5

3169.00

A

200

B

5

3097.00

A

C

B

5

3089.60

A

C

A

5

2700.20

B

100

B

5

2646.00

B

150

C

5

623.80

C

100

C

5

340.60

C

D

200

C

5

338.00

C

D

200

D

5

183.80

D

100

D

5

151.40

D

C

D

5

111.80

D

C

C

5

109.00

D

150

D

5

71.20

D

Means that do not share a letter are significantly different.

Excel

112_1.tif

112_2.tif

Anova: Two-Factor With Replication

SUMMARY

Bcontrol

B100

B150

B200

Total

AA

Count

5

5

5

5

20

Sum

13501

16161

15845

16905

62412

Average

2700.2

3232.2

3169

3381

3120.6

Variance

35700.2

4679.2

11167.5

40930

87716.57

AB

Count

5

5

5

5

20

Sum

15448

13230

16638

15485

60801

Average

3089.6

2646

3327.6

3097

3040.05

Variance

5839.8

76917.5

269901.3

7432.5

139929.4

AC

Count

5

5

5

5

20

Sum

545

1703

3119

1690

7057

Average

109

340.6

623.8

338

352.85

Variance

351.5

2525.8

1079.7

6782.5

37326.03

AD

Count

5

5

5

5

20

Sum

559

757

356

919

2591

Average

111.8

151.4

71.2

183.8

129.55

Variance

1485.2

4135.3

997.7

1510.7

3590.366

Total

Count

20

20

20

20

Sum

30053

31851

35958

34999

Average

1502.65

1592.55

1797.9

1749.95

Variance

2069464

1977134

2317478

2359637

ANOVA

Source of Variation

SS

df

MS

F

p-value

F crit

Sample

1.62E+08

3

53925376

1830.164

5.98E-62

2.748191

Columns

1128272

3

376090.7

12.76408

1.23E-06

2.748191

Interaction

2088667

9

232074.2

7.876325

1.02E-07

2.029792

Within

1885746

64

29464.78

Total

1.67E+08

79