{"id":629,"date":"2017-05-11T17:14:13","date_gmt":"2017-05-11T17:14:13","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/chapter\/chapter-5-one-way-analysis-of-variance\/"},"modified":"2017-05-11T18:24:11","modified_gmt":"2017-05-11T18:24:11","slug":"chapter-5-one-way-analysis-of-variance","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/chapter\/chapter-5-one-way-analysis-of-variance\/","title":{"raw":"Chapter 5:  One-Way Analysis of Variance","rendered":"Chapter 5:  One-Way Analysis of Variance"},"content":{"raw":"<div class=\"Basic-Text-Frame\">\r\n\r\nPreviously, we have tested hypotheses about two population means. This chapter examines methods for comparing more than two means. Analysis of variance (ANOVA) is an inferential method used to test the equality of three or more population means.\r\n<p class=\"Centered\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span>= \u2026=\u00b5<span class=\"Subscript SmallText\">k<\/span><\/p>\r\nThis method is also referred to as single-factor ANOVA because we use a single property, or characteristic, for categorizing the populations. This characteristic is sometimes referred to as a treatment or factor.\r\n<p class=\"Callout\"><span class=\"pullquote-left\">A treatment (or factor) is a property, or characteristic, that allows us to distinguish the different populations from one another.<\/span><\/p>\r\nThe objects of ANOVA are (1) estimate treatment means, and the differences of treatment means; (2) test hypotheses for statistical significance of comparisons of treatment means, where \u201ctreatment\u201d or \u201cfactor\u201d is the characteristic that distinguishes the populations.\r\n\r\nFor example, a biologist might compare the effect that three different herbicides may have on seed production of an invasive species in a forest environment. The biologist would want to estimate the mean annual seed production under the three different treatments, while also testing to see which treatment results in the lowest annual seed production. The null and alternative hypotheses are:\r\n<table class=\"no-lines\"><colgroup> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: at least one of the means is significantly different from the others<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nIt would be tempting to test this null hypothesis H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span> by comparing the population means two at a time. If we continue this way, we would need to test three different pairs of hypotheses:\r\n<table class=\"no-lines\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span><\/td>\r\n<td class=\"Table\">AND<\/td>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\r\n<td class=\"Table\">AND<\/td>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>\u2260 \u00b5<span class=\"Subscript SmallText\">2<\/span><\/td>\r\n<td class=\"Table\"><span class=\"Subscript SmallText\">\u00a0<\/span><\/td>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>\u2260 \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\r\n<td class=\"Table\"><span class=\"Subscript SmallText\">\u00a0<\/span><\/td>\r\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: \u00b5<span class=\"Subscript SmallText\">2<\/span>\u2260 \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nIf we used a 5% level of significance, each test would have a probability of a Type I error (rejecting the null hypothesis when it is true) of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span> = 0.05. Each test would have a 95% probability of correctly not rejecting the null hypothesis. The probability that all three tests correctly do not reject the null hypothesis is 0.95<span class=\"Superscript SmallText\">3<\/span> = 0.86. There is a 1 - 0.95<span class=\"Superscript SmallText\">3<\/span> = 0.14 (14%) probability that at least one test will lead to an incorrect rejection of the null hypothesis. A 14% probability of a Type I error is much higher than the desired alpha of 5% (remember: <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span> is the same as Type I error). As the number of populations increases, the probability of making a Type I error using multiple t-tests also increases. Analysis of variance allows us to test the null hypothesis (all means are equal) against the alternative hypothesis (at least one mean is different) with a specified value of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span>.\r\n\r\nThe assumptions for ANOVA are (1) observations in each treatment group represents a random sample from that population; (2) each of the populations is normally distributed; (3) population variances for each treatment group are homogeneous (i.e., <span class=\"Inline-Equation\"><img class=\"frame-46\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171319\/Image37184_fmt.png\" alt=\"Image37184.PNG\" \/><\/span>). We can easily test the normality of the samples by creating a normal probability plot, however, verifying homogeneous variances can be more difficult. A general rule of thumb is as follows: <em>One-way ANOVA may be used if the largest sample standard deviation is no more than twice the smallest sample standard deviation.<\/em>\r\n\r\nIn the previous chapter, we used a two-sample t-test to compare the means from two independent samples with a common variance. The sample data are used to compute the test statistic:\r\n<p class=\"Centered\"><span class=\"Inline-Equation-Large\"><img class=\"frame-45\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171320\/Image37204_fmt.png\" alt=\"Image37204.PNG\" \/><\/span> where <span class=\"Inline-Equation\"><img class=\"frame-10\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171321\/Image37227_fmt.png\" alt=\"Image37227.PNG\" \/><\/span><\/p>\r\nis the pooled estimate of the common population variance \u03c3<span class=\"Superscript SmallText\">2<\/span>. To test more than two populations, we must extend this idea of pooled variance to include all samples as shown below:\r\n<p class=\"Centered\"><img class=\"frame-172 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171323\/Image37244_fmt.png\" alt=\"Image37244.PNG\" \/><\/p>\r\nwhere S<span class=\"Subscript SmallText\">w<\/span><span class=\"Superscript SmallText\">2<\/span> represents the pooled estimate of the common variance \u03c3<span class=\"Superscript SmallText\">2<\/span>, and it measures the variability of the observations within the different populations <strong class=\"Strong-2\">whether or not H<\/strong><strong><span class=\"Subscript SmallText\">0<\/span> <\/strong><strong class=\"Strong-2\">is true<\/strong>. This is often referred to as the variance within samples (variation due to error).\r\n\r\nIf the null hypothesis IS true (all the means are equal), then all the populations are the same, with a common mean <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span> and variance <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>. Instead of randomly selecting different samples from different populations, we are actually drawing <em>k<\/em> different samples from one population. We know that the sampling distribution for <em>k<\/em> means based on <em>n<\/em> observations will have mean <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<span class=\"Subscript SmallText\"><em>x\u0304<\/em><\/span><\/span> and variance <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>\/n (squared standard error). Since we have drawn <em>k<\/em> samples of <em>n<\/em> observations each, we can estimate the variance of the k sample means (<span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>\/n) by\r\n<p class=\"Centered\"><img class=\"frame-11 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171325\/8840.png\" alt=\"8840.png\" \/><\/p>\r\nConsequently, <em>n<\/em> times the sample variance of the means estimates <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>. We designate this quantity as S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> such that\r\n<p class=\"Centered\"><img class=\"frame-73 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171327\/8847.png\" alt=\"8847.png\" \/><\/p>\r\nwhere S<sub>B<sup>2<\/sup><\/sub>\u00a0is also an unbiased estimate of the common variance <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>, IF H<span class=\"Subscript SmallText\">0<\/span> IS TRUE. This is often referred to as the variance between samples (variation due to treatment).\r\n\r\nUnder the null hypothesis that all <em>k<\/em> populations are identical, we have two estimates of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span> (S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> and S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span>). We can use the ratio of S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span>\/ S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> as a test statistic to test the null hypothesis that H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span>= \u2026= \u00b5<span class=\"Subscript SmallText\">k<\/span>, which follows an F-distribution with degrees of freedom df<span class=\"Subscript SmallText\">1<\/span>= k - 1 and df<span class=\"Subscript SmallText\">2<\/span> = N - <em>k<\/em> (where <em>k<\/em> is the number of populations and N is the total number of observations (N = n<span class=\"Subscript SmallText\">1<\/span> + n<span class=\"Subscript SmallText\">2<\/span>+\u2026+ n<span class=\"Subscript SmallText\">k<\/span>). The numerator of the test statistic measures the variation between sample means. The estimate of the variance in the denominator depends only on the sample variances and is not affected by the differences among the sample means.\r\n\r\nWhen the null hypothesis is true, the ratio of S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> and S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> will be close to 1. When the null hypothesis is false, S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> will tend to be larger than S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> due to the differences among the populations. We will reject the null hypothesis if the F test statistic is larger than the F critical value at a given level of significance (or if the p-value is less than the level of significance).\r\n\r\nTables are a convenient format for summarizing the key results in ANOVA calculations. The following one-way ANOVA table illustrates the required computations and the relationships between the various ANOVA table elements.\r\n\r\n[caption id=\"\" align=\"aligncenter\" width=\"901\"]<img class=\"frame-13\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171329\/8636.png\" alt=\"8636.png\" width=\"901\" height=\"253\" \/> Table 1. One-way ANOVA table.[\/caption]\r\n\r\nThe sum of squares for the ANOVA table has the relationship of SSTo = SSTr + SSE where:\r\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img class=\"frame-26\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171331\/8869.png\" alt=\"8869.png\" \/><\/span>\u00a0\u00a0\u00a0 <span class=\"Inline-Equation\"><img class=\"frame-51\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171332\/8882.png\" alt=\"8882.png\" \/><\/span> \u00a0\u00a0\u00a0<span class=\"Inline-Equation\"><img class=\"frame-7\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171334\/8890.png\" alt=\"8890.png\" \/><\/span><\/p>\r\n<p class=\"Centered\"><strong class=\"Strong-2\">Total variation (SSTo) = explained variation (SSTr) + unexplained variation (SSE)<\/strong><\/p>\r\nThe degrees of freedom also have a similar relationship: df<span class=\"Subscript SmallText\">(SSTo)<\/span> = df<span class=\"Subscript SmallText\">(SSTr)<\/span> + df<span class=\"Subscript SmallText\">(SSE)<\/span>\r\n\r\nThe Mean Sum of Squares for the treatment and error are found by dividing the Sums of Squares by the degrees of freedom for each. While the Sums of Squares are additive, the Mean Sums of Squares are not. The F-statistic is then found by dividing the Mean Sum of Squares for the treatment (MSTr) by the Mean Sum of Squares for the error(MSE). The MSTr is the S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> and the MSE is the S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span>.\r\n<p class=\"Centered\" style=\"text-align: center\"><strong class=\"Strong-2\">F = S<sub>B<\/sub><\/strong><span class=\"Superscript SmallText\">2<\/span><strong class=\"Strong-2\">\/ S<sub>w<\/sub><\/strong><span class=\"Superscript SmallText\">2<\/span> <strong class=\"Strong-2\">= MSTr\/MSE<\/strong><\/p>\r\n\r\n<div class=\"textbox examples\">\r\n<h3>Example 1<\/h3>\r\n<p class=\"ExampleHeading\">An environmentalist wanted to determine if the mean acidity of rain differed among Alaska, Florida, and Texas. He randomly selected six rain dates at each site obtained the following data:<\/p>\r\n\r\n\r\n[caption id=\"\" align=\"aligncenter\" width=\"655\"]<img class=\"frame-47\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171336\/8997.png\" alt=\"8997.png\" width=\"655\" height=\"423\" \/> Table 2. Data for Alaska, Florida, and Texas.[\/caption]\r\n<p class=\"Example\">H<span class=\"Subscript SmallText\">0<\/span>: <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">A<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">F<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">T<\/span> H<span class=\"Subscript SmallText\">1<\/span>: at least one of the means is different<\/p>\r\n\r\n<table class=\"Table\" style=\"margin-left: 23px;width: 705.141px\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr style=\"height: 29px\">\r\n<td class=\"Table\" style=\"width: 80px;height: 29px\">\r\n<p class=\"Table\">State<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 124px;height: 29px\">\r\n<p class=\"Table\">Sample size<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 131px;height: 29px\">\r\n<p class=\"Table\">Sample total<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 143px;height: 29px\">\r\n<p class=\"Table\">Sample mean<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 172.141px;height: 29px\">\r\n<p class=\"Table\">Sample variance<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29.8125px\">\r\n<td class=\"Table\" style=\"width: 80px;height: 29.8125px\">\r\n<p class=\"Table\">Alaska<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 124px;height: 29.8125px\">\r\n<p class=\"Table\">n<span class=\"Subscript SmallText\">1<\/span> = 6<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 131px;height: 29.8125px\">\r\n<p class=\"Table\">30.2<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 143px;height: 29.8125px\">\r\n<p class=\"Table\">5.033<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 172.141px;height: 29.8125px\">\r\n<p class=\"Table\">0.0265<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td class=\"Table\" style=\"width: 80px;height: 29px\">\r\n<p class=\"Table\">Florida<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 124px;height: 29px\">\r\n<p class=\"Table\">n<span class=\"Subscript SmallText\">2<\/span> = 6<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 131px;height: 29px\">\r\n<p class=\"Table\">27.1<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 143px;height: 29px\">\r\n<p class=\"Table\">4.517<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 172.141px;height: 29px\">\r\n<p class=\"Table\">0.1193<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td class=\"Table\" style=\"width: 80px;height: 29px\">\r\n<p class=\"Table\">Texas<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 124px;height: 29px\">\r\n<p class=\"Table\">n<span class=\"Subscript SmallText\">3<\/span> = 6<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 131px;height: 29px\">\r\n<p class=\"Table\">33.22<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 143px;height: 29px\">\r\n<p class=\"Table\">5.537<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"width: 172.141px;height: 29px\">\r\n<p class=\"Table\">0.1575<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p class=\"Caption\" style=\"text-align: center\"><em>Table 3. Summary Table.<\/em><\/p>\r\n<p class=\"Example\">Notice that there are differences among the sample means. Are the differences small enough to be explained solely by sampling variability? Or are they of sufficient magnitude so that a more reasonable explanation is that the <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span>\u2019s are not all equal? The conclusion depends on how much variation among the sample means (based on their deviations from the grand mean) compares to the variation within the three samples.<\/p>\r\n<p class=\"Example\">The grand mean is equal to the sum of all observations divided by the total sample size:<\/p>\r\n<p class=\"Example\"><span class=\"Inline-Equation\"><img class=\"frame-5\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171337\/8898.png\" alt=\"8898.png\" \/><\/span> = grand total\/N = 90.52\/18 = 5.0289<\/p>\r\n<p class=\"Example\">SSTo = (5.11-5.0289)<span class=\"Superscript SmallText\">2<\/span> + (5.01-5.0289)<span class=\"Superscript SmallText\">2<\/span> +\u2026+(5.24-5.0289)<span class=\"Superscript SmallText\">2<\/span><\/p>\r\n<p class=\"Example\">+ (4.87-5.0289)<span class=\"Superscript SmallText\">2<\/span> + (4.18-5.0289)<span class=\"Superscript SmallText\">2<\/span> +\u2026+(4.09-5.0289)<span class=\"Superscript SmallText\">2<\/span><\/p>\r\n<p class=\"Example\">+ (5.46-5.0289)<span class=\"Superscript SmallText\">2<\/span> + (6.29-5.0289)<span class=\"Superscript SmallText\">2<\/span> +\u2026+(5.30-5.0289)<span class=\"Superscript SmallText\">2<\/span> = 4.6384<\/p>\r\n<p class=\"Example\">SSTr = 6(5.033-5.0289)<span class=\"Superscript SmallText\">2<\/span> + 6(4.517-5.0289)<span class=\"Superscript SmallText\">2<\/span> + 6(5.537-5.0289)<span class=\"Superscript SmallText\">2<\/span> = 3.1214<\/p>\r\n<p class=\"Example\">SSE = SSTo \u2013 SSTr = 4.6384 \u2013 3.1214 = 1.5170<\/p>\r\n\r\n\r\n[caption id=\"\" align=\"aligncenter\" width=\"976\"]<img class=\"frame-13\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171339\/8605.png\" alt=\"8605.png\" width=\"976\" height=\"269\" \/> Table 4. One-way ANOVA Table.[\/caption]\r\n<p class=\"Example\">This test is based on df<span class=\"Subscript SmallText\">1<\/span> = k - 1 = 2 and df<span class=\"Subscript SmallText\">2<\/span> = N - k = 15. For <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span> = 0.05, the F critical value is 3.68. Since the observed F = 15.4372 is greater than the F critical value of 3.68, we reject the null hypothesis. There is enough evidence to state that at least one of the means is different.<\/p>\r\n\r\n<\/div>\r\n<h2 class=\"ExampleHeading\">Software Solutions<\/h2>\r\n<h3>Minitab<\/h3>\r\n<p class=\"Centered\"><img class=\"frame-67 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171342\/093_1_fmt.png\" alt=\"093_1.tif\" \/><img class=\"frame-67 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171346\/093_2_fmt.png\" alt=\"093_2.tif\" \/><\/p>\r\n\r\n<h4>One-way ANOVA: pH vs. State<\/h4>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Source<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">DF<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">SS<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">MS<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">F<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">P<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">State<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">2<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">3.121<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.561<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">15.43<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.000<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Error<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">15<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.517<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.101<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Total<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">17 4.638<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\" colspan=\"6\">\r\n<p class=\"Table\">S = 0.3180 R-Sq = 67.29% R-Sq(adj) = 62.93%<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr style=\"height: 43.4844px\">\r\n<td class=\"Table\" style=\"height: 43.4844px\"><\/td>\r\n<td class=\"Table\" style=\"height: 43.4844px\"><\/td>\r\n<td class=\"Table\" style=\"height: 43.4844px\"><\/td>\r\n<td class=\"Table\" style=\"height: 43.4844px\" colspan=\"6\">\r\n<p class=\"Table\">Individual 95% CIs For Mean Based on Pooled StDev<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 43px\">\r\n<td class=\"Table\" style=\"height: 43px\">\r\n<p class=\"Table\">Level<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 43px\">\r\n<p class=\"Table\">N<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 43px\">\r\n<p class=\"Table\">Mean<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 43px\">\r\n<p class=\"Table\">StDev<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 43px\" colspan=\"5\">\r\n<p class=\"Table\">----+---------+---------+---------+-----<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 44px\">\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">Alaska<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">5.0333<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">0.1629<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\" colspan=\"2\">\r\n<p class=\"Table\">(------*------)<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<\/tr>\r\n<tr style=\"height: 44px\">\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">Florida<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">4.5167<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">0.3455<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\" colspan=\"2\">\r\n<p class=\"Table\">(------*------)<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<\/tr>\r\n<tr style=\"height: 44px\">\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">Texas<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">5.5367<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">0.3969<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\" colspan=\"2\">\r\n<p class=\"Table\">(------*------)<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<\/tr>\r\n<tr style=\"height: 43px\">\r\n<td class=\"Table\" style=\"height: 43px\"><\/td>\r\n<td class=\"Table\" style=\"height: 43px\"><\/td>\r\n<td class=\"Table\" style=\"height: 43px\"><\/td>\r\n<td class=\"Table\" style=\"height: 43px\"><\/td>\r\n<td class=\"Table\" style=\"height: 43px\" colspan=\"5\">\r\n<p class=\"Table\">----+---------+---------+---------+-----<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 44px\">\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">4.40<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">4.80<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">5.20<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\">\r\n<p class=\"Table\">5.60<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<\/tr>\r\n<tr style=\"height: 44px\">\r\n<td class=\"Table\" style=\"height: 44px\" colspan=\"3\">\r\n<p class=\"Table\">Pooled StDev = 0.3180<\/p>\r\n<\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<td class=\"Table\" style=\"height: 44px\"><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe p-value (0.000) is less than the level of significance (0.05) so we will reject the null hypothesis.\r\n<h3>Excel<\/h3>\r\n<p class=\"No-Caption\"><span class=\"Picture\"><img class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171349\/092_1_fmt.png\" alt=\"092_1.tif\" \/><\/span><\/p>\r\n\r\n<h4><span class=\"Picture\"><img class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171354\/092_2_fmt.png\" alt=\"092_2.tif\" \/><\/span>ANOVA: Single Factor<\/h4>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table-Heading\">\r\n<p class=\"Table-Heading\"><strong>SUMMARY<\/strong><\/p>\r\n<\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Groups<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Count<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Sum<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Average<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Variance<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Column 1<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">30.2<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">5.033333<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.026547<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Column 2<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">27.1<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">4.516667<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.119347<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Column 3<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">33.22<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">5.536667<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.157507<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table-Heading\">\r\n<p class=\"Table-Heading\"><strong>ANOVA<\/strong><\/p>\r\n<\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<td class=\"Table-Heading\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Source of Variation<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">SS<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">df<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">MS<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">F<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">p-value<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">F crit<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Between Groups<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">3.121378<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">2<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.560689<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">15.43199<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.000229<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">3.68232<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Within Groups<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.517<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">15<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.101133<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Total<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">4.638378<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">17<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe p-value (0.000229) is less than alpha (0.05) so we reject the null hypothesis. There is enough evidence to support the claim that at least one of the means is different.\r\n\r\nOnce we have rejected the null hypothesis and found that at least one of the treatment means is different, the next step is to identify those differences. There are two approaches that can be used to answer this type of question: contrasts and multiple comparisons.\r\n\r\nContrasts can be used only when there are clear expectations BEFORE starting an experiment, and these are reflected in the experimental design. Contrasts are <strong class=\"Strong-2\">planned comparisons<\/strong>. For example, mule deer are treated with drug A, drug B, or a placebo to treat an infection. The three treatments are not symmetrical. The placebo is meant to provide a baseline against which the other drugs can be compared. Contrasts are more powerful than multiple comparisons because they are more specific. They are more able to pick up a significant difference. Contrasts are not always readily available in statistical software packages (when they are, you often need to assign the coefficients), or may be limited to comparing each sample to a control.\r\n\r\nMultiple comparisons should be used when there are no justified expectations. They are <em>aposteriori<\/em>, <strong class=\"Strong-2\">pair-wise tests<\/strong> of significance. For example, we compare the gas mileage for six brands of all-terrain vehicles. We have no prior knowledge to expect any vehicle to perform differently from the rest. Pair-wise comparisons should be performed here, but only if an ANOVA test on all six vehicles rejected the null hypothesis first.\r\n\r\n<strong class=\"Strong-2\">It is NOT appropriate to use a contrast test when suggested comparisons appear only after the data have been collected.<\/strong> We are going to focus on multiple comparisons instead of planned contrasts.\r\n<h2>Multiple Comparisons<\/h2>\r\nWhen the null hypothesis is rejected by the F-test, we believe that there are significant differences among the <em>k<\/em> population means. So, which ones are different? Multiple comparison method is the way to identify which of the means are different while controlling the experiment-wise error (the accumulated risk associated with a family of comparisons). There are many multiple comparison methods available.\r\n\r\nIn <strong class=\"Strong-2\">The Least Significant Difference Test<\/strong>, each individual hypothesis is tested with the student t-statistic. When the Type I error probability is set at some value and the variance s<span class=\"Superscript SmallText\">2<\/span> has <em>v<\/em> degrees of freedom, the null hypothesis is rejected for any observed value such that |t<span class=\"Subscript SmallText\">o<\/span>|&gt;t<span class=\"Symbol-Subscript SmallText\" xml:lang=\"ar-SA\">\u03b1\/2<\/span>, v. It is an abbreviated version of conducting all possible pair-wise t-tests. This method has weak experiment-wise error rate. Fisher\u2019s Protected LSD is somewhat better at controlling this problem.\r\n\r\n<strong class=\"Strong-2\">Bonferroni<\/strong> inequality is a conservative alternative when software is not available. When conducting n comparisons, <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>\u2264 n <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">c<\/span> therefore <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">c<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>\/n. In other words, divide the experiment-wise level of significance by the number of multiple comparisons to get the comparison-wise level of significance. The Bonferroni procedure is based on computing confidence intervals for the differences between each possible pair of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span>\u2019s. The critical value for the confidence intervals comes from a table with (N - <em>k<\/em>) degrees of freedom and <em>k<\/em>(<em>k<\/em> - 1)\/2 number of intervals. If a particular interval does not contain zero, the two means are declared to be significantly different from one another. An interval that contains zero indicates that the two means are NOT significantly different.\r\n\r\n<strong class=\"Strong-2\">Dunnett\u2019s<\/strong> procedure was created for studies where one of the treatments acts as a control treatment for some or all of the remaining treatments. It is primarily used if the interest of the study is determining whether the mean responses for the treatments differ from that of the control. Like Bonferroni, confidence intervals are created to estimate the difference between two treatment means with a specific table of critical values used to control the experiment-wise error rate. The standard error of the difference is <span class=\"Inline-Equation-Large\"><img class=\"frame-43\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171356\/Image37382_fmt.png\" alt=\"Image37382.PNG\" \/><\/span>.\r\n\r\n<strong class=\"Strong-2\">Scheffe\u2019s<\/strong> test is also a conservative method for all possible simultaneous comparisons suggested by the data. This test equates the F statistic of ANOVA with the t-test statistic. Since t<span class=\"Superscript SmallText\">2<\/span> = F then t = \u221aF, we can substitute \u221aF(<span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>, v<span class=\"Subscript SmallText\">1<\/span>, v<span class=\"Subscript SmallText\">2<\/span>) for t(<span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>, v<span class=\"Subscript SmallText\">2<\/span>) for Scheffe\u2019s statistic.\r\n\r\n<strong class=\"Strong-2\">Tukey\u2019s<\/strong> test provides a strong sense of experiment-wise error rate for all pair-wise comparison of treatment means. This test is also known as the <em>Honestly Significant Difference<\/em>. This test orders the treatments from smallest to largest and uses the studentized range statistic\r\n<p class=\"Centered\"><img class=\"frame-710 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171357\/8905.png\" alt=\"8905.png\" \/><\/p>\r\nThe absolute difference of the two means is used because the location of the two means in the calculated difference is arbitrary, with the sign of the difference depending on which mean is used first. For unequal replications, the Tukey-Kramer approximation is used instead.\r\n\r\n<strong class=\"Strong-2\">Student-Newman-Keuls<\/strong> (SNK) test is a multiple range test based on the studentized range statistic like Tukey\u2019s. The critical value is based on a particular pair of means being tested within the entire set of ordered means. Two or more ranges among means are used for test criteria. While it is similar to Tukey\u2019s in terms of a test statistic, it has weak experiment-wise error rates.\r\n\r\nBonferroni, Dunnett\u2019s, and Scheffe\u2019s tests are the most conservative, meaning that the difference between the two means must be greater before concluding a significant difference. The LSD and SNK tests are the least conservative. Tukey\u2019s test is in the middle. Robert Kuehl, author of <em>Design of Experiments: Statistical Principles of Research Design and Analysis<\/em> (2000), states that the Tukey method provides the best protection against decision errors, along with a strong inference about magnitude and direction of differences.\r\n\r\nLet\u2019s go back to our question on mean rain acidity in Alaska, Florida, and Texas. The null and alternative hypotheses were as follows:\r\n<table class=\"Table\"><colgroup> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">H<sub>0<\/sub>: <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">A<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">F<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">T<\/span><\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">H<sub>1<\/sub>: at least one of the means is different<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe p-value for the F-test was 0.000229, which is less than our 5% level of significance. We rejected the null hypothesis and had enough evidence to support the claim that at least one of the means was significantly different from another. We will use Bonferroni and Tukey\u2019s methods for multiple comparisons in order to determine which mean(s) is different.\r\n<h2>Bonferroni Multiple Comparison Method<\/h2>\r\nA Bonferroni confidence interval is computed for each pair-wise comparison. For <em>k<\/em> populations, there will be <em>k<\/em>(<em>k<\/em>-1)\/2 multiple comparisons. The confidence interval takes the form of:\r\n<p class=\"Equation\"><span class=\"Inline-Equation\"><img class=\"frame-96 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171358\/8913.png\" alt=\"8913.png\" \/><\/span><\/p>\r\n<span class=\"Inline-Equation\"><img class=\"frame-96 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171400\/8920.png\" alt=\"8920.png\" \/><\/span>\r\n\r\nWhere MSE is from the analysis of variance table and the Bonferroni <em>t<\/em> critical value comes from the Bonferroni Table given below. The Bonferroni <em>t<\/em> critical value, instead of the student <em>t<\/em> critical value, combined with the use of the MSE is used to achieve a simultaneous confidence level of at least 95% for all intervals computed. The two means are judged to be significantly different if the corresponding interval does not include zero.\r\n\r\n[caption id=\"\" align=\"aligncenter\" width=\"629\"]<img class=\"frame-13\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171402\/8535.png\" alt=\"8535.png\" width=\"629\" height=\"715\" \/> Table 5. Bonferroni t-critical values.[\/caption]\r\n\r\nFor this problem, <em>k<\/em> = 3 so there are <em>k<\/em>(<em>k<\/em> - 1)\/2= 3(3 - 1)\/2 = 3 multiple comparisons. The degrees of freedom are equal to N - <em>k<\/em> = 18 - 3 = 15. The Bonferroni critical value is 2.69.\r\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171404\/8942.png\" alt=\"8942.png\" \/><\/span><\/p>\r\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171407\/9310.png\" alt=\"9310.png\" \/><\/span><\/p>\r\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171409\/8960.png\" alt=\"8960.png\" \/><\/span><\/p>\r\nThe first confidence interval contains all positive values. This tells you that there is a significant difference between the two means and that the mean rain pH for Alaska is significantly greater than the mean rain pH for Florida.\r\n\r\nThe second confidence interval contains all negative values. This tells you that there is a significant difference between the two means and that the mean rain pH of Alaska is significantly lower than the mean rain pH of Texas.\r\n\r\nThe third confidence interval also contains all negative values. This tells you that there is a significant difference between the two means and that the mean rain pH of Florida is significantly lower than the mean rain pH of Texas.\r\n\r\nAll three states have significantly different levels of rain pH. Texas has the highest rain pH, then Alaska followed by Florida, which has the lowest mean rain pH level. You can use the confidence intervals to estimate the mean difference between the states. For example, the average rain pH in Texas ranges from 0.5262 to 1.5138 higher than the average rain pH in Florida.\r\n\r\nNow let\u2019s use the Tukey method for multiple comparisons. We are going to let software compute the values for us. Excel doesn\u2019t do multiple comparisons so we are going to rely on Minitab output.\r\n<p class=\"Centered\"><img class=\"frame-64 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171412\/095_fmt.png\" alt=\"095.tif\" \/><\/p>\r\n\r\n<h4>One-way ANOVA: pH vs. state<\/h4>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Source<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">DF<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">SS<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">MS<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">F<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">P<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">state<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">2<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">3.121<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.561<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">15.4<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.000<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Error<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">15<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.517<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.101<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Total<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">17<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">4.638<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\" colspan=\"2\">\r\n<p class=\"Table\">S = 0.3180<\/p>\r\n<\/td>\r\n<td class=\"Table\" colspan=\"2\">\r\n<p class=\"Table\">R-Sq = 67.29%<\/p>\r\n<\/td>\r\n<td class=\"Table\" colspan=\"2\">\r\n<p class=\"Table\">R-Sq(adj) = 62.93%<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nWe have seen this part of the output before. We now want to focus on the <em>Grouping Information Using Tukey Method.<\/em> All three states have different letters indicating that the mean rain pH for each state is significantly different. They are also listed from highest to lowest. It is easy to see that Texas has the highest mean rain pH while Florida has the lowest.\r\n<h4>Grouping Information Using Tukey Method<\/h4>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">state<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">N<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Mean<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Grouping<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Texas<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">5.5367<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">A<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Alaska<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">5.0333<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">B<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Florida<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">6<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">4.516<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">C<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\" colspan=\"4\">\r\n<p class=\"Table\">Means that do not share a letter are significantly different.<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThis next set of confidence intervals is similar to the Bonferroni confidence intervals. They estimate the difference of each pair of means. The individual confidence interval level is set at 97.97% instead of 95% thus controlling the experiment-wise error rate.\r\n<table class=\"Table\"><colgroup> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Tukey 95% Simultaneous Confidence Intervals<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">All Pairwise Comparisons among Levels of state<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Individual confidence level = <strong class=\"Strong-2\">97.97%<\/strong><\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\" colspan=\"8\">\r\n<p class=\"Table\">state = Alaska subtracted from:<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">state<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Lower<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Center<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Upper<\/p>\r\n<\/td>\r\n<td class=\"Table\" colspan=\"4\">\r\n<p class=\"Table\">---------+---------+---------+---------+<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Florida<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">-0.9931<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">-0.5167<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">-0.0402<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\" colspan=\"2\">\r\n<p class=\"Table\">(-----*----)<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Texas<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.0269<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.5033<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.9798<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\" colspan=\"2\">\r\n<p class=\"Table\">(-----*-----)<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\" colspan=\"4\">\r\n<p class=\"Table\">---------+---------+---------+---------+<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">-0.80<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.00<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.80<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.60<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<table class=\"Table\"><colgroup> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<tbody>\r\n<tr>\r\n<td class=\"Table\" colspan=\"8\">\r\n<p class=\"Table\">state = Florida subtracted from:<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">state<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Lower<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Center<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Upper<\/p>\r\n<\/td>\r\n<td class=\"Table\" colspan=\"4\">\r\n<p class=\"Table\">---------+---------+---------+---------+<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\">\r\n<p class=\"Table\">Texas<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.5435<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.0200<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.4965<\/p>\r\n<\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\" colspan=\"2\">\r\n<p class=\"Table\">(-----*-----)<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\" colspan=\"4\">\r\n<p class=\"Table\">---------+---------+---------+---------+<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\"><\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">-0.80<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.00<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">0.80<\/p>\r\n<\/td>\r\n<td class=\"Table\">\r\n<p class=\"Table\">1.60<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe first pairing is Florida \u2013 Alaska, which results in an interval of (-0.9931, -0.0402). The interval has all negative values indicating that Florida is significantly lower than Alaska. The second pairing is Texas \u2013 Alaska, which results in an interval of (0.0269, 0.9798). The interval has all positive values indicating that Texas is greater than Alaska. The third pairing is Texas \u2013 Florida, which results in an interval from (0.5435, 1.4965). All positive values indicate that Texas is greater than Florida.\r\n\r\nThe intervals are similar to the Bonferroni intervals with differences in width due to methods used. In both cases, the same conclusions are reached.\r\n\r\nWhen we use one-way ANOVA and conclude that the differences among the means are significant, we can\u2019t be absolutely sure that the given factor is responsible for the differences. It is possible that the variation of some other unknown factor is responsible. One way to reduce the effect of extraneous factors is to design an experiment so that it has a completely randomized design. This means that each element has an equal probability of receiving any treatment or belonging to any different group. In general good results require that the experiment be carefully designed and executed.\r\n\r\nAdditional example:\r\n\r\nhttps:\/\/youtu.be\/BMyYXc8cWHs\r\n\r\n<\/div>","rendered":"<div class=\"Basic-Text-Frame\">\n<p>Previously, we have tested hypotheses about two population means. This chapter examines methods for comparing more than two means. Analysis of variance (ANOVA) is an inferential method used to test the equality of three or more population means.<\/p>\n<p class=\"Centered\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span>= \u2026=\u00b5<span class=\"Subscript SmallText\">k<\/span><\/p>\n<p>This method is also referred to as single-factor ANOVA because we use a single property, or characteristic, for categorizing the populations. This characteristic is sometimes referred to as a treatment or factor.<\/p>\n<p class=\"Callout\"><span class=\"pullquote-left\">A treatment (or factor) is a property, or characteristic, that allows us to distinguish the different populations from one another.<\/span><\/p>\n<p>The objects of ANOVA are (1) estimate treatment means, and the differences of treatment means; (2) test hypotheses for statistical significance of comparisons of treatment means, where \u201ctreatment\u201d or \u201cfactor\u201d is the characteristic that distinguishes the populations.<\/p>\n<p>For example, a biologist might compare the effect that three different herbicides may have on seed production of an invasive species in a forest environment. The biologist would want to estimate the mean annual seed production under the three different treatments, while also testing to see which treatment results in the lowest annual seed production. The null and alternative hypotheses are:<\/p>\n<table class=\"no-lines\">\n<colgroup>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: at least one of the means is significantly different from the others<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>It would be tempting to test this null hypothesis H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span> by comparing the population means two at a time. If we continue this way, we would need to test three different pairs of hypotheses:<\/p>\n<table class=\"no-lines\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span><\/td>\n<td class=\"Table\">AND<\/td>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\n<td class=\"Table\">AND<\/td>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>\u2260 \u00b5<span class=\"Subscript SmallText\">2<\/span><\/td>\n<td class=\"Table\"><span class=\"Subscript SmallText\">\u00a0<\/span><\/td>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>\u2260 \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\n<td class=\"Table\"><span class=\"Subscript SmallText\">\u00a0<\/span><\/td>\n<td class=\"Table\">H<span class=\"Subscript SmallText\">1<\/span>: \u00b5<span class=\"Subscript SmallText\">2<\/span>\u2260 \u00b5<span class=\"Subscript SmallText\">3<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>If we used a 5% level of significance, each test would have a probability of a Type I error (rejecting the null hypothesis when it is true) of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span> = 0.05. Each test would have a 95% probability of correctly not rejecting the null hypothesis. The probability that all three tests correctly do not reject the null hypothesis is 0.95<span class=\"Superscript SmallText\">3<\/span> = 0.86. There is a 1 &#8211; 0.95<span class=\"Superscript SmallText\">3<\/span> = 0.14 (14%) probability that at least one test will lead to an incorrect rejection of the null hypothesis. A 14% probability of a Type I error is much higher than the desired alpha of 5% (remember: <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span> is the same as Type I error). As the number of populations increases, the probability of making a Type I error using multiple t-tests also increases. Analysis of variance allows us to test the null hypothesis (all means are equal) against the alternative hypothesis (at least one mean is different) with a specified value of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span>.<\/p>\n<p>The assumptions for ANOVA are (1) observations in each treatment group represents a random sample from that population; (2) each of the populations is normally distributed; (3) population variances for each treatment group are homogeneous (i.e., <span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-46\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171319\/Image37184_fmt.png\" alt=\"Image37184.PNG\" \/><\/span>). We can easily test the normality of the samples by creating a normal probability plot, however, verifying homogeneous variances can be more difficult. A general rule of thumb is as follows: <em>One-way ANOVA may be used if the largest sample standard deviation is no more than twice the smallest sample standard deviation.<\/em><\/p>\n<p>In the previous chapter, we used a two-sample t-test to compare the means from two independent samples with a common variance. The sample data are used to compute the test statistic:<\/p>\n<p class=\"Centered\"><span class=\"Inline-Equation-Large\"><img decoding=\"async\" class=\"frame-45\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171320\/Image37204_fmt.png\" alt=\"Image37204.PNG\" \/><\/span> where <span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-10\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171321\/Image37227_fmt.png\" alt=\"Image37227.PNG\" \/><\/span><\/p>\n<p>is the pooled estimate of the common population variance \u03c3<span class=\"Superscript SmallText\">2<\/span>. To test more than two populations, we must extend this idea of pooled variance to include all samples as shown below:<\/p>\n<p class=\"Centered\"><img decoding=\"async\" class=\"frame-172 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171323\/Image37244_fmt.png\" alt=\"Image37244.PNG\" \/><\/p>\n<p>where S<span class=\"Subscript SmallText\">w<\/span><span class=\"Superscript SmallText\">2<\/span> represents the pooled estimate of the common variance \u03c3<span class=\"Superscript SmallText\">2<\/span>, and it measures the variability of the observations within the different populations <strong class=\"Strong-2\">whether or not H<\/strong><strong><span class=\"Subscript SmallText\">0<\/span> <\/strong><strong class=\"Strong-2\">is true<\/strong>. This is often referred to as the variance within samples (variation due to error).<\/p>\n<p>If the null hypothesis IS true (all the means are equal), then all the populations are the same, with a common mean <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span> and variance <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>. Instead of randomly selecting different samples from different populations, we are actually drawing <em>k<\/em> different samples from one population. We know that the sampling distribution for <em>k<\/em> means based on <em>n<\/em> observations will have mean <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<span class=\"Subscript SmallText\"><em>x\u0304<\/em><\/span><\/span> and variance <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>\/n (squared standard error). Since we have drawn <em>k<\/em> samples of <em>n<\/em> observations each, we can estimate the variance of the k sample means (<span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>\/n) by<\/p>\n<p class=\"Centered\"><img decoding=\"async\" class=\"frame-11 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171325\/8840.png\" alt=\"8840.png\" \/><\/p>\n<p>Consequently, <em>n<\/em> times the sample variance of the means estimates <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>. We designate this quantity as S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> such that<\/p>\n<p class=\"Centered\"><img decoding=\"async\" class=\"frame-73 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171327\/8847.png\" alt=\"8847.png\" \/><\/p>\n<p>where S<sub>B<sup>2<\/sup><\/sub>\u00a0is also an unbiased estimate of the common variance <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span>, IF H<span class=\"Subscript SmallText\">0<\/span> IS TRUE. This is often referred to as the variance between samples (variation due to treatment).<\/p>\n<p>Under the null hypothesis that all <em>k<\/em> populations are identical, we have two estimates of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03c3<\/span><span class=\"Superscript SmallText\">2<\/span> (S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> and S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span>). We can use the ratio of S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span>\/ S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> as a test statistic to test the null hypothesis that H<span class=\"Subscript SmallText\">0<\/span>: \u00b5<span class=\"Subscript SmallText\">1<\/span>= \u00b5<span class=\"Subscript SmallText\">2<\/span>= \u00b5<span class=\"Subscript SmallText\">3<\/span>= \u2026= \u00b5<span class=\"Subscript SmallText\">k<\/span>, which follows an F-distribution with degrees of freedom df<span class=\"Subscript SmallText\">1<\/span>= k &#8211; 1 and df<span class=\"Subscript SmallText\">2<\/span> = N &#8211; <em>k<\/em> (where <em>k<\/em> is the number of populations and N is the total number of observations (N = n<span class=\"Subscript SmallText\">1<\/span> + n<span class=\"Subscript SmallText\">2<\/span>+\u2026+ n<span class=\"Subscript SmallText\">k<\/span>). The numerator of the test statistic measures the variation between sample means. The estimate of the variance in the denominator depends only on the sample variances and is not affected by the differences among the sample means.<\/p>\n<p>When the null hypothesis is true, the ratio of S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> and S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> will be close to 1. When the null hypothesis is false, S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> will tend to be larger than S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span> due to the differences among the populations. We will reject the null hypothesis if the F test statistic is larger than the F critical value at a given level of significance (or if the p-value is less than the level of significance).<\/p>\n<p>Tables are a convenient format for summarizing the key results in ANOVA calculations. The following one-way ANOVA table illustrates the required computations and the relationships between the various ANOVA table elements.<\/p>\n<div style=\"width: 911px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"frame-13\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171329\/8636.png\" alt=\"8636.png\" width=\"901\" height=\"253\" \/><\/p>\n<p class=\"wp-caption-text\">Table 1. One-way ANOVA table.<\/p>\n<\/div>\n<p>The sum of squares for the ANOVA table has the relationship of SSTo = SSTr + SSE where:<\/p>\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-26\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171331\/8869.png\" alt=\"8869.png\" \/><\/span>\u00a0\u00a0\u00a0 <span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-51\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171332\/8882.png\" alt=\"8882.png\" \/><\/span> \u00a0\u00a0\u00a0<span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-7\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171334\/8890.png\" alt=\"8890.png\" \/><\/span><\/p>\n<p class=\"Centered\"><strong class=\"Strong-2\">Total variation (SSTo) = explained variation (SSTr) + unexplained variation (SSE)<\/strong><\/p>\n<p>The degrees of freedom also have a similar relationship: df<span class=\"Subscript SmallText\">(SSTo)<\/span> = df<span class=\"Subscript SmallText\">(SSTr)<\/span> + df<span class=\"Subscript SmallText\">(SSE)<\/span><\/p>\n<p>The Mean Sum of Squares for the treatment and error are found by dividing the Sums of Squares by the degrees of freedom for each. While the Sums of Squares are additive, the Mean Sums of Squares are not. The F-statistic is then found by dividing the Mean Sum of Squares for the treatment (MSTr) by the Mean Sum of Squares for the error(MSE). The MSTr is the S<span class=\"Subscript SmallText\">B<\/span><span class=\"Superscript SmallText\">2<\/span> and the MSE is the S<span class=\"Subscript SmallText\">W<\/span><span class=\"Superscript SmallText\">2<\/span>.<\/p>\n<p class=\"Centered\" style=\"text-align: center\"><strong class=\"Strong-2\">F = S<sub>B<\/sub><\/strong><span class=\"Superscript SmallText\">2<\/span><strong class=\"Strong-2\">\/ S<sub>w<\/sub><\/strong><span class=\"Superscript SmallText\">2<\/span> <strong class=\"Strong-2\">= MSTr\/MSE<\/strong><\/p>\n<div class=\"textbox examples\">\n<h3>Example 1<\/h3>\n<p class=\"ExampleHeading\">An environmentalist wanted to determine if the mean acidity of rain differed among Alaska, Florida, and Texas. He randomly selected six rain dates at each site obtained the following data:<\/p>\n<div style=\"width: 665px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"frame-47\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171336\/8997.png\" alt=\"8997.png\" width=\"655\" height=\"423\" \/><\/p>\n<p class=\"wp-caption-text\">Table 2. Data for Alaska, Florida, and Texas.<\/p>\n<\/div>\n<p class=\"Example\">H<span class=\"Subscript SmallText\">0<\/span>: <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">A<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">F<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">T<\/span> H<span class=\"Subscript SmallText\">1<\/span>: at least one of the means is different<\/p>\n<table class=\"Table\" style=\"margin-left: 23px;width: 705.141px\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr style=\"height: 29px\">\n<td class=\"Table\" style=\"width: 80px;height: 29px\">\n<p class=\"Table\">State<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 124px;height: 29px\">\n<p class=\"Table\">Sample size<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 131px;height: 29px\">\n<p class=\"Table\">Sample total<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 143px;height: 29px\">\n<p class=\"Table\">Sample mean<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 172.141px;height: 29px\">\n<p class=\"Table\">Sample variance<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 29.8125px\">\n<td class=\"Table\" style=\"width: 80px;height: 29.8125px\">\n<p class=\"Table\">Alaska<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 124px;height: 29.8125px\">\n<p class=\"Table\">n<span class=\"Subscript SmallText\">1<\/span> = 6<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 131px;height: 29.8125px\">\n<p class=\"Table\">30.2<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 143px;height: 29.8125px\">\n<p class=\"Table\">5.033<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 172.141px;height: 29.8125px\">\n<p class=\"Table\">0.0265<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td class=\"Table\" style=\"width: 80px;height: 29px\">\n<p class=\"Table\">Florida<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 124px;height: 29px\">\n<p class=\"Table\">n<span class=\"Subscript SmallText\">2<\/span> = 6<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 131px;height: 29px\">\n<p class=\"Table\">27.1<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 143px;height: 29px\">\n<p class=\"Table\">4.517<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 172.141px;height: 29px\">\n<p class=\"Table\">0.1193<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td class=\"Table\" style=\"width: 80px;height: 29px\">\n<p class=\"Table\">Texas<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 124px;height: 29px\">\n<p class=\"Table\">n<span class=\"Subscript SmallText\">3<\/span> = 6<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 131px;height: 29px\">\n<p class=\"Table\">33.22<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 143px;height: 29px\">\n<p class=\"Table\">5.537<\/p>\n<\/td>\n<td class=\"Table\" style=\"width: 172.141px;height: 29px\">\n<p class=\"Table\">0.1575<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p class=\"Caption\" style=\"text-align: center\"><em>Table 3. Summary Table.<\/em><\/p>\n<p class=\"Example\">Notice that there are differences among the sample means. Are the differences small enough to be explained solely by sampling variability? Or are they of sufficient magnitude so that a more reasonable explanation is that the <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span>\u2019s are not all equal? The conclusion depends on how much variation among the sample means (based on their deviations from the grand mean) compares to the variation within the three samples.<\/p>\n<p class=\"Example\">The grand mean is equal to the sum of all observations divided by the total sample size:<\/p>\n<p class=\"Example\"><span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-5\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171337\/8898.png\" alt=\"8898.png\" \/><\/span> = grand total\/N = 90.52\/18 = 5.0289<\/p>\n<p class=\"Example\">SSTo = (5.11-5.0289)<span class=\"Superscript SmallText\">2<\/span> + (5.01-5.0289)<span class=\"Superscript SmallText\">2<\/span> +\u2026+(5.24-5.0289)<span class=\"Superscript SmallText\">2<\/span><\/p>\n<p class=\"Example\">+ (4.87-5.0289)<span class=\"Superscript SmallText\">2<\/span> + (4.18-5.0289)<span class=\"Superscript SmallText\">2<\/span> +\u2026+(4.09-5.0289)<span class=\"Superscript SmallText\">2<\/span><\/p>\n<p class=\"Example\">+ (5.46-5.0289)<span class=\"Superscript SmallText\">2<\/span> + (6.29-5.0289)<span class=\"Superscript SmallText\">2<\/span> +\u2026+(5.30-5.0289)<span class=\"Superscript SmallText\">2<\/span> = 4.6384<\/p>\n<p class=\"Example\">SSTr = 6(5.033-5.0289)<span class=\"Superscript SmallText\">2<\/span> + 6(4.517-5.0289)<span class=\"Superscript SmallText\">2<\/span> + 6(5.537-5.0289)<span class=\"Superscript SmallText\">2<\/span> = 3.1214<\/p>\n<p class=\"Example\">SSE = SSTo \u2013 SSTr = 4.6384 \u2013 3.1214 = 1.5170<\/p>\n<div style=\"width: 986px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"frame-13\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171339\/8605.png\" alt=\"8605.png\" width=\"976\" height=\"269\" \/><\/p>\n<p class=\"wp-caption-text\">Table 4. One-way ANOVA Table.<\/p>\n<\/div>\n<p class=\"Example\">This test is based on df<span class=\"Subscript SmallText\">1<\/span> = k &#8211; 1 = 2 and df<span class=\"Subscript SmallText\">2<\/span> = N &#8211; k = 15. For <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span> = 0.05, the F critical value is 3.68. Since the observed F = 15.4372 is greater than the F critical value of 3.68, we reject the null hypothesis. There is enough evidence to state that at least one of the means is different.<\/p>\n<\/div>\n<h2 class=\"ExampleHeading\">Software Solutions<\/h2>\n<h3>Minitab<\/h3>\n<p class=\"Centered\"><img decoding=\"async\" class=\"frame-67 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171342\/093_1_fmt.png\" alt=\"093_1.tif\" \/><img decoding=\"async\" class=\"frame-67 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171346\/093_2_fmt.png\" alt=\"093_2.tif\" \/><\/p>\n<h4>One-way ANOVA: pH vs. State<\/h4>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Source<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">DF<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">SS<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">MS<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">F<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">P<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">State<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">2<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">3.121<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.561<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">15.43<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.000<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Error<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">15<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.517<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.101<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Total<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">17 4.638<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\" colspan=\"6\">\n<p class=\"Table\">S = 0.3180 R-Sq = 67.29% R-Sq(adj) = 62.93%<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr style=\"height: 43.4844px\">\n<td class=\"Table\" style=\"height: 43.4844px\"><\/td>\n<td class=\"Table\" style=\"height: 43.4844px\"><\/td>\n<td class=\"Table\" style=\"height: 43.4844px\"><\/td>\n<td class=\"Table\" style=\"height: 43.4844px\" colspan=\"6\">\n<p class=\"Table\">Individual 95% CIs For Mean Based on Pooled StDev<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 43px\">\n<td class=\"Table\" style=\"height: 43px\">\n<p class=\"Table\">Level<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 43px\">\n<p class=\"Table\">N<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 43px\">\n<p class=\"Table\">Mean<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 43px\">\n<p class=\"Table\">StDev<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 43px\" colspan=\"5\">\n<p class=\"Table\">&#8212;-+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8211;<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 44px\">\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">Alaska<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">5.0333<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">0.1629<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\" colspan=\"2\">\n<p class=\"Table\">(&#8212;&#8212;*&#8212;&#8212;)<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<\/tr>\n<tr style=\"height: 44px\">\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">Florida<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">4.5167<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">0.3455<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\" colspan=\"2\">\n<p class=\"Table\">(&#8212;&#8212;*&#8212;&#8212;)<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<\/tr>\n<tr style=\"height: 44px\">\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">Texas<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">5.5367<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">0.3969<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\" colspan=\"2\">\n<p class=\"Table\">(&#8212;&#8212;*&#8212;&#8212;)<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<\/tr>\n<tr style=\"height: 43px\">\n<td class=\"Table\" style=\"height: 43px\"><\/td>\n<td class=\"Table\" style=\"height: 43px\"><\/td>\n<td class=\"Table\" style=\"height: 43px\"><\/td>\n<td class=\"Table\" style=\"height: 43px\"><\/td>\n<td class=\"Table\" style=\"height: 43px\" colspan=\"5\">\n<p class=\"Table\">&#8212;-+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8211;<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 44px\">\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">4.40<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">4.80<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">5.20<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\">\n<p class=\"Table\">5.60<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<\/tr>\n<tr style=\"height: 44px\">\n<td class=\"Table\" style=\"height: 44px\" colspan=\"3\">\n<p class=\"Table\">Pooled StDev = 0.3180<\/p>\n<\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<td class=\"Table\" style=\"height: 44px\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The p-value (0.000) is less than the level of significance (0.05) so we will reject the null hypothesis.<\/p>\n<h3>Excel<\/h3>\n<p class=\"No-Caption\"><span class=\"Picture\"><img decoding=\"async\" class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171349\/092_1_fmt.png\" alt=\"092_1.tif\" \/><\/span><\/p>\n<h4><span class=\"Picture\"><img decoding=\"async\" class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171354\/092_2_fmt.png\" alt=\"092_2.tif\" \/><\/span>ANOVA: Single Factor<\/h4>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table-Heading\">\n<p class=\"Table-Heading\"><strong>SUMMARY<\/strong><\/p>\n<\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Groups<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Count<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Sum<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Average<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Variance<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Column 1<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">30.2<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">5.033333<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.026547<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Column 2<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">27.1<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">4.516667<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.119347<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Column 3<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">33.22<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">5.536667<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.157507<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table-Heading\">\n<p class=\"Table-Heading\"><strong>ANOVA<\/strong><\/p>\n<\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<td class=\"Table-Heading\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Source of Variation<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">SS<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">df<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">MS<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">F<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">p-value<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">F crit<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Between Groups<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">3.121378<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">2<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.560689<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">15.43199<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.000229<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">3.68232<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Within Groups<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.517<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">15<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.101133<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Total<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">4.638378<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">17<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The p-value (0.000229) is less than alpha (0.05) so we reject the null hypothesis. There is enough evidence to support the claim that at least one of the means is different.<\/p>\n<p>Once we have rejected the null hypothesis and found that at least one of the treatment means is different, the next step is to identify those differences. There are two approaches that can be used to answer this type of question: contrasts and multiple comparisons.<\/p>\n<p>Contrasts can be used only when there are clear expectations BEFORE starting an experiment, and these are reflected in the experimental design. Contrasts are <strong class=\"Strong-2\">planned comparisons<\/strong>. For example, mule deer are treated with drug A, drug B, or a placebo to treat an infection. The three treatments are not symmetrical. The placebo is meant to provide a baseline against which the other drugs can be compared. Contrasts are more powerful than multiple comparisons because they are more specific. They are more able to pick up a significant difference. Contrasts are not always readily available in statistical software packages (when they are, you often need to assign the coefficients), or may be limited to comparing each sample to a control.<\/p>\n<p>Multiple comparisons should be used when there are no justified expectations. They are <em>aposteriori<\/em>, <strong class=\"Strong-2\">pair-wise tests<\/strong> of significance. For example, we compare the gas mileage for six brands of all-terrain vehicles. We have no prior knowledge to expect any vehicle to perform differently from the rest. Pair-wise comparisons should be performed here, but only if an ANOVA test on all six vehicles rejected the null hypothesis first.<\/p>\n<p><strong class=\"Strong-2\">It is NOT appropriate to use a contrast test when suggested comparisons appear only after the data have been collected.<\/strong> We are going to focus on multiple comparisons instead of planned contrasts.<\/p>\n<h2>Multiple Comparisons<\/h2>\n<p>When the null hypothesis is rejected by the F-test, we believe that there are significant differences among the <em>k<\/em> population means. So, which ones are different? Multiple comparison method is the way to identify which of the means are different while controlling the experiment-wise error (the accumulated risk associated with a family of comparisons). There are many multiple comparison methods available.<\/p>\n<p>In <strong class=\"Strong-2\">The Least Significant Difference Test<\/strong>, each individual hypothesis is tested with the student t-statistic. When the Type I error probability is set at some value and the variance s<span class=\"Superscript SmallText\">2<\/span> has <em>v<\/em> degrees of freedom, the null hypothesis is rejected for any observed value such that |t<span class=\"Subscript SmallText\">o<\/span>|&gt;t<span class=\"Symbol-Subscript SmallText\" xml:lang=\"ar-SA\">\u03b1\/2<\/span>, v. It is an abbreviated version of conducting all possible pair-wise t-tests. This method has weak experiment-wise error rate. Fisher\u2019s Protected LSD is somewhat better at controlling this problem.<\/p>\n<p><strong class=\"Strong-2\">Bonferroni<\/strong> inequality is a conservative alternative when software is not available. When conducting n comparisons, <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>\u2264 n <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">c<\/span> therefore <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">c<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>\/n. In other words, divide the experiment-wise level of significance by the number of multiple comparisons to get the comparison-wise level of significance. The Bonferroni procedure is based on computing confidence intervals for the differences between each possible pair of <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span>\u2019s. The critical value for the confidence intervals comes from a table with (N &#8211; <em>k<\/em>) degrees of freedom and <em>k<\/em>(<em>k<\/em> &#8211; 1)\/2 number of intervals. If a particular interval does not contain zero, the two means are declared to be significantly different from one another. An interval that contains zero indicates that the two means are NOT significantly different.<\/p>\n<p><strong class=\"Strong-2\">Dunnett\u2019s<\/strong> procedure was created for studies where one of the treatments acts as a control treatment for some or all of the remaining treatments. It is primarily used if the interest of the study is determining whether the mean responses for the treatments differ from that of the control. Like Bonferroni, confidence intervals are created to estimate the difference between two treatment means with a specific table of critical values used to control the experiment-wise error rate. The standard error of the difference is <span class=\"Inline-Equation-Large\"><img decoding=\"async\" class=\"frame-43\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171356\/Image37382_fmt.png\" alt=\"Image37382.PNG\" \/><\/span>.<\/p>\n<p><strong class=\"Strong-2\">Scheffe\u2019s<\/strong> test is also a conservative method for all possible simultaneous comparisons suggested by the data. This test equates the F statistic of ANOVA with the t-test statistic. Since t<span class=\"Superscript SmallText\">2<\/span> = F then t = \u221aF, we can substitute \u221aF(<span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>, v<span class=\"Subscript SmallText\">1<\/span>, v<span class=\"Subscript SmallText\">2<\/span>) for t(<span class=\"Symbols\" xml:lang=\"ar-SA\">\u03b1<\/span><span class=\"Subscript SmallText\">e<\/span>, v<span class=\"Subscript SmallText\">2<\/span>) for Scheffe\u2019s statistic.<\/p>\n<p><strong class=\"Strong-2\">Tukey\u2019s<\/strong> test provides a strong sense of experiment-wise error rate for all pair-wise comparison of treatment means. This test is also known as the <em>Honestly Significant Difference<\/em>. This test orders the treatments from smallest to largest and uses the studentized range statistic<\/p>\n<p class=\"Centered\"><img decoding=\"async\" class=\"frame-710 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171357\/8905.png\" alt=\"8905.png\" \/><\/p>\n<p>The absolute difference of the two means is used because the location of the two means in the calculated difference is arbitrary, with the sign of the difference depending on which mean is used first. For unequal replications, the Tukey-Kramer approximation is used instead.<\/p>\n<p><strong class=\"Strong-2\">Student-Newman-Keuls<\/strong> (SNK) test is a multiple range test based on the studentized range statistic like Tukey\u2019s. The critical value is based on a particular pair of means being tested within the entire set of ordered means. Two or more ranges among means are used for test criteria. While it is similar to Tukey\u2019s in terms of a test statistic, it has weak experiment-wise error rates.<\/p>\n<p>Bonferroni, Dunnett\u2019s, and Scheffe\u2019s tests are the most conservative, meaning that the difference between the two means must be greater before concluding a significant difference. The LSD and SNK tests are the least conservative. Tukey\u2019s test is in the middle. Robert Kuehl, author of <em>Design of Experiments: Statistical Principles of Research Design and Analysis<\/em> (2000), states that the Tukey method provides the best protection against decision errors, along with a strong inference about magnitude and direction of differences.<\/p>\n<p>Let\u2019s go back to our question on mean rain acidity in Alaska, Florida, and Texas. The null and alternative hypotheses were as follows:<\/p>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">H<sub>0<\/sub>: <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">A<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">F<\/span> = <span class=\"Symbols\" xml:lang=\"ar-SA\">\u03bc<\/span><span class=\"Subscript SmallText\">T<\/span><\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">H<sub>1<\/sub>: at least one of the means is different<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The p-value for the F-test was 0.000229, which is less than our 5% level of significance. We rejected the null hypothesis and had enough evidence to support the claim that at least one of the means was significantly different from another. We will use Bonferroni and Tukey\u2019s methods for multiple comparisons in order to determine which mean(s) is different.<\/p>\n<h2>Bonferroni Multiple Comparison Method<\/h2>\n<p>A Bonferroni confidence interval is computed for each pair-wise comparison. For <em>k<\/em> populations, there will be <em>k<\/em>(<em>k<\/em>-1)\/2 multiple comparisons. The confidence interval takes the form of:<\/p>\n<p class=\"Equation\"><span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-96 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171358\/8913.png\" alt=\"8913.png\" \/><\/span><\/p>\n<p><span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-96 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171400\/8920.png\" alt=\"8920.png\" \/><\/span><\/p>\n<p>Where MSE is from the analysis of variance table and the Bonferroni <em>t<\/em> critical value comes from the Bonferroni Table given below. The Bonferroni <em>t<\/em> critical value, instead of the student <em>t<\/em> critical value, combined with the use of the MSE is used to achieve a simultaneous confidence level of at least 95% for all intervals computed. The two means are judged to be significantly different if the corresponding interval does not include zero.<\/p>\n<div style=\"width: 639px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"frame-13\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171402\/8535.png\" alt=\"8535.png\" width=\"629\" height=\"715\" \/><\/p>\n<p class=\"wp-caption-text\">Table 5. Bonferroni t-critical values.<\/p>\n<\/div>\n<p>For this problem, <em>k<\/em> = 3 so there are <em>k<\/em>(<em>k<\/em> &#8211; 1)\/2= 3(3 &#8211; 1)\/2 = 3 multiple comparisons. The degrees of freedom are equal to N &#8211; <em>k<\/em> = 18 &#8211; 3 = 15. The Bonferroni critical value is 2.69.<\/p>\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171404\/8942.png\" alt=\"8942.png\" \/><\/span><\/p>\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171407\/9310.png\" alt=\"9310.png\" \/><\/span><\/p>\n<p class=\"Centered\"><span class=\"Inline-Equation\"><img decoding=\"async\" class=\"frame-13 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171409\/8960.png\" alt=\"8960.png\" \/><\/span><\/p>\n<p>The first confidence interval contains all positive values. This tells you that there is a significant difference between the two means and that the mean rain pH for Alaska is significantly greater than the mean rain pH for Florida.<\/p>\n<p>The second confidence interval contains all negative values. This tells you that there is a significant difference between the two means and that the mean rain pH of Alaska is significantly lower than the mean rain pH of Texas.<\/p>\n<p>The third confidence interval also contains all negative values. This tells you that there is a significant difference between the two means and that the mean rain pH of Florida is significantly lower than the mean rain pH of Texas.<\/p>\n<p>All three states have significantly different levels of rain pH. Texas has the highest rain pH, then Alaska followed by Florida, which has the lowest mean rain pH level. You can use the confidence intervals to estimate the mean difference between the states. For example, the average rain pH in Texas ranges from 0.5262 to 1.5138 higher than the average rain pH in Florida.<\/p>\n<p>Now let\u2019s use the Tukey method for multiple comparisons. We are going to let software compute the values for us. Excel doesn\u2019t do multiple comparisons so we are going to rely on Minitab output.<\/p>\n<p class=\"Centered\"><img decoding=\"async\" class=\"frame-64 aligncenter\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1888\/2017\/05\/11171412\/095_fmt.png\" alt=\"095.tif\" \/><\/p>\n<h4>One-way ANOVA: pH vs. state<\/h4>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Source<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">DF<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">SS<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">MS<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">F<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">P<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">state<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">2<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">3.121<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.561<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">15.4<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.000<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Error<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">15<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.517<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.101<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Total<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">17<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">4.638<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\" colspan=\"2\">\n<p class=\"Table\">S = 0.3180<\/p>\n<\/td>\n<td class=\"Table\" colspan=\"2\">\n<p class=\"Table\">R-Sq = 67.29%<\/p>\n<\/td>\n<td class=\"Table\" colspan=\"2\">\n<p class=\"Table\">R-Sq(adj) = 62.93%<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>We have seen this part of the output before. We now want to focus on the <em>Grouping Information Using Tukey Method.<\/em> All three states have different letters indicating that the mean rain pH for each state is significantly different. They are also listed from highest to lowest. It is easy to see that Texas has the highest mean rain pH while Florida has the lowest.<\/p>\n<h4>Grouping Information Using Tukey Method<\/h4>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">state<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">N<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Mean<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Grouping<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Texas<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">5.5367<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">A<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Alaska<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">5.0333<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">B<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Florida<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">6<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">4.516<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">C<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\" colspan=\"4\">\n<p class=\"Table\">Means that do not share a letter are significantly different.<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This next set of confidence intervals is similar to the Bonferroni confidence intervals. They estimate the difference of each pair of means. The individual confidence interval level is set at 97.97% instead of 95% thus controlling the experiment-wise error rate.<\/p>\n<table class=\"Table\">\n<colgroup>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Tukey 95% Simultaneous Confidence Intervals<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">All Pairwise Comparisons among Levels of state<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Individual confidence level = <strong class=\"Strong-2\">97.97%<\/strong><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\" colspan=\"8\">\n<p class=\"Table\">state = Alaska subtracted from:<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">state<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Lower<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Center<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Upper<\/p>\n<\/td>\n<td class=\"Table\" colspan=\"4\">\n<p class=\"Table\">&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Florida<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">-0.9931<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">-0.5167<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">-0.0402<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\" colspan=\"2\">\n<p class=\"Table\">(&#8212;&#8211;*&#8212;-)<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Texas<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.0269<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.5033<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.9798<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\" colspan=\"2\">\n<p class=\"Table\">(&#8212;&#8211;*&#8212;&#8211;)<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\" colspan=\"4\">\n<p class=\"Table\">&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\">\n<p class=\"Table\">-0.80<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.00<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.80<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.60<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"Table\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<tbody>\n<tr>\n<td class=\"Table\" colspan=\"8\">\n<p class=\"Table\">state = Florida subtracted from:<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">state<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Lower<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Center<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">Upper<\/p>\n<\/td>\n<td class=\"Table\" colspan=\"4\">\n<p class=\"Table\">&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\">\n<p class=\"Table\">Texas<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.5435<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.0200<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.4965<\/p>\n<\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\" colspan=\"2\">\n<p class=\"Table\">(&#8212;&#8211;*&#8212;&#8211;)<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\" colspan=\"4\">\n<p class=\"Table\">&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\"><\/td>\n<td class=\"Table\">\n<p class=\"Table\">-0.80<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.00<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">0.80<\/p>\n<\/td>\n<td class=\"Table\">\n<p class=\"Table\">1.60<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The first pairing is Florida \u2013 Alaska, which results in an interval of (-0.9931, -0.0402). The interval has all negative values indicating that Florida is significantly lower than Alaska. The second pairing is Texas \u2013 Alaska, which results in an interval of (0.0269, 0.9798). The interval has all positive values indicating that Texas is greater than Alaska. The third pairing is Texas \u2013 Florida, which results in an interval from (0.5435, 1.4965). All positive values indicate that Texas is greater than Florida.<\/p>\n<p>The intervals are similar to the Bonferroni intervals with differences in width due to methods used. In both cases, the same conclusions are reached.<\/p>\n<p>When we use one-way ANOVA and conclude that the differences among the means are significant, we can\u2019t be absolutely sure that the given factor is responsible for the differences. It is possible that the variation of some other unknown factor is responsible. One way to reduce the effect of extraneous factors is to design an experiment so that it has a completely randomized design. This means that each element has an equal probability of receiving any treatment or belonging to any different group. In general good results require that the experiment be carefully designed and executed.<\/p>\n<p>Additional example:<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-1\" title=\"Statistics - One-way ANOVA\" width=\"500\" height=\"375\" src=\"https:\/\/www.youtube.com\/embed\/BMyYXc8cWHs?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<\/div>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-629\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>Natural Resources Biometrics. <strong>Authored by<\/strong>: Diane Kiernan. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"https:\/\/textbooks.opensuny.org\/natural-resources-biometrics\/\">https:\/\/textbooks.opensuny.org\/natural-resources-biometrics\/<\/a>. <strong>Project<\/strong>: Open SUNY Textbooks. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\">CC BY-NC-SA: Attribution-NonCommercial-ShareAlike<\/a><\/em><\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":622,"menu_order":5,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"Natural Resources Biometrics\",\"author\":\"Diane Kiernan\",\"organization\":\"\",\"url\":\"https:\/\/textbooks.opensuny.org\/natural-resources-biometrics\/\",\"project\":\"Open SUNY Textbooks\",\"license\":\"cc-by-nc-sa\",\"license_terms\":\"\"}]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-629","chapter","type-chapter","status-publish","hentry"],"part":21,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/pressbooks\/v2\/chapters\/629","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/wp\/v2\/users\/622"}],"version-history":[{"count":2,"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/pressbooks\/v2\/chapters\/629\/revisions"}],"predecessor-version":[{"id":1251,"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/pressbooks\/v2\/chapters\/629\/revisions\/1251"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/pressbooks\/v2\/parts\/21"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/pressbooks\/v2\/chapters\/629\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/wp\/v2\/media?parent=629"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/pressbooks\/v2\/chapter-type?post=629"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/wp\/v2\/contributor?post=629"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-natural-resources-biometrics\/wp-json\/wp\/v2\/license?post=629"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}