{"id":5467,"date":"2022-08-30T22:31:59","date_gmt":"2022-08-30T22:31:59","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=5467"},"modified":"2022-08-30T22:34:16","modified_gmt":"2022-08-30T22:34:16","slug":"14c-preview","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/14c-preview\/","title":{"raw":"14C Preview","rendered":"14C Preview"},"content":{"raw":"Preparing for the next class\r\n\r\nIn the next in-class activity, you will need to identify the right types of data for an\u00a0 ANOVA, determine if ANOVA groups are independent random samples, and determine\u00a0 if groups have similar levels of variability.\r\n\r\nIn trying to understand when it is appropriate to use an ANOVA, there are three main\u00a0 conditions we should consider:\r\n<ol>\r\n \t<li>Is it the right type of data?<\/li>\r\n \t<li>Are the groups independent random samples?<\/li>\r\n \t<li>Do the groups have similar levels of variability?<\/li>\r\n<\/ol>\r\nAn ANOVA also requires that the data within each group be normally distributed, but\u00a0 testing for that is outside the scope of this course.\r\n\r\nLet\u2019s look at each condition in more detail.\r\n\r\nAn ANOVA only works if the factor of interest is categorical data, the response variable\u00a0 is numeric and continuous, and the mean of the response variable is the parameter of\u00a0 interest. Remember that categorical data are qualitative data that have no inherent\u00a0 ranking or order. Basically, in an ANOVA, we are interested in comparing the mean of\u00a0 the response variable to more than two independent groups of the factor of interest.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 1<\/h3>\r\n1) Which of the following variables are categorical data? There may be more than one\u00a0 correct answer.\r\n<ol>\r\n \t<li>a) Brand of shoes<\/li>\r\n \t<li>b) Body weight in pounds<\/li>\r\n \t<li>c) Age in years<\/li>\r\n \t<li>d) City of residence<\/li>\r\n \t<li>e) Outdoor temperature in degrees Celsius<\/li>\r\n \t<li>f) Type of medication<\/li>\r\n<\/ol>\r\nHint: Look for data that will be separated by categories rather than measured in\u00a0 numbers.\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 2<\/h3>\r\n2) Which of the following variables are numeric and continuous data? There may be\u00a0 more than one correct answer.\r\n<ol>\r\n \t<li>a) Hours of TV watched<\/li>\r\n \t<li>b) Body weight in pounds<\/li>\r\n \t<li>c) Age in days<\/li>\r\n \t<li>d) Telephone number<\/li>\r\n \t<li>e) Body temperature in degrees Fahrenheit<\/li>\r\n \t<li>f) Type of diet<\/li>\r\n<\/ol>\r\nHint: Look for data measured in numbers.\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 3<\/h3>\r\n<span style=\"font-size: 1rem; text-align: initial;\">3) Which of the following pairings contain the right kind of data for an ANOVA? There\u00a0 may be more than one correct answer.\u00a0<\/span>\r\n<ol>\r\n \t<li>a) Factor of interest: type of medication, response variable: blood pressure b) Factor of interest: body fat percentage, response variable: risk of heart attack<\/li>\r\n \t<li>c) Factor of interest: water temperature, response variable: frequency of coral\u00a0 bleaching events<\/li>\r\n \t<li>d) Factor of interest: television show, response variable: income earned per year<\/li>\r\n \t<li>e) Factor of interest: high school attended, response variable: location of current\u00a0 home<\/li>\r\n \t<li>f) Factor of interest: college degree earned, response variable: annual income<\/li>\r\n<\/ol>\r\nHint: The factor of interest should be categorical, and the response variable should\u00a0 be numeric.\r\n\r\n<\/div>\r\nThe groups being compared using an ANOVA need to be independent random samples or randomly assigned groups in an experiment. Consider the following examples.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 4<\/h3>\r\n4) Suppose an animal rescue group wants to determine the best kind of food to help\u00a0 undernourished animals gain weight. The rescue randomly divides a group of dogs\u00a0 into four groups and feeds a different type of food to each group. They then track\u00a0 weight gain over time. Which of the following statements is the best evaluation of the\u00a0 groups?\r\n<ol>\r\n \t<li>a) The groups are not independent, so an ANOVA is not appropriate.<\/li>\r\n \t<li>b) The groups are independent, randomly assigned experimental groups, so an\u00a0 ANOVA is appropriate.<\/li>\r\n \t<li>c) The groups are assigned correctly, but the type of data being collected is not\u00a0 appropriate for an ANOVA.<\/li>\r\n<\/ol>\r\nHint: Are the groups randomly assigned?\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 5<\/h3>\r\n5) Suppose a high school principal wants to evaluate the impact of student interests on\u00a0 student academic performance. The principal compares the average GPAs of the\u00a0 students in the chess club, marching band, soccer team, and student government\u00a0 association. Which of the following statements is the best evaluation of the groups?\r\n<ol>\r\n \t<li>a) The groups are not randomly selected and there could be overlap between the\u00a0 groups, so an ANOVA is not appropriate.<\/li>\r\n \t<li>b) The groups are independent, randomly selected groups, so an ANOVA is\u00a0 appropriate.<\/li>\r\n \t<li>c) The groups are assigned correctly, but the type of data being collected is not\u00a0 appropriate for an ANOVA.<\/li>\r\n<\/ol>\r\nHint: Is each group an independent random sample?\r\n\r\n<\/div>\r\nThe groups being compared should have equal or similar variability within their groups.\u00a0 There are formal tests that can be used to assess the similarity of variability among\u00a0 ANOVA groups, but they are beyond the scope of this course.\r\n\r\nInstead, we can visually estimate variability by comparing boxplots of data or\u00a0 numerically comparing the standard deviations provided in summary statistics.\u00a0 Remember that the box in a boxplot visually represents the middle 50% of the data and\u00a0 is the size of the interquartile range. While this is not a measurement of the standard\u00a0 deviation, a boxplot allows us to visually compare the spread or variability in each\u00a0 group.\r\n\r\nA good rule of thumb is that as long as the sample sizes are equal, the largest standard\u00a0 deviation can be no more than two times the smallest standard deviation. If the sample\u00a0 sizes are different, the standard deviations need to be really similar.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 6<\/h3>\r\n6) Which of the following two boxplots represent the most similar variances? Hint: Look for boxplots that are similarly shaped.\r\n\r\n<img src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26210955\/Picture632-300x91.png\" alt=\"A box plot with A, B, C, D on the vertical axis. For A, the low point is at approximately 0, the high point is at approximately 5, the low end of the box is at approximately 1, the high end is at approximately 3, and the middle line is at approximately 2. For B, the low point is at approximately 1, the high point is at approximately 5, the low end of the box is at approximately 2.25, the high end of the box is at approximately 3.5, and the middle line is at approximately 3. For C, the low point is at approximately 1, the high point is at approximately 7, the low end of the box is at approximately 2, the high end of the box is at approximately 4.25, and the middle line is at approximately 4. For D, the low point is at approximately 2.5, the high point is at approximately 8, the low end of the box is at approximately 4, the high end is at approximately 6, and the middle line is at approximately 5. There are points \u201cy bar sub 1\u201d at approximately 2, \u201cy bar sub 2\u201d at approximately 3, \u201cy bar sub 3\u201d at approximately 3.5, and \u201cy bar sub 4\u201d at approximately 5.\" \/><\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 7<\/h3>\r\n7) Using the previous rule of thumb, determine whether the equal variance assumption\u00a0 for ANOVA is reasonable for the following four studies. Suppose each of these\u00a0 studies has equal sample sizes across all groups.\r\n<div align=\"left\">\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td>Study #<\/td>\r\n<td>Smallest SD<\/td>\r\n<td>Largest SD<\/td>\r\n<td>Similar variability?<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>1<\/td>\r\n<td>3.06<\/td>\r\n<td>3.79<\/td>\r\n<td>Yes<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>2<\/td>\r\n<td>0.22<\/td>\r\n<td>2.54<\/td>\r\n<td>No<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>3<\/td>\r\n<td>1.57<\/td>\r\n<td>3.32<\/td>\r\n<td>No<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>4<\/td>\r\n<td>2.39<\/td>\r\n<td>4.16<\/td>\r\n<td>Yes<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n&nbsp;\r\n\r\nHint: The largest SD should be no more than two times the smallest SD.\r\n\r\n<\/div>\r\nLooking ahead\r\n\r\nOur in-class activity will use osteoporosis research as an example. Explore the\u00a0 information found at https:\/\/medlineplus.gov\/osteoporosis.html and be ready to discuss risk factors, prevention, and treatment of the disease.","rendered":"<p>Preparing for the next class<\/p>\n<p>In the next in-class activity, you will need to identify the right types of data for an\u00a0 ANOVA, determine if ANOVA groups are independent random samples, and determine\u00a0 if groups have similar levels of variability.<\/p>\n<p>In trying to understand when it is appropriate to use an ANOVA, there are three main\u00a0 conditions we should consider:<\/p>\n<ol>\n<li>Is it the right type of data?<\/li>\n<li>Are the groups independent random samples?<\/li>\n<li>Do the groups have similar levels of variability?<\/li>\n<\/ol>\n<p>An ANOVA also requires that the data within each group be normally distributed, but\u00a0 testing for that is outside the scope of this course.<\/p>\n<p>Let\u2019s look at each condition in more detail.<\/p>\n<p>An ANOVA only works if the factor of interest is categorical data, the response variable\u00a0 is numeric and continuous, and the mean of the response variable is the parameter of\u00a0 interest. Remember that categorical data are qualitative data that have no inherent\u00a0 ranking or order. Basically, in an ANOVA, we are interested in comparing the mean of\u00a0 the response variable to more than two independent groups of the factor of interest.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>Question 1<\/h3>\n<p>1) Which of the following variables are categorical data? There may be more than one\u00a0 correct answer.<\/p>\n<ol>\n<li>a) Brand of shoes<\/li>\n<li>b) Body weight in pounds<\/li>\n<li>c) Age in years<\/li>\n<li>d) City of residence<\/li>\n<li>e) Outdoor temperature in degrees Celsius<\/li>\n<li>f) Type of medication<\/li>\n<\/ol>\n<p>Hint: Look for data that will be separated by categories rather than measured in\u00a0 numbers.<\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Question 2<\/h3>\n<p>2) Which of the following variables are numeric and continuous data? There may be\u00a0 more than one correct answer.<\/p>\n<ol>\n<li>a) Hours of TV watched<\/li>\n<li>b) Body weight in pounds<\/li>\n<li>c) Age in days<\/li>\n<li>d) Telephone number<\/li>\n<li>e) Body temperature in degrees Fahrenheit<\/li>\n<li>f) Type of diet<\/li>\n<\/ol>\n<p>Hint: Look for data measured in numbers.<\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Question 3<\/h3>\n<p><span style=\"font-size: 1rem; text-align: initial;\">3) Which of the following pairings contain the right kind of data for an ANOVA? There\u00a0 may be more than one correct answer.\u00a0<\/span><\/p>\n<ol>\n<li>a) Factor of interest: type of medication, response variable: blood pressure b) Factor of interest: body fat percentage, response variable: risk of heart attack<\/li>\n<li>c) Factor of interest: water temperature, response variable: frequency of coral\u00a0 bleaching events<\/li>\n<li>d) Factor of interest: television show, response variable: income earned per year<\/li>\n<li>e) Factor of interest: high school attended, response variable: location of current\u00a0 home<\/li>\n<li>f) Factor of interest: college degree earned, response variable: annual income<\/li>\n<\/ol>\n<p>Hint: The factor of interest should be categorical, and the response variable should\u00a0 be numeric.<\/p>\n<\/div>\n<p>The groups being compared using an ANOVA need to be independent random samples or randomly assigned groups in an experiment. Consider the following examples.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>Question 4<\/h3>\n<p>4) Suppose an animal rescue group wants to determine the best kind of food to help\u00a0 undernourished animals gain weight. The rescue randomly divides a group of dogs\u00a0 into four groups and feeds a different type of food to each group. They then track\u00a0 weight gain over time. Which of the following statements is the best evaluation of the\u00a0 groups?<\/p>\n<ol>\n<li>a) The groups are not independent, so an ANOVA is not appropriate.<\/li>\n<li>b) The groups are independent, randomly assigned experimental groups, so an\u00a0 ANOVA is appropriate.<\/li>\n<li>c) The groups are assigned correctly, but the type of data being collected is not\u00a0 appropriate for an ANOVA.<\/li>\n<\/ol>\n<p>Hint: Are the groups randomly assigned?<\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Question 5<\/h3>\n<p>5) Suppose a high school principal wants to evaluate the impact of student interests on\u00a0 student academic performance. The principal compares the average GPAs of the\u00a0 students in the chess club, marching band, soccer team, and student government\u00a0 association. Which of the following statements is the best evaluation of the groups?<\/p>\n<ol>\n<li>a) The groups are not randomly selected and there could be overlap between the\u00a0 groups, so an ANOVA is not appropriate.<\/li>\n<li>b) The groups are independent, randomly selected groups, so an ANOVA is\u00a0 appropriate.<\/li>\n<li>c) The groups are assigned correctly, but the type of data being collected is not\u00a0 appropriate for an ANOVA.<\/li>\n<\/ol>\n<p>Hint: Is each group an independent random sample?<\/p>\n<\/div>\n<p>The groups being compared should have equal or similar variability within their groups.\u00a0 There are formal tests that can be used to assess the similarity of variability among\u00a0 ANOVA groups, but they are beyond the scope of this course.<\/p>\n<p>Instead, we can visually estimate variability by comparing boxplots of data or\u00a0 numerically comparing the standard deviations provided in summary statistics.\u00a0 Remember that the box in a boxplot visually represents the middle 50% of the data and\u00a0 is the size of the interquartile range. While this is not a measurement of the standard\u00a0 deviation, a boxplot allows us to visually compare the spread or variability in each\u00a0 group.<\/p>\n<p>A good rule of thumb is that as long as the sample sizes are equal, the largest standard\u00a0 deviation can be no more than two times the smallest standard deviation. If the sample\u00a0 sizes are different, the standard deviations need to be really similar.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>Question 6<\/h3>\n<p>6) Which of the following two boxplots represent the most similar variances? Hint: Look for boxplots that are similarly shaped.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26210955\/Picture632-300x91.png\" alt=\"A box plot with A, B, C, D on the vertical axis. For A, the low point is at approximately 0, the high point is at approximately 5, the low end of the box is at approximately 1, the high end is at approximately 3, and the middle line is at approximately 2. For B, the low point is at approximately 1, the high point is at approximately 5, the low end of the box is at approximately 2.25, the high end of the box is at approximately 3.5, and the middle line is at approximately 3. For C, the low point is at approximately 1, the high point is at approximately 7, the low end of the box is at approximately 2, the high end of the box is at approximately 4.25, and the middle line is at approximately 4. For D, the low point is at approximately 2.5, the high point is at approximately 8, the low end of the box is at approximately 4, the high end is at approximately 6, and the middle line is at approximately 5. There are points \u201cy bar sub 1\u201d at approximately 2, \u201cy bar sub 2\u201d at approximately 3, \u201cy bar sub 3\u201d at approximately 3.5, and \u201cy bar sub 4\u201d at approximately 5.\" \/><\/div>\n<div class=\"textbox key-takeaways\">\n<h3>Question 7<\/h3>\n<p>7) Using the previous rule of thumb, determine whether the equal variance assumption\u00a0 for ANOVA is reasonable for the following four studies. Suppose each of these\u00a0 studies has equal sample sizes across all groups.<\/p>\n<div style=\"text-align: left;\">\n<table>\n<tbody>\n<tr>\n<td>Study #<\/td>\n<td>Smallest SD<\/td>\n<td>Largest SD<\/td>\n<td>Similar variability?<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>3.06<\/td>\n<td>3.79<\/td>\n<td>Yes<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>0.22<\/td>\n<td>2.54<\/td>\n<td>No<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>1.57<\/td>\n<td>3.32<\/td>\n<td>No<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>2.39<\/td>\n<td>4.16<\/td>\n<td>Yes<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Hint: The largest SD should be no more than two times the smallest SD.<\/p>\n<\/div>\n<p>Looking ahead<\/p>\n<p>Our in-class activity will use osteoporosis research as an example. Explore the\u00a0 information found at https:\/\/medlineplus.gov\/osteoporosis.html and be ready to discuss risk factors, prevention, and treatment of the disease.<\/p>\n","protected":false},"author":23592,"menu_order":9,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-5467","chapter","type-chapter","status-publish","hentry"],"part":5448,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5467","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/23592"}],"version-history":[{"count":2,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5467\/revisions"}],"predecessor-version":[{"id":5469,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5467\/revisions\/5469"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/5448"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5467\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=5467"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=5467"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=5467"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=5467"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}