{"id":5495,"date":"2022-09-14T16:23:48","date_gmt":"2022-09-14T16:23:48","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=5495"},"modified":"2022-09-14T16:23:48","modified_gmt":"2022-09-14T16:23:48","slug":"15c-inclass","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/15c-inclass\/","title":{"raw":"15C InClass","rendered":"15C InClass"},"content":{"raw":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">In the preview assignment, you were introduced to flight data for three airlines in March 2021.The two-way table is given again below.[footnote]U.S. Department of Transportation, Bureau of Transportation Statistics. (n.d.). On-time performance -Reporting operating carrier flight delays at a glance. https:\/\/www.transtats.bts.gov\/HomeDrillChart_Month.asp?5ry_lrn4=FDFD&amp;N44_Qry=E&amp;5ry_Pn44vr4=DDD&amp;5ry_Nv42146=DDD&amp;heY_fryrp6lrn4=FDFE&amp;heY_fryrp6Z106u=F[\/footnote] Recall that we are considering the question of whether the distributions of flight status are the same among the flights of the three airlines. For our test, the null hypothesis is that the distribution of flight status is the same for these three airlines, and the alternative hypothesis is that the distribution of flight status is not the same for these three airlines.<\/div>\r\n<div class=\"textLayer\">On-Time FlightsDelayed FlightsCanceled FlightsDiverted FlightsTotalAmerican Airlines42,6004,6572969547,648Delta Airlines51,6204,0301505655,856Southwest Airlines69,3849,2801,78212880,574Total 163,60417,9672,228279184,078<\/div>\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 1<\/h3>\r\n1) Suppose that we obtain enough evidence to reject the null hypothesis; that is, we decide that there is strong evidence to support the idea that the flight status distributions are not the same for at least one of the three airlines. At that point, what else might you want to know about the situation?<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 2<\/h3>\r\n<div class=\"textLayer\">2) Consider the following listed assumptions and conditions required to perform a chi-square test of homogeneity. Discuss whether each condition is met and whether there is any cause for concern.Appropriate Data and Variables: This is not an official condition, but it is important to make sure that we aredealing with data that give the counts for each value of a categorical variable. Thatcategorical variable should be measured for a samplefrom each population of interest.<\/div>\r\n<div class=\"textLayer\">Part A: Is this condition met?Explain.Independence\/Randomness Condition: The samples from our populationsshould be independent,random samples or independent samples that can beconsidered representative of the respective populations.<\/div>\r\n<div class=\"textLayer\">Part B: Is this condition metwithour sample of March 2021 flights? Explain.Large Sample Sizes Condition: The sample sizes need to be large enoughso thattheexpected count in each cell is at least five. Notice that this conditionrequires us to know the expected count in each cell. We can use technology to obtain that information. Goto the DCMPChi-Square Testtool at https:\/\/dcmathpathways.shinyapps.io\/ChiSquaredTest\/.\u2022Click the Test of Independence\/Homogeneitytab at the top. \u2022In the drop-down menu for \u201cEnter Data,\u201dselect \u201cTextbook\u201dand \u201cAirline FlightStatus.\u201d\u2022Noticethat the rows represent the different populations you are testing for homogeneity.\u2022Noticethat thecolumns are the values of the categorical variable we are considering the distributions of. \u2022Make sure to check the box for\u201cShow ExpectedCell Counts.\u201d<\/div>\r\n<div class=\"textLayer\">Part C: Is the expected cell frequency condition met? Explain.<\/div>\r\n<\/div>\r\n<\/div>\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 3<\/h3>\r\n3) Let\u2019sproceed with this chi-square test at a significance level of 0.01. Continueusing the data analysis tool. What is thevalue of thechi-square test statistic obtained from the test?<\/div>\r\n<\/div>\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 4<\/h3>\r\n4) What is the P-value obtained from the chi-square test? What does the P-valuerepresent and what does it tell you?<\/div>\r\n<\/div>\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 5<\/h3>\r\n5)What is the conclusion of our hypothesis test? State your conclusion in context. Even though we have already drawn a conclusion from our hypothesis test, there is still some information we can glean by looking at the difference between the observed count and the expected count for each cell.The data analysis toolcalls this difference the residualfor that cell (and the idea is similar to the concept of residuals you saw when looking at the differencesbetween observed values and predicted values in the linear regression context). Residuals are calculated using the formula:Residual=Observed\u2212ExpectedSince the values in our cells may vary quite a bit, it\u2019s a good idea to look atwhat the data analysis toolcallsstandardized residualsinstead.These are sometimes referred to as Standardized Pearson residuals.These are values that standardize the residuals so that if the null hypothesis is assumed to be true, they can be interpreted as normal z-scores. In particular, most standardized residuals for a given test will fall between \u22122 and 2. We can use these standardized residuals to determine how far off our observed countis from what was expected if the null hypothesis istrue (i.e., if the distributionsare really the same). The sign of the standardized residual tells us whether we observed more cases in that cell than weexpected (a positive residual) or fewer cases than we expected (a negative residual).<\/div>\r\n<\/div>\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 6<\/h3>\r\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\r\n<div class=\"textLayer\">6) In thedata analysistool, check the boxes for \u201cShow Residuals\u201d and \u201cShow Standardized Residuals.\u201d<\/div>\r\n<div class=\"textLayer\">Part A: Completethe following table with the standardized residuals for Delta Airlines and then interpretthe standardized residuals.Standardized Residual for On-time FlightsStandardized Residual for Delayed FlightsStandardized Residual for Canceled FlightsStandardized Residual for Diverted FlightsDelta Airlines 31.9<\/div>\r\n<\/div>\r\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\r\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910630775463\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part B:Complete thefollowingtable with the standardized residuals for Southwest Airlines andthen interpret the standardized residuals. Standardized Residual for On-time FlightsStandardized Residual for Delayed FlightsStandardized Residual for Canceled FlightsStandardized Residual for Diverted FlightsSouthwest Airlines\u221233.3<\/span><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part C:Based on your answers to Parts A and B, which airline might you choosebetween DeltaAirlinesand SouthwestAirlinesif you were planning a flight? A word of caution: As we saw in the preview assignment, the degrees of freedom fora chi-square test of homogeneity are not related to the sample size at all, so theydo not increase as the sample size increases. The degrees of freedom dependonly on the number of rows and columns in the associated two-way table. As aconsequence, it can be that if the sample size is verylarge, a chi-square test mayresult in rejecting the null hypothesiseven when theactualdifferences between thedistributions are small.In our airline example, we had a very large sample size foreach population, and we got a very small P-value that led us to reject the nullhypothesis.<\/span><\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\r\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910630775463\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\r\n<div data-resin-component=\"regionList\"><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 7<\/h3>\r\n7) Discuss this result in terms of practical significance and statisticalsignificance. Can we safely conclude that any of our three airlines are doing muchbetter than the others?<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">In the preview assignment, you were introduced to flight data for three airlines in March 2021.The two-way table is given again below.<a class=\"footnote\" title=\"U.S. Department of Transportation, Bureau of Transportation Statistics. (n.d.). On-time performance -Reporting operating carrier flight delays at a glance. https:\/\/www.transtats.bts.gov\/HomeDrillChart_Month.asp?5ry_lrn4=FDFD&amp;N44_Qry=E&amp;5ry_Pn44vr4=DDD&amp;5ry_Nv42146=DDD&amp;heY_fryrp6lrn4=FDFE&amp;heY_fryrp6Z106u=F\" id=\"return-footnote-5495-1\" href=\"#footnote-5495-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> Recall that we are considering the question of whether the distributions of flight status are the same among the flights of the three airlines. For our test, the null hypothesis is that the distribution of flight status is the same for these three airlines, and the alternative hypothesis is that the distribution of flight status is not the same for these three airlines.<\/div>\n<div class=\"textLayer\">On-Time FlightsDelayed FlightsCanceled FlightsDiverted FlightsTotalAmerican Airlines42,6004,6572969547,648Delta Airlines51,6204,0301505655,856Southwest Airlines69,3849,2801,78212880,574Total 163,60417,9672,228279184,078<\/div>\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 1<\/h3>\n<p>1) Suppose that we obtain enough evidence to reject the null hypothesis; that is, we decide that there is strong evidence to support the idea that the flight status distributions are not the same for at least one of the three airlines. At that point, what else might you want to know about the situation?<\/p><\/div>\n<\/div>\n<\/div>\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 2<\/h3>\n<div class=\"textLayer\">2) Consider the following listed assumptions and conditions required to perform a chi-square test of homogeneity. Discuss whether each condition is met and whether there is any cause for concern.Appropriate Data and Variables: This is not an official condition, but it is important to make sure that we aredealing with data that give the counts for each value of a categorical variable. Thatcategorical variable should be measured for a samplefrom each population of interest.<\/div>\n<div class=\"textLayer\">Part A: Is this condition met?Explain.Independence\/Randomness Condition: The samples from our populationsshould be independent,random samples or independent samples that can beconsidered representative of the respective populations.<\/div>\n<div class=\"textLayer\">Part B: Is this condition metwithour sample of March 2021 flights? Explain.Large Sample Sizes Condition: The sample sizes need to be large enoughso thattheexpected count in each cell is at least five. Notice that this conditionrequires us to know the expected count in each cell. We can use technology to obtain that information. Goto the DCMPChi-Square Testtool at https:\/\/dcmathpathways.shinyapps.io\/ChiSquaredTest\/.\u2022Click the Test of Independence\/Homogeneitytab at the top. \u2022In the drop-down menu for \u201cEnter Data,\u201dselect \u201cTextbook\u201dand \u201cAirline FlightStatus.\u201d\u2022Noticethat the rows represent the different populations you are testing for homogeneity.\u2022Noticethat thecolumns are the values of the categorical variable we are considering the distributions of. \u2022Make sure to check the box for\u201cShow ExpectedCell Counts.\u201d<\/div>\n<div class=\"textLayer\">Part C: Is the expected cell frequency condition met? Explain.<\/div>\n<\/div>\n<\/div>\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 3<\/h3>\n<p>3) Let\u2019sproceed with this chi-square test at a significance level of 0.01. Continueusing the data analysis tool. What is thevalue of thechi-square test statistic obtained from the test?<\/p><\/div>\n<\/div>\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 4<\/h3>\n<p>4) What is the P-value obtained from the chi-square test? What does the P-valuerepresent and what does it tell you?<\/p><\/div>\n<\/div>\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 5<\/h3>\n<p>5)What is the conclusion of our hypothesis test? State your conclusion in context. Even though we have already drawn a conclusion from our hypothesis test, there is still some information we can glean by looking at the difference between the observed count and the expected count for each cell.The data analysis toolcalls this difference the residualfor that cell (and the idea is similar to the concept of residuals you saw when looking at the differencesbetween observed values and predicted values in the linear regression context). Residuals are calculated using the formula:Residual=Observed\u2212ExpectedSince the values in our cells may vary quite a bit, it\u2019s a good idea to look atwhat the data analysis toolcallsstandardized residualsinstead.These are sometimes referred to as Standardized Pearson residuals.These are values that standardize the residuals so that if the null hypothesis is assumed to be true, they can be interpreted as normal z-scores. In particular, most standardized residuals for a given test will fall between \u22122 and 2. We can use these standardized residuals to determine how far off our observed countis from what was expected if the null hypothesis istrue (i.e., if the distributionsare really the same). The sign of the standardized residual tells us whether we observed more cases in that cell than weexpected (a positive residual) or fewer cases than we expected (a negative residual).<\/p><\/div>\n<\/div>\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 6<\/h3>\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\n<div class=\"textLayer\">6) In thedata analysistool, check the boxes for \u201cShow Residuals\u201d and \u201cShow Standardized Residuals.\u201d<\/div>\n<div class=\"textLayer\">Part A: Completethe following table with the standardized residuals for Delta Airlines and then interpretthe standardized residuals.Standardized Residual for On-time FlightsStandardized Residual for Delayed FlightsStandardized Residual for Canceled FlightsStandardized Residual for Diverted FlightsDelta Airlines 31.9<\/div>\n<\/div>\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910630775463\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part B:Complete thefollowingtable with the standardized residuals for Southwest Airlines andthen interpret the standardized residuals. Standardized Residual for On-time FlightsStandardized Residual for Delayed FlightsStandardized Residual for Canceled FlightsStandardized Residual for Diverted FlightsSouthwest Airlines\u221233.3<\/span><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part C:Based on your answers to Parts A and B, which airline might you choosebetween DeltaAirlinesand SouthwestAirlinesif you were planning a flight? A word of caution: As we saw in the preview assignment, the degrees of freedom fora chi-square test of homogeneity are not related to the sample size at all, so theydo not increase as the sample size increases. The degrees of freedom dependonly on the number of rows and columns in the associated two-way table. As aconsequence, it can be that if the sample size is verylarge, a chi-square test mayresult in rejecting the null hypothesiseven when theactualdifferences between thedistributions are small.In our airline example, we had a very large sample size foreach population, and we got a very small P-value that led us to reject the nullhypothesis.<\/span><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910630775463\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\n<div data-resin-component=\"regionList\"><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 7<\/h3>\n<p>7) Discuss this result in terms of practical significance and statisticalsignificance. Can we safely conclude that any of our three airlines are doing muchbetter than the others?<\/p><\/div>\n<\/div>\n<\/div>\n<\/div>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-5495-1\">U.S. Department of Transportation, Bureau of Transportation Statistics. (n.d.). On-time performance -Reporting operating carrier flight delays at a glance. https:\/\/www.transtats.bts.gov\/HomeDrillChart_Month.asp?5ry_lrn4=FDFD&amp;N44_Qry=E&amp;5ry_Pn44vr4=DDD&amp;5ry_Nv42146=DDD&amp;heY_fryrp6lrn4=FDFE&amp;heY_fryrp6Z106u=F <a href=\"#return-footnote-5495-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":23592,"menu_order":8,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-5495","chapter","type-chapter","status-publish","hentry"],"part":5479,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5495","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/23592"}],"version-history":[{"count":1,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5495\/revisions"}],"predecessor-version":[{"id":5496,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5495\/revisions\/5496"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/5479"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5495\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=5495"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=5495"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=5495"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=5495"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}