{"id":5485,"date":"2022-09-14T09:48:04","date_gmt":"2022-09-14T09:48:04","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=5485"},"modified":"2022-10-02T19:34:24","modified_gmt":"2022-10-02T19:34:24","slug":"15a-preview","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/15a-preview\/","title":{"raw":"15A Preview","rendered":"15A Preview"},"content":{"raw":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">Preparing for the next class<\/div>\r\n<div class=\"textLayer\">In the next in-class activity, you will need to be able to calculate expected counts for a single categorical variable, calculate a chi-square test statistic and explain the purpose behind each step in the calculation, and use technology to find the probability associated with a chi-square test statistic value.In the next activity, we will test hypotheses about the frequency distribution of a categorical variable and consider hypotheses that compare the proportion of a population that falls into two or more possible categories. Recall that a frequency table organizes data by listing different possible categories and the number of times each category occurs in the dataset.Thus, the frequency table displays the frequency or relative frequency distribution of a categorical variable. In this activity, you will compare the observed distribution of a categorical variable to its hypothesized distribution.In doing so, you will perform a chi-square test for goodness of fit. The \u201cchi-square\u201d part of the name represents the underlying statistical distribution (the chi-square distribution), which you will explore in this assignment. The \u201cgoodness of fit\u201d part of the name describes the main task at hand: comparing how well the observed values \u201cfit\u201d with the hypothesized valuesor baseline distribution.Example-Italian SoccerItalian youth soccer leagues create cohorts of children based on year of birth. For example, children born in 2005 only playedother children born in that same year. If a child wasborn on December 31, 2004, theyplayedwith the 2004 cohort (rather than the younger 2005 cohort). So, children born earlier in the year (e.g.,JanuaryorFebruary) tend to be the eldest players in their leagues. Children born later in the year (e.g.,NovemberorDecember) tend to be the youngest players in their leagues.Could this seemingly unimportant practice\u2014grouping by year of birth\u2014have an effect on players\u2019 later soccer careers? We\u2019regoing to explore this question using data1compiled by researchers on professional Italian soccer players.<\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 1<\/h3>\r\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">1) A calendar year can be defined in quarters (as shown in thefollowing table).Birth rates differbetween quarters of the year. Some quarters arelonger (have more days), and different cultures have different preferences for times of birth. Researchers measured birth rates in Italy and found the following results: 1Fumarco, L. &amp; Rossi, G. (2018, August 8). The relative age effect on labour market outcomes -Evidence from Italian football. EuropeanSport Management Quarterly, 18(4), 501\u2013516. DOI:10.1080\/16184742.2018.1424225<\/div>\r\n<\/div>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Quarter<\/strong><\/td>\r\n<td><strong>Quarter 1<\/strong>\r\n\r\n<strong>(Jan. \u2013 March)<\/strong><\/td>\r\n<td><strong>Quarter 2 <\/strong>\r\n\r\n<strong>(April \u2013 June)<\/strong><\/td>\r\n<td><strong>Quarter 3<\/strong>\r\n\r\n<strong>(July \u2013 Sept.)<\/strong><\/td>\r\n<td><strong>Quarter 4<\/strong>\r\n\r\n<strong>(Oct. \u2013 Dec.)<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Proportion of births in Italy<\/td>\r\n<td>0.2248<\/td>\r\n<td>0.2498<\/td>\r\n<td>0.2574<\/td>\r\n<td>0.2680<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\r\n<div class=\"textLayer\">Part A: Researchers collected data on 1,703 professional Italian soccer players. Assume their birthdates are distributed similarlyacross the fourquartersto the birthdates in the general Italian populationpresented in the previous table. If the birth quarters of the professional Italian soccer playersare like the general Italian population, how many of the players would be born in each quarter? Find the expected counts for each quarter. Show your results and work inthe following table.<\/div>\r\n<div>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Quarter<\/strong><\/td>\r\n<td><strong>Quarter 1<\/strong>\r\n\r\n<strong>(Jan. \u2013 March)<\/strong><\/td>\r\n<td><strong>Quarter 2 <\/strong>\r\n\r\n<strong>(April \u2013 June)<\/strong><\/td>\r\n<td><strong>Quarter 3<\/strong>\r\n\r\n<strong>(July \u2013 Sept.)<\/strong><\/td>\r\n<td><strong>Quarter 4<\/strong>\r\n\r\n<strong>(Oct. \u2013 Dec.)<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Expected number of soccer players born this quarter<\/td>\r\n<td>&nbsp;\r\n\r\n0.2248 \u00d7 1703 = <strong>382.83<\/strong>\r\n\r\n&nbsp;<\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<div class=\"textLayer\">Hint: Sincethe expected counts are theoretical values, they do not need to be whole numbers.The observedbirthdates of the actualsample of 1,703 professional Italian soccer players were classified by quarter and the results are summarized in the following table.<\/div>\r\n<div>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Quarter<\/strong><\/td>\r\n<td><strong>Quarter 1<\/strong>\r\n\r\n<strong>(Jan. \u2013 March)<\/strong><\/td>\r\n<td><strong>Quarter 2 <\/strong>\r\n\r\n<strong>(April \u2013 June)<\/strong><\/td>\r\n<td><strong>Quarter 3<\/strong>\r\n\r\n<strong>(July \u2013 Sept.)<\/strong><\/td>\r\n<td><strong>Quarter 4<\/strong>\r\n\r\n<strong>(Oct. \u2013 Dec.)<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Observed number of soccer players<\/td>\r\n<td>507<\/td>\r\n<td>534<\/td>\r\n<td>389<\/td>\r\n<td>273<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<div class=\"textLayer\">Part B: Compare the expected counts in Part A to theactualobserved counts. Do the observed counts have a different pattern than the expected counts? If so, describe the difference.<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 2<\/h3>\r\n2) Let\u2019s quantify the differencesbetween the expected counts and the observed counts.\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Quarter<\/strong><\/td>\r\n<td><strong>Observed (O)<\/strong><\/td>\r\n<td><strong>Expected (E)<\/strong><\/td>\r\n<td><strong>(O \u2013 E)<\/strong><\/td>\r\n<td><strong>(O \u2013 E)2<\/strong><\/td>\r\n<td><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>1<\/td>\r\n<td>507<\/td>\r\n<td>382.83<\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>2<\/td>\r\n<td>534<\/td>\r\n<td>425.41<\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>3<\/td>\r\n<td>389<\/td>\r\n<td>438.35<\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>4<\/td>\r\n<td>273<\/td>\r\n<td>456.40<\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<td><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nPart A:For each quarter, find the difference between the observed and the expected numbersof soccer players with birthdates in the quarter (Observed \u2013Expected, or O \u2013E). Fill in the (O\u2013E)column in the previous table.\r\nPart B:To get a sense of the overall difference between the observed and expected counts, we may be tempted to add up all of the differences we calculated. Why might thisbecounterproductive?\r\nPart C:To get a full set of positive differences, one way is to square the differences. This is similar to the way we calculate standard deviations. Write your answers in the (O \u2013E)2column of the previous table.\r\nPart D:To further understand the intuition of the next step in the previousscenario, let\u2019s briefly explore a simple example. Imagineyou\u2019re a senior in high school inthese twoscenarios: i. You are buying donuts for threeof your friends. When your friends reach into the bag, they only find two donuts\u2014the store has shorted you by one donut. ii. You are buying donuts for your entire school. There are 600 students at your school. The donut shop mistakenly only gives you 599 donuts in the boxes you bring to school. In which situation is the donut shortage more severe? Explain.\r\n<div class=\"textLayer\">Part E:To get a sense of the difference between our observed and expected values on the scaleof what we expected, divide the values in the (O \u2013E)2 column by the expected counts. Put your results in the final (\ud835\udc42\u2212\ud835\udc38)2\ud835\udc38columnin theprevioustable.<\/div>\r\n<div class=\"textLayer\">Part F:Finally, add up the final values in the (\ud835\udc42\u2212\ud835\udc38)2\ud835\udc38column and report your result.<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"bp-page-4\" class=\"page\" data-page-number=\"4\" data-loaded=\"true\">\r\n<div class=\"textLayer\">You just calculated the value of the chi-square(pronounced \u201ckai-square\u201d) test statistic for this problem. It measures the overall distance between observed and expected counts. The greater the chi-square test statistic, the further the observed counts are from what we expected. Here is the formula for the chi-square test statistic:\ud835\udf122=\u2211(Observed\u2212Expected)2ExpectedThis formula shows what we did in Part F\u2014we added up (the large sigma \u03a3represents summation) the (\ud835\udc42\u2212\ud835\udc38)2\ud835\udc38for each quarter of the year(each category).It\u2019s important to remember the intuition behind this formula\u2014we get the differences, square them to get rid of the negative values, and then scale them by dividing the squared differences by the expected counts. In this way, we get a robust measure of the overall difference between the observed and expected counts for a categorical variable.So, what does our chi-square test statistic value mean? To assess what our chi-square value tells us about the distance between the expected and observed counts, we\u2019ll turn to the chi-square distribution. Goto the DCMPChi-Square Distributiontool athttps:\/\/dcmathpathways.shinyapps.io\/ChisqDist\/.<\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 3<\/h3>\r\n<div id=\"bp-page-4\" class=\"page\" data-page-number=\"4\" data-loaded=\"true\">\r\n<div class=\"textLayer\">3)Let\u2019s assume that the distribution of birthdates among all professional Italian soccer players is the same as the distribution of birthdates in the general Italian population. Under this assumption,and if certain conditions are met (we will talk about these conditions in the next in-class activity), the statistic we calculated should follow a chi-square distributionwith threedegrees of freedom (df = 3).The degrees of freedom is one fewer than the number of possible categories for our categorical variable. In our case, we categorizedbirthdates into fourquarters, so one fewer makes threedegrees of freedom.<\/div>\r\n<\/div>\r\n<div id=\"bp-page-5\" class=\"page\" data-page-number=\"5\" data-loaded=\"true\">\r\n<div class=\"textLayer\">Part A:Set the degrees of freedom slider to threeand observe the shape of the chi-square distribution. Does the distribution have any negative values? Does this make sense? Explain.Hint: Think about how we dealt with negatives when calculating the chi-square test statistic.<\/div>\r\n<div class=\"textLayer\">Part B:In the chi-square distributionwith threedegrees of freedom, are small values (close to 0) more common or are large values (sixor higher) more commonif our assumption about the distribution of the categorical variable is true? Does this make sense? Explain.<\/div>\r\n<div class=\"textLayer\">Hint: Think about the assumption stated in the setup to Question 3.<\/div>\r\n<div class=\"textLayer\">Part C:Select the Find Probabilitytab at the top ofthe data analysis tool.Keep the same degrees of freedom (df = 3) and select the \u201cUpper Tail\u201d probability option. Enter your calculated chi-square statistic value where it says \u201cValue of x.\u201d Record the probability the data analysis tool shows, and comment on what this probability says about the assumption that the distribution of birthdates among all professional Italian soccer players is the same as the distribution of birthdates in the general Italian population.<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">Preparing for the next class<\/div>\n<div class=\"textLayer\">In the next in-class activity, you will need to be able to calculate expected counts for a single categorical variable, calculate a chi-square test statistic and explain the purpose behind each step in the calculation, and use technology to find the probability associated with a chi-square test statistic value.In the next activity, we will test hypotheses about the frequency distribution of a categorical variable and consider hypotheses that compare the proportion of a population that falls into two or more possible categories. Recall that a frequency table organizes data by listing different possible categories and the number of times each category occurs in the dataset.Thus, the frequency table displays the frequency or relative frequency distribution of a categorical variable. In this activity, you will compare the observed distribution of a categorical variable to its hypothesized distribution.In doing so, you will perform a chi-square test for goodness of fit. The \u201cchi-square\u201d part of the name represents the underlying statistical distribution (the chi-square distribution), which you will explore in this assignment. The \u201cgoodness of fit\u201d part of the name describes the main task at hand: comparing how well the observed values \u201cfit\u201d with the hypothesized valuesor baseline distribution.Example-Italian SoccerItalian youth soccer leagues create cohorts of children based on year of birth. For example, children born in 2005 only playedother children born in that same year. If a child wasborn on December 31, 2004, theyplayedwith the 2004 cohort (rather than the younger 2005 cohort). So, children born earlier in the year (e.g.,JanuaryorFebruary) tend to be the eldest players in their leagues. Children born later in the year (e.g.,NovemberorDecember) tend to be the youngest players in their leagues.Could this seemingly unimportant practice\u2014grouping by year of birth\u2014have an effect on players\u2019 later soccer careers? We\u2019regoing to explore this question using data1compiled by researchers on professional Italian soccer players.<\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 1<\/h3>\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">1) A calendar year can be defined in quarters (as shown in thefollowing table).Birth rates differbetween quarters of the year. Some quarters arelonger (have more days), and different cultures have different preferences for times of birth. Researchers measured birth rates in Italy and found the following results: 1Fumarco, L. &amp; Rossi, G. (2018, August 8). The relative age effect on labour market outcomes -Evidence from Italian football. EuropeanSport Management Quarterly, 18(4), 501\u2013516. DOI:10.1080\/16184742.2018.1424225<\/div>\n<\/div>\n<table>\n<tbody>\n<tr>\n<td><strong>Quarter<\/strong><\/td>\n<td><strong>Quarter 1<\/strong><\/p>\n<p><strong>(Jan. \u2013 March)<\/strong><\/td>\n<td><strong>Quarter 2 <\/strong><\/p>\n<p><strong>(April \u2013 June)<\/strong><\/td>\n<td><strong>Quarter 3<\/strong><\/p>\n<p><strong>(July \u2013 Sept.)<\/strong><\/td>\n<td><strong>Quarter 4<\/strong><\/p>\n<p><strong>(Oct. \u2013 Dec.)<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Proportion of births in Italy<\/td>\n<td>0.2248<\/td>\n<td>0.2498<\/td>\n<td>0.2574<\/td>\n<td>0.2680<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\n<div class=\"textLayer\">Part A: Researchers collected data on 1,703 professional Italian soccer players. Assume their birthdates are distributed similarlyacross the fourquartersto the birthdates in the general Italian populationpresented in the previous table. If the birth quarters of the professional Italian soccer playersare like the general Italian population, how many of the players would be born in each quarter? Find the expected counts for each quarter. Show your results and work inthe following table.<\/div>\n<div>\n<table>\n<tbody>\n<tr>\n<td><strong>Quarter<\/strong><\/td>\n<td><strong>Quarter 1<\/strong><\/p>\n<p><strong>(Jan. \u2013 March)<\/strong><\/td>\n<td><strong>Quarter 2 <\/strong><\/p>\n<p><strong>(April \u2013 June)<\/strong><\/td>\n<td><strong>Quarter 3<\/strong><\/p>\n<p><strong>(July \u2013 Sept.)<\/strong><\/td>\n<td><strong>Quarter 4<\/strong><\/p>\n<p><strong>(Oct. \u2013 Dec.)<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Expected number of soccer players born this quarter<\/td>\n<td>&nbsp;<\/p>\n<p>0.2248 \u00d7 1703 = <strong>382.83<\/strong><\/p>\n<p>&nbsp;<\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"textLayer\">Hint: Sincethe expected counts are theoretical values, they do not need to be whole numbers.The observedbirthdates of the actualsample of 1,703 professional Italian soccer players were classified by quarter and the results are summarized in the following table.<\/div>\n<div>\n<table>\n<tbody>\n<tr>\n<td><strong>Quarter<\/strong><\/td>\n<td><strong>Quarter 1<\/strong><\/p>\n<p><strong>(Jan. \u2013 March)<\/strong><\/td>\n<td><strong>Quarter 2 <\/strong><\/p>\n<p><strong>(April \u2013 June)<\/strong><\/td>\n<td><strong>Quarter 3<\/strong><\/p>\n<p><strong>(July \u2013 Sept.)<\/strong><\/td>\n<td><strong>Quarter 4<\/strong><\/p>\n<p><strong>(Oct. \u2013 Dec.)<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Observed number of soccer players<\/td>\n<td>507<\/td>\n<td>534<\/td>\n<td>389<\/td>\n<td>273<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"textLayer\">Part B: Compare the expected counts in Part A to theactualobserved counts. Do the observed counts have a different pattern than the expected counts? If so, describe the difference.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 2<\/h3>\n<p>2) Let\u2019s quantify the differencesbetween the expected counts and the observed counts.<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Quarter<\/strong><\/td>\n<td><strong>Observed (O)<\/strong><\/td>\n<td><strong>Expected (E)<\/strong><\/td>\n<td><strong>(O \u2013 E)<\/strong><\/td>\n<td><strong>(O \u2013 E)2<\/strong><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>507<\/td>\n<td>382.83<\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>534<\/td>\n<td>425.41<\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<td>389<\/td>\n<td>438.35<\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>273<\/td>\n<td>456.40<\/td>\n<td><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Part A:For each quarter, find the difference between the observed and the expected numbersof soccer players with birthdates in the quarter (Observed \u2013Expected, or O \u2013E). Fill in the (O\u2013E)column in the previous table.<br \/>\nPart B:To get a sense of the overall difference between the observed and expected counts, we may be tempted to add up all of the differences we calculated. Why might thisbecounterproductive?<br \/>\nPart C:To get a full set of positive differences, one way is to square the differences. This is similar to the way we calculate standard deviations. Write your answers in the (O \u2013E)2column of the previous table.<br \/>\nPart D:To further understand the intuition of the next step in the previousscenario, let\u2019s briefly explore a simple example. Imagineyou\u2019re a senior in high school inthese twoscenarios: i. You are buying donuts for threeof your friends. When your friends reach into the bag, they only find two donuts\u2014the store has shorted you by one donut. ii. You are buying donuts for your entire school. There are 600 students at your school. The donut shop mistakenly only gives you 599 donuts in the boxes you bring to school. In which situation is the donut shortage more severe? Explain.<\/p>\n<div class=\"textLayer\">Part E:To get a sense of the difference between our observed and expected values on the scaleof what we expected, divide the values in the (O \u2013E)2 column by the expected counts. Put your results in the final (\ud835\udc42\u2212\ud835\udc38)2\ud835\udc38columnin theprevioustable.<\/div>\n<div class=\"textLayer\">Part F:Finally, add up the final values in the (\ud835\udc42\u2212\ud835\udc38)2\ud835\udc38column and report your result.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"bp-page-4\" class=\"page\" data-page-number=\"4\" data-loaded=\"true\">\n<div class=\"textLayer\">You just calculated the value of the chi-square(pronounced \u201ckai-square\u201d) test statistic for this problem. It measures the overall distance between observed and expected counts. The greater the chi-square test statistic, the further the observed counts are from what we expected. Here is the formula for the chi-square test statistic:\ud835\udf122=\u2211(Observed\u2212Expected)2ExpectedThis formula shows what we did in Part F\u2014we added up (the large sigma \u03a3represents summation) the (\ud835\udc42\u2212\ud835\udc38)2\ud835\udc38for each quarter of the year(each category).It\u2019s important to remember the intuition behind this formula\u2014we get the differences, square them to get rid of the negative values, and then scale them by dividing the squared differences by the expected counts. In this way, we get a robust measure of the overall difference between the observed and expected counts for a categorical variable.So, what does our chi-square test statistic value mean? To assess what our chi-square value tells us about the distance between the expected and observed counts, we\u2019ll turn to the chi-square distribution. Goto the DCMPChi-Square Distributiontool athttps:\/\/dcmathpathways.shinyapps.io\/ChisqDist\/.<\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 3<\/h3>\n<div id=\"bp-page-4\" class=\"page\" data-page-number=\"4\" data-loaded=\"true\">\n<div class=\"textLayer\">3)Let\u2019s assume that the distribution of birthdates among all professional Italian soccer players is the same as the distribution of birthdates in the general Italian population. Under this assumption,and if certain conditions are met (we will talk about these conditions in the next in-class activity), the statistic we calculated should follow a chi-square distributionwith threedegrees of freedom (df = 3).The degrees of freedom is one fewer than the number of possible categories for our categorical variable. In our case, we categorizedbirthdates into fourquarters, so one fewer makes threedegrees of freedom.<\/div>\n<\/div>\n<div id=\"bp-page-5\" class=\"page\" data-page-number=\"5\" data-loaded=\"true\">\n<div class=\"textLayer\">Part A:Set the degrees of freedom slider to threeand observe the shape of the chi-square distribution. Does the distribution have any negative values? Does this make sense? Explain.Hint: Think about how we dealt with negatives when calculating the chi-square test statistic.<\/div>\n<div class=\"textLayer\">Part B:In the chi-square distributionwith threedegrees of freedom, are small values (close to 0) more common or are large values (sixor higher) more commonif our assumption about the distribution of the categorical variable is true? Does this make sense? Explain.<\/div>\n<div class=\"textLayer\">Hint: Think about the assumption stated in the setup to Question 3.<\/div>\n<div class=\"textLayer\">Part C:Select the Find Probabilitytab at the top ofthe data analysis tool.Keep the same degrees of freedom (df = 3) and select the \u201cUpper Tail\u201d probability option. Enter your calculated chi-square statistic value where it says \u201cValue of x.\u201d Record the probability the data analysis tool shows, and comment on what this probability says about the assumption that the distribution of birthdates among all professional Italian soccer players is the same as the distribution of birthdates in the general Italian population.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"author":23592,"menu_order":3,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-5485","chapter","type-chapter","status-publish","hentry"],"part":5479,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5485","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/23592"}],"version-history":[{"count":3,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5485\/revisions"}],"predecessor-version":[{"id":5601,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5485\/revisions\/5601"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/5479"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5485\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=5485"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=5485"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=5485"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=5485"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}