{"id":5532,"date":"2022-09-21T13:11:54","date_gmt":"2022-09-21T13:11:54","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=5532"},"modified":"2022-10-11T20:32:46","modified_gmt":"2022-10-11T20:32:46","slug":"16c-inclass","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/16c-inclass\/","title":{"raw":"16C InClass","rendered":"16C InClass"},"content":{"raw":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">Suppose you are a data scientist for Capital Bikeshare in Washington, D.C., and your job is to develop a linear regression model to predict the number of bike rentals based on the temperature.These predictions will be used to help determine the number of bikes to make available across the city each day. Previously, you\u2019ve used the regression model to calculate a predicted value of the response given a particular value of the explanatory variable. This time you decide to include an interval with your predictions, so you report a plausible range of values the number of bike rentals might take given a particular value of temperature.<\/div>\r\n<div><img class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/27010004\/Picture961-300x201.jpg\" alt=\"Someone riding a bike on a road in the countryside\" width=\"463\" height=\"310\" \/><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 1<\/h3>\r\n1) Briefly explain why it may be helpful to the team using your results to include an interval along with the predicted value.\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textLayer\">Today\u2019s dataset is from Capital Bikeshare[footnote]Capital Bikeshare. (n.d.). Declare your independence. https:\/\/www.capitalbikeshare.comCredit: iStock\/Asawiin_Klabma[\/footnote] in Washington D.C. It contains daily information about the number of bike rentals, weather, day of the week, and other details for days in 2011 and 2012. Your primary objective in this activity is to predict the number of daily bike rentals during the winter months (December 21 toMarch 20). To do so, you\u2019ll use data from 50 randomly selected winter days in 2011 and 2012. The variables of interest for this activity are:\u2022count:Total number of bikes rented\u2022temperature:Approximate high temperature in degrees Fahrenheit<\/div>\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 2<\/h3>\r\n<div class=\"textLayer\">2) The primary goal is to use a linear regression model to predict the daily number of bike rentals based on the temperaturein the winter.Goto the DCMP Linear Regression tool athttps:\/\/dcmathpathways.shinyapps.io\/LinearRegression\/.<\/div>\r\n<div class=\"textLayer\">\u2022Access spreadsheet DCMP_STAT_16C_dcbikeshare_winter_sample.<\/div>\r\n<div class=\"textLayer\">\u2022Under \u201cEnter Data,\u201d select \u201cEnter Own.\u201d<\/div>\r\n<div class=\"textLayer\">\u2022Select the appropriate explanatoryvariable (\ud835\udc4b) and response variable (\ud835\udc4c).<\/div>\r\n<div class=\"textLayer\">\u2022Enter the data.<\/div>\r\n<div class=\"textLayer\">Part A:Createa graphical display of the two variablesto visualize the relationship between the daily temperature and the number of bike rentals in the winter.Include an informative title and axis labels.<\/div>\r\n<div class=\"textLayer\">Part B: Use the plot to describe the relationship between the number of bike rentals and the temperature.<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 3<\/h3>\r\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">3) Now let\u2019s fit the linear regression model.<\/div>\r\n<div class=\"textLayer\">Part A: Use the linear regression tool(i.e., data analysis tool)to calculate the equation for the line of best fit. Write the equation using contextualized variable names.<\/div>\r\n<div class=\"textLayer\">Part B: Interpret the slope in the context of the data.<\/div>\r\n<div class=\"textLayer\">Part C: Briefly explain why the intercept is negative, even though it isn\u2019t possible to have a negative number of bike rentals.<\/div>\r\n<div class=\"textLayer\">Part D: Create a plot of the residual vs.the fitted values. Based on this plot, is the linear regression equation a reasonable fit for these data? Explain. Recall that the residual plot is found on the Fitted Values &amp; Residual Analysis tab.<\/div>\r\n<\/div>\r\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\r\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Part E: What is the predicted number of bike rentals when the temperature is 48 degrees Fahrenheit on a winter day?<\/span><\/div>\r\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Though the equation of the line of best fit is used to calculate the expected number of bike rentals on days that are 48 degrees, we know that there is not the same number of bikes rented on every winter day that has a temperature of 48 degrees. This can be seen graphically by the scatter of points about the regression line. Secondly, we have learned about sample variability in previous lessons. If we were to randomly select another 50-winter-day sample, the best line of fit would be different, so the predicted number of bike rentals (the point estimate) for the days that are 48 degrees would change. As you\u2019ve seen previously for means, proportions, and slope, you can calculate an interval to account for the variability in the predicted values. Before calculating the interval for predicted values, however, we need to first consider the type of prediction we\u2019re most interested in obtaining. There are two types of ways we can use the equation of the line of best fit:<\/span><\/div>\r\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">1. To estimate the mean value of the response when the explanatoryvariable is equal to a particular value, \ud835\udc650<\/span><\/div>\r\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">2. To predict the value of the response for an individual observation when the explanatory variable is equal to \ud835\udc650<\/span><\/div>\r\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">The type of interval calculated will depend on whether the goal is to estimate the mean response for a value of the explanatory variable or to predict the value of the response variable for an individual observation. Suppose we have the following equation of the line of best fit: \ud835\udc66\u0302=\ud835\udc4e+\ud835\udc4f\ud835\udc65 When the objective is to estimate the mean value of the response variable for a particular value of the explanatory variable,\ud835\udc650, we will calculate a confidence interval for the mean response. This interval gives us a range of plausible values the mean value of the response variable takes when \ud835\udc65=\ud835\udc650. In practice, we will use software to calculate the interval, so our focus is on the interpretation. We can interpret the\ud835\udc36%interval as follows: We are \ud835\udc36%confident that the mean response when the explanatory variable equals \ud835\udc650is between (lower bound)and (upper bound). <\/span><\/div>\r\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Part F: Use technologyto calculate a 95% confidence interval for the mean number of bike rentals when the temperature is48degreesFahrenheit on winter days. <\/span><\/div>\r\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Part G: What is the center of the interval? What is the margin of error?<\/span><\/div>\r\n<\/div>\r\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\r\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910623207586\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part H: Interpret the interval in the context of the data. When the objective is to predict the value of the response variable for an individual observationwith the explanatory variable equal to\ud835\udc650, we will calculate a \ud835\udc6a% prediction interval for an individual response, where \ud835\udc36is the confidence level. This interval gives us a range of plausible values of the response when an individual observation has a value of the explanatory variable equal to \ud835\udc650. In practice, we will use software to calculate the interval, so our focus is on the interpretation. We can interpret the interval as follows: We are \ud835\udc36%confident that the value of the response variable foran individual with a value of the explanatory variable equal to\ud835\udc650is between (lower bound)and (upper bound). When calculating the prediction interval for an individual observation, we have to take into account two sources of variability (i.e., the reasons our point estimates or predictions may not be exactly right). These are sources of variability due to:(1) the individual values that vary around the population regression line and(2) the fact that we don\u2019t have the equation of the population regression line and must rely on estimates of the slope and intercept.<\/span><\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\r\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910623207586\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 4<\/h3>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">4) Because of the additional variability in the points scattered about the line, the prediction interval will always be wider than the confidence interval for a given value of the explanatory variable.<\/span><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part A: Use technologyto calculate a 95% prediction interval for the predicted number of bike rentalswhen the temperature is 48 degrees Fahrenheiton an individual winter day. <\/span><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part B: What is the center of the interval? What is the margin of error? <\/span><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part C: Interpret the interval in the context of the data. When fitting a linear regression model, we assume that the distribution of the response variable is approximately normal for a given value of the explanatory variable. It is important for that condition to hold when using prediction intervals, since these intervals take into account the scatter of the points about the line. We can check that this conditionholds by examining the distribution of the residuals. If the distribution of the residuals is approximately normal, we can feel confident that the distribution of the response variable is normally distributedabout the regression line for each value of the explanatory variable.<\/span><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part D: Use the linear regression tool(i.e., data analysis tool)to make a histogram of the residuals. \u2022Select the Fitted Values and Residual Analysistab. \u2022Select the option \u201cHistogram\/Boxplot of Residuals.\u201d \u2022Select the option \u201cSuperimpose Normal Curve.\u201d Based on this histogram, can this linear model be reliably used to calculate the predicted number of bike rentals on an individual day? Explain.<\/div>\r\n<\/div>\r\n<\/div>\r\n<div data-resin-component=\"regionList\"><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">If the residuals are not normally distributed, you can report the prediction interval with a note indicating that caution should be applied when using the results from the interval. Notice that we checked the distribution of the residuals when calculating the prediction interval for an individual response but not for the confidence interval for the mean response. The reliability of the confidence interval for the mean response does not rely on the normality of the distribution of the residual due to the Central Limit Theorem.<\/div>\r\n<div data-resin-component=\"regionList\"><\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 5<\/h3>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">5) Compare the intervals from Questions 3 and 4.<\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part A: How do the centers of the intervals compare? Is this what you would expect? Explain.<\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part B: How do the margins of error of the intervals compare? Is this what you would expected? Explain.<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 6<\/h3>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">6) Consider the following analysis questions. Identify whether a confidence interval for the mean response or a prediction interval for an individual response should be used.Explain.<\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part A: Based on the weather forecast, the temperature on January 3 will be 40degrees.About how many rentals are expected on that day?<\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part B: About how many bike rentals are expected on unusually warm winter days when the temperature is70 degrees?<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 7<\/h3>\r\n7) How many bikes would you recommend putting out on a day when the temperature is 48 degrees? Briefly explain how you made your decision using the concepts from this activity.\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">Suppose you are a data scientist for Capital Bikeshare in Washington, D.C., and your job is to develop a linear regression model to predict the number of bike rentals based on the temperature.These predictions will be used to help determine the number of bikes to make available across the city each day. Previously, you\u2019ve used the regression model to calculate a predicted value of the response given a particular value of the explanatory variable. This time you decide to include an interval with your predictions, so you report a plausible range of values the number of bike rentals might take given a particular value of temperature.<\/div>\n<div><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/27010004\/Picture961-300x201.jpg\" alt=\"Someone riding a bike on a road in the countryside\" width=\"463\" height=\"310\" \/><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 1<\/h3>\n<p>1) Briefly explain why it may be helpful to the team using your results to include an interval along with the predicted value.<\/p>\n<\/div>\n<\/div>\n<div class=\"textLayer\">Today\u2019s dataset is from Capital Bikeshare<a class=\"footnote\" title=\"Capital Bikeshare. (n.d.). Declare your independence. https:\/\/www.capitalbikeshare.comCredit: iStock\/Asawiin_Klabma\" id=\"return-footnote-5532-1\" href=\"#footnote-5532-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> in Washington D.C. It contains daily information about the number of bike rentals, weather, day of the week, and other details for days in 2011 and 2012. Your primary objective in this activity is to predict the number of daily bike rentals during the winter months (December 21 toMarch 20). To do so, you\u2019ll use data from 50 randomly selected winter days in 2011 and 2012. The variables of interest for this activity are:\u2022count:Total number of bikes rented\u2022temperature:Approximate high temperature in degrees Fahrenheit<\/div>\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 2<\/h3>\n<div class=\"textLayer\">2) The primary goal is to use a linear regression model to predict the daily number of bike rentals based on the temperaturein the winter.Goto the DCMP Linear Regression tool athttps:\/\/dcmathpathways.shinyapps.io\/LinearRegression\/.<\/div>\n<div class=\"textLayer\">\u2022Access spreadsheet DCMP_STAT_16C_dcbikeshare_winter_sample.<\/div>\n<div class=\"textLayer\">\u2022Under \u201cEnter Data,\u201d select \u201cEnter Own.\u201d<\/div>\n<div class=\"textLayer\">\u2022Select the appropriate explanatoryvariable (\ud835\udc4b) and response variable (\ud835\udc4c).<\/div>\n<div class=\"textLayer\">\u2022Enter the data.<\/div>\n<div class=\"textLayer\">Part A:Createa graphical display of the two variablesto visualize the relationship between the daily temperature and the number of bike rentals in the winter.Include an informative title and axis labels.<\/div>\n<div class=\"textLayer\">Part B: Use the plot to describe the relationship between the number of bike rentals and the temperature.<\/div>\n<\/div>\n<\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 3<\/h3>\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">3) Now let\u2019s fit the linear regression model.<\/div>\n<div class=\"textLayer\">Part A: Use the linear regression tool(i.e., data analysis tool)to calculate the equation for the line of best fit. Write the equation using contextualized variable names.<\/div>\n<div class=\"textLayer\">Part B: Interpret the slope in the context of the data.<\/div>\n<div class=\"textLayer\">Part C: Briefly explain why the intercept is negative, even though it isn\u2019t possible to have a negative number of bike rentals.<\/div>\n<div class=\"textLayer\">Part D: Create a plot of the residual vs.the fitted values. Based on this plot, is the linear regression equation a reasonable fit for these data? Explain. Recall that the residual plot is found on the Fitted Values &amp; Residual Analysis tab.<\/div>\n<\/div>\n<div id=\"bp-page-2\" class=\"page\" data-page-number=\"2\" data-loaded=\"true\">\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Part E: What is the predicted number of bike rentals when the temperature is 48 degrees Fahrenheit on a winter day?<\/span><\/div>\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Though the equation of the line of best fit is used to calculate the expected number of bike rentals on days that are 48 degrees, we know that there is not the same number of bikes rented on every winter day that has a temperature of 48 degrees. This can be seen graphically by the scatter of points about the regression line. Secondly, we have learned about sample variability in previous lessons. If we were to randomly select another 50-winter-day sample, the best line of fit would be different, so the predicted number of bike rentals (the point estimate) for the days that are 48 degrees would change. As you\u2019ve seen previously for means, proportions, and slope, you can calculate an interval to account for the variability in the predicted values. Before calculating the interval for predicted values, however, we need to first consider the type of prediction we\u2019re most interested in obtaining. There are two types of ways we can use the equation of the line of best fit:<\/span><\/div>\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">1. To estimate the mean value of the response when the explanatoryvariable is equal to a particular value, \ud835\udc650<\/span><\/div>\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">2. To predict the value of the response for an individual observation when the explanatory variable is equal to \ud835\udc650<\/span><\/div>\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">The type of interval calculated will depend on whether the goal is to estimate the mean response for a value of the explanatory variable or to predict the value of the response variable for an individual observation. Suppose we have the following equation of the line of best fit: \ud835\udc66\u0302=\ud835\udc4e+\ud835\udc4f\ud835\udc65 When the objective is to estimate the mean value of the response variable for a particular value of the explanatory variable,\ud835\udc650, we will calculate a confidence interval for the mean response. This interval gives us a range of plausible values the mean value of the response variable takes when \ud835\udc65=\ud835\udc650. In practice, we will use software to calculate the interval, so our focus is on the interpretation. We can interpret the\ud835\udc36%interval as follows: We are \ud835\udc36%confident that the mean response when the explanatory variable equals \ud835\udc650is between (lower bound)and (upper bound). <\/span><\/div>\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Part F: Use technologyto calculate a 95% confidence interval for the mean number of bike rentals when the temperature is48degreesFahrenheit on winter days. <\/span><\/div>\n<div class=\"annotationLayer\"><span style=\"font-size: 1em;\">Part G: What is the center of the interval? What is the margin of error?<\/span><\/div>\n<\/div>\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910623207586\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part H: Interpret the interval in the context of the data. When the objective is to predict the value of the response variable for an individual observationwith the explanatory variable equal to\ud835\udc650, we will calculate a \ud835\udc6a% prediction interval for an individual response, where \ud835\udc36is the confidence level. This interval gives us a range of plausible values of the response when an individual observation has a value of the explanatory variable equal to \ud835\udc650. In practice, we will use software to calculate the interval, so our focus is on the interpretation. We can interpret the interval as follows: We are \ud835\udc36%confident that the value of the response variable foran individual with a value of the explanatory variable equal to\ud835\udc650is between (lower bound)and (upper bound). When calculating the prediction interval for an individual observation, we have to take into account two sources of variability (i.e., the reasons our point estimates or predictions may not be exactly right). These are sources of variability due to:(1) the individual values that vary around the population regression line and(2) the fact that we don\u2019t have the equation of the population regression line and must rely on estimates of the slope and intercept.<\/span><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"bp-page-3\" class=\"page\" data-page-number=\"3\" data-loaded=\"true\">\n<div class=\"ba-Layer ba-Layer--region\" data-resin-fileid=\"910623207586\" data-resin-iscurrent=\"true\" data-resin-feature=\"annotations\" data-testid=\"ba-Layer--region\">\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 4<\/h3>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">4) Because of the additional variability in the points scattered about the line, the prediction interval will always be wider than the confidence interval for a given value of the explanatory variable.<\/span><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part A: Use technologyto calculate a 95% prediction interval for the predicted number of bike rentalswhen the temperature is 48 degrees Fahrenheiton an individual winter day. <\/span><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part B: What is the center of the interval? What is the margin of error? <\/span><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\"><span style=\"font-size: 1em;\">Part C: Interpret the interval in the context of the data. When fitting a linear regression model, we assume that the distribution of the response variable is approximately normal for a given value of the explanatory variable. It is important for that condition to hold when using prediction intervals, since these intervals take into account the scatter of the points about the line. We can check that this conditionholds by examining the distribution of the residuals. If the distribution of the residuals is approximately normal, we can feel confident that the distribution of the response variable is normally distributedabout the regression line for each value of the explanatory variable.<\/span><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part D: Use the linear regression tool(i.e., data analysis tool)to make a histogram of the residuals. \u2022Select the Fitted Values and Residual Analysistab. \u2022Select the option \u201cHistogram\/Boxplot of Residuals.\u201d \u2022Select the option \u201cSuperimpose Normal Curve.\u201d Based on this histogram, can this linear model be reliably used to calculate the predicted number of bike rentals on an individual day? Explain.<\/div>\n<\/div>\n<\/div>\n<div data-resin-component=\"regionList\"><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">If the residuals are not normally distributed, you can report the prediction interval with a note indicating that caution should be applied when using the results from the interval. Notice that we checked the distribution of the residuals when calculating the prediction interval for an individual response but not for the confidence interval for the mean response. The reliability of the confidence interval for the mean response does not rely on the normality of the distribution of the residual due to the Central Limit Theorem.<\/div>\n<div data-resin-component=\"regionList\"><\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 5<\/h3>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">5) Compare the intervals from Questions 3 and 4.<\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part A: How do the centers of the intervals compare? Is this what you would expect? Explain.<\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part B: How do the margins of error of the intervals compare? Is this what you would expected? Explain.<\/div>\n<\/div>\n<\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 6<\/h3>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">6) Consider the following analysis questions. Identify whether a confidence interval for the mean response or a prediction interval for an individual response should be used.Explain.<\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part A: Based on the weather forecast, the temperature on January 3 will be 40degrees.About how many rentals are expected on that day?<\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">Part B: About how many bike rentals are expected on unusually warm winter days when the temperature is70 degrees?<\/div>\n<\/div>\n<\/div>\n<div class=\"ba-RegionAnnotations-list is-listening\" data-resin-component=\"regionList\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 7<\/h3>\n<p>7) How many bikes would you recommend putting out on a day when the temperature is 48 degrees? Briefly explain how you made your decision using the concepts from this activity.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-5532-1\">Capital Bikeshare. (n.d.). Declare your independence. https:\/\/www.capitalbikeshare.comCredit: iStock\/Asawiin_Klabma <a href=\"#return-footnote-5532-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":23592,"menu_order":68,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-5532","chapter","type-chapter","status-publish","hentry"],"part":5514,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5532","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/23592"}],"version-history":[{"count":3,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5532\/revisions"}],"predecessor-version":[{"id":5641,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5532\/revisions\/5641"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/5514"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5532\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=5532"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=5532"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=5532"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=5532"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}