{"id":5518,"date":"2022-09-19T19:05:33","date_gmt":"2022-09-19T19:05:33","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=5518"},"modified":"2022-10-07T07:50:48","modified_gmt":"2022-10-07T07:50:48","slug":"16a-inclass","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/16a-inclass\/","title":{"raw":"16A InClass","rendered":"16A InClass"},"content":{"raw":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">House prices fluctuate over time and are impacted by several variables. There are both internal variables (characteristics of the house) and external variables (the economy) that can impact house prices. It is even possible for two identical houses to sell at different prices.<\/div>\r\n<div class=\"textLayer\"><img class=\"alignnone\" style=\"font-size: 1em;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232409\/Picture762-300x205.jpg\" alt=\"An illustration of a city.\" \/><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 1<\/h3>\r\n1) What are some other things that might impact the price of a house?\r\n\r\n<\/div>\r\n<span style=\"font-size: 1em;\">Previously, you learned that there are several conditions that must be met in order to analyze data using linear regression. Similarly, in order to conduct a hypothesis test for the significance of slope, you will need to verify those same conditions. Recall from In-Class Activity 6.D that when a linear regression is appropriate, the value of the residuals will be randomly scattered around 0. In other words, some residuals will be positive (observed value above the line on the residual plot) and some will be negative (observed value below the line on the residual plot). We should not see a systematic pattern (e.g., all above in order then all below in order). In particular, we should worry about the appropriateness of the model if we notice the following red flags:<\/span>\r\n\r\n<\/div>\r\n<div class=\"textLayer\">\u2022Red Flag 1: The trend in the scatterplot is nonlinear, indicating that the relationship between the explanatory variable and the response variable is not modeled well by a line. The residuals tend to have a pattern.<\/div>\r\n<div class=\"textLayer\">\u2022Red Flag 2: The observed values are further and further away from the line of best fit for a portion of the data. In other words, the errors are not consistent for all values of the explanatory variable. The size of the residuals tends to increase or decrease as the value of the explanatory variable increases. When this happens, it can be hard to get a handle on the accuracy of the model because the standard deviation of the residuals is not constant over the values of the independent variable. Formally, to conduct a hypothesis test for the slope, we require: o A simple random sample o For any given value of \ud835\udc65, the residuals have a mean of 0(Red Flag 1)o The standard deviation of the residuals is the same at all values of \ud835\udc65(RedFlag 2)oFor any given value of \ud835\udc65, the residuals follow a normal distribution.<\/div>\r\n<\/div>\r\n<div>Credit: iStock\/ma_rish<\/div>\r\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 2<\/h3>\r\n2) A random sample of 21 apartments all located in the same zip code is observed. The sizes(in square feet) and the rents(per month) are recorded. Consider the following residual plot. Would it be appropriate to do a hypothesis test for significance of slope? Explain.\r\n\r\n<img class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232413\/Picture771-300x87.png\" alt=\"A residual plot labeled \u201cSize\u201d on the x-axis and \u201cResidual\u201d on the y-axis. There is a horizontal line at y = 0. The points with lower x-values are generally closer to the line than those with higher x-values.\" width=\"797\" height=\"231\" \/>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 3<\/h3>\r\n<div class=\"textLayer\">3) A random sample of 34 houses all located in the same zip code is for sale. The house sizes (in square feet) and the house prices (in dollars) are recorded.<\/div>\r\n<div class=\"textLayer\">Part A: Are the variables qualitative or quantitative?<\/div>\r\n<div class=\"textLayer\">Part B: What is the explanatory variable?<\/div>\r\n<div class=\"textLayer\">Part C: What is the response variable?<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"textLayer\">The random sample of 34 house prices and house sizes (square feet) was collected in Saratoga, Florida. Use the following scatterplot (Figure 1) and residual plot (Figure 2) to answer the questions that follow.<\/div>\r\n<div><img class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232417\/Picture781-300x112.png\" alt=\"A scatterplot labeled \u201cSquare Feet\u201d on the x-axis and \u201cPrice\u201d on the y-axis. There is a line of best fit that extends approximately from (1300, 175000) to (3200, 390000). The points are clustered near the line.\" width=\"774\" height=\"289\" \/><\/div>\r\n<div><img class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232421\/Picture791-300x93.png\" alt=\"A residual plot labeled \u201cSquare Feet\u201d on the x-axis and \u201cResidual\u201d on the y-axis. There is a horizontal line at y = 0. There is no pattern to the points.\" width=\"784\" height=\"243\" \/><\/div>\r\n<div><\/div>\r\n<div class=\"textLayer\">\r\n<div class=\"textbox key-takeaways\">\r\n<h3>Question 4<\/h3>\r\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\r\n<div class=\"textLayer\">4) A family wants to put their house up for sale and, using this set of data, they want to know if there is a useful linear relationship between the size and price of a house in their zip code to help them predict the price of their home.<\/div>\r\n<div class=\"textLayer\">Part A: Does the scatterplot appear to show a linear pattern?<\/div>\r\n<div class=\"textLayer\">Part B: To carry out a hypothesis test for the significance of slope, determine whether the conditions for fitting a linear regression line have been met.<\/div>\r\n<\/div>\r\n<div id=\"bp-page-4\" class=\"page\" data-page-number=\"4\" data-loaded=\"true\">\r\n<div class=\"textLayer\">At the 5% significance level, use Parts C through E to determine whether there is convincing evidence to conclude that there is a useful linear relationship between the size and price of a house in this zip code.<\/div>\r\n<div class=\"textLayer\">Part C: Write the hypothesis in symbols and in complete sentences.<\/div>\r\n<div class=\"textLayer\">Part D: Use the following output to determine the value of the test statistic and the P-value.<\/div>\r\n<div><img class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232426\/Picture801-300x76.png\" alt=\"A table labeled &quot;Linear Regression Equation,&quot; with headings &quot;Parameter,&quot; &quot;Estimate,&quot; &quot;Standard Error,&quot; &quot;t Statistic,&quot; and &quot;P-value.&quot; The first row has values &quot;intercept,&quot; 44,321, 12,357, 3.59, and 0.001. The second row has values &quot;Slope (Square Feet),&quot; 108.1, 5.808, 18.62, and &lt;0.0001.\" width=\"604\" height=\"153\" \/><\/div>\r\n<div class=\"textLayer\">Part E: Will the null hypothesis be rejected? Write your conclusion in a complete sentence.<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">House prices fluctuate over time and are impacted by several variables. There are both internal variables (characteristics of the house) and external variables (the economy) that can impact house prices. It is even possible for two identical houses to sell at different prices.<\/div>\n<div class=\"textLayer\"><img decoding=\"async\" class=\"alignnone\" style=\"font-size: 1em;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232409\/Picture762-300x205.jpg\" alt=\"An illustration of a city.\" \/><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 1<\/h3>\n<p>1) What are some other things that might impact the price of a house?<\/p>\n<\/div>\n<p><span style=\"font-size: 1em;\">Previously, you learned that there are several conditions that must be met in order to analyze data using linear regression. Similarly, in order to conduct a hypothesis test for the significance of slope, you will need to verify those same conditions. Recall from In-Class Activity 6.D that when a linear regression is appropriate, the value of the residuals will be randomly scattered around 0. In other words, some residuals will be positive (observed value above the line on the residual plot) and some will be negative (observed value below the line on the residual plot). We should not see a systematic pattern (e.g., all above in order then all below in order). In particular, we should worry about the appropriateness of the model if we notice the following red flags:<\/span><\/p>\n<\/div>\n<div class=\"textLayer\">\u2022Red Flag 1: The trend in the scatterplot is nonlinear, indicating that the relationship between the explanatory variable and the response variable is not modeled well by a line. The residuals tend to have a pattern.<\/div>\n<div class=\"textLayer\">\u2022Red Flag 2: The observed values are further and further away from the line of best fit for a portion of the data. In other words, the errors are not consistent for all values of the explanatory variable. The size of the residuals tends to increase or decrease as the value of the explanatory variable increases. When this happens, it can be hard to get a handle on the accuracy of the model because the standard deviation of the residuals is not constant over the values of the independent variable. Formally, to conduct a hypothesis test for the slope, we require: o A simple random sample o For any given value of \ud835\udc65, the residuals have a mean of 0(Red Flag 1)o The standard deviation of the residuals is the same at all values of \ud835\udc65(RedFlag 2)oFor any given value of \ud835\udc65, the residuals follow a normal distribution.<\/div>\n<\/div>\n<div>Credit: iStock\/ma_rish<\/div>\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 2<\/h3>\n<p>2) A random sample of 21 apartments all located in the same zip code is observed. The sizes(in square feet) and the rents(per month) are recorded. Consider the following residual plot. Would it be appropriate to do a hypothesis test for significance of slope? Explain.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232413\/Picture771-300x87.png\" alt=\"A residual plot labeled \u201cSize\u201d on the x-axis and \u201cResidual\u201d on the y-axis. There is a horizontal line at y = 0. The points with lower x-values are generally closer to the line than those with higher x-values.\" width=\"797\" height=\"231\" \/><\/p>\n<\/div>\n<\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 3<\/h3>\n<div class=\"textLayer\">3) A random sample of 34 houses all located in the same zip code is for sale. The house sizes (in square feet) and the house prices (in dollars) are recorded.<\/div>\n<div class=\"textLayer\">Part A: Are the variables qualitative or quantitative?<\/div>\n<div class=\"textLayer\">Part B: What is the explanatory variable?<\/div>\n<div class=\"textLayer\">Part C: What is the response variable?<\/div>\n<\/div>\n<\/div>\n<div class=\"textLayer\">The random sample of 34 house prices and house sizes (square feet) was collected in Saratoga, Florida. Use the following scatterplot (Figure 1) and residual plot (Figure 2) to answer the questions that follow.<\/div>\n<div><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232417\/Picture781-300x112.png\" alt=\"A scatterplot labeled \u201cSquare Feet\u201d on the x-axis and \u201cPrice\u201d on the y-axis. There is a line of best fit that extends approximately from (1300, 175000) to (3200, 390000). The points are clustered near the line.\" width=\"774\" height=\"289\" \/><\/div>\n<div><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232421\/Picture791-300x93.png\" alt=\"A residual plot labeled \u201cSquare Feet\u201d on the x-axis and \u201cResidual\u201d on the y-axis. There is a horizontal line at y = 0. There is no pattern to the points.\" width=\"784\" height=\"243\" \/><\/div>\n<div><\/div>\n<div class=\"textLayer\">\n<div class=\"textbox key-takeaways\">\n<h3>Question 4<\/h3>\n<div id=\"bp-page-1\" class=\"page\" data-page-number=\"1\" data-loaded=\"true\">\n<div class=\"textLayer\">4) A family wants to put their house up for sale and, using this set of data, they want to know if there is a useful linear relationship between the size and price of a house in their zip code to help them predict the price of their home.<\/div>\n<div class=\"textLayer\">Part A: Does the scatterplot appear to show a linear pattern?<\/div>\n<div class=\"textLayer\">Part B: To carry out a hypothesis test for the significance of slope, determine whether the conditions for fitting a linear regression line have been met.<\/div>\n<\/div>\n<div id=\"bp-page-4\" class=\"page\" data-page-number=\"4\" data-loaded=\"true\">\n<div class=\"textLayer\">At the 5% significance level, use Parts C through E to determine whether there is convincing evidence to conclude that there is a useful linear relationship between the size and price of a house in this zip code.<\/div>\n<div class=\"textLayer\">Part C: Write the hypothesis in symbols and in complete sentences.<\/div>\n<div class=\"textLayer\">Part D: Use the following output to determine the value of the test statistic and the P-value.<\/div>\n<div><img loading=\"lazy\" decoding=\"async\" class=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/26232426\/Picture801-300x76.png\" alt=\"A table labeled &quot;Linear Regression Equation,&quot; with headings &quot;Parameter,&quot; &quot;Estimate,&quot; &quot;Standard Error,&quot; &quot;t Statistic,&quot; and &quot;P-value.&quot; The first row has values &quot;intercept,&quot; 44,321, 12,357, 3.59, and 0.001. The second row has values &quot;Slope (Square Feet),&quot; 108.1, 5.808, 18.62, and &lt;0.0001.\" width=\"604\" height=\"153\" \/><\/div>\n<div class=\"textLayer\">Part E: Will the null hypothesis be rejected? Write your conclusion in a complete sentence.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"author":23592,"menu_order":2,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-5518","chapter","type-chapter","status-publish","hentry"],"part":5514,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5518","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/23592"}],"version-history":[{"count":3,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5518\/revisions"}],"predecessor-version":[{"id":5628,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5518\/revisions\/5628"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/5514"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/5518\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=5518"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=5518"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=5518"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=5518"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}