{"id":472,"date":"2016-04-21T22:43:37","date_gmt":"2016-04-21T22:43:37","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/introstats1xmaster\/?post_type=chapter&#038;p=472"},"modified":"2019-05-29T22:22:21","modified_gmt":"2019-05-29T22:22:21","slug":"prediction","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/introstats1\/chapter\/prediction\/","title":{"raw":"Prediction","rendered":"Prediction"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>Learning Outcomes<\/h3>\r\n<section>\r\n<ul>\r\n \t<li>Use interpolation and extrapolation<\/li>\r\n<\/ul>\r\n<\/section><\/div>\r\nRecall this example from earlier content:\r\n<p style=\"margin-left: 20px;\">A random sample of 11 statistics students produced the following data, where\r\n<em>x<\/em> is the third exam score out of 80, and <em>y<\/em> is the final exam score out of 200. Can you predict the final exam score of a random student if you know the third exam score?<\/p>\r\n\r\n<table>\r\n<thead>\r\n<tr>\r\n<th>x (third exam score)<\/th>\r\n<th>y (final exam score)<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>65<\/td>\r\n<td>175<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>67<\/td>\r\n<td>133<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>71<\/td>\r\n<td>185<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>71<\/td>\r\n<td>163<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>66<\/td>\r\n<td>126<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>75<\/td>\r\n<td>198<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>67<\/td>\r\n<td>153<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>70<\/td>\r\n<td>163<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>71<\/td>\r\n<td>159<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>69<\/td>\r\n<td>151<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>69<\/td>\r\n<td>159<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p style=\"margin-left: 20px;\">Table showing the scores on the final exam based on scores from the third exam.<\/p>\r\n<p style=\"margin-left: 20px;\"><img src=\"https:\/\/textimgs.s3.amazonaws.com\/DE\/stats\/ye6m-pdce957i#fixme#fixme#fixme\" alt=\"This is a scatter plot of the data provided. The third exam score is plotted on the x-axis, and the final exam score is plotted on the y-axis. The points form a strong, positive, linear pattern.\" \/>Scatter plot showing the scores on the final exam based on scores from the third exam.<\/p>\r\nWe examined the scatterplot and showed that the correlation coefficient is significant. We found the equation of the best-fit line for the final exam grade as a function of the grade on the third-exam. We can now use the least-squares regression line for prediction.\r\n\r\nSuppose you want to estimate, or predict, the mean final exam score of statistics students who received 73 on the third exam. The exam scores <strong>(<em data-redactor-tag=\"em\">x<\/em>-values)<\/strong> range from 65 to 75. <strong>Since 73 is between the <em data-redactor-tag=\"em\">x<\/em>-values 65 and 75<\/strong>, substitute <em>x<\/em> = 73 into the equation. Then:\r\n\r\n[latex]\\displaystyle\\hat{{y}}=-{173.51}+{4.83}{({73})}={179.08}[\/latex]\r\n\r\nWe predict that statistics students who earn a grade of 73 on the third exam will earn a grade of 179.08 on the final exam, on average.\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nUse the data above for this example:\r\n<ol>\r\n \t<li>What would you predict the final exam score to be for a student who scored a 66 on the third exam?<\/li>\r\n \t<li>What would you predict the final exam score to be for a student who scored a 90 on the third exam?<\/li>\r\n<\/ol>\r\nSolution:\r\n<ol>\r\n \t<li>145.27<\/li>\r\n \t<li>The <em>x<\/em> values in the data are between 65 and 75. Ninety is outside of the domain of the observed <em>x<\/em> values in the data (independent variable), so you cannot reliably predict the final exam score for this student. (Even though it is possible to enter 90 into the equation for <em>x<\/em> and calculate a corresponding <em>y<\/em> value, the <em>y<\/em> value that you get will not be reliable.)To understand really how unreliable the prediction can be outside of the observed <em>x<\/em> values observed in the data, make the substitution <em>x<\/em>= 90 into the equation.[latex]\\displaystyle\\hat{{y}}=-{173.51}+{4.83}{({90})}={261.19}[\/latex]The final-exam score is predicted to be 261.19. The largest the final-exam score can be is 200.<\/li>\r\n<\/ol>\r\n<\/div>\r\n\r\n<hr \/>\r\n\r\n<h4>Note<\/h4>\r\nThe process of predicting inside of the observed <em>x<\/em> values observed in the data is called <strong>interpolation<\/strong>. The process of predicting outside of the observed <em>x <\/em>values observed in the data is called <strong>extrapolation<\/strong>.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>try it<\/h3>\r\nData are collected on the relationship between the number of hours per week practicing a musical instrument and scores on a math test. The line of best fit is as follows:\r\n\r\n[latex]\\displaystyle\\hat{{y}}={72.5}+{2.8}{x}[\/latex]\r\n\r\nWhat would you predict the score on a math test would be for a student who practices a musical instrument for five hours a week?\r\n\r\n86.5\r\n\r\n<\/div>\r\n\r\n<hr \/>\r\n\r\n<h2>References<\/h2>\r\nData from the Centers for Disease Control and Prevention.\r\n\r\nData from the National Center for HIV, STD, and TB Prevention.\r\n\r\nData from the United States Census Bureau. Available online at http:\/\/www.census.gov\/compendia\/statab\/cats\/transportation\/motor_vehicle_accidents_and_fatalities.html\r\n\r\nData from the National Center for Health Statistics.\r\n<h2>Concept Review<\/h2>\r\nAfter determining the presence of a strong correlation coefficient and calculating the line of best fit, you can use the least squares regression line to make predictions about your data.","rendered":"<div class=\"textbox learning-objectives\">\n<h3>Learning Outcomes<\/h3>\n<section>\n<ul>\n<li>Use interpolation and extrapolation<\/li>\n<\/ul>\n<\/section>\n<\/div>\n<p>Recall this example from earlier content:<\/p>\n<p style=\"margin-left: 20px;\">A random sample of 11 statistics students produced the following data, where<br \/>\n<em>x<\/em> is the third exam score out of 80, and <em>y<\/em> is the final exam score out of 200. Can you predict the final exam score of a random student if you know the third exam score?<\/p>\n<table>\n<thead>\n<tr>\n<th>x (third exam score)<\/th>\n<th>y (final exam score)<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>65<\/td>\n<td>175<\/td>\n<\/tr>\n<tr>\n<td>67<\/td>\n<td>133<\/td>\n<\/tr>\n<tr>\n<td>71<\/td>\n<td>185<\/td>\n<\/tr>\n<tr>\n<td>71<\/td>\n<td>163<\/td>\n<\/tr>\n<tr>\n<td>66<\/td>\n<td>126<\/td>\n<\/tr>\n<tr>\n<td>75<\/td>\n<td>198<\/td>\n<\/tr>\n<tr>\n<td>67<\/td>\n<td>153<\/td>\n<\/tr>\n<tr>\n<td>70<\/td>\n<td>163<\/td>\n<\/tr>\n<tr>\n<td>71<\/td>\n<td>159<\/td>\n<\/tr>\n<tr>\n<td>69<\/td>\n<td>151<\/td>\n<\/tr>\n<tr>\n<td>69<\/td>\n<td>159<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p style=\"margin-left: 20px;\">Table showing the scores on the final exam based on scores from the third exam.<\/p>\n<p style=\"margin-left: 20px;\"><img decoding=\"async\" src=\"https:\/\/textimgs.s3.amazonaws.com\/DE\/stats\/ye6m-pdce957i#fixme#fixme#fixme\" alt=\"This is a scatter plot of the data provided. The third exam score is plotted on the x-axis, and the final exam score is plotted on the y-axis. The points form a strong, positive, linear pattern.\" \/>Scatter plot showing the scores on the final exam based on scores from the third exam.<\/p>\n<p>We examined the scatterplot and showed that the correlation coefficient is significant. We found the equation of the best-fit line for the final exam grade as a function of the grade on the third-exam. We can now use the least-squares regression line for prediction.<\/p>\n<p>Suppose you want to estimate, or predict, the mean final exam score of statistics students who received 73 on the third exam. The exam scores <strong>(<em data-redactor-tag=\"em\">x<\/em>-values)<\/strong> range from 65 to 75. <strong>Since 73 is between the <em data-redactor-tag=\"em\">x<\/em>-values 65 and 75<\/strong>, substitute <em>x<\/em> = 73 into the equation. Then:<\/p>\n<p>[latex]\\displaystyle\\hat{{y}}=-{173.51}+{4.83}{({73})}={179.08}[\/latex]<\/p>\n<p>We predict that statistics students who earn a grade of 73 on the third exam will earn a grade of 179.08 on the final exam, on average.<\/p>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Use the data above for this example:<\/p>\n<ol>\n<li>What would you predict the final exam score to be for a student who scored a 66 on the third exam?<\/li>\n<li>What would you predict the final exam score to be for a student who scored a 90 on the third exam?<\/li>\n<\/ol>\n<p>Solution:<\/p>\n<ol>\n<li>145.27<\/li>\n<li>The <em>x<\/em> values in the data are between 65 and 75. Ninety is outside of the domain of the observed <em>x<\/em> values in the data (independent variable), so you cannot reliably predict the final exam score for this student. (Even though it is possible to enter 90 into the equation for <em>x<\/em> and calculate a corresponding <em>y<\/em> value, the <em>y<\/em> value that you get will not be reliable.)To understand really how unreliable the prediction can be outside of the observed <em>x<\/em> values observed in the data, make the substitution <em>x<\/em>= 90 into the equation.[latex]\\displaystyle\\hat{{y}}=-{173.51}+{4.83}{({90})}={261.19}[\/latex]The final-exam score is predicted to be 261.19. The largest the final-exam score can be is 200.<\/li>\n<\/ol>\n<\/div>\n<hr \/>\n<h4>Note<\/h4>\n<p>The process of predicting inside of the observed <em>x<\/em> values observed in the data is called <strong>interpolation<\/strong>. The process of predicting outside of the observed <em>x <\/em>values observed in the data is called <strong>extrapolation<\/strong>.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>try it<\/h3>\n<p>Data are collected on the relationship between the number of hours per week practicing a musical instrument and scores on a math test. The line of best fit is as follows:<\/p>\n<p>[latex]\\displaystyle\\hat{{y}}={72.5}+{2.8}{x}[\/latex]<\/p>\n<p>What would you predict the score on a math test would be for a student who practices a musical instrument for five hours a week?<\/p>\n<p>86.5<\/p>\n<\/div>\n<hr \/>\n<h2>References<\/h2>\n<p>Data from the Centers for Disease Control and Prevention.<\/p>\n<p>Data from the National Center for HIV, STD, and TB Prevention.<\/p>\n<p>Data from the United States Census Bureau. Available online at http:\/\/www.census.gov\/compendia\/statab\/cats\/transportation\/motor_vehicle_accidents_and_fatalities.html<\/p>\n<p>Data from the National Center for Health Statistics.<\/p>\n<h2>Concept Review<\/h2>\n<p>After determining the presence of a strong correlation coefficient and calculating the line of best fit, you can use the least squares regression line to make predictions about your data.<\/p>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-472\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>OpenStax, Statistics, Prediction. <strong>Provided by<\/strong>: OpenStax. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.41:84\/Introductory_Statistics\">http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.41:84\/Introductory_Statistics<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><li>Introductory Statistics . <strong>Authored by<\/strong>: Barbara Illowski, Susan Dean. <strong>Provided by<\/strong>: Open Stax. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44\">http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em>. <strong>License Terms<\/strong>: Download for free at http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44<\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":21,"menu_order":6,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"OpenStax, Statistics, Prediction\",\"author\":\"\",\"organization\":\"OpenStax\",\"url\":\"http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.41:84\/Introductory_Statistics\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"},{\"type\":\"cc\",\"description\":\"Introductory Statistics \",\"author\":\"Barbara Illowski, Susan Dean\",\"organization\":\"Open Stax\",\"url\":\"http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"Download for free at http:\/\/cnx.org\/contents\/30189442-6998-4686-ac05-ed152b91b9de@17.44\"}]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-472","chapter","type-chapter","status-publish","hentry"],"part":457,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/pressbooks\/v2\/chapters\/472","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/wp\/v2\/users\/21"}],"version-history":[{"count":5,"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/pressbooks\/v2\/chapters\/472\/revisions"}],"predecessor-version":[{"id":1843,"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/pressbooks\/v2\/chapters\/472\/revisions\/1843"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/pressbooks\/v2\/parts\/457"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/pressbooks\/v2\/chapters\/472\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/wp\/v2\/media?parent=472"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/pressbooks\/v2\/chapter-type?post=472"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/wp\/v2\/contributor?post=472"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/introstats1\/wp-json\/wp\/v2\/license?post=472"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}