{"id":3859,"date":"2022-03-15T23:20:36","date_gmt":"2022-03-15T23:20:36","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=3859"},"modified":"2022-06-03T02:15:58","modified_gmt":"2022-06-03T02:15:58","slug":"what-to-know-about-6-c","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/what-to-know-about-6-c\/","title":{"raw":"What to Know About 6.C: Understanding the Coefficient of Determination","rendered":"What to Know About 6.C: Understanding the Coefficient of Determination"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>Learning Goals<\/h3>\r\nAt the end of this page, you should feel comfortable performing these skills:\r\n<ul>\r\n \t<li>Develop intuition about how [latex]R^{2}[\/latex] is related to the shape of a scatterplot.<\/li>\r\n \t<li>Use technology to calculate [latex]R^{2}[\/latex].<\/li>\r\n \t<li>Interpret the meaning of [latex]R^{2}[\/latex] in context.<\/li>\r\n \t<li>Identify possible values of [latex]R^{2}[\/latex].<\/li>\r\n<\/ul>\r\n<\/div>\r\nIn the next in-class activity, you will need to be able to interpret the meaning of [latex]R^2[\/latex] in context, relate [latex]R^2[\/latex] to the shape of a scatterplot, and identify variable types (explanatory and response) and plot data in a scatterplot. We'll prepare for this by developing your understanding of how [latex]R^{2}[\/latex] is related to the shape of a scatterplot as your learn to calculate, interpret, and recognize this measure in different scenarios.\r\n<h2>The Coefficient of Determination<\/h2>\r\nThe coefficient of determination, denoted [latex]R^2[\/latex] and pronounced \u201cR squared,\u201d is the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable. The following graphic shows a visualization of what we mean by this.\r\n\r\n<img class=\"alignnone wp-image-1240\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193049\/Picture156-300x230.jpg\" alt=\"Several graphs and charts. The first two both show House prices in Florida, graphing size in square feet on the x-axis and price in dollars on the y-axis. The first chart has a horizontal line across the center. For each point, a square is drawn based on its distance from the horizontal line. On the second graph, the points are in the same locations, but instead of a horizontal line, there is a diagonal line of best fit. For each point, a square is drawn based on its vertical distance from this line. Beneath this is two illustrations, each one corresponding to one of the graphs. The first one is titled &quot;Total variation in price&quot; and shows each of the squares from the graph with the horizontal line. The second one is labeled &quot;Variation in price not explained by its linear relationship with size&quot; and shows the squares from the graph with the line of best fit. Beneath both of these is another heading that says &quot;Proportion of variation in price not explained by its linear relationship with size.&quot; Beneath it, the squares from each graph are shown again, but this time, overlain, showing that the squares from the graph with the line of best fit have approximately one fifth of the area of those from the graph with the horizontal line.\" width=\"1243\" height=\"953\" \/>\r\n\r\nThe scatterplots show the prices and sizes of six houses. The first scatterplot includes the data points, as well as a horizontal line whose y-intercept is the mean house price. The second scatterplot includes the data points and the line of best fit. The blue squares in the first scatterplot are a demonstration of the total variation in the price; as you may recall from your discussion of variance, this is related to the sum of the squared distance from the mean. When we find the line of best fit, the distance of each data point to the line is minimized. The green squares in the second scatterplot are a demonstration of the variation in price that is left over after fitting a line to the data; in other words, the green squares show the variation in price that is not explained by its linear relationship with size.\r\n\r\nWhen we compare the unexplained variation with the total variation, we can visually estimate that the unexplained variation comprises about one-fifth of the total variation. As a result, we estimate that about four-fifths (or about 80%) of the variation is explained.\r\n\r\nWhen we use technology to compute [latex]R^2[\/latex] for this dataset, we find that [latex]R^2=0.82[\/latex]. This is consistent with our visual estimations. In other words, 82% of the variation in house price can be explained by the fact that houses differ in size and there is a linear relationship between price and size.\r\n<div class=\"textbox tryit\">\r\n<h3>Video Placement<\/h3>\r\n<span style=\"background-color: #e6daf7;\">[Perspective Video: A three-instructor video that gives perspectives for how to see [latex]R^{2}[\/latex] as the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable. This video shouldn't be technical for these introductory stats students. In fact, it should reassure them that the interpretation of [latex]R^{2}[\/latex] will be the more important skill to attain, while still developing an intuition of what [latex]R^{2}[\/latex] measures in linearly related data. Stress that students don't need to follow the idea presented in the above images thoroughly, but do refer to them, and make it clear that a key point within the scope of this course is made in the notion that [latex]R^{2}[\/latex] close to 1 indicates that a proportion of the variation close to 100% can be explained by the explanatory variable -- a strong linear correlation.]<\/span>\r\n\r\n<\/div>\r\n<div class=\"textbox\">\r\n\r\nAbout the Notation\r\n\r\nYou'll sometimes see\u00a0[latex]R^{2}[\/latex] written in the lowercase as [latex]r^2[\/latex] (like in the <em>DCMP Data Analysis tool<\/em>), but [latex]R^2[\/latex] and [latex]r^2[\/latex] mean the same thing. In these activities, we will use the notation [latex]R^2[\/latex].\r\n\r\nThe reason that we use the symbol [latex]R^{2}[\/latex] is that the coefficient of determination is equal to the square of the correlation coefficient [latex]r[\/latex]. Because of this, [latex]R^2[\/latex] is more sensitive to differences in the strength of the linear relationship between the two variables than [latex]r[\/latex] is. This increased sensitivity can be seen in the graphic below; the difference between [latex]R^2[\/latex] values is greater than the difference between corresponding [latex]r[\/latex] values.\r\n\r\n<span style=\"background-color: #ffff00;\">[The image below is missing the line that indicates the R^2 values beneath each r value! -- Please re-snip and re-insert the full image.]<\/span>\r\n\r\n<img class=\"alignnone wp-image-1241\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193055\/Picture157-300x79.png\" alt=\"Several scatterplots labeled by the correlation of their line of best fit. The first graph is labeled &quot;Perfect Positive Correlation&quot; and shows points exactly on the line of best fit. The line has a positive slope and the r value is 1. The second graph is labeled &quot;Strong Positive Correlation&quot; and shows points close to the line of best fit. The slope of the line is positive and the r value is 0.91. The next graph is labeled &quot;Weak Positive Correlation&quot; and shows points that are not close to the line of best fit, but still show a correlation to the line. The slope of the line is positive and the r value is 0.48. The next graph is labeled &quot;No Correlation&quot; and show points randomly scattered across the graph. There is no line of best fit and the r value is 0. The next graph is labeled &quot;Weak Negative Distribution&quot; and shows points that are not close to the line of best fit, but still show a correlation to the line. The slope of the line is negative and the r-value is -0.48. The next graph is labeled &quot;Strong Negative Correlation&quot; and shows points that are close to the line of best fit. The slope of the line is negative and the r-value is -0.91. The last graph is labeled &quot;Perfect Negative Correlation&quot; and shows points that are exactly on the line of best fit. It has a negative slope and the r-value is -1.\" width=\"1067\" height=\"281\" \/>\r\n\r\nWe will not go into more detail here about how [latex]R^2[\/latex] is calculated; instead, you will practice finding and interpreting this value.\r\n<p style=\"padding-left: 30px;\">If you are curious about how this quantity is computed, see this video\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=lng4ZgConCM\">https:\/\/www.youtube.com\/watch?v=lng4ZgConCM<\/a>.<\/p>\r\n\r\n<\/div>\r\n<h3>\u00a0[latex]R^{2}[\/latex] and Scatterplot Shape<\/h3>\r\nYou've seen that the coefficient of determination\u00a0[latex]R^2[\/latex] is a measure of the proportion of the variation of a response variable in linearly related bivariate data that can be explained by its relationship with the explanatory variable. You should understand that\r\n<ul>\r\n \t<li>[latex]R^2[\/latex] is equivalent to the square of the correlation coefficient\u00a0[latex]r[\/latex] and will always be a positive number.<\/li>\r\n \t<li>[latex]R^2[\/latex] should be interpreted as a percentage.\r\n<ul>\r\n \t<li>That is, if a data set has\u00a0[latex]r=0.87[\/latex], then\u00a0[latex]R^2=0.7569[\/latex], which would be expressed as [latex]75.69\\%[\/latex].<\/li>\r\n \t<li>We would say that approximately [latex]75.7 \\%[\/latex] of the variation in the response variable is due to its linear relationship with the explanatory variable.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\nConsider what you already understand about the shape and spread of a scatterplot.\r\n<ul>\r\n \t<li>The strongest linear relationships appear in plots as data that is roughly linear in shape with data points that lie very close to some line.<\/li>\r\n \t<li>Weaker relationships may be very roughly linear in shape and more spread out, with data points that lie further from some line.<\/li>\r\n \t<li>Non-linear relationships have data points that either form other shapes or are randomly scattered across the plot.<\/li>\r\n<\/ul>\r\nUse what you already know together with the idea that [latex]R^2[\/latex] is the square of [latex]r[\/latex]to answer Question 1.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 1<\/h3>\r\nFor this question, use the following graphs.\r\n\r\n[caption id=\"attachment_1242\" align=\"alignnone\" width=\"1184\"]<img class=\"wp-image-1242\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193100\/Picture159-300x201.png\" alt=\"A scatterplot showing points that are fairly close together arranged in a semicircular pattern.\" width=\"1184\" height=\"793\" \/> Graph A[\/caption]\r\n\r\n&nbsp;\r\n\r\n[caption id=\"attachment_1243\" align=\"alignnone\" width=\"1113\"]<img class=\"wp-image-1243\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193105\/Picture160-300x200.png\" alt=\"A scatterplot showing points that are close together in a roughly linear shape.\" width=\"1113\" height=\"741\" \/> Graph B[\/caption]\r\n\r\n&nbsp;\r\n\r\n[caption id=\"attachment_1244\" align=\"alignnone\" width=\"1250\"]<img class=\"wp-image-1244\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193109\/Picture161-300x200.png\" alt=\"A scatterplot showing points that are not grouped closely together and follow a somewhat linear pattern.\" width=\"1250\" height=\"833\" \/> Graph C[\/caption]\r\n\r\n&nbsp;\r\n\r\nPart A: Which of the previous scatterplots demonstrates the strongest linear relationship?\r\n\r\n[reveal-answer q=\"739788\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"739788\"]Recall what you have learned about the spread and shape of linear data as it appears visually in a scatterplot.[\/hidden-answer]\r\n\r\n&nbsp;\r\n\r\nPart B: Which of the previous scatterplots do you expect to have the highest value of [latex]R^2[\/latex]? Explain.\r\n\r\n[reveal-answer q=\"712482\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"712482\"]Recall that [latex]R^2[\/latex] is the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable.[\/hidden-answer]\r\n\r\n<\/div>\r\nNow that you have developed some intuition about\u00a0[latex]R^2[\/latex] and the shape of a plot, let's explore how the spread of a plot affects the value of\u00a0[latex]R^2[\/latex].\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 2<\/h3>\r\nGo to the <em>DCMP Explore Linear Regression<\/em> tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/ExploreLinReg\/\">https:\/\/dcmathpathways.shinyapps.io\/ExploreLinReg\/<\/a>. From the drop-down menu, select \u201cLinear Relationship.\u201d\r\n\r\n<img class=\"wp-image-1245 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193117\/Picture162-300x147.png\" alt=\"A selection menu. At the top, &quot;Explore Linear Regression&quot; is selected and &quot;Scatterplot&quot; and &quot;Residualplot&quot; are unselected. Beneath this is a dropdown menu where &quot;Draw Your Own (Click in Graph)&quot; and &quot;Random Scatter&quot; are unselected, &quot;Linear Relationship&quot; is selected, and &quot;Quadratic Relationship&quot; is unselected. Beneath that is another heading that says &quot;Initial Number of Points.&quot; Under it, 50 is selected and 20, 100, and 500 are all unselected.\" width=\"392\" height=\"192\" \/>\r\n\r\nSelect the boxes that will display [latex]r[\/latex] and [latex]R^2[\/latex]:\r\n\r\n<img class=\"wp-image-1246 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193122\/Picture163-300x175.png\" alt=\"A checklist with the heading &quot;Options.&quot; &quot;Linear Regression Line&quot; is selected, &quot;Smooth Trend&quot; is unselected, and &quot;Show Correlation Coefficient r&quot; and &quot;Squared Correlation Coefficient r squared&quot; are both selected.\" width=\"393\" height=\"230\" \/>\r\n\r\nToggle back and forth between the different options for spread. Make a note for yourself about how [latex]R^2[\/latex] changes as you change the spread from large to medium to small. As you do this, note that squaring [latex]r[\/latex] does, in fact, yield [latex]R^2[\/latex].\r\n\r\n<img class=\"wp-image-1247 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193128\/Picture164-300x91.png\" alt=\"A selection menu with the heading &quot;Spread.&quot; Beneath it, &quot;medium&quot; has been selected and &quot;small&quot; and &quot;large&quot; are both unselected. Beneath those is a button that reads &quot;Refresh.&quot;\" width=\"393\" height=\"119\" \/>\r\n\r\nPart A: When the data points lie closer to the line of best fit, the linear relationship between the explanatory variable and the response variable ___________.\r\n<ol>\r\n \t<li>a) gets stronger<\/li>\r\n \t<li>b) gets weaker<\/li>\r\n \t<li>c) stays the same<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"684619\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"684619\"]Recall what you know about the strength of linear relationships.[\/hidden-answer]\r\n\r\n&nbsp;\r\n\r\nPart B: As the linear relationship between the explanatory variable and the response variable gets stronger, the value of [latex]R^2[\/latex] __________.\r\n<ol>\r\n \t<li>a) increases<\/li>\r\n \t<li>b) decreases<\/li>\r\n \t<li>c) stays the same<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"861892\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"861892\"]What happens to [latex]R^2[\/latex] as you change the spread of the points relative to the line in the scatterplot?[\/hidden-answer]\r\n\r\n<\/div>\r\nHopefully you are feeling more confident in understanding\u00a0[latex]R^2[\/latex] with regard to what you already knew about linear relationships and the shape and spread of the data on a scatterplot.\r\n<h3>Interpreting\u00a0[latex]R^2[\/latex] in context<\/h3>\r\nNow it's time to put together what you've learned so far about explanatory and response variables, visual clues in a scatterplot regarding the appropriateness of linear analysis, the correlation coefficient\u00a0[latex]r[\/latex], and the coefficient of determination\u00a0[latex]R^2[\/latex]. See the video below for a summary of these ideas, then use what you've learned to answer Questions 3 - 6.\r\n<div class=\"textbox tryit\">\r\n<h3>Video Placement<\/h3>\r\n<span style=\"background-color: #e6daf7;\">[A 3-instructor worked example that summarizes the ideas appearing in Questions 3 - 6. This can be a place to use an inclusion or social justice example. The example should walk through identifying response and explanatory variables from a scenario description, anticipation of the R^2 value upon visual inspection of the plot, confirmation of R^2 via a data analysis tool, and interpretation of R^2 in the context of the scenario.]\u00a0<\/span>\r\n\r\n<\/div>\r\nNow you try it. Go to the <em>DCMP Linear Regression<\/em> tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/LinearRegression\/%20\">https:\/\/dcmathpathways.shinyapps.io\/LinearRegression\/<\/a>. Select \u201cFrom Textbook\u201d and then select the \u201cBad Drivers\u201d dataset to answer Question 3 - 6 below.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 3<\/h3>\r\nWe wish to investigate the relationship between the losses (in dollars) incurred by insurance companies for collisions per insured driver and insurance premiums (in dollars). Insurance companies incur losses when drivers who are insured through them are involved in collisions, and the insurance companies then have to pay for the associated costs. Insurance premiums are the fees that insurance companies charge; drivers pay premiums to the insurance companies in order to buy insurance coverage.\r\n\r\n&nbsp;\r\n\r\nPart A: Which variable is the explanatory variable?\r\n<ol>\r\n \t<li>a) Losses<\/li>\r\n \t<li>b) Insurance premiums<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"765472\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"765472\"]Which variable measures the outcome and which is the variable driving the outcome?[\/hidden-answer]\r\n\r\n&nbsp;\r\n\r\nPart B: Which variable is the response variable?\r\n<ol>\r\n \t<li>a) Losses<\/li>\r\n \t<li>b) Insurance premiums<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"439776\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"439776\"]Which variable measures the outcome and which is the variable driving the outcome?[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 4<\/h3>\r\nVisually inspect the scatterplot. Which of the following do you expect of the [latex]R^2[\/latex] value?\r\n<ol>\r\n \t<li>a) Very close to 0%<\/li>\r\n \t<li>b) Between 10% and 50%<\/li>\r\n \t<li>c) Between 50% and 90%<\/li>\r\n \t<li>d) Very close to 100%<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"816551\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"816551\"]Visually inspect the scatterplot for shape and spread.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 5<\/h3>\r\nUse the tool to find [latex]R^2[\/latex]. Note that the coefficient of determination ([latex]R^2[\/latex]) is given in a table below the scatterplot. What is the value of [latex]R^2[\/latex]?\r\n\r\n[reveal-answer q=\"383832\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"383832\"]It will be listed in a summary of the statistical model, near [latex]r[\/latex].[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 6<\/h3>\r\nInterpret [latex]R^2[\/latex] for this scatterplot: _______% of the variation in _______ can be explained by its linear relationship with _______.\r\n\r\n[reveal-answer q=\"271381\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"271381\"]Refer to the definition of [latex]R^2[\/latex] given at the beginning of this assignment. [\/hidden-answer]\r\n\r\n<\/div>\r\n<h3>\u00a0Identifying\u00a0[latex]R^2[\/latex] in Context<\/h3>\r\nThe final two questions below ask you to think carefully about the characteristics of [latex]R^2[\/latex] and the characteristics of data graphed on a scatterplot. Use what you have learned about these to answer the questions below. Don't forget to use the given hint if necessary.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 7<\/h3>\r\nDepending on the tools you use, [latex]R^2[\/latex] may be expressed as a decimal or as a percentage. Even though the tool expresses [latex]R^2[\/latex] using a percentage, it is important to be able to read values of [latex]R^2[\/latex] in both decimal and percentage form. Consider an arbitrary [latex]R^2[\/latex]. Which of the following are possible values of [latex]R^2[\/latex]? There may be more than one correct answer.\r\n<ol>\r\n \t<li>a) 0.1<\/li>\r\n \t<li>b) 30%<\/li>\r\n \t<li>c) 1.5<\/li>\r\n \t<li>d) 1.5%<\/li>\r\n \t<li>e) 100%<\/li>\r\n \t<li>f) -0.1<\/li>\r\n \t<li>g) -44<\/li>\r\n \t<li>h) 1<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"555842\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"555842\"]\u00a0Recall that [latex]R^2[\/latex] is the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable. Does it make sense for this proportion to be negative? Can more than 100% of the variation in the response variable be explained by its linear relationship with the explanatory variable? You may have to convert between decimals and percentages.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 8<\/h3>\r\nBoth of the plots below have an [latex]R^2[\/latex] value of 0.9%. This is quite close to zero, which means that very little of the variation in the response variable can be explained by its linear relationship with the explanatory variable.\r\n\r\n<img class=\"alignnone wp-image-1248\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193134\/Picture165-300x195.png\" alt=\"A scatterplot showing points arranged in a horizontal zig-zag pattern.\" width=\"325\" height=\"211\" \/>.\u00a0 \u00a0<img class=\"alignnone wp-image-1249\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193140\/Picture166-300x193.png\" alt=\"A scatterplot showing points arranged randomly.\" width=\"326\" height=\"210\" \/>\r\n\r\nDetermine whether this statement is true or false: If [latex]R^2[\/latex] is very small, then there is no relationship between the explanatory variable and the response variable.\r\n\r\n[reveal-answer q=\"674667\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"674667\"]Can [latex]R^2[\/latex] indicate a weak linear relationship without revealing some other kind of strong relationship between the variables?[\/hidden-answer]\r\n\r\n<\/div>\r\n<h2>Summary<\/h2>\r\nIn this <em>What to Know\u00a0<\/em>page,\u00a0you learned about the coefficient of determination [latex]R^{2}[\/latex].\u00a0 Here is a summary, per question, of what you saw.\r\n<ul>\r\n \t<li>In Questions 1, 2, 4, and 8, you developed intuition about how [latex]R^{2}[\/latex] is related to the shape of a scatterplot.<\/li>\r\n \t<li>In Question 3, you identified variable types (explanatory and response) and plotted data in a scatterplot.<\/li>\r\n \t<li>In Question 5, you used technology to calculate\u00a0[latex]R^{2}[\/latex].<\/li>\r\n \t<li>In Question 6, you interpreted the meaning of\u00a0[latex]R^{2}[\/latex] in context.<\/li>\r\n \t<li>In Question 7, you identified possible values of\u00a0[latex]R^{2}[\/latex].<\/li>\r\n<\/ul>\r\nHopefully, you are beginning to develop a basic understanding of the coefficient of determination\u00a0[latex]R^{2}[\/latex]. Let's move to the activity in\u00a0<em>Forming Connections\u00a0<\/em>to continue exploring these ideas.","rendered":"<div class=\"textbox learning-objectives\">\n<h3>Learning Goals<\/h3>\n<p>At the end of this page, you should feel comfortable performing these skills:<\/p>\n<ul>\n<li>Develop intuition about how [latex]R^{2}[\/latex] is related to the shape of a scatterplot.<\/li>\n<li>Use technology to calculate [latex]R^{2}[\/latex].<\/li>\n<li>Interpret the meaning of [latex]R^{2}[\/latex] in context.<\/li>\n<li>Identify possible values of [latex]R^{2}[\/latex].<\/li>\n<\/ul>\n<\/div>\n<p>In the next in-class activity, you will need to be able to interpret the meaning of [latex]R^2[\/latex] in context, relate [latex]R^2[\/latex] to the shape of a scatterplot, and identify variable types (explanatory and response) and plot data in a scatterplot. We&#8217;ll prepare for this by developing your understanding of how [latex]R^{2}[\/latex] is related to the shape of a scatterplot as your learn to calculate, interpret, and recognize this measure in different scenarios.<\/p>\n<h2>The Coefficient of Determination<\/h2>\n<p>The coefficient of determination, denoted [latex]R^2[\/latex] and pronounced \u201cR squared,\u201d is the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable. The following graphic shows a visualization of what we mean by this.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-1240\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193049\/Picture156-300x230.jpg\" alt=\"Several graphs and charts. The first two both show House prices in Florida, graphing size in square feet on the x-axis and price in dollars on the y-axis. The first chart has a horizontal line across the center. For each point, a square is drawn based on its distance from the horizontal line. On the second graph, the points are in the same locations, but instead of a horizontal line, there is a diagonal line of best fit. For each point, a square is drawn based on its vertical distance from this line. Beneath this is two illustrations, each one corresponding to one of the graphs. The first one is titled &quot;Total variation in price&quot; and shows each of the squares from the graph with the horizontal line. The second one is labeled &quot;Variation in price not explained by its linear relationship with size&quot; and shows the squares from the graph with the line of best fit. Beneath both of these is another heading that says &quot;Proportion of variation in price not explained by its linear relationship with size.&quot; Beneath it, the squares from each graph are shown again, but this time, overlain, showing that the squares from the graph with the line of best fit have approximately one fifth of the area of those from the graph with the horizontal line.\" width=\"1243\" height=\"953\" \/><\/p>\n<p>The scatterplots show the prices and sizes of six houses. The first scatterplot includes the data points, as well as a horizontal line whose y-intercept is the mean house price. The second scatterplot includes the data points and the line of best fit. The blue squares in the first scatterplot are a demonstration of the total variation in the price; as you may recall from your discussion of variance, this is related to the sum of the squared distance from the mean. When we find the line of best fit, the distance of each data point to the line is minimized. The green squares in the second scatterplot are a demonstration of the variation in price that is left over after fitting a line to the data; in other words, the green squares show the variation in price that is not explained by its linear relationship with size.<\/p>\n<p>When we compare the unexplained variation with the total variation, we can visually estimate that the unexplained variation comprises about one-fifth of the total variation. As a result, we estimate that about four-fifths (or about 80%) of the variation is explained.<\/p>\n<p>When we use technology to compute [latex]R^2[\/latex] for this dataset, we find that [latex]R^2=0.82[\/latex]. This is consistent with our visual estimations. In other words, 82% of the variation in house price can be explained by the fact that houses differ in size and there is a linear relationship between price and size.<\/p>\n<div class=\"textbox tryit\">\n<h3>Video Placement<\/h3>\n<p><span style=\"background-color: #e6daf7;\">[Perspective Video: A three-instructor video that gives perspectives for how to see [latex]R^{2}[\/latex] as the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable. This video shouldn&#8217;t be technical for these introductory stats students. In fact, it should reassure them that the interpretation of [latex]R^{2}[\/latex] will be the more important skill to attain, while still developing an intuition of what [latex]R^{2}[\/latex] measures in linearly related data. Stress that students don&#8217;t need to follow the idea presented in the above images thoroughly, but do refer to them, and make it clear that a key point within the scope of this course is made in the notion that [latex]R^{2}[\/latex] close to 1 indicates that a proportion of the variation close to 100% can be explained by the explanatory variable &#8212; a strong linear correlation.]<\/span><\/p>\n<\/div>\n<div class=\"textbox\">\n<p>About the Notation<\/p>\n<p>You&#8217;ll sometimes see\u00a0[latex]R^{2}[\/latex] written in the lowercase as [latex]r^2[\/latex] (like in the <em>DCMP Data Analysis tool<\/em>), but [latex]R^2[\/latex] and [latex]r^2[\/latex] mean the same thing. In these activities, we will use the notation [latex]R^2[\/latex].<\/p>\n<p>The reason that we use the symbol [latex]R^{2}[\/latex] is that the coefficient of determination is equal to the square of the correlation coefficient [latex]r[\/latex]. Because of this, [latex]R^2[\/latex] is more sensitive to differences in the strength of the linear relationship between the two variables than [latex]r[\/latex] is. This increased sensitivity can be seen in the graphic below; the difference between [latex]R^2[\/latex] values is greater than the difference between corresponding [latex]r[\/latex] values.<\/p>\n<p><span style=\"background-color: #ffff00;\">[The image below is missing the line that indicates the R^2 values beneath each r value! &#8212; Please re-snip and re-insert the full image.]<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-1241\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193055\/Picture157-300x79.png\" alt=\"Several scatterplots labeled by the correlation of their line of best fit. The first graph is labeled &quot;Perfect Positive Correlation&quot; and shows points exactly on the line of best fit. The line has a positive slope and the r value is 1. The second graph is labeled &quot;Strong Positive Correlation&quot; and shows points close to the line of best fit. The slope of the line is positive and the r value is 0.91. The next graph is labeled &quot;Weak Positive Correlation&quot; and shows points that are not close to the line of best fit, but still show a correlation to the line. The slope of the line is positive and the r value is 0.48. The next graph is labeled &quot;No Correlation&quot; and show points randomly scattered across the graph. There is no line of best fit and the r value is 0. The next graph is labeled &quot;Weak Negative Distribution&quot; and shows points that are not close to the line of best fit, but still show a correlation to the line. The slope of the line is negative and the r-value is -0.48. The next graph is labeled &quot;Strong Negative Correlation&quot; and shows points that are close to the line of best fit. The slope of the line is negative and the r-value is -0.91. The last graph is labeled &quot;Perfect Negative Correlation&quot; and shows points that are exactly on the line of best fit. It has a negative slope and the r-value is -1.\" width=\"1067\" height=\"281\" \/><\/p>\n<p>We will not go into more detail here about how [latex]R^2[\/latex] is calculated; instead, you will practice finding and interpreting this value.<\/p>\n<p style=\"padding-left: 30px;\">If you are curious about how this quantity is computed, see this video\u00a0<a href=\"https:\/\/www.youtube.com\/watch?v=lng4ZgConCM\">https:\/\/www.youtube.com\/watch?v=lng4ZgConCM<\/a>.<\/p>\n<\/div>\n<h3>\u00a0[latex]R^{2}[\/latex] and Scatterplot Shape<\/h3>\n<p>You&#8217;ve seen that the coefficient of determination\u00a0[latex]R^2[\/latex] is a measure of the proportion of the variation of a response variable in linearly related bivariate data that can be explained by its relationship with the explanatory variable. You should understand that<\/p>\n<ul>\n<li>[latex]R^2[\/latex] is equivalent to the square of the correlation coefficient\u00a0[latex]r[\/latex] and will always be a positive number.<\/li>\n<li>[latex]R^2[\/latex] should be interpreted as a percentage.\n<ul>\n<li>That is, if a data set has\u00a0[latex]r=0.87[\/latex], then\u00a0[latex]R^2=0.7569[\/latex], which would be expressed as [latex]75.69\\%[\/latex].<\/li>\n<li>We would say that approximately [latex]75.7 \\%[\/latex] of the variation in the response variable is due to its linear relationship with the explanatory variable.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Consider what you already understand about the shape and spread of a scatterplot.<\/p>\n<ul>\n<li>The strongest linear relationships appear in plots as data that is roughly linear in shape with data points that lie very close to some line.<\/li>\n<li>Weaker relationships may be very roughly linear in shape and more spread out, with data points that lie further from some line.<\/li>\n<li>Non-linear relationships have data points that either form other shapes or are randomly scattered across the plot.<\/li>\n<\/ul>\n<p>Use what you already know together with the idea that [latex]R^2[\/latex] is the square of [latex]r[\/latex]to answer Question 1.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 1<\/h3>\n<p>For this question, use the following graphs.<\/p>\n<div id=\"attachment_1242\" style=\"width: 1194px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1242\" class=\"wp-image-1242\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193100\/Picture159-300x201.png\" alt=\"A scatterplot showing points that are fairly close together arranged in a semicircular pattern.\" width=\"1184\" height=\"793\" \/><\/p>\n<p id=\"caption-attachment-1242\" class=\"wp-caption-text\">Graph A<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div id=\"attachment_1243\" style=\"width: 1123px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1243\" class=\"wp-image-1243\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193105\/Picture160-300x200.png\" alt=\"A scatterplot showing points that are close together in a roughly linear shape.\" width=\"1113\" height=\"741\" \/><\/p>\n<p id=\"caption-attachment-1243\" class=\"wp-caption-text\">Graph B<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div id=\"attachment_1244\" style=\"width: 1260px\" class=\"wp-caption alignnone\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1244\" class=\"wp-image-1244\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193109\/Picture161-300x200.png\" alt=\"A scatterplot showing points that are not grouped closely together and follow a somewhat linear pattern.\" width=\"1250\" height=\"833\" \/><\/p>\n<p id=\"caption-attachment-1244\" class=\"wp-caption-text\">Graph C<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Part A: Which of the previous scatterplots demonstrates the strongest linear relationship?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q739788\">Hint<\/span><\/p>\n<div id=\"q739788\" class=\"hidden-answer\" style=\"display: none\">Recall what you have learned about the spread and shape of linear data as it appears visually in a scatterplot.<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Part B: Which of the previous scatterplots do you expect to have the highest value of [latex]R^2[\/latex]? Explain.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q712482\">Hint<\/span><\/p>\n<div id=\"q712482\" class=\"hidden-answer\" style=\"display: none\">Recall that [latex]R^2[\/latex] is the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable.<\/div>\n<\/div>\n<\/div>\n<p>Now that you have developed some intuition about\u00a0[latex]R^2[\/latex] and the shape of a plot, let&#8217;s explore how the spread of a plot affects the value of\u00a0[latex]R^2[\/latex].<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 2<\/h3>\n<p>Go to the <em>DCMP Explore Linear Regression<\/em> tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/ExploreLinReg\/\">https:\/\/dcmathpathways.shinyapps.io\/ExploreLinReg\/<\/a>. From the drop-down menu, select \u201cLinear Relationship.\u201d<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1245 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193117\/Picture162-300x147.png\" alt=\"A selection menu. At the top, &quot;Explore Linear Regression&quot; is selected and &quot;Scatterplot&quot; and &quot;Residualplot&quot; are unselected. Beneath this is a dropdown menu where &quot;Draw Your Own (Click in Graph)&quot; and &quot;Random Scatter&quot; are unselected, &quot;Linear Relationship&quot; is selected, and &quot;Quadratic Relationship&quot; is unselected. Beneath that is another heading that says &quot;Initial Number of Points.&quot; Under it, 50 is selected and 20, 100, and 500 are all unselected.\" width=\"392\" height=\"192\" \/><\/p>\n<p>Select the boxes that will display [latex]r[\/latex] and [latex]R^2[\/latex]:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1246 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193122\/Picture163-300x175.png\" alt=\"A checklist with the heading &quot;Options.&quot; &quot;Linear Regression Line&quot; is selected, &quot;Smooth Trend&quot; is unselected, and &quot;Show Correlation Coefficient r&quot; and &quot;Squared Correlation Coefficient r squared&quot; are both selected.\" width=\"393\" height=\"230\" \/><\/p>\n<p>Toggle back and forth between the different options for spread. Make a note for yourself about how [latex]R^2[\/latex] changes as you change the spread from large to medium to small. As you do this, note that squaring [latex]r[\/latex] does, in fact, yield [latex]R^2[\/latex].<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1247 aligncenter\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193128\/Picture164-300x91.png\" alt=\"A selection menu with the heading &quot;Spread.&quot; Beneath it, &quot;medium&quot; has been selected and &quot;small&quot; and &quot;large&quot; are both unselected. Beneath those is a button that reads &quot;Refresh.&quot;\" width=\"393\" height=\"119\" \/><\/p>\n<p>Part A: When the data points lie closer to the line of best fit, the linear relationship between the explanatory variable and the response variable ___________.<\/p>\n<ol>\n<li>a) gets stronger<\/li>\n<li>b) gets weaker<\/li>\n<li>c) stays the same<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q684619\">Hint<\/span><\/p>\n<div id=\"q684619\" class=\"hidden-answer\" style=\"display: none\">Recall what you know about the strength of linear relationships.<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Part B: As the linear relationship between the explanatory variable and the response variable gets stronger, the value of [latex]R^2[\/latex] __________.<\/p>\n<ol>\n<li>a) increases<\/li>\n<li>b) decreases<\/li>\n<li>c) stays the same<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q861892\">Hint<\/span><\/p>\n<div id=\"q861892\" class=\"hidden-answer\" style=\"display: none\">What happens to [latex]R^2[\/latex] as you change the spread of the points relative to the line in the scatterplot?<\/div>\n<\/div>\n<\/div>\n<p>Hopefully you are feeling more confident in understanding\u00a0[latex]R^2[\/latex] with regard to what you already knew about linear relationships and the shape and spread of the data on a scatterplot.<\/p>\n<h3>Interpreting\u00a0[latex]R^2[\/latex] in context<\/h3>\n<p>Now it&#8217;s time to put together what you&#8217;ve learned so far about explanatory and response variables, visual clues in a scatterplot regarding the appropriateness of linear analysis, the correlation coefficient\u00a0[latex]r[\/latex], and the coefficient of determination\u00a0[latex]R^2[\/latex]. See the video below for a summary of these ideas, then use what you&#8217;ve learned to answer Questions 3 &#8211; 6.<\/p>\n<div class=\"textbox tryit\">\n<h3>Video Placement<\/h3>\n<p><span style=\"background-color: #e6daf7;\">[A 3-instructor worked example that summarizes the ideas appearing in Questions 3 &#8211; 6. This can be a place to use an inclusion or social justice example. The example should walk through identifying response and explanatory variables from a scenario description, anticipation of the R^2 value upon visual inspection of the plot, confirmation of R^2 via a data analysis tool, and interpretation of R^2 in the context of the scenario.]\u00a0<\/span><\/p>\n<\/div>\n<p>Now you try it. Go to the <em>DCMP Linear Regression<\/em> tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/LinearRegression\/%20\">https:\/\/dcmathpathways.shinyapps.io\/LinearRegression\/<\/a>. Select \u201cFrom Textbook\u201d and then select the \u201cBad Drivers\u201d dataset to answer Question 3 &#8211; 6 below.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 3<\/h3>\n<p>We wish to investigate the relationship between the losses (in dollars) incurred by insurance companies for collisions per insured driver and insurance premiums (in dollars). Insurance companies incur losses when drivers who are insured through them are involved in collisions, and the insurance companies then have to pay for the associated costs. Insurance premiums are the fees that insurance companies charge; drivers pay premiums to the insurance companies in order to buy insurance coverage.<\/p>\n<p>&nbsp;<\/p>\n<p>Part A: Which variable is the explanatory variable?<\/p>\n<ol>\n<li>a) Losses<\/li>\n<li>b) Insurance premiums<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q765472\">Hint<\/span><\/p>\n<div id=\"q765472\" class=\"hidden-answer\" style=\"display: none\">Which variable measures the outcome and which is the variable driving the outcome?<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Part B: Which variable is the response variable?<\/p>\n<ol>\n<li>a) Losses<\/li>\n<li>b) Insurance premiums<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q439776\">Hint<\/span><\/p>\n<div id=\"q439776\" class=\"hidden-answer\" style=\"display: none\">Which variable measures the outcome and which is the variable driving the outcome?<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 4<\/h3>\n<p>Visually inspect the scatterplot. Which of the following do you expect of the [latex]R^2[\/latex] value?<\/p>\n<ol>\n<li>a) Very close to 0%<\/li>\n<li>b) Between 10% and 50%<\/li>\n<li>c) Between 50% and 90%<\/li>\n<li>d) Very close to 100%<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q816551\">Hint<\/span><\/p>\n<div id=\"q816551\" class=\"hidden-answer\" style=\"display: none\">Visually inspect the scatterplot for shape and spread.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 5<\/h3>\n<p>Use the tool to find [latex]R^2[\/latex]. Note that the coefficient of determination ([latex]R^2[\/latex]) is given in a table below the scatterplot. What is the value of [latex]R^2[\/latex]?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q383832\">Hint<\/span><\/p>\n<div id=\"q383832\" class=\"hidden-answer\" style=\"display: none\">It will be listed in a summary of the statistical model, near [latex]r[\/latex].<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 6<\/h3>\n<p>Interpret [latex]R^2[\/latex] for this scatterplot: _______% of the variation in _______ can be explained by its linear relationship with _______.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q271381\">Hint<\/span><\/p>\n<div id=\"q271381\" class=\"hidden-answer\" style=\"display: none\">Refer to the definition of [latex]R^2[\/latex] given at the beginning of this assignment. <\/div>\n<\/div>\n<\/div>\n<h3>\u00a0Identifying\u00a0[latex]R^2[\/latex] in Context<\/h3>\n<p>The final two questions below ask you to think carefully about the characteristics of [latex]R^2[\/latex] and the characteristics of data graphed on a scatterplot. Use what you have learned about these to answer the questions below. Don&#8217;t forget to use the given hint if necessary.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 7<\/h3>\n<p>Depending on the tools you use, [latex]R^2[\/latex] may be expressed as a decimal or as a percentage. Even though the tool expresses [latex]R^2[\/latex] using a percentage, it is important to be able to read values of [latex]R^2[\/latex] in both decimal and percentage form. Consider an arbitrary [latex]R^2[\/latex]. Which of the following are possible values of [latex]R^2[\/latex]? There may be more than one correct answer.<\/p>\n<ol>\n<li>a) 0.1<\/li>\n<li>b) 30%<\/li>\n<li>c) 1.5<\/li>\n<li>d) 1.5%<\/li>\n<li>e) 100%<\/li>\n<li>f) -0.1<\/li>\n<li>g) -44<\/li>\n<li>h) 1<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q555842\">Hint<\/span><\/p>\n<div id=\"q555842\" class=\"hidden-answer\" style=\"display: none\">\u00a0Recall that [latex]R^2[\/latex] is the proportion of the variation in the response variable that can be explained by its linear relationship with the explanatory variable. Does it make sense for this proportion to be negative? Can more than 100% of the variation in the response variable be explained by its linear relationship with the explanatory variable? You may have to convert between decimals and percentages.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 8<\/h3>\n<p>Both of the plots below have an [latex]R^2[\/latex] value of 0.9%. This is quite close to zero, which means that very little of the variation in the response variable can be explained by its linear relationship with the explanatory variable.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-1248\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193134\/Picture165-300x195.png\" alt=\"A scatterplot showing points arranged in a horizontal zig-zag pattern.\" width=\"325\" height=\"211\" \/>.\u00a0 \u00a0<img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-1249\" style=\"font-size: 1rem; text-align: initial;\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/12193140\/Picture166-300x193.png\" alt=\"A scatterplot showing points arranged randomly.\" width=\"326\" height=\"210\" \/><\/p>\n<p>Determine whether this statement is true or false: If [latex]R^2[\/latex] is very small, then there is no relationship between the explanatory variable and the response variable.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q674667\">Hint<\/span><\/p>\n<div id=\"q674667\" class=\"hidden-answer\" style=\"display: none\">Can [latex]R^2[\/latex] indicate a weak linear relationship without revealing some other kind of strong relationship between the variables?<\/div>\n<\/div>\n<\/div>\n<h2>Summary<\/h2>\n<p>In this <em>What to Know\u00a0<\/em>page,\u00a0you learned about the coefficient of determination [latex]R^{2}[\/latex].\u00a0 Here is a summary, per question, of what you saw.<\/p>\n<ul>\n<li>In Questions 1, 2, 4, and 8, you developed intuition about how [latex]R^{2}[\/latex] is related to the shape of a scatterplot.<\/li>\n<li>In Question 3, you identified variable types (explanatory and response) and plotted data in a scatterplot.<\/li>\n<li>In Question 5, you used technology to calculate\u00a0[latex]R^{2}[\/latex].<\/li>\n<li>In Question 6, you interpreted the meaning of\u00a0[latex]R^{2}[\/latex] in context.<\/li>\n<li>In Question 7, you identified possible values of\u00a0[latex]R^{2}[\/latex].<\/li>\n<\/ul>\n<p>Hopefully, you are beginning to develop a basic understanding of the coefficient of determination\u00a0[latex]R^{2}[\/latex]. Let&#8217;s move to the activity in\u00a0<em>Forming Connections\u00a0<\/em>to continue exploring these ideas.<\/p>\n","protected":false},"author":428269,"menu_order":13,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-3859","chapter","type-chapter","status-publish","hentry"],"part":4241,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/3859","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/428269"}],"version-history":[{"count":10,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/3859\/revisions"}],"predecessor-version":[{"id":4839,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/3859\/revisions\/4839"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/4241"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/3859\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=3859"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=3859"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=3859"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=3859"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}