{"id":261,"date":"2017-04-15T03:20:18","date_gmt":"2017-04-15T03:20:18","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/conceptstest1\/chapter\/introduction-6\/"},"modified":"2017-05-30T23:39:57","modified_gmt":"2017-05-30T23:39:57","slug":"introduction-6","status":"web-only","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/chapter\/introduction-6\/","title":{"raw":"Why It Matters: Probability and Probability Distributions","rendered":"Why It Matters: Probability and Probability Distributions"},"content":{"raw":"&nbsp;\r\n\r\nRecall the Big Picture\u2014the four-step process that encompasses statistics (as it is presented in this course):\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032014\/m6_probability_big_picture_probability.gif\" alt=\"The Big Picture of statistics. Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 3: Probability\" width=\"868\" height=\"420\" \/>\r\n\r\nSo far, we've discussed the first two steps:\r\n\r\n<strong>Producing data<\/strong>\u2014how data are obtained and what considerations affect the data production process.\r\n\r\n<strong>Exploratory data analysis<\/strong>\u2014tools that help us get a first feel for the data by exposing their features using graphs and numbers.\r\n\r\nOur eventual goal is <strong>inference<\/strong>\u2014drawing reliable conclusions about the population on the basis of what we've discovered in our sample. To really understand how inference works, though, we first need to talk about <strong>probability<\/strong>, because it is the underlying foundation for the methods of statistical inference. We use an example to try to explain why probability is so essential to inference.\r\n\r\nFirst, here is the general idea: As we all know, the way statistics works is that we use a sample to learn about the population from which it was drawn. Ideally, the sample should be random so that it represents the population well.\r\n\r\nRecall from <em>Types of Statistical Studies and Producing Data<\/em> that when we say a random sample represents the population <em>well<\/em>, we mean that there is no inherent bias in this sampling technique. It is important to acknowledge, though, that this does not mean all random samples are necessarily \u201cperfect.\u201d Random samples are still random, and therefore no random sample will be exactly the same as another. One random sample may give a fairly accurate representation of the population, whereas another random sample might be \u201coff\u201d purely because of chance. Unfortunately, when looking at a particular sample (which is what happens in practice), we never know how much it differs from the population. This uncertainty is where probability comes into the picture. We use probability to quantify how much we expect random samples to vary. This gives us a way to draw conclusions about the population in the face of the uncertainty that is generated by the use of a random sample. The following example illustrates this important point.\r\n<div class=\"textbox examples\">\r\n<h3>Example<\/h3>\r\n<h2>Death Penalty<\/h2>\r\nSuppose we are interested in estimating the percentage of U.S. adults who favor the death penalty. To do so, we choose a random sample of 1,200 U.S. adults and ask their opinion: either in favor of or against the death penalty. We find that 744 of the 1,200, or 62%, are in favor. (Although this is only an example, 62% is quite realistic given some recent polls). Here is a picture that illustrates what we have done and found in our example:\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032017\/m6_probability_image002.gif\" alt=\"Illustration of random sampling\" width=\"424\" height=\"265\" \/>\r\n\r\nOur goal is to do inference\u2014to learn and draw conclusions about the opinions of the entire population of U.S. adults regarding the death penalty on the basis of the opinions of only 1,200 of them.\r\n\r\nCan we conclude that 62% of the population favors the death penalty? Another random sample could give a very different result, so we are uncertain. But because our sample is random, we know that our uncertainty is due to chance, not to problems with how the sample was collected. So we can use probability to describe the likelihood that our sample is within a desired level of accuracy. For example, probability can answer the question, <em>How likely is it that our sample estimate is no more than 3% from the true percentage of all U.S. adults who are in favor of the death penalty?<\/em>\r\n\r\nAnswering this question (which we do using probability) is obviously going to have an important impact on the confidence we can attach to the inference step. In particular, if we find it quite unlikely that the sample percentage will be very different from the population percentage, then we have a lot of confidence that we can draw conclusions about the population on the basis of the sample.\r\n\r\nIn this module, we discuss probability more generally. Then we begin to develop the probability machinery that underlies inference.\r\n\r\n<\/div>\r\n&nbsp;","rendered":"<p>&nbsp;<\/p>\n<p>Recall the Big Picture\u2014the four-step process that encompasses statistics (as it is presented in this course):<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032014\/m6_probability_big_picture_probability.gif\" alt=\"The Big Picture of statistics. Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 3: Probability\" width=\"868\" height=\"420\" \/><\/p>\n<p>So far, we&#8217;ve discussed the first two steps:<\/p>\n<p><strong>Producing data<\/strong>\u2014how data are obtained and what considerations affect the data production process.<\/p>\n<p><strong>Exploratory data analysis<\/strong>\u2014tools that help us get a first feel for the data by exposing their features using graphs and numbers.<\/p>\n<p>Our eventual goal is <strong>inference<\/strong>\u2014drawing reliable conclusions about the population on the basis of what we&#8217;ve discovered in our sample. To really understand how inference works, though, we first need to talk about <strong>probability<\/strong>, because it is the underlying foundation for the methods of statistical inference. We use an example to try to explain why probability is so essential to inference.<\/p>\n<p>First, here is the general idea: As we all know, the way statistics works is that we use a sample to learn about the population from which it was drawn. Ideally, the sample should be random so that it represents the population well.<\/p>\n<p>Recall from <em>Types of Statistical Studies and Producing Data<\/em> that when we say a random sample represents the population <em>well<\/em>, we mean that there is no inherent bias in this sampling technique. It is important to acknowledge, though, that this does not mean all random samples are necessarily \u201cperfect.\u201d Random samples are still random, and therefore no random sample will be exactly the same as another. One random sample may give a fairly accurate representation of the population, whereas another random sample might be \u201coff\u201d purely because of chance. Unfortunately, when looking at a particular sample (which is what happens in practice), we never know how much it differs from the population. This uncertainty is where probability comes into the picture. We use probability to quantify how much we expect random samples to vary. This gives us a way to draw conclusions about the population in the face of the uncertainty that is generated by the use of a random sample. The following example illustrates this important point.<\/p>\n<div class=\"textbox examples\">\n<h3>Example<\/h3>\n<h2>Death Penalty<\/h2>\n<p>Suppose we are interested in estimating the percentage of U.S. adults who favor the death penalty. To do so, we choose a random sample of 1,200 U.S. adults and ask their opinion: either in favor of or against the death penalty. We find that 744 of the 1,200, or 62%, are in favor. (Although this is only an example, 62% is quite realistic given some recent polls). Here is a picture that illustrates what we have done and found in our example:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032017\/m6_probability_image002.gif\" alt=\"Illustration of random sampling\" width=\"424\" height=\"265\" \/><\/p>\n<p>Our goal is to do inference\u2014to learn and draw conclusions about the opinions of the entire population of U.S. adults regarding the death penalty on the basis of the opinions of only 1,200 of them.<\/p>\n<p>Can we conclude that 62% of the population favors the death penalty? Another random sample could give a very different result, so we are uncertain. But because our sample is random, we know that our uncertainty is due to chance, not to problems with how the sample was collected. So we can use probability to describe the likelihood that our sample is within a desired level of accuracy. For example, probability can answer the question, <em>How likely is it that our sample estimate is no more than 3% from the true percentage of all U.S. adults who are in favor of the death penalty?<\/em><\/p>\n<p>Answering this question (which we do using probability) is obviously going to have an important impact on the confidence we can attach to the inference step. In particular, if we find it quite unlikely that the sample percentage will be very different from the population percentage, then we have a lot of confidence that we can draw conclusions about the population on the basis of the sample.<\/p>\n<p>In this module, we discuss probability more generally. Then we begin to develop the probability machinery that underlies inference.<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-261\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>Concepts in Statistics. <strong>Provided by<\/strong>: Open Learning Initiative. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/oli.cmu.edu\">http:\/\/oli.cmu.edu<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":163,"menu_order":1,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"Concepts in Statistics\",\"author\":\"\",\"organization\":\"Open Learning Initiative\",\"url\":\"http:\/\/oli.cmu.edu\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"}]","CANDELA_OUTCOMES_GUID":"69c8488a-0afd-41bf-91c2-c1f6e51dd816","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-261","chapter","type-chapter","status-web-only","hentry"],"part":258,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/users\/163"}],"version-history":[{"count":4,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/261\/revisions"}],"predecessor-version":[{"id":1396,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/261\/revisions\/1396"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/parts\/258"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/261\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/media?parent=261"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapter-type?post=261"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/contributor?post=261"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/license?post=261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}