{"id":356,"date":"2017-04-15T03:22:32","date_gmt":"2017-04-15T03:22:32","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/conceptstest1\/chapter\/distribution-of-sample-proportions-5-of-6\/"},"modified":"2017-05-31T00:31:43","modified_gmt":"2017-05-31T00:31:43","slug":"distribution-of-sample-proportions-5-of-6","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/chapter\/distribution-of-sample-proportions-5-of-6\/","title":{"raw":"Distribution of Sample Proportions (5 of 6)","rendered":"Distribution of Sample Proportions (5 of 6)"},"content":{"raw":"&nbsp;\r\n<div class=\"textbox learning-objectives\">\r\n<h3>Learning Objectives<\/h3>\r\n<ul>\r\n \t<li>Use a z-score and the standard normal model to estimate probabilities of specified events.<\/li>\r\n<\/ul>\r\n<\/div>\r\nFrom our work on the previous page, we now have a mathematical model of the sampling distribution of sample proportions. This model describes how much variability we can expect in random samples from a population with a given parameter. If a normal model is a good fit for a sampling distribution, we can apply the empirical rule and use <em>z<\/em>-scores to determine probabilities. Here we link probability to the kind of thinking we do in inference.\r\n<h3>Making Connections to Probability Models in <em>Probability and Probability Distribution<\/em><\/h3>\r\nProbability describes the chance that a random event occurs. Recall the concept of a random variable from the module <em>Probability and Probability Distribution<\/em>. When a variable is random, it varies unpredictably in the short run but has a predictable pattern in the long run. Sample proportions from random samples are a random variable. We cannot predict the proportion for any one random sample; they vary. But we can predict the pattern that occurs when we select a great many random samples from a population. The sampling distribution describes this pattern. When a normal model is a good fit for the sampling distribution, we can use what we learned in the previous module to find probabilities.\r\n\r\nRecall probability models we saw in <em>Probability and Probability Distribution<\/em>. We saw examples of models with skewed curves, but we focused on normal curves because we use normal probability models to describe sampling distributions in Modules 7 to 10 when we make inferences about a population. As we now know, we can use a normal model only when certain conditions are met. Whenever we want to use a normal model, we must check the conditions to make sure a normal model is a good fit.\r\n\r\nHere we summarize our general process for developing a probability model for inference. This is essentially the same process we used in the previous module for developing normal probability models from relative frequencies.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032227\/m7_link_prob_statistical_inference_topic_7_1_m7_dist_sample_proportions_5_image1.png\" alt=\"Normal curve showing probability\" width=\"659\" height=\"623\" \/>\r\n\r\nIf a normal model is a good fit for the sampling distribution, we can standardize the values by calculating a <em>z<\/em>-score. Then we can use the standard normal model to find probabilities, as we did in <em>Probability and Probability Distribution<\/em>.\r\n\r\nThe <em>z<\/em>-score is the error in the statistic divided by the standard error. For sample proportions, we have the following formulas.\r\n<p style=\"text-align: center\">[latex]\\begin{array}{l}\\mathrm{standard}\\text{}\\mathrm{error}=\\sqrt{\\frac{p(1-p)}{n}}\\\\ Z=\\frac{\\mathrm{statistic}-\\mathrm{parameter}}{\\mathrm{standard}\\text{}\\mathrm{error}}=\\frac{\\stackrel{\u02c6}{p}-p}{\\mathrm{standard}\\text{}\\mathrm{error}}\\end{array}[\/latex]<\/p>\r\nWe can also write this as one formula:\r\n<p style=\"text-align: center\">[latex]Z=\\frac{\\stackrel{\u02c6}{p}-p}{\\sqrt{\\frac{p(1-p)}{n}}}[\/latex]<\/p>\r\n\r\n<h3>Comment<\/h3>\r\nThis <em>z<\/em>-score formula is similar to the <em>z<\/em>-score formula we used in <em>Probability and Probability Distribution<\/em>. We described the <em>z<\/em>-score as the number of standard deviations a data value is from the mean. Here we can describe the <em>z<\/em>-score as the number of standard errors a sample proportion is from the mean. Because the mean is the parameter value, we can say that the <em>z<\/em>-score is the number of standard errors a sample proportion is from the parameter.\r\n\r\nA positive <em>z<\/em>-score indicates that the sample proportion is larger than the parameter. A negative <em>z<\/em>-score indicates that the sample proportion is smaller than the parameter.\r\n<div class=\"textbox examples\">\r\n<h3>Example<\/h3>\r\n<h2>Probability Calculations for Community College Enrollment<\/h2>\r\nLet\u2019s return to the example of community college enrollment. Recall that a 2007 report by the Pew Research Center stated that about 10% of the 3.1 million 18- to 24-year-olds in the United States were enrolled in a community college. Let\u2019s again suppose we randomly selected 100 young adults in this age group and found that 15% of the sample was enrolled in a community college.\r\n\r\nPreviously, we determined that 15% is a surprising result. Now we want to be more precise. We ask this question: <em>What is the probability that a random sample of this size has 15% or more enrolled in a community college?<\/em>\r\n\r\nTo answer this question, we first determine if a normal model is a good fit for the sampling distribution.\r\n\r\n<strong>Check normality conditions:<\/strong>\r\n\r\nYes, the conditions are met. The number of expected successes and failures in a sample of 100 are at least 10. We expect 10% of the 100 to be enrolled in a community college, [latex]np=100(0.10)=10[\/latex]. We expect 90% of the 100 to not be enrolled, [latex]n(1-p)=100(0.90)=90[\/latex].\r\n\r\nWe therefore can use a normal model, which allows us to use a <em>z<\/em>-score to find the probability.\r\n\r\n<strong>Find the <em>z<\/em>-score:<\/strong>\r\n<p style=\"text-align: center\">[latex]\\mathrm{standard}\\text{}\\mathrm{error}\\text{}=\\text{}\\sqrt{\\frac{p(1-p)}{n}}=\\sqrt{\\frac{0.10(0.90)}{100}}\\approx 0.03[\/latex]<\/p>\r\n<p style=\"text-align: center\">[latex]Z=\\frac{\\mathrm{statistic}-\\mathrm{parameter}}{\\mathrm{standard}\\text{}\\mathrm{error}}=\\frac{0.15-0.10}{0.03}\\approx 1.67[\/latex]<\/p>\r\n<strong>Find the probability using the standard normal model:<\/strong>\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032230\/m7_link_prob_statistical_inference_topic_7_1_m7_dist_sample_proportions_5_image4.png\" alt=\"Standard normal model. First model shows sample proportions, second model shows the area to the left and right of the Z value\" width=\"444\" height=\"181\" \/>\r\n\r\nWe want the probability that the sample proportion is 15% or more. So we want the probability that the <em>z<\/em>-score is greater than or equal to 1.67. The probability is about 0.0475.\r\n\r\n<strong>Conclusion: <\/strong>If it is true that 10% of the population of 18- to 24-year-olds are enrolled at a community college, then it is unusual to see a random sample of 100 with 15% or more enrolled. The probability is about 0.0475.\r\n\r\nNote: This probability is a conditional probability. Recall from <em>Relationships in Categorical Data with Intro to Probability<\/em> that we write a conditional probability <em>P<\/em>(<em>A<\/em> given <em>B<\/em>) as <em>P<\/em>(<em>A<\/em> | <em>B<\/em>). Here we write <em>P<\/em>(a sample proportion is 0.15 given that the population proportion is 0.10) as\r\n<p style=\"text-align: center\">[latex]P(\\text{}\\stackrel{\u02c6}{p}\\text{}\u2265\\text{}0.15\\text{}|\\text{}p=0.10)\\text{}\\approx \\text{}0.0475[\/latex]<\/p>\r\n\r\n<\/div>\r\n<a href=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/dist_sample_prop_6of6\/dist_sample_proportions_normal.html\" target=\"new\">Click here to open this simulation in its own window.<\/a>\r\n\r\n<iframe id=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/dist_sample_prop_6of6\/dist_sample_proportions_normal.html\" width=\"750\" height=\"750\"><\/iframe>\r\n<div class=\"textbox exercises\">\r\n<h3>Learn By Doing<\/h3>\r\nhttps:\/\/assessments.lumenlearning.com\/assessments\/3892\r\n\r\n<\/div>\r\n&nbsp;","rendered":"<p>&nbsp;<\/p>\n<div class=\"textbox learning-objectives\">\n<h3>Learning Objectives<\/h3>\n<ul>\n<li>Use a z-score and the standard normal model to estimate probabilities of specified events.<\/li>\n<\/ul>\n<\/div>\n<p>From our work on the previous page, we now have a mathematical model of the sampling distribution of sample proportions. This model describes how much variability we can expect in random samples from a population with a given parameter. If a normal model is a good fit for a sampling distribution, we can apply the empirical rule and use <em>z<\/em>-scores to determine probabilities. Here we link probability to the kind of thinking we do in inference.<\/p>\n<h3>Making Connections to Probability Models in <em>Probability and Probability Distribution<\/em><\/h3>\n<p>Probability describes the chance that a random event occurs. Recall the concept of a random variable from the module <em>Probability and Probability Distribution<\/em>. When a variable is random, it varies unpredictably in the short run but has a predictable pattern in the long run. Sample proportions from random samples are a random variable. We cannot predict the proportion for any one random sample; they vary. But we can predict the pattern that occurs when we select a great many random samples from a population. The sampling distribution describes this pattern. When a normal model is a good fit for the sampling distribution, we can use what we learned in the previous module to find probabilities.<\/p>\n<p>Recall probability models we saw in <em>Probability and Probability Distribution<\/em>. We saw examples of models with skewed curves, but we focused on normal curves because we use normal probability models to describe sampling distributions in Modules 7 to 10 when we make inferences about a population. As we now know, we can use a normal model only when certain conditions are met. Whenever we want to use a normal model, we must check the conditions to make sure a normal model is a good fit.<\/p>\n<p>Here we summarize our general process for developing a probability model for inference. This is essentially the same process we used in the previous module for developing normal probability models from relative frequencies.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032227\/m7_link_prob_statistical_inference_topic_7_1_m7_dist_sample_proportions_5_image1.png\" alt=\"Normal curve showing probability\" width=\"659\" height=\"623\" \/><\/p>\n<p>If a normal model is a good fit for the sampling distribution, we can standardize the values by calculating a <em>z<\/em>-score. Then we can use the standard normal model to find probabilities, as we did in <em>Probability and Probability Distribution<\/em>.<\/p>\n<p>The <em>z<\/em>-score is the error in the statistic divided by the standard error. For sample proportions, we have the following formulas.<\/p>\n<p style=\"text-align: center\">[latex]\\begin{array}{l}\\mathrm{standard}\\text{}\\mathrm{error}=\\sqrt{\\frac{p(1-p)}{n}}\\\\ Z=\\frac{\\mathrm{statistic}-\\mathrm{parameter}}{\\mathrm{standard}\\text{}\\mathrm{error}}=\\frac{\\stackrel{\u02c6}{p}-p}{\\mathrm{standard}\\text{}\\mathrm{error}}\\end{array}[\/latex]<\/p>\n<p>We can also write this as one formula:<\/p>\n<p style=\"text-align: center\">[latex]Z=\\frac{\\stackrel{\u02c6}{p}-p}{\\sqrt{\\frac{p(1-p)}{n}}}[\/latex]<\/p>\n<h3>Comment<\/h3>\n<p>This <em>z<\/em>-score formula is similar to the <em>z<\/em>-score formula we used in <em>Probability and Probability Distribution<\/em>. We described the <em>z<\/em>-score as the number of standard deviations a data value is from the mean. Here we can describe the <em>z<\/em>-score as the number of standard errors a sample proportion is from the mean. Because the mean is the parameter value, we can say that the <em>z<\/em>-score is the number of standard errors a sample proportion is from the parameter.<\/p>\n<p>A positive <em>z<\/em>-score indicates that the sample proportion is larger than the parameter. A negative <em>z<\/em>-score indicates that the sample proportion is smaller than the parameter.<\/p>\n<div class=\"textbox examples\">\n<h3>Example<\/h3>\n<h2>Probability Calculations for Community College Enrollment<\/h2>\n<p>Let\u2019s return to the example of community college enrollment. Recall that a 2007 report by the Pew Research Center stated that about 10% of the 3.1 million 18- to 24-year-olds in the United States were enrolled in a community college. Let\u2019s again suppose we randomly selected 100 young adults in this age group and found that 15% of the sample was enrolled in a community college.<\/p>\n<p>Previously, we determined that 15% is a surprising result. Now we want to be more precise. We ask this question: <em>What is the probability that a random sample of this size has 15% or more enrolled in a community college?<\/em><\/p>\n<p>To answer this question, we first determine if a normal model is a good fit for the sampling distribution.<\/p>\n<p><strong>Check normality conditions:<\/strong><\/p>\n<p>Yes, the conditions are met. The number of expected successes and failures in a sample of 100 are at least 10. We expect 10% of the 100 to be enrolled in a community college, [latex]np=100(0.10)=10[\/latex]. We expect 90% of the 100 to not be enrolled, [latex]n(1-p)=100(0.90)=90[\/latex].<\/p>\n<p>We therefore can use a normal model, which allows us to use a <em>z<\/em>-score to find the probability.<\/p>\n<p><strong>Find the <em>z<\/em>-score:<\/strong><\/p>\n<p style=\"text-align: center\">[latex]\\mathrm{standard}\\text{}\\mathrm{error}\\text{}=\\text{}\\sqrt{\\frac{p(1-p)}{n}}=\\sqrt{\\frac{0.10(0.90)}{100}}\\approx 0.03[\/latex]<\/p>\n<p style=\"text-align: center\">[latex]Z=\\frac{\\mathrm{statistic}-\\mathrm{parameter}}{\\mathrm{standard}\\text{}\\mathrm{error}}=\\frac{0.15-0.10}{0.03}\\approx 1.67[\/latex]<\/p>\n<p><strong>Find the probability using the standard normal model:<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15032230\/m7_link_prob_statistical_inference_topic_7_1_m7_dist_sample_proportions_5_image4.png\" alt=\"Standard normal model. First model shows sample proportions, second model shows the area to the left and right of the Z value\" width=\"444\" height=\"181\" \/><\/p>\n<p>We want the probability that the sample proportion is 15% or more. So we want the probability that the <em>z<\/em>-score is greater than or equal to 1.67. The probability is about 0.0475.<\/p>\n<p><strong>Conclusion: <\/strong>If it is true that 10% of the population of 18- to 24-year-olds are enrolled at a community college, then it is unusual to see a random sample of 100 with 15% or more enrolled. The probability is about 0.0475.<\/p>\n<p>Note: This probability is a conditional probability. Recall from <em>Relationships in Categorical Data with Intro to Probability<\/em> that we write a conditional probability <em>P<\/em>(<em>A<\/em> given <em>B<\/em>) as <em>P<\/em>(<em>A<\/em> | <em>B<\/em>). Here we write <em>P<\/em>(a sample proportion is 0.15 given that the population proportion is 0.10) as<\/p>\n<p style=\"text-align: center\">[latex]P(\\text{}\\stackrel{\u02c6}{p}\\text{}\u2265\\text{}0.15\\text{}|\\text{}p=0.10)\\text{}\\approx \\text{}0.0475[\/latex]<\/p>\n<\/div>\n<p><a href=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/dist_sample_prop_6of6\/dist_sample_proportions_normal.html\" target=\"new\">Click here to open this simulation in its own window.<\/a><\/p>\n<p><iframe loading=\"lazy\" id=\"\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/interactives\/dist_sample_prop_6of6\/dist_sample_proportions_normal.html\" width=\"750\" height=\"750\"><\/iframe><\/p>\n<div class=\"textbox exercises\">\n<h3>Learn By Doing<\/h3>\n<p>\t<iframe id=\"lumen_assessment_3892\" class=\"resizable\" src=\"https:\/\/assessments.lumenlearning.com\/assessments\/load?assessment_id=3892&#38;embed=1&#38;external_user_id=&#38;external_context_id=&#38;iframe_resize_id=lumen_assessment_3892\" frameborder=\"0\" style=\"border:none;width:100%;height:100%;min-height:400px;\"><br \/>\n\t<\/iframe><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-356\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>Concepts in Statistics. <strong>Provided by<\/strong>: Open Learning Initiative. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/oli.cmu.edu\">http:\/\/oli.cmu.edu<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":163,"menu_order":7,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"Concepts in Statistics\",\"author\":\"\",\"organization\":\"Open Learning Initiative\",\"url\":\"http:\/\/oli.cmu.edu\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"}]","CANDELA_OUTCOMES_GUID":"a2869fc9-e2a7-4604-a24a-4847aea0a201","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-356","chapter","type-chapter","status-publish","hentry"],"part":333,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/356","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/users\/163"}],"version-history":[{"count":5,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/356\/revisions"}],"predecessor-version":[{"id":1430,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/356\/revisions\/1430"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/parts\/333"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/356\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/media?parent=356"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapter-type?post=356"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/contributor?post=356"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/license?post=356"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}