{"id":36,"date":"2017-04-15T03:15:06","date_gmt":"2017-04-15T03:15:06","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/conceptstest1\/chapter\/sampling-2-of-2\/"},"modified":"2017-06-01T02:03:47","modified_gmt":"2017-06-01T02:03:47","slug":"sampling-2-of-2","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/chapter\/sampling-2-of-2\/","title":{"raw":"Sampling (2 of 2)","rendered":"Sampling (2 of 2)"},"content":{"raw":"&nbsp;\r\n<div class=\"textbox learning-objectives\">\r\n<h3>Learning Objectives<\/h3>\r\n<ul>\r\n \t<li>For an observational study, critique the sampling plan. Recognize implications and limitations of the plan.<\/li>\r\n<\/ul>\r\n<\/div>\r\nLet\u2019s briefly summarize the main points about sampling:\r\n<ul>\r\n \t<li>We draw a conclusion about the population on the basis of the sample.<\/li>\r\n \t<li>To draw a valid conclusion, the sample must be representative of the population. A representative sample is a subset of the population that reflects the characteristics of the population.<\/li>\r\n \t<li>A sample is biased if it systematically favors a certain outcome.<\/li>\r\n \t<li>Random selection eliminates bias.<\/li>\r\n<\/ul>\r\nWe have not mentioned the size of the sample. Are larger samples more accurate? Well, the answer is yes and no.\r\n\r\nRecall the 1936 presidential election. A sample of over 2 million people did not correctly identify the winner of the election. Two million people is a huge sample, yet the results were completely wrong. So a large sample does not guarantee reliable results.\r\n\r\nHowever, if the samples are randomly selected, then size does matter. We see this in the next example.\r\n<div class=\"textbox examples\">\r\n<h3>Example<\/h3>\r\n<h2>For Random Samples, Size Matters<\/h2>\r\nLet\u2019s compare the accuracy of random samples of different sizes.\r\n\r\nSuppose there are 10,000 students at your college. Also suppose that 65% of these students are eligible for financial aid. How accurate are random samples at predicting this population value?\r\n\r\nTo answer this question, we randomly select 50 students and determine the proportion who are eligible for financial aid. We repeat this several times. Here are the results for three random samples:\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031459\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image1.png\" alt=\"There are four column graphs in this diagram. The first graph shows that out of a population of 10,000 people, 65 percent of them are eligible for financial aid. The three graphs after each represent random samples of 50 people. The first shows that 56% of the population is eligible, the second shows that 72 percent are eligible, and the third shows that 64 percent are eligible\" width=\"659\" height=\"248\" \/>\r\n\r\nNotice that each random sample has a different result. Some results are larger than the true population value of 65%; some results are smaller than the true population value. Because there is no bias in random samples, we expect results above and below the true value to occur with similar frequency.\r\n\r\nNow we use a simulation to take many more random samples. Again, each sample is composed of 50 randomly selected people. Here is a dotplot of the proportion who are eligible for financial aid in 100 samples. Each dot is a random sample.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031501\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image2.png\" alt=\"Dotplot shows 100 random 50-student samples determining financial aid eligibility\" width=\"657\" height=\"152\" \/>\r\n\r\nWe see that the results from random samples vary from 0.48 to 0.80. Typical values range from about 0.58 to 0.74.\r\n\r\nNote: Many samples have results below the true population value of 0.65, and many have results above 0.65. This shows that random samples are not biased. For the question <em>Are you eligible for financial aid?<\/em>, there is no systematic favoring of one response over another. The samples are representative of the population.\r\n\r\n<strong>What happens when we increase the number of people in the random sample?<\/strong>\r\n\r\nWe increased the number of people in each sample to 250. Here is dotplot of the results from 100 of these larger random samples.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031502\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image3.png\" alt=\"Dotplot showing 100 random 250-student samples determining financial aid eligibility. Higher eligible proportions are in the middle of the dotplot \" width=\"649\" height=\"142\" \/>\r\n\r\nNotice there is less variability in these larger samples. Results range from about 0.58 to 0.73. Typical values range from about 0.62 to 0.68. These samples give results that are closer to the true population value of 0.65.\r\n\r\nSo what\u2019s the point? <em>Larger samples tend to be more accurate than smaller samples if the samples are chosen randomly.<\/em>\r\n\r\n<\/div>\r\n<h3>Comment<\/h3>\r\nThe precision of the sample results depends on the size of the sample, not the size of the population. The following dotplots illustrate this point. Here we selected samples with 250 people in each sample, but we <em> varied the size of the population<\/em>. Each dotplot contains 100 samples.\r\n\r\nNotice that the sample results look very similar. For each population, the sample results fall between about 0.58 and 0.73. In each graph, it is common for sample results to fall between about 0.62 and 0.68.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031505\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image4.png\" alt=\"Three dotplots showing that accuracy relies on sample size more than on population size\" width=\"807\" height=\"172\" \/>\r\n\r\nWhat\u2019s the main point? <em>The size of the population does not affect the accuracy of a random sample as long as the population is large.<\/em>\r\n<div class=\"textbox exercises\">\r\n<h3>Learn By Doing<\/h3>\r\nhttps:\/\/assessments.lumenlearning.com\/assessments\/3408\r\n\r\nhttps:\/\/assessments.lumenlearning.com\/assessments\/3828\r\n\r\n<\/div>\r\n<h3>Comment<\/h3>\r\nIf an attempt is made to include every individual from a population in a sample, then the investigation is called a <strong>census<\/strong>. Every 10 years, the U.S. Census Bureau conducts a population census. It attempts to collect information about every person living in the United States. However, the population census misses between 1% and 3% of the U.S. population and accidentally counts some people more than once. A full census is possible only for small populations.\r\n<h3><strong>Let\u2019s Summarize<\/strong><\/h3>\r\n<ul>\r\n \t<li>We draw a conclusion about the population on the basis of the sample.<\/li>\r\n \t<li>To draw a valid conclusion, the sample must be representative of the population. A representative sample is a subset of the population. It also reflects the characteristics of the population.<\/li>\r\n \t<li>A sample is biased if it systematically favors a certain outcome.<\/li>\r\n \t<li>Random selection eliminates bias.<\/li>\r\n \t<li>Larger samples tend to be more accurate than smaller samples if the samples are chosen randomly.<\/li>\r\n \t<li>The size of the population does not affect the accuracy of a random sample as long as the population is large.<\/li>\r\n \t<li>If an attempt is made to include every individual from a population in a sample, then the investigation is called a <em>census<\/em>.<\/li>\r\n<\/ul>\r\n<h3><\/h3>","rendered":"<p>&nbsp;<\/p>\n<div class=\"textbox learning-objectives\">\n<h3>Learning Objectives<\/h3>\n<ul>\n<li>For an observational study, critique the sampling plan. Recognize implications and limitations of the plan.<\/li>\n<\/ul>\n<\/div>\n<p>Let\u2019s briefly summarize the main points about sampling:<\/p>\n<ul>\n<li>We draw a conclusion about the population on the basis of the sample.<\/li>\n<li>To draw a valid conclusion, the sample must be representative of the population. A representative sample is a subset of the population that reflects the characteristics of the population.<\/li>\n<li>A sample is biased if it systematically favors a certain outcome.<\/li>\n<li>Random selection eliminates bias.<\/li>\n<\/ul>\n<p>We have not mentioned the size of the sample. Are larger samples more accurate? Well, the answer is yes and no.<\/p>\n<p>Recall the 1936 presidential election. A sample of over 2 million people did not correctly identify the winner of the election. Two million people is a huge sample, yet the results were completely wrong. So a large sample does not guarantee reliable results.<\/p>\n<p>However, if the samples are randomly selected, then size does matter. We see this in the next example.<\/p>\n<div class=\"textbox examples\">\n<h3>Example<\/h3>\n<h2>For Random Samples, Size Matters<\/h2>\n<p>Let\u2019s compare the accuracy of random samples of different sizes.<\/p>\n<p>Suppose there are 10,000 students at your college. Also suppose that 65% of these students are eligible for financial aid. How accurate are random samples at predicting this population value?<\/p>\n<p>To answer this question, we randomly select 50 students and determine the proportion who are eligible for financial aid. We repeat this several times. Here are the results for three random samples:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031459\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image1.png\" alt=\"There are four column graphs in this diagram. The first graph shows that out of a population of 10,000 people, 65 percent of them are eligible for financial aid. The three graphs after each represent random samples of 50 people. The first shows that 56% of the population is eligible, the second shows that 72 percent are eligible, and the third shows that 64 percent are eligible\" width=\"659\" height=\"248\" \/><\/p>\n<p>Notice that each random sample has a different result. Some results are larger than the true population value of 65%; some results are smaller than the true population value. Because there is no bias in random samples, we expect results above and below the true value to occur with similar frequency.<\/p>\n<p>Now we use a simulation to take many more random samples. Again, each sample is composed of 50 randomly selected people. Here is a dotplot of the proportion who are eligible for financial aid in 100 samples. Each dot is a random sample.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031501\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image2.png\" alt=\"Dotplot shows 100 random 50-student samples determining financial aid eligibility\" width=\"657\" height=\"152\" \/><\/p>\n<p>We see that the results from random samples vary from 0.48 to 0.80. Typical values range from about 0.58 to 0.74.<\/p>\n<p>Note: Many samples have results below the true population value of 0.65, and many have results above 0.65. This shows that random samples are not biased. For the question <em>Are you eligible for financial aid?<\/em>, there is no systematic favoring of one response over another. The samples are representative of the population.<\/p>\n<p><strong>What happens when we increase the number of people in the random sample?<\/strong><\/p>\n<p>We increased the number of people in each sample to 250. Here is dotplot of the results from 100 of these larger random samples.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031502\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image3.png\" alt=\"Dotplot showing 100 random 250-student samples determining financial aid eligibility. Higher eligible proportions are in the middle of the dotplot\" width=\"649\" height=\"142\" \/><\/p>\n<p>Notice there is less variability in these larger samples. Results range from about 0.58 to 0.73. Typical values range from about 0.62 to 0.68. These samples give results that are closer to the true population value of 0.65.<\/p>\n<p>So what\u2019s the point? <em>Larger samples tend to be more accurate than smaller samples if the samples are chosen randomly.<\/em><\/p>\n<\/div>\n<h3>Comment<\/h3>\n<p>The precision of the sample results depends on the size of the sample, not the size of the population. The following dotplots illustrate this point. Here we selected samples with 250 people in each sample, but we <em> varied the size of the population<\/em>. Each dotplot contains 100 samples.<\/p>\n<p>Notice that the sample results look very similar. For each population, the sample results fall between about 0.58 and 0.73. In each graph, it is common for sample results to fall between about 0.62 and 0.68.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031505\/m1_statistical_analysis_topic_1_2_m1_sampling_2_image4.png\" alt=\"Three dotplots showing that accuracy relies on sample size more than on population size\" width=\"807\" height=\"172\" \/><\/p>\n<p>What\u2019s the main point? <em>The size of the population does not affect the accuracy of a random sample as long as the population is large.<\/em><\/p>\n<div class=\"textbox exercises\">\n<h3>Learn By Doing<\/h3>\n<p>\t<iframe id=\"lumen_assessment_3408\" class=\"resizable\" src=\"https:\/\/assessments.lumenlearning.com\/assessments\/load?assessment_id=3408&#38;embed=1&#38;external_user_id=&#38;external_context_id=&#38;iframe_resize_id=lumen_assessment_3408\" frameborder=\"0\" style=\"border:none;width:100%;height:100%;min-height:400px;\"><br \/>\n\t<\/iframe><\/p>\n<p>\t<iframe id=\"lumen_assessment_3828\" class=\"resizable\" src=\"https:\/\/assessments.lumenlearning.com\/assessments\/load?assessment_id=3828&#38;embed=1&#38;external_user_id=&#38;external_context_id=&#38;iframe_resize_id=lumen_assessment_3828\" frameborder=\"0\" style=\"border:none;width:100%;height:100%;min-height:400px;\"><br \/>\n\t<\/iframe><\/p>\n<\/div>\n<h3>Comment<\/h3>\n<p>If an attempt is made to include every individual from a population in a sample, then the investigation is called a <strong>census<\/strong>. Every 10 years, the U.S. Census Bureau conducts a population census. It attempts to collect information about every person living in the United States. However, the population census misses between 1% and 3% of the U.S. population and accidentally counts some people more than once. A full census is possible only for small populations.<\/p>\n<h3><strong>Let\u2019s Summarize<\/strong><\/h3>\n<ul>\n<li>We draw a conclusion about the population on the basis of the sample.<\/li>\n<li>To draw a valid conclusion, the sample must be representative of the population. A representative sample is a subset of the population. It also reflects the characteristics of the population.<\/li>\n<li>A sample is biased if it systematically favors a certain outcome.<\/li>\n<li>Random selection eliminates bias.<\/li>\n<li>Larger samples tend to be more accurate than smaller samples if the samples are chosen randomly.<\/li>\n<li>The size of the population does not affect the accuracy of a random sample as long as the population is large.<\/li>\n<li>If an attempt is made to include every individual from a population in a sample, then the investigation is called a <em>census<\/em>.<\/li>\n<\/ul>\n<h3><\/h3>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-36\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>Concepts in Statistics. <strong>Provided by<\/strong>: Open Learning Initiative. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/oli.cmu.edu\">http:\/\/oli.cmu.edu<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":163,"menu_order":7,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"Concepts in Statistics\",\"author\":\"\",\"organization\":\"Open Learning Initiative\",\"url\":\"http:\/\/oli.cmu.edu\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"}]","CANDELA_OUTCOMES_GUID":"820c4e40-4a47-44e1-8874-56f0915f9d19","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-36","chapter","type-chapter","status-publish","hentry"],"part":18,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/36","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/users\/163"}],"version-history":[{"count":8,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/36\/revisions"}],"predecessor-version":[{"id":1529,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/36\/revisions\/1529"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/parts\/18"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapters\/36\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/media?parent=36"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/pressbooks\/v2\/chapter-type?post=36"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/contributor?post=36"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/suny-wmopen-concepts-statistics\/wp-json\/wp\/v2\/license?post=36"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}