{"id":624,"date":"2017-04-15T05:10:18","date_gmt":"2017-04-15T05:10:18","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/conceptstest1\/?post_type=chapter&#038;p=624"},"modified":"2017-05-27T23:32:44","modified_gmt":"2017-05-27T23:32:44","slug":"why-it-matters-why-it-matters-types-of-statistical-studies-and-producing-data","status":"web-only","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/chapter\/why-it-matters-why-it-matters-types-of-statistical-studies-and-producing-data\/","title":{"raw":"Why It Matters: Types of Statistical Studies and Producing Data","rendered":"Why It Matters: Types of Statistical Studies and Producing Data"},"content":{"raw":"We organized this course around the Big Picture of Statistics. As we learn new material, we will always look at how these new ideas relate to the Big Picture. In this way the Big Picture is a diagram that will help us organize and understand the material we will learn throughout the course.\r\n\r\nThe Big Picture summarizes the steps in a statistical investigation.\r\n\r\nWe begin a statistical investigation with a research question. The research question is frequently something we want to know about a <strong>population<\/strong>. The population can be people or other things, such as animals or objects. For example, we might want to know the answer to questions such as:\r\n<ul>\r\n \t<li>What percentage of U.S. adults supports the death penalty? (Population: U.S. adults)<\/li>\r\n \t<li>Do cell phones affect bees? (Population: bees)<\/li>\r\n \t<li>Do cars get better gas mileage with a new gasoline additive? (Population: cars)<\/li>\r\n<\/ul>\r\nThe population is the entire group that we want to know something about:\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_population.png\" alt=\"The Big Picture of statistics. Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference.&quot; This diagram represents population as randomly placed black dots in a circle.\" width=\"338\" height=\"230\" \/>\r\n\r\nIn most cases, the population is a large group. Often, the population is so large that we cannot collect information from every individual in the population. So we select a <strong>sample<\/strong> from the population. Then we collect data from this sample. This is the first step in the statistical investigation. We call this step <strong>producing data<\/strong>.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_data.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 1: Producing Data \" width=\"650\" height=\"280\" \/>\r\n\r\nOf course, we need a sample that represents the population well. This involves careful planning but also involves chance. For example, if our goal is to determine the percentage of U.S. adults who favor the death penalty, we do not want our sample to contain only Democrats or only Republicans. So we can give everyone the same opportunity to be in the sample, but we will let chance select the sample.\r\n\r\nAt this step of the investigation we also carefully define what kind of information we plan to gather. Then we collect the data.\r\n\r\nData is often a long list of information. To make sense of the data, we explore it and summarize it using graphs and different numerical measures, such as percentages or averages. We call this step <strong>exploratory data analysis<\/strong>.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_eda.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 2: Exploratory Data Analysis.\" width=\"732\" height=\"280\" \/>\r\n\r\nRemember, our goal is to answer a question about a population based on a sample. Of course, samples will vary due to chance, and we will need to answer our question in spite of this variability. So we need to understand how sample results will vary and how sample results relate to the population as a whole when chance is involved. This is where <strong>probability<\/strong> comes in.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_probability.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 3: Probability\" width=\"732\" height=\"412\" \/>\r\n\r\nProbability is the \u201cmachinery\u201d behind the last step in the process called <strong>inference<\/strong>. We infer something about a population based on a sample. This inference is the conclusion we reach from our sample data that answers our original question about the population.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_inference.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 4: Inference.\" width=\"730\" height=\"420\" \/>\r\n<div class=\"textbox examples\">\r\n<h3>Example - The big picture of statistics<\/h3>\r\nAt the end of April 2005, ABC News and the Washington Post conducted a poll to determine the percentage of U.S. adults who support the death penalty.\r\n\r\n<strong>Research question<\/strong>: What percentage of U.S. adults support the death penalty?\r\n\r\nSteps in the statistical investigation:\r\n<ol>\r\n \t<li><strong>Produce Data<\/strong>: <em>Determine what to measure, then collect the data.<\/em>\r\nThe poll selected 1,082 U.S. adults at random. Each adult answered this question: \u201cDo you favor or oppose the death penalty for a person convicted of murder?\u201d<\/li>\r\n \t<li><strong>Explore the Data<\/strong>: <em>Analyze and summarize the data.<\/em>\r\nIn the sample, 65% favored the death penalty.<\/li>\r\n \t<li><strong>Draw a Conclusion<\/strong>: <em>Use the data, probability, and statistical inference to draw a conclusion about the population.<\/em>\r\nOur goal is to determine the percentage of the U.S. adult population that supports the death penalty. We know that different samples will give different results. What are the chances that our sample reflects the opinions of the population within 3%? Probability describes the likelihood that our sample is this accurate. So we can say with 95% confidence that between 62% and 68% of the population favor the death penalty.<\/li>\r\n<\/ol>\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_example.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. This diagram includes an example of how this model would answer the question of what percentage of a sample population support the death penalty.\" width=\"751\" height=\"463\" \/>\r\n\r\n<\/div>\r\n&nbsp;\r\n<h3>Let's Summarize<\/h3>\r\nA\u00a0statistical investigation with a research question. Then the investigation proceeds with the following steps:\r\n<ul>\r\n \t<li>Produce Data: Determine what to measure, then collect the data.<\/li>\r\n \t<li>Explore the Data: Analyze and summarize the data (also called <em>exploratory data analysis<\/em>).<\/li>\r\n \t<li>Draw a Conclusion: Use the data, probability, and statistical inference to draw a conclusion about the population.<\/li>\r\n<\/ul>\r\n<h3>Types of Statistical Studies and Producing Data<\/h3>\r\nIn this first module, we focus on the <em>produce data<\/em> step in a statistical investigation. We discuss two types of statistical investigations: the observational study and the experiment. Each type of investigation involves a different approach to collecting data. We will also see that our approach to collecting data determines what we can conclude from the data.\r\n\r\n<img class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031444\/m1_statistical_analysis_big_picture_mod1.gif\" alt=\"Step 1: Produce Data\" width=\"868\" height=\"420\" \/>","rendered":"<p>We organized this course around the Big Picture of Statistics. As we learn new material, we will always look at how these new ideas relate to the Big Picture. In this way the Big Picture is a diagram that will help us organize and understand the material we will learn throughout the course.<\/p>\n<p>The Big Picture summarizes the steps in a statistical investigation.<\/p>\n<p>We begin a statistical investigation with a research question. The research question is frequently something we want to know about a <strong>population<\/strong>. The population can be people or other things, such as animals or objects. For example, we might want to know the answer to questions such as:<\/p>\n<ul>\n<li>What percentage of U.S. adults supports the death penalty? (Population: U.S. adults)<\/li>\n<li>Do cell phones affect bees? (Population: bees)<\/li>\n<li>Do cars get better gas mileage with a new gasoline additive? (Population: cars)<\/li>\n<\/ul>\n<p>The population is the entire group that we want to know something about:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_population.png\" alt=\"The Big Picture of statistics. Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference.&quot; This diagram represents population as randomly placed black dots in a circle.\" width=\"338\" height=\"230\" \/><\/p>\n<p>In most cases, the population is a large group. Often, the population is so large that we cannot collect information from every individual in the population. So we select a <strong>sample<\/strong> from the population. Then we collect data from this sample. This is the first step in the statistical investigation. We call this step <strong>producing data<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_data.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 1: Producing Data\" width=\"650\" height=\"280\" \/><\/p>\n<p>Of course, we need a sample that represents the population well. This involves careful planning but also involves chance. For example, if our goal is to determine the percentage of U.S. adults who favor the death penalty, we do not want our sample to contain only Democrats or only Republicans. So we can give everyone the same opportunity to be in the sample, but we will let chance select the sample.<\/p>\n<p>At this step of the investigation we also carefully define what kind of information we plan to gather. Then we collect the data.<\/p>\n<p>Data is often a long list of information. To make sense of the data, we explore it and summarize it using graphs and different numerical measures, such as percentages or averages. We call this step <strong>exploratory data analysis<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_eda.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 2: Exploratory Data Analysis.\" width=\"732\" height=\"280\" \/><\/p>\n<p>Remember, our goal is to answer a question about a population based on a sample. Of course, samples will vary due to chance, and we will need to answer our question in spite of this variability. So we need to understand how sample results will vary and how sample results relate to the population as a whole when chance is involved. This is where <strong>probability<\/strong> comes in.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_probability.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 3: Probability\" width=\"732\" height=\"412\" \/><\/p>\n<p>Probability is the \u201cmachinery\u201d behind the last step in the process called <strong>inference<\/strong>. We infer something about a population based on a sample. This inference is the conclusion we reach from our sample data that answers our original question about the population.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_inference.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. Highlighted in this diagram is Step 4: Inference.\" width=\"730\" height=\"420\" \/><\/p>\n<div class=\"textbox examples\">\n<h3>Example &#8211; The big picture of statistics<\/h3>\n<p>At the end of April 2005, ABC News and the Washington Post conducted a poll to determine the percentage of U.S. adults who support the death penalty.<\/p>\n<p><strong>Research question<\/strong>: What percentage of U.S. adults support the death penalty?<\/p>\n<p>Steps in the statistical investigation:<\/p>\n<ol>\n<li><strong>Produce Data<\/strong>: <em>Determine what to measure, then collect the data.<\/em><br \/>\nThe poll selected 1,082 U.S. adults at random. Each adult answered this question: \u201cDo you favor or oppose the death penalty for a person convicted of murder?\u201d<\/li>\n<li><strong>Explore the Data<\/strong>: <em>Analyze and summarize the data.<\/em><br \/>\nIn the sample, 65% favored the death penalty.<\/li>\n<li><strong>Draw a Conclusion<\/strong>: <em>Use the data, probability, and statistical inference to draw a conclusion about the population.<\/em><br \/>\nOur goal is to determine the percentage of the U.S. adult population that supports the death penalty. We know that different samples will give different results. What are the chances that our sample reflects the opinions of the population within 3%? Probability describes the likelihood that our sample is this accurate. So we can say with 95% confidence that between 62% and 68% of the population favor the death penalty.<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/oerfiles\/Concepts+in+Statistics\/images\/big_picture_example.png\" alt=\"Shown on the diagram are Step 1: Producing Data, Step 2: Exploratory Data Analysis, Step 3: Probability, and Step 4: Inference. This diagram includes an example of how this model would answer the question of what percentage of a sample population support the death penalty.\" width=\"751\" height=\"463\" \/><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<h3>Let&#8217;s Summarize<\/h3>\n<p>A\u00a0statistical investigation with a research question. Then the investigation proceeds with the following steps:<\/p>\n<ul>\n<li>Produce Data: Determine what to measure, then collect the data.<\/li>\n<li>Explore the Data: Analyze and summarize the data (also called <em>exploratory data analysis<\/em>).<\/li>\n<li>Draw a Conclusion: Use the data, probability, and statistical inference to draw a conclusion about the population.<\/li>\n<\/ul>\n<h3>Types of Statistical Studies and Producing Data<\/h3>\n<p>In this first module, we focus on the <em>produce data<\/em> step in a statistical investigation. We discuss two types of statistical investigations: the observational study and the experiment. Each type of investigation involves a different approach to collecting data. We will also see that our approach to collecting data determines what we can conclude from the data.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/1729\/2017\/04\/15031444\/m1_statistical_analysis_big_picture_mod1.gif\" alt=\"Step 1: Produce Data\" width=\"868\" height=\"420\" \/><\/p>\n\n\t\t\t <section class=\"citations-section\" role=\"contentinfo\">\n\t\t\t <h3>Candela Citations<\/h3>\n\t\t\t\t\t <div>\n\t\t\t\t\t\t <div id=\"citation-list-624\">\n\t\t\t\t\t\t\t <div class=\"licensing\"><div class=\"license-attribution-dropdown-subheading\">CC licensed content, Shared previously<\/div><ul class=\"citation-list\"><li>Concepts in Statistics. <strong>Provided by<\/strong>: Open Learning Initiative. <strong>Located at<\/strong>: <a target=\"_blank\" href=\"http:\/\/oli.cmu.edu\">http:\/\/oli.cmu.edu<\/a>. <strong>License<\/strong>: <em><a target=\"_blank\" rel=\"license\" href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\">CC BY: Attribution<\/a><\/em><\/li><\/ul><\/div>\n\t\t\t\t\t\t <\/div>\n\t\t\t\t\t <\/div>\n\t\t\t <\/section>","protected":false},"author":163,"menu_order":1,"template":"","meta":{"_candela_citation":"[{\"type\":\"cc\",\"description\":\"Concepts in Statistics\",\"author\":\"\",\"organization\":\"Open Learning Initiative\",\"url\":\"http:\/\/oli.cmu.edu\",\"project\":\"\",\"license\":\"cc-by\",\"license_terms\":\"\"}]","CANDELA_OUTCOMES_GUID":"fd4d64a4-59b9-4781-9309-936844063e2d","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-624","chapter","type-chapter","status-web-only","hentry"],"part":18,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/624","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/users\/163"}],"version-history":[{"count":15,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/624\/revisions"}],"predecessor-version":[{"id":1305,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/624\/revisions\/1305"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/parts\/18"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapters\/624\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/media?parent=624"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/pressbooks\/v2\/chapter-type?post=624"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/contributor?post=624"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/atd-herkimer-statisticssocsci\/wp-json\/wp\/v2\/license?post=624"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}