{"id":247,"date":"2022-02-18T23:43:52","date_gmt":"2022-02-18T23:43:52","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/?post_type=chapter&#038;p=247"},"modified":"2022-03-30T22:15:05","modified_gmt":"2022-03-30T22:15:05","slug":"visualizing-quantitative-data-what-to-know","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/chapter\/visualizing-quantitative-data-what-to-know\/","title":{"raw":"Visualizing Quantitative Data: What to Know","rendered":"Visualizing Quantitative Data: What to Know"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>Learning Goals<\/h3>\r\nAfter completing this section, you should feel comfortable performing these skills.\r\n<ul>\r\n \t<li><a href=\"#IdentQuant\">Identify Quantitative Variables in a Data Set<\/a><\/li>\r\n \t<li><a href=\"#QuantGraph\">Identify graphical displays appropriate for visualizing quantitative data distributions.<\/a><\/li>\r\n \t<li><a href=\"#Using a Data Analysis Tool to Create Histograms\">Use a data analysis tool to create a histogram of quantitative data.<\/a><\/li>\r\n \t<li><a href=\"#ReadHist\">Read and interpret a histogram.<\/a><\/li>\r\n \t<li><a href=\"#Affects of Bin Widths on Histograms\">Explain how the bin width affects a histogram.<\/a><\/li>\r\n \t<li><a href=\"#Using a Data Analysis Tool to Create Dotplots\">Use a data analysis tool to create a dotplot of quantitative data.<\/a><\/li>\r\n \t<li><a href=\"#Dotplots\">Read and interpret a dotplot.<\/a><\/li>\r\n \t<li><a href=\"#Drawing Conclusions about Larger Populations\">Determine if a population and sample are appropriate to draw conclusions about a larger population.<\/a><\/li>\r\n<\/ul>\r\nClick on a skill above to jump to its location in this section.\r\n\r\n<\/div>\r\nIn the next activity, you will need to identify quantitative variables, make plots of the distributions of quantitative variables, distinguish between a population and a sample, and explain limitations of analyses based on sample data. In this section, you'll prepare for the activity by exploring the types of displays used to visualize quantitative variables.\r\n<h2 id=\"Quantitative Variables\">Quantitative Variables<\/h2>\r\nYou learned to distinguish the difference between categorical and quantitative variables in Module 1,\u00a0<em>Data Collection and Organization<\/em>, and you learned to identify and display distributions of categorical variables in the previous two sections,\u00a0<em>Displaying Categorical Data<\/em>\u00a0and\u00a0<em>Applications of Bar Graphs<\/em>. Before we turn our attention to a thorough study of quantitative variables, take a moment to refresh your knowledge in the recall boxes below.\r\n<div class=\"textbox examples\">\r\n<h3>Recall<\/h3>\r\nWhat is the distinguishing feature of a quantitative variable? That, how can we tell a quantitative variable apart from a categorical variable?\r\n<p style=\"text-align: left;\">Core Skill:[reveal-answer q=\"244835\"]Identify a variable as quantitative or categorical[\/reveal-answer]<\/p>\r\n[hidden-answer a=\"244835\"]\r\n\r\n<strong>Quantitative<\/strong> variables measure a\u00a0<em>quantity\u00a0<\/em>associated with a person, item, or entity being observed. We can perform arithmetic on quantitative measures, such as taking their average.\r\n<ul>\r\n \t<li>Examples: Height, price of an ice cream cone, exam scores, temperature<\/li>\r\n<\/ul>\r\n<strong>Categorical<\/strong> variables measure a\u00a0<em>characteristic<\/em>\u00a0of a person, item, or entity being observed . Categorical data collects a number of characteristics that fall into a category. We cannot do arithmetic on categorical variables.\r\n<ul>\r\n \t<li>Examples: eye color, zip code, flavor of ice cream, student ID number<\/li>\r\n<\/ul>\r\n\r\n<hr \/>\r\n\r\n<span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">[\/hidden-answer]<\/span>\r\n\r\nWe will explore <em>quantitative<\/em> displays later in this section. In the meantime, can you recall which graphs and charts are appropriate for displaying the distribution of a <em>categorical<\/em> variable?\r\n\r\nCore skill:[reveal-answer q=\"21598\"]Identify appropriate visual displays for the distribution of a categorical variable. [\/reveal-answer]\r\n[hidden-answer a=\"21598\"]Categorical distributions can be visualized using pie charts, bar graphs, side-by-side bar graphs, and stacked bar graphs.\u00a0 [\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nWhich are the quantitative variables in the list below?\r\n\r\nSalary, eye color, zip code, number of children in household, height, income level\r\n\r\n[reveal-answer q=\"956280\"]Show answer[\/reveal-answer]\r\n[hidden-answer a=\"956280\"]Quantitative variables: <em>salary<\/em>, <em>number of children in household<\/em>, and <em>height<\/em>.\r\n\r\nWe've seen before that categorical variables might have numbers in them, like <em>zip code\u00a0<\/em>or\u00a0<em>income level<\/em>, but they don't have numerical meaning. They are categories, not numbers.\u00a0[\/hidden-answer]\r\n\r\n<\/div>\r\nIn short, quantitative variables have numerical meaning. They are numbers that come with labels attached;\u00a0[latex]30[\/latex] <em>years<\/em>,\u00a0[latex]82[\/latex] <em>points<\/em>,\u00a0[latex]15,000[\/latex] <em>dollars<\/em>,\u00a0[latex]3[\/latex] <em>speeding tickets<\/em> are all examples of quantitative data observations. We can sum them up, take their average, and identify the minimum and maximum values.\r\n\r\nNow it's your turn to practice what you know using a real data set. Read the example and description of the data set and its variables below, then answer the questions that follow.\r\n<h3 id=\"IdentQuant\">Variables in a data set<\/h3>\r\nLet's say we are interested in the ages of film actors who have won the highest professional accolades. Do they tend to be younger or older when they win a big award? We can use a data set containing the ages of performers (a quantitative variable) at the time of receiving an award. While we won't be able to draw conclusions about\u00a0<em>why<\/em>\u00a0the award recipients might tend to be younger or older, we can use a visual display to see if an interesting tendency emerges.\r\n\r\nTo investigate, we'll ask the question,\u00a0<em>How old are the winners of the Best Actress and Best Actor awards at the Academy Awards (more commonly known as \u201cthe Oscars\u201d)?<\/em>\r\n\r\nTo answer this question, we will use data on \u201cBest Actress\/Actor\u201d for the\u00a0[latex]184[\/latex] winners from 1929 to 2018.[footnote]<em>Oscar winners, 1929 to 2018<\/em>. (n.d.). OpenIntro. Retrieved from https:\/\/www.openintro.org\/data\/index.php?data=oscars[\/footnote] The table below shows the first five observations.\r\n<table class=\"lines\" border=\"1\">\r\n<tbody>\r\n<tr>\r\n<td style=\"width: 56.9219px; text-align: center;\" colspan=\"10\"><strong>Best Actress\/Actor Winners from 1929 to 2018<\/strong><strong>\r\n<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 56.9219px; text-align: center;\"><strong>oscar_no<\/strong><\/td>\r\n<td style=\"width: 53.375px; text-align: center;\"><strong>oscar_yr<\/strong><\/td>\r\n<td style=\"width: 70.4375px; text-align: center;\"><strong>award<\/strong><\/td>\r\n<td style=\"width: 87.5px; text-align: center;\"><strong>name<\/strong><\/td>\r\n<td style=\"width: 158.688px; text-align: center;\"><strong>movie<\/strong><\/td>\r\n<td style=\"width: 22.0625px; text-align: center;\"><strong>age<\/strong><\/td>\r\n<td style=\"width: 89.1719px; text-align: center;\"><strong>birth_pl<\/strong><\/td>\r\n<td style=\"width: 54.7656px; text-align: center;\"><strong>birth_mo<\/strong><\/td>\r\n<td style=\"width: 43.375px; text-align: center;\"><strong>birth_d<\/strong><\/td>\r\n<td style=\"width: 42.6875px; text-align: center;\"><strong>birth_y<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 56.9219px; text-align: center;\"><strong>1<\/strong><\/td>\r\n<td style=\"width: 53.375px; text-align: center;\">1929<\/td>\r\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\r\n<td style=\"width: 87.5px; text-align: center;\">Janet Gaynor<\/td>\r\n<td style=\"width: 158.688px; text-align: center;\">7th Heaven<\/td>\r\n<td style=\"width: 22.0625px; text-align: center;\">22<\/td>\r\n<td style=\"width: 89.1719px; text-align: center;\">Pennsylvania<\/td>\r\n<td style=\"width: 54.7656px; text-align: center;\">10<\/td>\r\n<td style=\"width: 43.375px; text-align: center;\">6<\/td>\r\n<td style=\"width: 42.6875px; text-align: center;\">1906<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 56.9219px; text-align: center;\"><strong>2<\/strong><\/td>\r\n<td style=\"width: 53.375px; text-align: center;\">1930<\/td>\r\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\r\n<td style=\"width: 87.5px; text-align: center;\">Mary Pickford<\/td>\r\n<td style=\"width: 158.688px; text-align: center;\">Coquette<\/td>\r\n<td style=\"width: 22.0625px; text-align: center;\">37<\/td>\r\n<td style=\"width: 89.1719px; text-align: center;\">Canada<\/td>\r\n<td style=\"width: 54.7656px; text-align: center;\">4<\/td>\r\n<td style=\"width: 43.375px; text-align: center;\">8<\/td>\r\n<td style=\"width: 42.6875px; text-align: center;\">1892<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 56.9219px; text-align: center;\"><strong>3<\/strong><\/td>\r\n<td style=\"width: 53.375px; text-align: center;\">1931<\/td>\r\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\r\n<td style=\"width: 87.5px; text-align: center;\">Norma Shearer<\/td>\r\n<td style=\"width: 158.688px; text-align: center;\">The Divorcee<\/td>\r\n<td style=\"width: 22.0625px; text-align: center;\">28<\/td>\r\n<td style=\"width: 89.1719px; text-align: center;\">Canada<\/td>\r\n<td style=\"width: 54.7656px; text-align: center;\">8<\/td>\r\n<td style=\"width: 43.375px; text-align: center;\">10<\/td>\r\n<td style=\"width: 42.6875px; text-align: center;\">1902<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 56.9219px; text-align: center;\"><strong>4<\/strong><\/td>\r\n<td style=\"width: 53.375px; text-align: center;\">1932<\/td>\r\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\r\n<td style=\"width: 87.5px; text-align: center;\">Marie Dressler<\/td>\r\n<td style=\"width: 158.688px; text-align: center;\">Min and Bill<\/td>\r\n<td style=\"width: 22.0625px; text-align: center;\">63<\/td>\r\n<td style=\"width: 89.1719px; text-align: center;\">Canada<\/td>\r\n<td style=\"width: 54.7656px; text-align: center;\">11<\/td>\r\n<td style=\"width: 43.375px; text-align: center;\">9<\/td>\r\n<td style=\"width: 42.6875px; text-align: center;\">1868<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 56.9219px; text-align: center;\"><strong>5<\/strong><\/td>\r\n<td style=\"width: 53.375px; text-align: center;\">1933<\/td>\r\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\r\n<td style=\"width: 87.5px; text-align: center;\">Helen Hayes<\/td>\r\n<td style=\"width: 158.688px; text-align: center;\">The Sin of Madelon Claudet<\/td>\r\n<td style=\"width: 22.0625px; text-align: center;\">32<\/td>\r\n<td style=\"width: 89.1719px; text-align: center;\">Washington DC<\/td>\r\n<td style=\"width: 54.7656px; text-align: center;\">10<\/td>\r\n<td style=\"width: 43.375px; text-align: center;\">10<\/td>\r\n<td style=\"width: 42.6875px; text-align: center;\">1900<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe following is the <strong>data dictionary<\/strong> for the variables in the table:\r\n<ul>\r\n \t<li><strong><em>oscar_no<\/em><\/strong>: Oscar ceremony number<\/li>\r\n \t<li><strong><em>oscar_yr<\/em><\/strong>: Year of the Oscar ceremony<\/li>\r\n \t<li><strong><em>award<\/em><\/strong>: Best Actress or Best Actor<\/li>\r\n \t<li><strong><em>name<\/em><\/strong>: Name of award recipient<\/li>\r\n \t<li><strong><em>movie<\/em><\/strong>: Name of movie<\/li>\r\n \t<li><strong><em>age<\/em><\/strong>: Age of award recipient<\/li>\r\n \t<li><strong><em>birth_pl<\/em><\/strong>: Birth place of award recipient<\/li>\r\n \t<li><strong><em>birth_mo<\/em><\/strong>: Birth month of award recipient<\/li>\r\n \t<li><strong><em>birth_d<\/em><\/strong>: Birth day of award recipient<\/li>\r\n \t<li><strong><em>birth_y<\/em><\/strong>: Birth year of award recipient<\/li>\r\n<\/ul>\r\n<div class=\"textbox tryit\">\r\n<h3>quantitative versus categorical variables<\/h3>\r\n<span style=\"background-color: #ffff00;\">[Insert a short video ( &lt; 30 seconds) introducing the features of quantitative variables vs categorical in a data table or data dictionary (this extends the understanding obtained in 1C of identifying them from a list of words. The confusing variables in the data dictionary above include oscar_yr and birth_mo, which will appear to be numerical to students.]\u00a0<\/span>\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 1<\/h3>\r\n[ohm_question hide_question_numbers=1]240827[\/ohm_question]\r\n\r\n[reveal-answer q=\"95535\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"95535\"]Recall that quantitative variables have numerical meaning. For which variable listed could you add the values or find their average?[\/hidden-answer]\r\n\r\n<\/div>\r\n<h2 id=\"QuantDisplay\">Quantitative Displays<\/h2>\r\nEarlier, you learned which kinds of graphs make good visualizations for categorical data. Just as certain graphs are useful for displaying data across categories (pie chart, bar graph, side-by-side and stacked bar graphs), others are especially well suited to quantitative data distributions. Categorical displays won't work for quantitative data and vice-versa.\r\n<h3 id=\"QuantGraph\">Graphs and Charts<\/h3>\r\nIn the future, you may need to choose a display based on the type of data distribution you have, so it is important to know which display works for the type of data you have. Remind yourself in the example below of which graphs and charts you have used to display categorical variables then answer the following question about quantitative displays.\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nSome graphs and charts can be used to display distributions of categorical variables. Others work for displaying quantitative variables. Which of the graphs and charts below did you use in previous sections to display categorical variables?\r\n<ol>\r\n \t<li>Pie chart<\/li>\r\n \t<li><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">Bar chart<\/span><\/li>\r\n \t<li><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">Dotplot<\/span><\/li>\r\n \t<li><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">Histogram<\/span><\/li>\r\n<\/ol>\r\n[reveal-answer q=\"382376\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"382376\"]1, and 2, Pie charts and Bar charts were used to display categorical variables.[\/hidden-answer]\r\n\r\n<\/div>\r\nNow answer Question 2, about quantitative displays.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 2<\/h3>\r\n[ohm_question hide_question_numbers=1]240828[\/ohm_question]\r\n\r\n[reveal-answer q=\"683208\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"683208\"]Think about which graphs or charts are appropriate for distributions of categorical variables..[\/hidden-answer]\r\n\r\n<\/div>\r\nWe know that pie charts and bar charts (and side-by-side and stacked bar charts) are used to display categorical distributions. <strong>Histograms<\/strong> and <strong>dotplots<\/strong> are appropriate for displaying quantitative data.\r\n<ul>\r\n \t<li><strong>Dotplots<\/strong> display how many individual observations there are of each value observed. Each observation in the data set appears as its own dot on the graph. A large number of observations could overwhelm the display so dotplots work well when the data set is small.<\/li>\r\n \t<li><strong>Histograms<\/strong> are good choices for displaying data sets that have a large number of observations since they group observations into equal-size \"bins.\" The bins can include any interval of values desired, so a histogram will not be overwhelmed by a large number of observations in a data set.<\/li>\r\n<\/ul>\r\n<h2 id=\"Histograms\">Histograms<\/h2>\r\nWe've seen that a\u00a0<strong>histogram <\/strong>is a graphical display used to visualize the distribution of a quantitative variable, and we know that it is a good choice to use when there are a large number of observations in the data set, which is why histograms are commonly used for quantitative distributions. Let's take a closer look at how a histogram is created before using the tool to create one ourselves.\r\n<div class=\"textbox tryit\">\r\n<h3>creating a histogram<\/h3>\r\n<span style=\"background-color: #99cc00;\">[Perspective Video - a 3-instructors video demonstrating how to create a histogram for a variable from an copy-and-pasted data set, covering the features of a histogram, especially including binwidth and endpoints. Be sure to point out that after students can also select \"dotplot\" in the tool to change the type of graph. Include a statement or two comparing and contrasting the two graphs. Boxplots have not been studied yet, so there is no need to compare them as well.]<\/span>\r\n\r\n<\/div>\r\nWe can use the \"Best Actress\/Actor\" data table as a resource to learn more about the features of a histogram. Below, see a histogram of the variable\u00a0<em>age<\/em> from the data set.\r\n\r\n<img class=\"alignnone wp-image-965\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11181832\/Picture101-300x133.png\" alt=\"A bar graph of Best Actress and Best Actors Winners by age. The vertical axis is labeled &quot;Count&quot; and numbered in increments of 10 up to 30 and the horizontal is labeled &quot;Age of Best Actress and Best Actor Winners.&quot; The bar for ages 20-24 goes approximately two thirds of the way to the line at 10. The bar for ages 25-29 goes approximately three quarters of the way to the line at 30. The bar for ages 30-34 goes approximately one fifth of an increment above the line at 30. The bar for ages 35-39 goes approximately one fifth of an increment above the line at 30. The bar for ages 40-44 goes approximately one half of an increment above the line at 30. The bar for ages 45-49 goes approximately one tenth of an increment below the line at 20. The bar for ages 50-54 goes approximately one fifth of an increment above the line at 10. The bar for ages 55-59 goes approximately halfway to the line at 10. The bar for ages 60-64 goes approximately one third of an increment above the line at 10. The bar for ages 65-69 is at zero. The bar for ages 70-74 goes approximately one tenth of an increment above the line at 0. The bar for ages 75-79 goes approximately one tenth of an increment above the line at 0. The bar for ages 80-84 goes approximately one tenth of an increment above the line at 0.\" width=\"873\" height=\"387\" \/>\r\n\r\nSimilar to a bar graph, the height of each bar shows the number of observations within each \u201cbin\u201d (these would be the categories in the bar graph). A <strong>bin <\/strong>is a range of values that the quantitative variable can take. For example, the first bin on the histogram above is [20,25). The height of this bar shows there are six actors or actresses with ages that fall in this bin.\r\n\r\nA bin can be defined by its <strong>end points<\/strong>, the smallest and largest values of the quantitative variable represented in the bin. For the first bin [20,25), the end points are\u00a0[latex]20[\/latex] and\u00a0[latex]25[\/latex]. The notation [20,25) means this bin includes observations with ages that are at least\u00a0[latex]20[\/latex] and less than\u00a0[latex]25[\/latex].\r\n\r\nQuestions 3 and 4 below will help further understand the bins of a histogram.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 3<\/h3>\r\n[ohm_question hide_question_numbers=1]240830[\/ohm_question]\r\n\r\n[reveal-answer q=\"939927\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"939927\"]The notation [35,40) indicates an interval of all values between\u00a0[latex]35[\/latex] and\u00a0[latex]40[\/latex]. The bracket is used to include an endpoint. A parenthesis is used to exclude an endpoint.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 4<\/h3>\r\n[ohm_question hide_question_numbers=1]240831[\/ohm_question]\r\n\r\n[reveal-answer q=\"853664\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"853664\"]What is the difference between\u00a0[latex]40[\/latex] and\u00a0[latex]35[\/latex]? Does it appear that each bin has the same width?[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Using a Data Analysis Tool to Create Histograms\">Creating histograms with technology<\/h3>\r\n<div class=\"textbox\">\r\n\r\nGo to the <em>Describing and Exploring Quantitative Variables<\/em> tool at <a href=\"https:\/\/istats.shinyapps.io\/EDA_quantitative\/\" target=\"_blank\" rel=\"noopener\">https:\/\/istats.shinyapps.io\/EDA_quantitative\/<\/a> and create a histogram for the distribution of <em>age<\/em> of the\u00a0[latex]184[\/latex] Best Actress\/Actor winners, following the steps below:\r\n<p style=\"padding-left: 30px;\">Step 1) Select the <strong>Single Group<\/strong> tab<\/p>\r\n<p style=\"padding-left: 30px;\">Step 2) Locate the dropdown under <strong>Enter Data<\/strong> and select <strong>Your Own<\/strong>.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 3) For <strong>Do you have<\/strong>: select <strong>Individual Observations<\/strong>.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 4) In the <strong>Name of Variable<\/strong> box, type \"Age<em>\"<\/em>.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 5) Download the\u00a0<a href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1qitPXlvaiNkCpL4wLkC349rkpnuisCTUoJ87mKkcTFU\/edit#gid=0\" target=\"_blank\" rel=\"noopener\">Oscars_Age spreadsheet<\/a>\u00a0and copy and paste the <em>age<\/em> data.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 6) Locate <strong>Choose Type of Plot<\/strong>\u00a0and choose <strong>Histogram<\/strong>. Unselect any other types.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 7) <strong>Select Binwidth For Histogram<\/strong> to 5.<\/p>\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 5<\/h3>\r\n[ohm_question hide_question_numbers=1]240607[\/ohm_question]\r\n\r\n[reveal-answer q=\"30112\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"30112\"]You can scroll through the Individual Observations box to check the accuracy of the data you typed or pasted. Double-check the correct binwidth was selected.[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3>Interpreting histograms<\/h3>\r\n<div class=\"textbox tryit\">\r\n<h3>reading and interpreting histograms<\/h3>\r\n<strong><span style=\"background-color: #99cc00;\">[Worked Example -- a 3-instructors worked example of reading and interpreting histograms with different binwidths -- showing which binwidth seems \"better\" for answering certain questions about the distribution. )<\/span><\/strong>\r\n\r\n<\/div>\r\nUse the histogram you created to answer the following questions. (Hint: Hover over the histogram to get the exact height of each bar.)\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 6<\/h3>\r\n[ohm_question hide_question_numbers=1]240832[\/ohm_question]\r\n\r\n[reveal-answer q=\"853644\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"853644\"]Look for the bin that contains the interval of ages [20,25).[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 7<\/h3>\r\n[ohm_question hide_question_numbers=1]240833[\/ohm_question]\r\n\r\n[reveal-answer q=\"796786\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"796786\"]Look for the bin that contains the interval of ages [50,70).[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 8<\/h3>\r\n[ohm_question hide_question_numbers=1]240834[\/ohm_question]\r\n\r\n[reveal-answer q=\"235142\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"235142\"]Recall that a proportion is some part out of the whole. Hover over the bins to reveal the count of the observations contained.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 9<\/h3>\r\n[ohm_question hide_question_numbers=1]240836[\/ohm_question]\r\n\r\n[reveal-answer q=\"666392\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"666392\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Affects of Bin Widths on Histograms\">Bin Width<\/h3>\r\nUsing a different bin width for the histogram can change the features of the distribution we are able to see from the graphical display.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 10<\/h3>\r\n[ohm_question hide_question_numbers=1]240608[\/ohm_question]\r\n\r\n[reveal-answer q=\"106293\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"106293\"]Set the binwidth to 20.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 11<\/h3>\r\n[ohm_question hide_question_numbers=1]240609[\/ohm_question]\r\n\r\n[reveal-answer q=\"935450\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"935450\"]Set the bindwidth to 1.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 12<\/h3>\r\n[ohm_question hide_question_numbers=1]240838[\/ohm_question]\r\n\r\n[reveal-answer q=\"32745\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"32745\"]Change the binwidths in the tool to make the determination.[\/hidden-answer]\r\n\r\n<\/div>\r\n<h2>Dotplots<\/h2>\r\nIn a previous activity, you created a dotplot, a graphical display for quantitative data where each dot represents an single observation in a data set.\u00a0Dotplots are useful for visualizing distributions when the data set is small.\r\n\r\nThere aren't as many features to understand about a dotplot as there are with histograms. We'll begin our exploration by creating one with the tool, which we will read and interpret.\r\n<h3 id=\"Using a Data Analysis Tool to Create Dotplots\">Creating dotplots<\/h3>\r\nWe'll use a dotplot to visualize the same\u00a0distribution of <em>age<\/em> of Best Actress\/Actor winners.\r\n<div class=\"textbox\">\r\n\r\nWith the same tool open that you used to create the histogram (or by following Steps 1 - 4 above), check the \u201c<strong>Dotplot<\/strong>\u201d box. Use <strong>dotsize = 1<\/strong> and <strong>bin width = 1<\/strong>.\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 13<\/h3>\r\n[ohm_question hide_question_numbers=1]240611[\/ohm_question]\r\n\r\n[reveal-answer q=\"24307\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"24307\"]Check that the dotsize and binwidth are both set to 1.[\/hidden-answer]\r\n\r\n<\/div>\r\n<h3 id=\"Dotplots\">Interpreting dotplots<\/h3>\r\n<span style=\"background-color: #ffff00;\">At this point, students will be presented with two datasets. They will be able to choose which one they would like to use to answer example questions before creating dotplots using the data analysis tool.<\/span>\r\n\r\nReading a dotplot is much the same as reading a histogram. The horizontal axis contains the range of all possible values of the variable and the vertical axis marks the number of each of those values observed. The difference is that a dotplot shows distinct counts of each value or binned value, one dot per observation. The example below demonstrates how to read and interpret a dotplot.\r\n<div class=\"textbox exercises\">\r\n<h3>Example<\/h3>\r\nLet's say that a marketing firm is interested in the age in years of the typical automobile driven by college students. Survey responses from 130 college students were collected and are displayed in the dot plot below.\r\n\r\n<img class=\"aligncenter size-full wp-image-1019\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/11201824\/AgeofCar_Dotplot.jpg\" alt=\"a dotplot with age of car in years labeled on horizontal axis, ranging from 1 to 16. The following dots are displayed above each of the ages: 1 = 6, 2 = 7, 3 = 14, 4 = 9, 5 = 10, 6 = 21 , 7 = 20 , 8 = 16 , 9 = 7, 10 = 8, 11 = 4, 12 = 4, 13 = 1, 14 = 1, 16 = 2\" width=\"1789\" height=\"590\" \/>\r\n<ol>\r\n \t<li>What does each dot on the graph represent?<\/li>\r\n \t<li>How many students reported driving a car more than 12 years old?<\/li>\r\n \t<li>How many students reported driving a car that was 10 years old?<\/li>\r\n \t<li>Did more students report driving a car under 2 years old or one 6 years old?<\/li>\r\n<\/ol>\r\n[reveal-answer q=\"945832\"]Show Solution[\/reveal-answer]\r\n[hidden-answer a=\"945832\"]\r\n<ol>\r\n \t<li>Each dot represents one student's response with the age of their car in years.<\/li>\r\n \t<li>4<\/li>\r\n \t<li>8<\/li>\r\n \t<li>More reported driving a car that was 6 years old.<\/li>\r\n<\/ol>\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n&nbsp;\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 14<\/h3>\r\n[ohm_question hide_question_numbers=1]240841[\/ohm_question]\r\n\r\n[reveal-answer q=\"249799\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"249799\"]Use the dotplot. For example, to determine if it would be considered \"typical\" for a performer at least\u00a0[latex]70[\/latex] years old to win the award, we would say it is not since only a small number of observations are located over\u00a0[latex]70[\/latex]. [\/hidden-answer]\r\n\r\n<\/div>\r\n<h2 id=\"Drawing Conclusions about Larger Populations\">Looking Ahead: Drawing Conclusions about Larger Populations<\/h2>\r\nYou saw a brief introduction to statistical inference earlier in the course, the process of making inferences about a <strong>population<\/strong> based on data collected on a <strong>sample<\/strong> from that population. We'll study it in greater detail later, but it will be helpful to consider the idea of a <strong>representative sample<\/strong> from time to time along the way. You learned in section 2A that a sampling method is considered biased if it has a tendency to produce samples that are not representative of the population. When that happens, we cannot <strong>generalize<\/strong> our results to the population and can only make statements about the sample itself.\r\n<div class=\"textbox examples\">\r\n<h3>Recall<\/h3>\r\nCore skill:\r\n[reveal-answer q=\"908567\"]Understand the difference between a sample and a population.\u00a0[\/reveal-answer]\r\n[hidden-answer a=\"908567\"]\r\n\r\nA <strong>population <\/strong>is the group of individuals or entities that our research or survey question pertains to.\r\n\r\nA\u00a0<strong>sample<\/strong> is a group of individuals or entities on which we collect data.\r\n\r\nA sample is\u00a0<strong>representative<\/strong> of the population if the characteristics of the sample tend to match the characteristics of the population.\r\n\r\nIf a sample is not representative of the population, we cannot\u00a0<strong>generalize <\/strong>the result of our analysis\u00a0from the sample to the population.[\/hidden-answer]\r\n\r\n<\/div>\r\nThe question below will help you to develop your understanding of when you can use the results of an analysis to make statements about some larger population of which your sample is a subset.\r\n\r\nTo answer this question, consider that the data set we've explored in this section,\u00a0<em>Best Actress\/Actor\u201d for the\u00a0[latex]184[\/latex] winners from 1929 to 2018,\u00a0<\/em>includes observations on people who won this award over an\u00a0[latex]89[\/latex] year span. The people about whom data was collected are also members of the set of all Oscar winners in the timespan, which is itself a subset of all Hollywood film actors.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 15<\/h3>\r\n[ohm_question hide_question_numbers=1]240842[\/ohm_question]\r\n\r\n[reveal-answer q=\"714373\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"714373\"]What is the population of interest? What sample was used to make your plots?[\/hidden-answer]\r\n\r\n<\/div>\r\nIn the next activity, we'll continue this theme by talking about the runtime of well-loved movies. Get ready by thinking about those movies you could watch over and over.\u00a0Look up the \u201cruntime\u201d\u00a0(length of the movie in minutes) of your favorite movies to compare with others in the next activity.\r\n<div><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">To find the runtime of your favorite movie:<\/span><\/div>\r\n<div>\r\n<ul>\r\n \t<li>Navigate to <a href=\"https:\/\/www.imdb.com\/\">https:\/\/www.imdb.com\/<\/a>.<\/li>\r\n \t<li>Type your favorite movie in the search bar. Select the title.<\/li>\r\n \t<li>Convert the runtime into minutes and record that value.<\/li>\r\n<\/ul>\r\nFor example, if your favorite movie is Happy Gilmore, the runtime is listed as one hour,\u00a0[latex]32[\/latex] minutes. Therefore, the runtime that you will record is\u00a0[latex]92[\/latex] minutes.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 16<\/h3>\r\n[ohm_question hide_question_numbers=1]240843[\/ohm_question]\r\n\r\n[reveal-answer q=\"690809\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"690809\"]Look up your favorite movie in IMDB.com.[\/hidden-answer]\r\n\r\n<\/div>\r\n<\/div>\r\n<h2>Summary<\/h2>\r\nIn this section, you've had a chance to practice the tasks that will be essential to forming deeper connections in the next activity. This is a good time to sum it all up before moving on.\r\n<ul>\r\n \t<li>In Questions 1, 2, and 3, you identified quantitative variables and the plots used to visualize their distributions.<\/li>\r\n \t<li>In Questions 4, 5, 6, and 14, you used technology to make a plot of the distribution of a quantitative variable.<\/li>\r\n \t<li>In Questions 7 - 10, you used a histogram to describe a distribution.<\/li>\r\n \t<li>In Questions 11, 12, and 13 you explored how bin width affects a histogram.<\/li>\r\n \t<li>In Question 15, you used a dotplot to describe a distribution.<\/li>\r\n \t<li>In Question 16, you identified the population and the sample.<\/li>\r\n \t<li>In Question 16, you considered limitations on the scope of analysis based on the sample data.<\/li>\r\n<\/ul>\r\nThis section gave you an opportunity to see that dotplots and histograms are good ways to visualize quantitative data. You also received some practice manipulating the bin width of a histogram to see how it affected the information displayed. Finally, you were needed to differentiate between the population and the sample to discuss possible limitations on the scope of an analysis of sample data. If you feel comfortable with these ideas, please move on to the next activity in Forming Connections.","rendered":"<div class=\"textbox learning-objectives\">\n<h3>Learning Goals<\/h3>\n<p>After completing this section, you should feel comfortable performing these skills.<\/p>\n<ul>\n<li><a href=\"#IdentQuant\">Identify Quantitative Variables in a Data Set<\/a><\/li>\n<li><a href=\"#QuantGraph\">Identify graphical displays appropriate for visualizing quantitative data distributions.<\/a><\/li>\n<li><a href=\"#Using a Data Analysis Tool to Create Histograms\">Use a data analysis tool to create a histogram of quantitative data.<\/a><\/li>\n<li><a href=\"#ReadHist\">Read and interpret a histogram.<\/a><\/li>\n<li><a href=\"#Affects of Bin Widths on Histograms\">Explain how the bin width affects a histogram.<\/a><\/li>\n<li><a href=\"#Using a Data Analysis Tool to Create Dotplots\">Use a data analysis tool to create a dotplot of quantitative data.<\/a><\/li>\n<li><a href=\"#Dotplots\">Read and interpret a dotplot.<\/a><\/li>\n<li><a href=\"#Drawing Conclusions about Larger Populations\">Determine if a population and sample are appropriate to draw conclusions about a larger population.<\/a><\/li>\n<\/ul>\n<p>Click on a skill above to jump to its location in this section.<\/p>\n<\/div>\n<p>In the next activity, you will need to identify quantitative variables, make plots of the distributions of quantitative variables, distinguish between a population and a sample, and explain limitations of analyses based on sample data. In this section, you&#8217;ll prepare for the activity by exploring the types of displays used to visualize quantitative variables.<\/p>\n<h2 id=\"Quantitative Variables\">Quantitative Variables<\/h2>\n<p>You learned to distinguish the difference between categorical and quantitative variables in Module 1,\u00a0<em>Data Collection and Organization<\/em>, and you learned to identify and display distributions of categorical variables in the previous two sections,\u00a0<em>Displaying Categorical Data<\/em>\u00a0and\u00a0<em>Applications of Bar Graphs<\/em>. Before we turn our attention to a thorough study of quantitative variables, take a moment to refresh your knowledge in the recall boxes below.<\/p>\n<div class=\"textbox examples\">\n<h3>Recall<\/h3>\n<p>What is the distinguishing feature of a quantitative variable? That, how can we tell a quantitative variable apart from a categorical variable?<\/p>\n<p style=\"text-align: left;\">Core Skill:<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q244835\">Identify a variable as quantitative or categorical<\/span><\/p>\n<div id=\"q244835\" class=\"hidden-answer\" style=\"display: none\">\n<p><strong>Quantitative<\/strong> variables measure a\u00a0<em>quantity\u00a0<\/em>associated with a person, item, or entity being observed. We can perform arithmetic on quantitative measures, such as taking their average.<\/p>\n<ul>\n<li>Examples: Height, price of an ice cream cone, exam scores, temperature<\/li>\n<\/ul>\n<p><strong>Categorical<\/strong> variables measure a\u00a0<em>characteristic<\/em>\u00a0of a person, item, or entity being observed . Categorical data collects a number of characteristics that fall into a category. We cannot do arithmetic on categorical variables.<\/p>\n<ul>\n<li>Examples: eye color, zip code, flavor of ice cream, student ID number<\/li>\n<\/ul>\n<hr \/>\n<p><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\"><\/div>\n<\/div>\n<p><\/span><\/p>\n<p>We will explore <em>quantitative<\/em> displays later in this section. In the meantime, can you recall which graphs and charts are appropriate for displaying the distribution of a <em>categorical<\/em> variable?<\/p>\n<p>Core skill:<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q21598\">Identify appropriate visual displays for the distribution of a categorical variable. <\/span><\/p>\n<div id=\"q21598\" class=\"hidden-answer\" style=\"display: none\">Categorical distributions can be visualized using pie charts, bar graphs, side-by-side bar graphs, and stacked bar graphs.\u00a0 <\/div>\n<\/div>\n<\/div>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Which are the quantitative variables in the list below?<\/p>\n<p>Salary, eye color, zip code, number of children in household, height, income level<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q956280\">Show answer<\/span><\/p>\n<div id=\"q956280\" class=\"hidden-answer\" style=\"display: none\">Quantitative variables: <em>salary<\/em>, <em>number of children in household<\/em>, and <em>height<\/em>.<\/p>\n<p>We&#8217;ve seen before that categorical variables might have numbers in them, like <em>zip code\u00a0<\/em>or\u00a0<em>income level<\/em>, but they don&#8217;t have numerical meaning. They are categories, not numbers.\u00a0<\/div>\n<\/div>\n<\/div>\n<p>In short, quantitative variables have numerical meaning. They are numbers that come with labels attached;\u00a0[latex]30[\/latex] <em>years<\/em>,\u00a0[latex]82[\/latex] <em>points<\/em>,\u00a0[latex]15,000[\/latex] <em>dollars<\/em>,\u00a0[latex]3[\/latex] <em>speeding tickets<\/em> are all examples of quantitative data observations. We can sum them up, take their average, and identify the minimum and maximum values.<\/p>\n<p>Now it&#8217;s your turn to practice what you know using a real data set. Read the example and description of the data set and its variables below, then answer the questions that follow.<\/p>\n<h3 id=\"IdentQuant\">Variables in a data set<\/h3>\n<p>Let&#8217;s say we are interested in the ages of film actors who have won the highest professional accolades. Do they tend to be younger or older when they win a big award? We can use a data set containing the ages of performers (a quantitative variable) at the time of receiving an award. While we won&#8217;t be able to draw conclusions about\u00a0<em>why<\/em>\u00a0the award recipients might tend to be younger or older, we can use a visual display to see if an interesting tendency emerges.<\/p>\n<p>To investigate, we&#8217;ll ask the question,\u00a0<em>How old are the winners of the Best Actress and Best Actor awards at the Academy Awards (more commonly known as \u201cthe Oscars\u201d)?<\/em><\/p>\n<p>To answer this question, we will use data on \u201cBest Actress\/Actor\u201d for the\u00a0[latex]184[\/latex] winners from 1929 to 2018.<a class=\"footnote\" title=\"Oscar winners, 1929 to 2018. (n.d.). OpenIntro. Retrieved from https:\/\/www.openintro.org\/data\/index.php?data=oscars\" id=\"return-footnote-247-1\" href=\"#footnote-247-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> The table below shows the first five observations.<\/p>\n<table class=\"lines\">\n<tbody>\n<tr>\n<td style=\"width: 56.9219px; text-align: center;\" colspan=\"10\"><strong>Best Actress\/Actor Winners from 1929 to 2018<\/strong><strong><br \/>\n<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 56.9219px; text-align: center;\"><strong>oscar_no<\/strong><\/td>\n<td style=\"width: 53.375px; text-align: center;\"><strong>oscar_yr<\/strong><\/td>\n<td style=\"width: 70.4375px; text-align: center;\"><strong>award<\/strong><\/td>\n<td style=\"width: 87.5px; text-align: center;\"><strong>name<\/strong><\/td>\n<td style=\"width: 158.688px; text-align: center;\"><strong>movie<\/strong><\/td>\n<td style=\"width: 22.0625px; text-align: center;\"><strong>age<\/strong><\/td>\n<td style=\"width: 89.1719px; text-align: center;\"><strong>birth_pl<\/strong><\/td>\n<td style=\"width: 54.7656px; text-align: center;\"><strong>birth_mo<\/strong><\/td>\n<td style=\"width: 43.375px; text-align: center;\"><strong>birth_d<\/strong><\/td>\n<td style=\"width: 42.6875px; text-align: center;\"><strong>birth_y<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 56.9219px; text-align: center;\"><strong>1<\/strong><\/td>\n<td style=\"width: 53.375px; text-align: center;\">1929<\/td>\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\n<td style=\"width: 87.5px; text-align: center;\">Janet Gaynor<\/td>\n<td style=\"width: 158.688px; text-align: center;\">7th Heaven<\/td>\n<td style=\"width: 22.0625px; text-align: center;\">22<\/td>\n<td style=\"width: 89.1719px; text-align: center;\">Pennsylvania<\/td>\n<td style=\"width: 54.7656px; text-align: center;\">10<\/td>\n<td style=\"width: 43.375px; text-align: center;\">6<\/td>\n<td style=\"width: 42.6875px; text-align: center;\">1906<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 56.9219px; text-align: center;\"><strong>2<\/strong><\/td>\n<td style=\"width: 53.375px; text-align: center;\">1930<\/td>\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\n<td style=\"width: 87.5px; text-align: center;\">Mary Pickford<\/td>\n<td style=\"width: 158.688px; text-align: center;\">Coquette<\/td>\n<td style=\"width: 22.0625px; text-align: center;\">37<\/td>\n<td style=\"width: 89.1719px; text-align: center;\">Canada<\/td>\n<td style=\"width: 54.7656px; text-align: center;\">4<\/td>\n<td style=\"width: 43.375px; text-align: center;\">8<\/td>\n<td style=\"width: 42.6875px; text-align: center;\">1892<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 56.9219px; text-align: center;\"><strong>3<\/strong><\/td>\n<td style=\"width: 53.375px; text-align: center;\">1931<\/td>\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\n<td style=\"width: 87.5px; text-align: center;\">Norma Shearer<\/td>\n<td style=\"width: 158.688px; text-align: center;\">The Divorcee<\/td>\n<td style=\"width: 22.0625px; text-align: center;\">28<\/td>\n<td style=\"width: 89.1719px; text-align: center;\">Canada<\/td>\n<td style=\"width: 54.7656px; text-align: center;\">8<\/td>\n<td style=\"width: 43.375px; text-align: center;\">10<\/td>\n<td style=\"width: 42.6875px; text-align: center;\">1902<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 56.9219px; text-align: center;\"><strong>4<\/strong><\/td>\n<td style=\"width: 53.375px; text-align: center;\">1932<\/td>\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\n<td style=\"width: 87.5px; text-align: center;\">Marie Dressler<\/td>\n<td style=\"width: 158.688px; text-align: center;\">Min and Bill<\/td>\n<td style=\"width: 22.0625px; text-align: center;\">63<\/td>\n<td style=\"width: 89.1719px; text-align: center;\">Canada<\/td>\n<td style=\"width: 54.7656px; text-align: center;\">11<\/td>\n<td style=\"width: 43.375px; text-align: center;\">9<\/td>\n<td style=\"width: 42.6875px; text-align: center;\">1868<\/td>\n<\/tr>\n<tr>\n<td style=\"width: 56.9219px; text-align: center;\"><strong>5<\/strong><\/td>\n<td style=\"width: 53.375px; text-align: center;\">1933<\/td>\n<td style=\"width: 70.4375px; text-align: center;\">Best actress<\/td>\n<td style=\"width: 87.5px; text-align: center;\">Helen Hayes<\/td>\n<td style=\"width: 158.688px; text-align: center;\">The Sin of Madelon Claudet<\/td>\n<td style=\"width: 22.0625px; text-align: center;\">32<\/td>\n<td style=\"width: 89.1719px; text-align: center;\">Washington DC<\/td>\n<td style=\"width: 54.7656px; text-align: center;\">10<\/td>\n<td style=\"width: 43.375px; text-align: center;\">10<\/td>\n<td style=\"width: 42.6875px; text-align: center;\">1900<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The following is the <strong>data dictionary<\/strong> for the variables in the table:<\/p>\n<ul>\n<li><strong><em>oscar_no<\/em><\/strong>: Oscar ceremony number<\/li>\n<li><strong><em>oscar_yr<\/em><\/strong>: Year of the Oscar ceremony<\/li>\n<li><strong><em>award<\/em><\/strong>: Best Actress or Best Actor<\/li>\n<li><strong><em>name<\/em><\/strong>: Name of award recipient<\/li>\n<li><strong><em>movie<\/em><\/strong>: Name of movie<\/li>\n<li><strong><em>age<\/em><\/strong>: Age of award recipient<\/li>\n<li><strong><em>birth_pl<\/em><\/strong>: Birth place of award recipient<\/li>\n<li><strong><em>birth_mo<\/em><\/strong>: Birth month of award recipient<\/li>\n<li><strong><em>birth_d<\/em><\/strong>: Birth day of award recipient<\/li>\n<li><strong><em>birth_y<\/em><\/strong>: Birth year of award recipient<\/li>\n<\/ul>\n<div class=\"textbox tryit\">\n<h3>quantitative versus categorical variables<\/h3>\n<p><span style=\"background-color: #ffff00;\">[Insert a short video ( &lt; 30 seconds) introducing the features of quantitative variables vs categorical in a data table or data dictionary (this extends the understanding obtained in 1C of identifying them from a list of words. The confusing variables in the data dictionary above include oscar_yr and birth_mo, which will appear to be numerical to students.]\u00a0<\/span><\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 1<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240827\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240827&theme=oea&iframe_resize_id=ohm240827\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q95535\">Hint<\/span><\/p>\n<div id=\"q95535\" class=\"hidden-answer\" style=\"display: none\">Recall that quantitative variables have numerical meaning. For which variable listed could you add the values or find their average?<\/div>\n<\/div>\n<\/div>\n<h2 id=\"QuantDisplay\">Quantitative Displays<\/h2>\n<p>Earlier, you learned which kinds of graphs make good visualizations for categorical data. Just as certain graphs are useful for displaying data across categories (pie chart, bar graph, side-by-side and stacked bar graphs), others are especially well suited to quantitative data distributions. Categorical displays won&#8217;t work for quantitative data and vice-versa.<\/p>\n<h3 id=\"QuantGraph\">Graphs and Charts<\/h3>\n<p>In the future, you may need to choose a display based on the type of data distribution you have, so it is important to know which display works for the type of data you have. Remind yourself in the example below of which graphs and charts you have used to display categorical variables then answer the following question about quantitative displays.<\/p>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Some graphs and charts can be used to display distributions of categorical variables. Others work for displaying quantitative variables. Which of the graphs and charts below did you use in previous sections to display categorical variables?<\/p>\n<ol>\n<li>Pie chart<\/li>\n<li><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">Bar chart<\/span><\/li>\n<li><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">Dotplot<\/span><\/li>\n<li><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">Histogram<\/span><\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q382376\">Show Solution<\/span><\/p>\n<div id=\"q382376\" class=\"hidden-answer\" style=\"display: none\">1, and 2, Pie charts and Bar charts were used to display categorical variables.<\/div>\n<\/div>\n<\/div>\n<p>Now answer Question 2, about quantitative displays.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 2<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240828\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240828&theme=oea&iframe_resize_id=ohm240828\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q683208\">Hint<\/span><\/p>\n<div id=\"q683208\" class=\"hidden-answer\" style=\"display: none\">Think about which graphs or charts are appropriate for distributions of categorical variables..<\/div>\n<\/div>\n<\/div>\n<p>We know that pie charts and bar charts (and side-by-side and stacked bar charts) are used to display categorical distributions. <strong>Histograms<\/strong> and <strong>dotplots<\/strong> are appropriate for displaying quantitative data.<\/p>\n<ul>\n<li><strong>Dotplots<\/strong> display how many individual observations there are of each value observed. Each observation in the data set appears as its own dot on the graph. A large number of observations could overwhelm the display so dotplots work well when the data set is small.<\/li>\n<li><strong>Histograms<\/strong> are good choices for displaying data sets that have a large number of observations since they group observations into equal-size &#8220;bins.&#8221; The bins can include any interval of values desired, so a histogram will not be overwhelmed by a large number of observations in a data set.<\/li>\n<\/ul>\n<h2 id=\"Histograms\">Histograms<\/h2>\n<p>We&#8217;ve seen that a\u00a0<strong>histogram <\/strong>is a graphical display used to visualize the distribution of a quantitative variable, and we know that it is a good choice to use when there are a large number of observations in the data set, which is why histograms are commonly used for quantitative distributions. Let&#8217;s take a closer look at how a histogram is created before using the tool to create one ourselves.<\/p>\n<div class=\"textbox tryit\">\n<h3>creating a histogram<\/h3>\n<p><span style=\"background-color: #99cc00;\">[Perspective Video &#8211; a 3-instructors video demonstrating how to create a histogram for a variable from an copy-and-pasted data set, covering the features of a histogram, especially including binwidth and endpoints. Be sure to point out that after students can also select &#8220;dotplot&#8221; in the tool to change the type of graph. Include a statement or two comparing and contrasting the two graphs. Boxplots have not been studied yet, so there is no need to compare them as well.]<\/span><\/p>\n<\/div>\n<p>We can use the &#8220;Best Actress\/Actor&#8221; data table as a resource to learn more about the features of a histogram. Below, see a histogram of the variable\u00a0<em>age<\/em> from the data set.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-965\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11181832\/Picture101-300x133.png\" alt=\"A bar graph of Best Actress and Best Actors Winners by age. The vertical axis is labeled &quot;Count&quot; and numbered in increments of 10 up to 30 and the horizontal is labeled &quot;Age of Best Actress and Best Actor Winners.&quot; The bar for ages 20-24 goes approximately two thirds of the way to the line at 10. The bar for ages 25-29 goes approximately three quarters of the way to the line at 30. The bar for ages 30-34 goes approximately one fifth of an increment above the line at 30. The bar for ages 35-39 goes approximately one fifth of an increment above the line at 30. The bar for ages 40-44 goes approximately one half of an increment above the line at 30. The bar for ages 45-49 goes approximately one tenth of an increment below the line at 20. The bar for ages 50-54 goes approximately one fifth of an increment above the line at 10. The bar for ages 55-59 goes approximately halfway to the line at 10. The bar for ages 60-64 goes approximately one third of an increment above the line at 10. The bar for ages 65-69 is at zero. The bar for ages 70-74 goes approximately one tenth of an increment above the line at 0. The bar for ages 75-79 goes approximately one tenth of an increment above the line at 0. The bar for ages 80-84 goes approximately one tenth of an increment above the line at 0.\" width=\"873\" height=\"387\" \/><\/p>\n<p>Similar to a bar graph, the height of each bar shows the number of observations within each \u201cbin\u201d (these would be the categories in the bar graph). A <strong>bin <\/strong>is a range of values that the quantitative variable can take. For example, the first bin on the histogram above is [20,25). The height of this bar shows there are six actors or actresses with ages that fall in this bin.<\/p>\n<p>A bin can be defined by its <strong>end points<\/strong>, the smallest and largest values of the quantitative variable represented in the bin. For the first bin [20,25), the end points are\u00a0[latex]20[\/latex] and\u00a0[latex]25[\/latex]. The notation [20,25) means this bin includes observations with ages that are at least\u00a0[latex]20[\/latex] and less than\u00a0[latex]25[\/latex].<\/p>\n<p>Questions 3 and 4 below will help further understand the bins of a histogram.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 3<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240830\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240830&theme=oea&iframe_resize_id=ohm240830\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q939927\">Hint<\/span><\/p>\n<div id=\"q939927\" class=\"hidden-answer\" style=\"display: none\">The notation [35,40) indicates an interval of all values between\u00a0[latex]35[\/latex] and\u00a0[latex]40[\/latex]. The bracket is used to include an endpoint. A parenthesis is used to exclude an endpoint.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 4<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240831\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240831&theme=oea&iframe_resize_id=ohm240831\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q853664\">Hint<\/span><\/p>\n<div id=\"q853664\" class=\"hidden-answer\" style=\"display: none\">What is the difference between\u00a0[latex]40[\/latex] and\u00a0[latex]35[\/latex]? Does it appear that each bin has the same width?<\/div>\n<\/div>\n<\/div>\n<h3 id=\"Using a Data Analysis Tool to Create Histograms\">Creating histograms with technology<\/h3>\n<div class=\"textbox\">\n<p>Go to the <em>Describing and Exploring Quantitative Variables<\/em> tool at <a href=\"https:\/\/istats.shinyapps.io\/EDA_quantitative\/\" target=\"_blank\" rel=\"noopener\">https:\/\/istats.shinyapps.io\/EDA_quantitative\/<\/a> and create a histogram for the distribution of <em>age<\/em> of the\u00a0[latex]184[\/latex] Best Actress\/Actor winners, following the steps below:<\/p>\n<p style=\"padding-left: 30px;\">Step 1) Select the <strong>Single Group<\/strong> tab<\/p>\n<p style=\"padding-left: 30px;\">Step 2) Locate the dropdown under <strong>Enter Data<\/strong> and select <strong>Your Own<\/strong>.<\/p>\n<p style=\"padding-left: 30px;\">Step 3) For <strong>Do you have<\/strong>: select <strong>Individual Observations<\/strong>.<\/p>\n<p style=\"padding-left: 30px;\">Step 4) In the <strong>Name of Variable<\/strong> box, type &#8220;Age<em>&#8220;<\/em>.<\/p>\n<p style=\"padding-left: 30px;\">Step 5) Download the\u00a0<a href=\"https:\/\/docs.google.com\/spreadsheets\/d\/1qitPXlvaiNkCpL4wLkC349rkpnuisCTUoJ87mKkcTFU\/edit#gid=0\" target=\"_blank\" rel=\"noopener\">Oscars_Age spreadsheet<\/a>\u00a0and copy and paste the <em>age<\/em> data.<\/p>\n<p style=\"padding-left: 30px;\">Step 6) Locate <strong>Choose Type of Plot<\/strong>\u00a0and choose <strong>Histogram<\/strong>. Unselect any other types.<\/p>\n<p style=\"padding-left: 30px;\">Step 7) <strong>Select Binwidth For Histogram<\/strong> to 5.<\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 5<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240607\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240607&theme=oea&iframe_resize_id=ohm240607\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q30112\">Hint<\/span><\/p>\n<div id=\"q30112\" class=\"hidden-answer\" style=\"display: none\">You can scroll through the Individual Observations box to check the accuracy of the data you typed or pasted. Double-check the correct binwidth was selected.<\/div>\n<\/div>\n<\/div>\n<h3>Interpreting histograms<\/h3>\n<div class=\"textbox tryit\">\n<h3>reading and interpreting histograms<\/h3>\n<p><strong><span style=\"background-color: #99cc00;\">[Worked Example &#8212; a 3-instructors worked example of reading and interpreting histograms with different binwidths &#8212; showing which binwidth seems &#8220;better&#8221; for answering certain questions about the distribution. )<\/span><\/strong><\/p>\n<\/div>\n<p>Use the histogram you created to answer the following questions. (Hint: Hover over the histogram to get the exact height of each bar.)<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 6<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240832\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240832&theme=oea&iframe_resize_id=ohm240832\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q853644\">Hint<\/span><\/p>\n<div id=\"q853644\" class=\"hidden-answer\" style=\"display: none\">Look for the bin that contains the interval of ages [20,25).<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 7<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240833\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240833&theme=oea&iframe_resize_id=ohm240833\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q796786\">Hint<\/span><\/p>\n<div id=\"q796786\" class=\"hidden-answer\" style=\"display: none\">Look for the bin that contains the interval of ages [50,70).<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 8<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240834\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240834&theme=oea&iframe_resize_id=ohm240834\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q235142\">Hint<\/span><\/p>\n<div id=\"q235142\" class=\"hidden-answer\" style=\"display: none\">Recall that a proportion is some part out of the whole. Hover over the bins to reveal the count of the observations contained.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 9<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240836\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240836&theme=oea&iframe_resize_id=ohm240836\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q666392\">Hint<\/span><\/p>\n<div id=\"q666392\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<h3 id=\"Affects of Bin Widths on Histograms\">Bin Width<\/h3>\n<p>Using a different bin width for the histogram can change the features of the distribution we are able to see from the graphical display.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 10<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240608\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240608&theme=oea&iframe_resize_id=ohm240608\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q106293\">Hint<\/span><\/p>\n<div id=\"q106293\" class=\"hidden-answer\" style=\"display: none\">Set the binwidth to 20.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 11<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240609\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240609&theme=oea&iframe_resize_id=ohm240609\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q935450\">Hint<\/span><\/p>\n<div id=\"q935450\" class=\"hidden-answer\" style=\"display: none\">Set the bindwidth to 1.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 12<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240838\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240838&theme=oea&iframe_resize_id=ohm240838\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q32745\">Hint<\/span><\/p>\n<div id=\"q32745\" class=\"hidden-answer\" style=\"display: none\">Change the binwidths in the tool to make the determination.<\/div>\n<\/div>\n<\/div>\n<h2>Dotplots<\/h2>\n<p>In a previous activity, you created a dotplot, a graphical display for quantitative data where each dot represents an single observation in a data set.\u00a0Dotplots are useful for visualizing distributions when the data set is small.<\/p>\n<p>There aren&#8217;t as many features to understand about a dotplot as there are with histograms. We&#8217;ll begin our exploration by creating one with the tool, which we will read and interpret.<\/p>\n<h3 id=\"Using a Data Analysis Tool to Create Dotplots\">Creating dotplots<\/h3>\n<p>We&#8217;ll use a dotplot to visualize the same\u00a0distribution of <em>age<\/em> of Best Actress\/Actor winners.<\/p>\n<div class=\"textbox\">\n<p>With the same tool open that you used to create the histogram (or by following Steps 1 &#8211; 4 above), check the \u201c<strong>Dotplot<\/strong>\u201d box. Use <strong>dotsize = 1<\/strong> and <strong>bin width = 1<\/strong>.<\/p>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 13<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240611\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240611&theme=oea&iframe_resize_id=ohm240611\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q24307\">Hint<\/span><\/p>\n<div id=\"q24307\" class=\"hidden-answer\" style=\"display: none\">Check that the dotsize and binwidth are both set to 1.<\/div>\n<\/div>\n<\/div>\n<h3 id=\"Dotplots\">Interpreting dotplots<\/h3>\n<p><span style=\"background-color: #ffff00;\">At this point, students will be presented with two datasets. They will be able to choose which one they would like to use to answer example questions before creating dotplots using the data analysis tool.<\/span><\/p>\n<p>Reading a dotplot is much the same as reading a histogram. The horizontal axis contains the range of all possible values of the variable and the vertical axis marks the number of each of those values observed. The difference is that a dotplot shows distinct counts of each value or binned value, one dot per observation. The example below demonstrates how to read and interpret a dotplot.<\/p>\n<div class=\"textbox exercises\">\n<h3>Example<\/h3>\n<p>Let&#8217;s say that a marketing firm is interested in the age in years of the typical automobile driven by college students. Survey responses from 130 college students were collected and are displayed in the dot plot below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-1019\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5772\/2022\/02\/11201824\/AgeofCar_Dotplot.jpg\" alt=\"a dotplot with age of car in years labeled on horizontal axis, ranging from 1 to 16. The following dots are displayed above each of the ages: 1 = 6, 2 = 7, 3 = 14, 4 = 9, 5 = 10, 6 = 21 , 7 = 20 , 8 = 16 , 9 = 7, 10 = 8, 11 = 4, 12 = 4, 13 = 1, 14 = 1, 16 = 2\" width=\"1789\" height=\"590\" \/><\/p>\n<ol>\n<li>What does each dot on the graph represent?<\/li>\n<li>How many students reported driving a car more than 12 years old?<\/li>\n<li>How many students reported driving a car that was 10 years old?<\/li>\n<li>Did more students report driving a car under 2 years old or one 6 years old?<\/li>\n<\/ol>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q945832\">Show Solution<\/span><\/p>\n<div id=\"q945832\" class=\"hidden-answer\" style=\"display: none\">\n<ol>\n<li>Each dot represents one student&#8217;s response with the age of their car in years.<\/li>\n<li>4<\/li>\n<li>8<\/li>\n<li>More reported driving a car that was 6 years old.<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 14<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240841\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240841&theme=oea&iframe_resize_id=ohm240841\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q249799\">Hint<\/span><\/p>\n<div id=\"q249799\" class=\"hidden-answer\" style=\"display: none\">Use the dotplot. For example, to determine if it would be considered &#8220;typical&#8221; for a performer at least\u00a0[latex]70[\/latex] years old to win the award, we would say it is not since only a small number of observations are located over\u00a0[latex]70[\/latex]. <\/div>\n<\/div>\n<\/div>\n<h2 id=\"Drawing Conclusions about Larger Populations\">Looking Ahead: Drawing Conclusions about Larger Populations<\/h2>\n<p>You saw a brief introduction to statistical inference earlier in the course, the process of making inferences about a <strong>population<\/strong> based on data collected on a <strong>sample<\/strong> from that population. We&#8217;ll study it in greater detail later, but it will be helpful to consider the idea of a <strong>representative sample<\/strong> from time to time along the way. You learned in section 2A that a sampling method is considered biased if it has a tendency to produce samples that are not representative of the population. When that happens, we cannot <strong>generalize<\/strong> our results to the population and can only make statements about the sample itself.<\/p>\n<div class=\"textbox examples\">\n<h3>Recall<\/h3>\n<p>Core skill:<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q908567\">Understand the difference between a sample and a population.\u00a0<\/span><\/p>\n<div id=\"q908567\" class=\"hidden-answer\" style=\"display: none\">\n<p>A <strong>population <\/strong>is the group of individuals or entities that our research or survey question pertains to.<\/p>\n<p>A\u00a0<strong>sample<\/strong> is a group of individuals or entities on which we collect data.<\/p>\n<p>A sample is\u00a0<strong>representative<\/strong> of the population if the characteristics of the sample tend to match the characteristics of the population.<\/p>\n<p>If a sample is not representative of the population, we cannot\u00a0<strong>generalize <\/strong>the result of our analysis\u00a0from the sample to the population.<\/div>\n<\/div>\n<\/div>\n<p>The question below will help you to develop your understanding of when you can use the results of an analysis to make statements about some larger population of which your sample is a subset.<\/p>\n<p>To answer this question, consider that the data set we&#8217;ve explored in this section,\u00a0<em>Best Actress\/Actor\u201d for the\u00a0[latex]184[\/latex] winners from 1929 to 2018,\u00a0<\/em>includes observations on people who won this award over an\u00a0[latex]89[\/latex] year span. The people about whom data was collected are also members of the set of all Oscar winners in the timespan, which is itself a subset of all Hollywood film actors.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 15<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240842\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240842&theme=oea&iframe_resize_id=ohm240842\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q714373\">Hint<\/span><\/p>\n<div id=\"q714373\" class=\"hidden-answer\" style=\"display: none\">What is the population of interest? What sample was used to make your plots?<\/div>\n<\/div>\n<\/div>\n<p>In the next activity, we&#8217;ll continue this theme by talking about the runtime of well-loved movies. Get ready by thinking about those movies you could watch over and over.\u00a0Look up the \u201cruntime\u201d\u00a0(length of the movie in minutes) of your favorite movies to compare with others in the next activity.<\/p>\n<div><span style=\"font-size: 1rem; orphans: 1; text-align: initial;\">To find the runtime of your favorite movie:<\/span><\/div>\n<div>\n<ul>\n<li>Navigate to <a href=\"https:\/\/www.imdb.com\/\">https:\/\/www.imdb.com\/<\/a>.<\/li>\n<li>Type your favorite movie in the search bar. Select the title.<\/li>\n<li>Convert the runtime into minutes and record that value.<\/li>\n<\/ul>\n<p>For example, if your favorite movie is Happy Gilmore, the runtime is listed as one hour,\u00a0[latex]32[\/latex] minutes. Therefore, the runtime that you will record is\u00a0[latex]92[\/latex] minutes.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 16<\/h3>\n<p><iframe loading=\"lazy\" id=\"ohm240843\" class=\"resizable\" src=\"https:\/\/ohm.lumenlearning.com\/multiembedq.php?id=240843&theme=oea&iframe_resize_id=ohm240843\" width=\"100%\" height=\"150\"><\/iframe><\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q690809\">Hint<\/span><\/p>\n<div id=\"q690809\" class=\"hidden-answer\" style=\"display: none\">Look up your favorite movie in IMDB.com.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h2>Summary<\/h2>\n<p>In this section, you&#8217;ve had a chance to practice the tasks that will be essential to forming deeper connections in the next activity. This is a good time to sum it all up before moving on.<\/p>\n<ul>\n<li>In Questions 1, 2, and 3, you identified quantitative variables and the plots used to visualize their distributions.<\/li>\n<li>In Questions 4, 5, 6, and 14, you used technology to make a plot of the distribution of a quantitative variable.<\/li>\n<li>In Questions 7 &#8211; 10, you used a histogram to describe a distribution.<\/li>\n<li>In Questions 11, 12, and 13 you explored how bin width affects a histogram.<\/li>\n<li>In Question 15, you used a dotplot to describe a distribution.<\/li>\n<li>In Question 16, you identified the population and the sample.<\/li>\n<li>In Question 16, you considered limitations on the scope of analysis based on the sample data.<\/li>\n<\/ul>\n<p>This section gave you an opportunity to see that dotplots and histograms are good ways to visualize quantitative data. You also received some practice manipulating the bin width of a histogram to see how it affected the information displayed. Finally, you were needed to differentiate between the population and the sample to discuss possible limitations on the scope of an analysis of sample data. If you feel comfortable with these ideas, please move on to the next activity in Forming Connections.<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-247-1\"><em>Oscar winners, 1929 to 2018<\/em>. (n.d.). OpenIntro. Retrieved from https:\/\/www.openintro.org\/data\/index.php?data=oscars <a href=\"#return-footnote-247-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":175116,"menu_order":14,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-247","chapter","type-chapter","status-publish","hentry"],"part":3,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/pressbooks\/v2\/chapters\/247","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/wp\/v2\/users\/175116"}],"version-history":[{"count":14,"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/pressbooks\/v2\/chapters\/247\/revisions"}],"predecessor-version":[{"id":1196,"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/pressbooks\/v2\/chapters\/247\/revisions\/1196"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/pressbooks\/v2\/parts\/3"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/pressbooks\/v2\/chapters\/247\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/wp\/v2\/media?parent=247"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/pressbooks\/v2\/chapter-type?post=247"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/wp\/v2\/contributor?post=247"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/exemplarstatistics\/wp-json\/wp\/v2\/license?post=247"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}