{"id":297,"date":"2021-10-27T21:44:56","date_gmt":"2021-10-27T21:44:56","guid":{"rendered":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/?post_type=chapter&#038;p=297"},"modified":"2022-02-17T20:08:07","modified_gmt":"2022-02-17T20:08:07","slug":"forming-connections-with-comparing-quantitative-distributions-3e","status":"publish","type":"chapter","link":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/forming-connections-with-comparing-quantitative-distributions-3e\/","title":{"raw":"Forming Connections in Comparing Quantitative Distributions: 3E - 15","rendered":"Forming Connections in Comparing Quantitative Distributions: 3E &#8211; 15"},"content":{"raw":"<div class=\"textbox learning-objectives\">\r\n<h3>objectives for this activity<\/h3>\r\nDuring this activity, you will:\r\n<ul>\r\n \t<li><a href=\"#Comparing Distributions Across Groups\">Summarize a comparison of quantitative distributions across groups.<\/a><\/li>\r\n<\/ul>\r\nClick on the skill above to jump to its location in this activity.\r\n\r\n<\/div>\r\n<h2>Decisions, Decisions, Decisions<\/h2>\r\n<img class=\"aligncenter wp-image-982\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11185526\/Picture20-215x300.jpg\" alt=\"A woman in a wheelchair holding a diploma in one hand and raising a graduation cap in the other. \" width=\"231\" height=\"322\" \/>\r\n\r\nNow that you've had a chance to practice using technology to create graphs and compare distributions of quantitative variables, let's put it all together to see how histograms and dotplots can be used to compare distributions of a quantitative variable across groups.\r\n\r\nTo do so, we'll consider median salary levels for recent college graduates. Before we get started, think for a moment about the reasons why a college student might choose a particular major. Some may choose a major based primarily on interests, and others choose a major based on its job prospects.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 1<\/h3>\r\nWhat is your major, or what major are you thinking about choosing? What do you think the job prospects are for your major?\r\n\r\n<\/div>\r\n<div class=\"textbox tryit\">\r\n<h3>video placement<\/h3>\r\n<span style=\"background-color: #e6daf7;\">[<strong>Intro<\/strong>: \"What variables may play a role in a student's choice of major? Maybe what percent of students get a job in their major? Starting salary? Think about how we might be able to visualize data collected from students with different majors who answer these questions.\u00a0 In this activity, we'll see how stacked histograms and side-by-side dotplots can be used to compare distributions of a quantitative variable like median salary levels across several groups, in this case: college majors. You will be able to compare the center, shape, and spread of the quantitative variable across the groups using the graphical displays. Before we begin, let's take a look at the dataset together. [display image of the dataset Salary Levels of College Majors as show in the page below] These are a few lines from the dataset. You can see that each major category, like Business, contains several college majors, like Accounting, Actuarial science, Finance, and so on. And each of those majors has a median salary associated with it. What do you think the observational units are in this dataset? That is, on what entities are we collecting the information about the major category and the median salary? It may be tempting to put yourself in this picture and think the entities are college graduates who have received their first job. But the observational unit is not a person. You'll give your answer to that question below. \"]<\/span>\r\n\r\n<\/div>\r\nIn this activity, you will explore the distribution of median salary levels of college majors across different major categories for recent college graduates in 2011. For each college major in the table, the median salary and major category is listed. A small part of the data is shown in a table below. For example, the Business major category includes college majors such as actuarial science, finance, and business economics. The major categories (<em>Major_category<\/em>) included in the complete dataset are: Agriculture &amp; Natural Resources, Arts, Biology &amp; Life Sciences, Business, Communications &amp; Journalism, Computers &amp; Mathematics, Education, Engineering, Health, Humanities &amp; Liberal Arts, Industrial Arts &amp; Consumer Services, Law &amp; Public Policy, Physical Sciences, Psychology &amp; Social Work, and Social Science.[footnote]American Community Survey 2010-2012 Public Use Microdata Series. n.d.). <em>College majors<\/em>. Github. https:\/\/github.com\/fivethirtyeight\/data\/tree\/master\/college-majors.[\/footnote]\r\n\r\nThe following table displays a subset (a few rows) of the data.\r\n<div align=\"center\">\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td style=\"text-align: center;\" colspan=\"3\"><strong>Salary Levels of College Majors<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\"><strong>Major<\/strong><\/td>\r\n<td style=\"text-align: center;\"><strong>Major_category<\/strong><\/td>\r\n<td style=\"text-align: center;\"><strong>Median_salary<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">ACCOUNTING<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">45000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">ACTUARIAL SCIENCE<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">62000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">FINANCE<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">47000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">GENERAL BUSINESS<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">40000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">HOSPITALITY MANAGEMENT<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">33000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">MARKETING AND MARKETING RESEARCH<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">38000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">MISCELLANEOUS BUSINESS AND MEDICAL ADMINISTRATION<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">40000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">OPERATIONS LOGISTICS AND E-COMMERCE<\/td>\r\n<td style=\"text-align: center;\">Business<\/td>\r\n<td style=\"text-align: center;\">50000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;\">AEROSPACE ENGINEERING<\/td>\r\n<td style=\"text-align: center;\">Engineering<\/td>\r\n<td style=\"text-align: center;\">60000<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div style=\"text-align: left;\"><\/div>\r\n<div class=\"textbox examples\">\r\n<h3>recall<\/h3>\r\nRecall from [Forming Connections: 1C] that we record information about variables of interest\u00a0on each observational unit to form the dataset.\r\n\r\nCore skill:\r\n[reveal-answer q=\"407851\"]Define the terms\u00a0<em>observational unit<\/em> and\u00a0<em>variable<\/em>[\/reveal-answer]\r\n[hidden-answer a=\"407851\"]\r\n\r\n<strong>Observational Units<\/strong>: individuals or items whose characteristics we are interested in.\r\n\r\n<strong>Variables<\/strong>: The characteristics we record on the observational units. These may be quantitative or categorical variables.[\/hidden-answer]\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 2<\/h3>\r\nWhat are the observational units in the dataset above?\r\n\r\na) College students\r\n\r\nb) College majors\r\n\r\nc) Median salaries\r\n\r\nd) College graduates\r\n\r\n[reveal-answer q=\"416000\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"416000\"]Note that there are two characteristics noted for each of the observational units in the dataset.[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 3<\/h3>\r\nWhich type of variable is median salary?\r\n\r\na) categorical\r\nb) quantitative\r\n\r\n[reveal-answer q=\"850781\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"850781\"]See 1C, <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of\/\">3A<\/a>, and <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of-exploring-the-influence-of-outliers-on-measures-of-center\/\">3C<\/a> for definitions of these types of variables.<span style=\"background-color: #ffff00;\">[This hint can link to the summary pages for 1C, 3A, and 3C]<\/span> [\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 4<\/h3>\r\nWhich type of variable is major category?\r\n\r\na) categorical\r\nb) quantitative\r\n\r\n[reveal-answer q=\"773626\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"773626\"]See 1C, <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of\/\">3A<\/a>, and <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of-exploring-the-influence-of-outliers-on-measures-of-center\/\">3C<\/a> for definitions of these types of variables. <span style=\"background-color: #ffff00;\">[This hint can link to the summary pages for 1C, 3A, and 3C]<\/span>[\/hidden-answer]\r\n\r\n<\/div>\r\n<h2 id=\"Comparing Distributions Across Groups\">Comparing Distributions Across Groups<\/h2>\r\nNext, let's go to the dataset in the data analysis tool and create side-by-side dotplots for all the median salaries for each of the major categories. We'll start with a comparison of median salaries for just Business, Engineering, and Education.\r\n<div class=\"textbox\">\r\n\r\nGo to the <em>Describing and Exploring Quantitative Variables<\/em> tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/\">https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/<\/a>. <span style=\"background-color: #00ffff;\">&lt;-- there is a problem with the horizontal axis labels on the dotplot. They are displaying in scientific notation -- e.g., \"2e+04\" (2x10^4) instead of 20,000.\u00a0<\/span>\r\n<p style=\"padding-left: 30px;\">Step 1) Click the Several Groups tab at the top of the page.<\/p>\r\n<p style=\"padding-left: 30px;\">Step 2) Select dataset \"Recent Grads - Salary.\"<\/p>\r\n<p style=\"padding-left: 30px;\">Step 3) Select \u201cDotplot\u201d and adjust the dot size appropriately.<\/p>\r\n\r\n<\/div>\r\nThis process will create a comparative dotplot of the median salaries for each major: Business, Engineering, and Education. Use the dotplot to approximate the typical median salary for majors in the Business major category, and the typical median salary for majors in the Engineering major category (we'll look at the Education category a little later).\r\n\r\nFor each of the plots, examine just the dotplot (not the descriptive statistics) to answer the questions below.\r\n<div class=\"textbox examples\">\r\n<h3>recall<\/h3>\r\nWhat does a dot on a dotplot represent? In particular, what does each dot on the Business dotplot represent?\r\n\r\nCore skill:[reveal-answer q=\"962558\"]Understand the how data is represented on a dotplot[\/reveal-answer]\r\n[hidden-answer a=\"962558\"]The dots on a dotplot represent individual observations made on the variable.\r\n\r\nFor example, each dot on the Business dotplot represents a particular median salary for a college major in the Business major category. For example, the $33,000 median salary for Hospitality Management is at the far left of the plot while $62,000 for Actuarial Science is at the far right. [\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 5<\/h3>\r\nWhich dotplot (Business or Engineering) has a greater center?\r\n\r\n[reveal-answer q=\"303985\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"303985\"] Use the dotplot itself, not the descriptive statistics, to answer.\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 6<\/h3>\r\nDoes either category (Business or Engineering) have any outliers in the distribution of median salaries? If so, identify the outlier(s).\r\n\r\n[reveal-answer q=\"988555\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"988555\"] Do there appear to be any unusual dots lying to the far right or far left in either distribution?\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 7<\/h3>\r\nCompare the shapes of just the distributions of median salaries for Business and Engineering. Which of these two major categories (Business or Engineering) has a distribution that is more symmetric?\r\n\r\n[reveal-answer q=\"981781\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"981781\"]Recall that a symmetric shape has roughly equal amounts of data to the right and left of the center.\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 8<\/h3>\r\nCompare the spread in median salaries of the two distributions (Business and Engineering). Which major category has greater variability in median salaries?\r\n\r\n[reveal-answer q=\"47630\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"47630\"] Use just the dotplot and not the descriptive statistics to answer this.\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nNow let's include Education majors in the comparison.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 9<\/h3>\r\nWhich of the three major categories (Business, Engineering, or Education) has the least amount of spread in median salaries?\r\n\r\n[reveal-answer q=\"496859\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"496859\"]Use just the dotplot and not the descriptive statistics to answer this.\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 10<\/h3>\r\nCompare the typical median salary of the three distributions of median salaries. Which of the three major categories (Business, Engineering, or Education) has the smallest typical value of median salary?\r\n\r\n[reveal-answer q=\"336155\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"336155\"] Look at the graphs to answer this fairly quickly.\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\nNow switch the view in the tool from dotplots to histograms.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 11<\/h3>\r\nWhat features of the distribution are easier to see with a dotplot?\r\n\r\n[reveal-answer q=\"620535\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"620535\"]What do <em>you<\/em> think?[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 12<\/h3>\r\nWhat features of the distribution are easier to see with a histogram?\r\n\r\n[reveal-answer q=\"853144\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"853144\"]What do <em>you<\/em> think?\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox tryit\">\r\n<h3>video placement<\/h3>\r\n<span style=\"background-color: #e6daf7;\">\u00a0[insert a sub-summary: How are you doing so far? The types of questions you've been answering should feel familiar. We recently described distributions in histograms using shape, center, variability (spread) and outliers. The main difference is that now we are comparing the same variable for more than one group at a time. Displaying the groups side by side over the same scale makes it easy to make these quick comparisons. ]<\/span>\r\n\r\n<\/div>\r\nNow let's look at the other major categories. Select the dataset \u201cRecent Grads - Salary Many Majors.\u201d This dataset includes the median salaries for recent graduates in Arts, Biology &amp; Life Sciences, Computers &amp; Mathematics, and\u00a0Humanities &amp; Liberal Arts.\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 13<\/h3>\r\nWrite a short paragraph comparing and contrasting the distribution of median salaries between the four major categories.\r\n\r\n[reveal-answer q=\"76550\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"76550\"]\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox key-takeaways\">\r\n<h3>question 14<\/h3>\r\nIf you had to choose a major between these four major categories based solely on median salary, which one would you choose and why?\r\n\r\n[reveal-answer q=\"889232\"]Hint[\/reveal-answer]\r\n[hidden-answer a=\"889232\"]\r\n\r\n[\/hidden-answer]\r\n\r\n<\/div>\r\n<div class=\"textbox tryit\">\r\n<h3>video placement<\/h3>\r\n<span style=\"background-color: #e6daf7;\">[wrap-up: \"What did you chose based solely on median salary in the last question? Of course, there are many other considerations that go into making a choice of major. You do want to have an interest in your field of choice and feel that you could persist in your future career! But answering questions like that helps you to realize how nicely the side-by-side graphical displays enable comparisons of a quantitative variable across groups. In the next part of the course, we'll cover summary statistics for quantitative data. We'll learn about numerical measures for spread, mean and median, and how they relate in differently shaped distributions. We'll also learn how to use standard deviation as a measure of spread.\"]<\/span>\r\n\r\n<\/div>","rendered":"<div class=\"textbox learning-objectives\">\n<h3>objectives for this activity<\/h3>\n<p>During this activity, you will:<\/p>\n<ul>\n<li><a href=\"#Comparing Distributions Across Groups\">Summarize a comparison of quantitative distributions across groups.<\/a><\/li>\n<\/ul>\n<p>Click on the skill above to jump to its location in this activity.<\/p>\n<\/div>\n<h2>Decisions, Decisions, Decisions<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-982\" src=\"https:\/\/s3-us-west-2.amazonaws.com\/courses-images\/wp-content\/uploads\/sites\/5738\/2022\/01\/11185526\/Picture20-215x300.jpg\" alt=\"A woman in a wheelchair holding a diploma in one hand and raising a graduation cap in the other.\" width=\"231\" height=\"322\" \/><\/p>\n<p>Now that you&#8217;ve had a chance to practice using technology to create graphs and compare distributions of quantitative variables, let&#8217;s put it all together to see how histograms and dotplots can be used to compare distributions of a quantitative variable across groups.<\/p>\n<p>To do so, we&#8217;ll consider median salary levels for recent college graduates. Before we get started, think for a moment about the reasons why a college student might choose a particular major. Some may choose a major based primarily on interests, and others choose a major based on its job prospects.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 1<\/h3>\n<p>What is your major, or what major are you thinking about choosing? What do you think the job prospects are for your major?<\/p>\n<\/div>\n<div class=\"textbox tryit\">\n<h3>video placement<\/h3>\n<p><span style=\"background-color: #e6daf7;\">[<strong>Intro<\/strong>: &#8220;What variables may play a role in a student&#8217;s choice of major? Maybe what percent of students get a job in their major? Starting salary? Think about how we might be able to visualize data collected from students with different majors who answer these questions.\u00a0 In this activity, we&#8217;ll see how stacked histograms and side-by-side dotplots can be used to compare distributions of a quantitative variable like median salary levels across several groups, in this case: college majors. You will be able to compare the center, shape, and spread of the quantitative variable across the groups using the graphical displays. Before we begin, let&#8217;s take a look at the dataset together. [display image of the dataset Salary Levels of College Majors as show in the page below] These are a few lines from the dataset. You can see that each major category, like Business, contains several college majors, like Accounting, Actuarial science, Finance, and so on. And each of those majors has a median salary associated with it. What do you think the observational units are in this dataset? That is, on what entities are we collecting the information about the major category and the median salary? It may be tempting to put yourself in this picture and think the entities are college graduates who have received their first job. But the observational unit is not a person. You&#8217;ll give your answer to that question below. &#8220;]<\/span><\/p>\n<\/div>\n<p>In this activity, you will explore the distribution of median salary levels of college majors across different major categories for recent college graduates in 2011. For each college major in the table, the median salary and major category is listed. A small part of the data is shown in a table below. For example, the Business major category includes college majors such as actuarial science, finance, and business economics. The major categories (<em>Major_category<\/em>) included in the complete dataset are: Agriculture &amp; Natural Resources, Arts, Biology &amp; Life Sciences, Business, Communications &amp; Journalism, Computers &amp; Mathematics, Education, Engineering, Health, Humanities &amp; Liberal Arts, Industrial Arts &amp; Consumer Services, Law &amp; Public Policy, Physical Sciences, Psychology &amp; Social Work, and Social Science.<a class=\"footnote\" title=\"American Community Survey 2010-2012 Public Use Microdata Series. n.d.). College majors. Github. https:\/\/github.com\/fivethirtyeight\/data\/tree\/master\/college-majors.\" id=\"return-footnote-297-1\" href=\"#footnote-297-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a><\/p>\n<p>The following table displays a subset (a few rows) of the data.<\/p>\n<div style=\"margin: auto;\">\n<table>\n<tbody>\n<tr>\n<td style=\"text-align: center;\" colspan=\"3\"><strong>Salary Levels of College Majors<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\"><strong>Major<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Major_category<\/strong><\/td>\n<td style=\"text-align: center;\"><strong>Median_salary<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">ACCOUNTING<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">45000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">ACTUARIAL SCIENCE<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">62000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">FINANCE<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">47000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">GENERAL BUSINESS<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">40000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">HOSPITALITY MANAGEMENT<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">33000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">MARKETING AND MARKETING RESEARCH<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">38000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">MISCELLANEOUS BUSINESS AND MEDICAL ADMINISTRATION<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">40000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">OPERATIONS LOGISTICS AND E-COMMERCE<\/td>\n<td style=\"text-align: center;\">Business<\/td>\n<td style=\"text-align: center;\">50000<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">AEROSPACE ENGINEERING<\/td>\n<td style=\"text-align: center;\">Engineering<\/td>\n<td style=\"text-align: center;\">60000<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div style=\"text-align: left;\"><\/div>\n<div class=\"textbox examples\">\n<h3>recall<\/h3>\n<p>Recall from [Forming Connections: 1C] that we record information about variables of interest\u00a0on each observational unit to form the dataset.<\/p>\n<p>Core skill:<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q407851\">Define the terms\u00a0<em>observational unit<\/em> and\u00a0<em>variable<\/em><\/span><\/p>\n<div id=\"q407851\" class=\"hidden-answer\" style=\"display: none\">\n<p><strong>Observational Units<\/strong>: individuals or items whose characteristics we are interested in.<\/p>\n<p><strong>Variables<\/strong>: The characteristics we record on the observational units. These may be quantitative or categorical variables.<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 2<\/h3>\n<p>What are the observational units in the dataset above?<\/p>\n<p>a) College students<\/p>\n<p>b) College majors<\/p>\n<p>c) Median salaries<\/p>\n<p>d) College graduates<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q416000\">Hint<\/span><\/p>\n<div id=\"q416000\" class=\"hidden-answer\" style=\"display: none\">Note that there are two characteristics noted for each of the observational units in the dataset.<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 3<\/h3>\n<p>Which type of variable is median salary?<\/p>\n<p>a) categorical<br \/>\nb) quantitative<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q850781\">Hint<\/span><\/p>\n<div id=\"q850781\" class=\"hidden-answer\" style=\"display: none\">See 1C, <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of\/\">3A<\/a>, and <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of-exploring-the-influence-of-outliers-on-measures-of-center\/\">3C<\/a> for definitions of these types of variables.<span style=\"background-color: #ffff00;\">[This hint can link to the summary pages for 1C, 3A, and 3C]<\/span> <\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 4<\/h3>\n<p>Which type of variable is major category?<\/p>\n<p>a) categorical<br \/>\nb) quantitative<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q773626\">Hint<\/span><\/p>\n<div id=\"q773626\" class=\"hidden-answer\" style=\"display: none\">See 1C, <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of\/\">3A<\/a>, and <a href=\"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/chapter\/summary-of-exploring-the-influence-of-outliers-on-measures-of-center\/\">3C<\/a> for definitions of these types of variables. <span style=\"background-color: #ffff00;\">[This hint can link to the summary pages for 1C, 3A, and 3C]<\/span><\/div>\n<\/div>\n<\/div>\n<h2 id=\"Comparing Distributions Across Groups\">Comparing Distributions Across Groups<\/h2>\n<p>Next, let&#8217;s go to the dataset in the data analysis tool and create side-by-side dotplots for all the median salaries for each of the major categories. We&#8217;ll start with a comparison of median salaries for just Business, Engineering, and Education.<\/p>\n<div class=\"textbox\">\n<p>Go to the <em>Describing and Exploring Quantitative Variables<\/em> tool at <a href=\"https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/\">https:\/\/dcmathpathways.shinyapps.io\/EDA_quantitative\/<\/a>. <span style=\"background-color: #00ffff;\">&lt;&#8211; there is a problem with the horizontal axis labels on the dotplot. They are displaying in scientific notation &#8212; e.g., &#8220;2e+04&#8221; (2&#215;10^4) instead of 20,000.\u00a0<\/span><\/p>\n<p style=\"padding-left: 30px;\">Step 1) Click the Several Groups tab at the top of the page.<\/p>\n<p style=\"padding-left: 30px;\">Step 2) Select dataset &#8220;Recent Grads &#8211; Salary.&#8221;<\/p>\n<p style=\"padding-left: 30px;\">Step 3) Select \u201cDotplot\u201d and adjust the dot size appropriately.<\/p>\n<\/div>\n<p>This process will create a comparative dotplot of the median salaries for each major: Business, Engineering, and Education. Use the dotplot to approximate the typical median salary for majors in the Business major category, and the typical median salary for majors in the Engineering major category (we&#8217;ll look at the Education category a little later).<\/p>\n<p>For each of the plots, examine just the dotplot (not the descriptive statistics) to answer the questions below.<\/p>\n<div class=\"textbox examples\">\n<h3>recall<\/h3>\n<p>What does a dot on a dotplot represent? In particular, what does each dot on the Business dotplot represent?<\/p>\n<p>Core skill:<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q962558\">Understand the how data is represented on a dotplot<\/span><\/p>\n<div id=\"q962558\" class=\"hidden-answer\" style=\"display: none\">The dots on a dotplot represent individual observations made on the variable.<\/p>\n<p>For example, each dot on the Business dotplot represents a particular median salary for a college major in the Business major category. For example, the $33,000 median salary for Hospitality Management is at the far left of the plot while $62,000 for Actuarial Science is at the far right. <\/p><\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 5<\/h3>\n<p>Which dotplot (Business or Engineering) has a greater center?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q303985\">Hint<\/span><\/p>\n<div id=\"q303985\" class=\"hidden-answer\" style=\"display: none\"> Use the dotplot itself, not the descriptive statistics, to answer.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 6<\/h3>\n<p>Does either category (Business or Engineering) have any outliers in the distribution of median salaries? If so, identify the outlier(s).<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q988555\">Hint<\/span><\/p>\n<div id=\"q988555\" class=\"hidden-answer\" style=\"display: none\"> Do there appear to be any unusual dots lying to the far right or far left in either distribution?<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 7<\/h3>\n<p>Compare the shapes of just the distributions of median salaries for Business and Engineering. Which of these two major categories (Business or Engineering) has a distribution that is more symmetric?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q981781\">Hint<\/span><\/p>\n<div id=\"q981781\" class=\"hidden-answer\" style=\"display: none\">Recall that a symmetric shape has roughly equal amounts of data to the right and left of the center.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 8<\/h3>\n<p>Compare the spread in median salaries of the two distributions (Business and Engineering). Which major category has greater variability in median salaries?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q47630\">Hint<\/span><\/p>\n<div id=\"q47630\" class=\"hidden-answer\" style=\"display: none\"> Use just the dotplot and not the descriptive statistics to answer this.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>Now let&#8217;s include Education majors in the comparison.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 9<\/h3>\n<p>Which of the three major categories (Business, Engineering, or Education) has the least amount of spread in median salaries?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q496859\">Hint<\/span><\/p>\n<div id=\"q496859\" class=\"hidden-answer\" style=\"display: none\">Use just the dotplot and not the descriptive statistics to answer this.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 10<\/h3>\n<p>Compare the typical median salary of the three distributions of median salaries. Which of the three major categories (Business, Engineering, or Education) has the smallest typical value of median salary?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q336155\">Hint<\/span><\/p>\n<div id=\"q336155\" class=\"hidden-answer\" style=\"display: none\"> Look at the graphs to answer this fairly quickly.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>Now switch the view in the tool from dotplots to histograms.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 11<\/h3>\n<p>What features of the distribution are easier to see with a dotplot?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q620535\">Hint<\/span><\/p>\n<div id=\"q620535\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 12<\/h3>\n<p>What features of the distribution are easier to see with a histogram?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q853144\">Hint<\/span><\/p>\n<div id=\"q853144\" class=\"hidden-answer\" style=\"display: none\">What do <em>you<\/em> think?<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox tryit\">\n<h3>video placement<\/h3>\n<p><span style=\"background-color: #e6daf7;\">\u00a0[insert a sub-summary: How are you doing so far? The types of questions you&#8217;ve been answering should feel familiar. We recently described distributions in histograms using shape, center, variability (spread) and outliers. The main difference is that now we are comparing the same variable for more than one group at a time. Displaying the groups side by side over the same scale makes it easy to make these quick comparisons. ]<\/span><\/p>\n<\/div>\n<p>Now let&#8217;s look at the other major categories. Select the dataset \u201cRecent Grads &#8211; Salary Many Majors.\u201d This dataset includes the median salaries for recent graduates in Arts, Biology &amp; Life Sciences, Computers &amp; Mathematics, and\u00a0Humanities &amp; Liberal Arts.<\/p>\n<div class=\"textbox key-takeaways\">\n<h3>question 13<\/h3>\n<p>Write a short paragraph comparing and contrasting the distribution of median salaries between the four major categories.<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q76550\">Hint<\/span><\/p>\n<div id=\"q76550\" class=\"hidden-answer\" style=\"display: none\">\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox key-takeaways\">\n<h3>question 14<\/h3>\n<p>If you had to choose a major between these four major categories based solely on median salary, which one would you choose and why?<\/p>\n<div class=\"qa-wrapper\" style=\"display: block\"><span class=\"show-answer collapsed\" style=\"cursor: pointer\" data-target=\"q889232\">Hint<\/span><\/p>\n<div id=\"q889232\" class=\"hidden-answer\" style=\"display: none\">\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox tryit\">\n<h3>video placement<\/h3>\n<p><span style=\"background-color: #e6daf7;\">[wrap-up: &#8220;What did you chose based solely on median salary in the last question? Of course, there are many other considerations that go into making a choice of major. You do want to have an interest in your field of choice and feel that you could persist in your future career! But answering questions like that helps you to realize how nicely the side-by-side graphical displays enable comparisons of a quantitative variable across groups. In the next part of the course, we&#8217;ll cover summary statistics for quantitative data. We&#8217;ll learn about numerical measures for spread, mean and median, and how they relate in differently shaped distributions. We&#8217;ll also learn how to use standard deviation as a measure of spread.&#8221;]<\/span><\/p>\n<\/div>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-297-1\">American Community Survey 2010-2012 Public Use Microdata Series. n.d.). <em>College majors<\/em>. Github. https:\/\/github.com\/fivethirtyeight\/data\/tree\/master\/college-majors. <a href=\"#return-footnote-297-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":25777,"menu_order":33,"template":"","meta":{"_candela_citation":"[]","CANDELA_OUTCOMES_GUID":"","pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-297","chapter","type-chapter","status-publish","hentry"],"part":3,"_links":{"self":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/297","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/users\/25777"}],"version-history":[{"count":35,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/297\/revisions"}],"predecessor-version":[{"id":3306,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/297\/revisions\/3306"}],"part":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/parts\/3"}],"metadata":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapters\/297\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/media?parent=297"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/pressbooks\/v2\/chapter-type?post=297"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/contributor?post=297"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/courses.lumenlearning.com\/lumen-danacenter-statsmockup\/wp-json\/wp\/v2\/license?post=297"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}