- Describe the distribution of quantitative data using a histogram.
In the next example, we use a histogram to describe the shape, center, and spread of the distribution of a quantitative variable.
Oscar for Best Actress
Here we have the ages of the actresses who won an Oscar for Best Actress from 1970 to 2001.
Click here to see the entire data set.
Shape: The distribution of ages appears skewed to the right. Most of the Oscar winners for Best Actress are young. More precisely, we see that 91% (29 of 32) of the winners under 50 years of age, and 56% (18 of 32) of the winners are under the age of 35.
Center: The distribution of ages appears to be centered between 30 and 35 years; 28% (9 of 32) of the winners are in this age range.
Spread: The data range from about 20 to about 80, so the overall range is approximately 60. There is a lot of variability in the ages of actresses who have won the Oscar for Best Actress.
Outliers: Winners older than 60 years are unusual. There are three outliers: one in each of the following intervals: 60–65, 70–75, 75–80.
Now we summarize all of these observations in a paragraph:
Between the years of 1970 and 2001, the Oscar for Best Actress was awarded most often to young actresses: 56% (18 of 32) of the winners were under the age of 35, with 28% (9 of 32) of the winners between 30 and 35 years of age. Winners ranged in age from about 20 to about 80, but only 3 of the 32 were over 60.
Here is a paragraph that uses more formal vocabulary to summarize the distribution of ages:
Between the years of 1970 and 2001, the Oscar for Best Actress was awarded most often to young actresses. The distribution of ages is skewed to the right: 56% (18 of 32) of the winners were under the age of 35, with the center of the distribution between 30 and 35 years of age. With winners ranging in age from about 20 to about 80, the overall range of the distribution is about 60. But much of this variability is due to three outliers who were older than 60 when they won the award.