## Measures of Central Tendency

It is often desirable to use a few numbers to summarize a distribution. One important aspect of a distribution is where its center is located. Measures of central tendency are discussed first. A second aspect of a distribution is how spread out it is. In other words, how much the data in the distribution vary from one another. The second section describes measures of variability.

Let’s begin by trying to find the most “typical” value of a data set.

Note that we just used the word “typical” although in many cases you might think of using the word “average.” We need to be careful with the word “average” as it means different things to different people in different contexts.  One of the most common uses of the word “average” is what mathematicians and statisticians call the arithmetic mean, or just plain old mean for short.  “Arithmetic mean” sounds rather fancy, but you have likely calculated a mean many times without realizing it; the mean is what most people think of when they use the word “average”.

### Mean

The mean of a set of data is the sum of the data values divided by the number of values.

### Example 14

Marci’s exam scores for her last math class were: 79, 86, 82, 94. The mean of these values would be:

$\frac{79+86+82+94}{4}=85.25$. Typically we round means to one more decimal place than the original data had. In this case, we would round 85.25 to 85.3.

### Example 15

The number of touchdown (TD) passes thrown by each of the 31 teams in the National Football League in the 2000 season are shown below.

37 33 33 32 29 28 28 23 22 22 22 21 21 21 20

20 19 19 18 18 18 18 16 15 14 14 14 12 12 9 6

Adding these values, we get 634 total TDs. Dividing by 31, the number of data values, we get 634/31 = 20.4516.   It would be appropriate to round this to 20.5.

It would be most correct for us to report that “The mean number of touchdown passes thrown in the NFL in the 2000 season was 20.5 passes,” but it is not uncommon to see the more casual word “average” used in place of “mean.”

The price of a jar of peanut butter at 5 stores was: $3.29,$3.59, $3.79,$3.75, and $3.99. Find the mean price. ### Example 16 The one hundred families in a particular neighborhood are asked their annual household income, to the nearest$5 thousand dollars. The results are summarized in a frequency table below.

 Income (thousands of dollars) Frequency 15 6 20 8 25 11 30 17 35 19 40 20 45 12 50 7

Calculating the mean by hand could get tricky if we try to type in all 100 values:

$\frac{\overbrace{15+\cdots+15}^{\text{6terms}}+\overbrace{20+\cdots+20}^{\text{8terms}}+\overbrace{25+\cdots+25}^{\text{11terms}}+\cdots}{\text{100}}$

We could calculate this more easily by noticing that adding 15 to itself six times is the same as = 90. Using this simplification, we get

$\frac{15\cdot6+20\cdot8+25\cdot11+30\cdot17+35\cdot19+40\cdot20+45\cdot12+50\cdot7}{\text{100}}=\frac{3390}{100}=33.9$

The mean household income of our sample is 33.9 thousand dollars ($33,900). ### Example 17 Extending off the last example, suppose a new family moves into the neighborhood example that has a household income of$5 million ($5000 thousand). Adding this to our sample, our mean is now: $\frac{15\cdot6+20\cdot8+25\cdot11+30\cdot17+35\cdot19+40\cdot20+45\cdot12+50\cdot7+5000\cdot1}{\text{101}}=\frac{8390}{101}=83.069$ While 83.1 thousand dollars ($83,069) is the correct mean household income, it no longer represents a “typical” value.

Imagine the data values on a see-saw or balance scale. The mean is the value that keeps the data in balance, like in the picture below.

### Example 20

Let us return now to our original household income data

 Income (thousands of dollars) Frequency 15 6 20 8 25 11 30 17 35 19 40 20 45 12 50 7

Here we have 100 data values. If we didn’t already know that, we could find it by adding the frequencies. Since 100 is an even number, we need to find the mean of the middle two data values – the 50th and 51st data values. To find these, we start counting up from the bottom:

There are 6 data values of $15, so Values 1 to 6 are$15 thousand

The next 8 data values are $20, so Values 7 to (6+8)=14 are$20 thousand

The next 11 data values are $25, so Values 15 to (14+11)=25 are$25 thousand

The next 17 data values are $30, so Values 26 to (25+17)=42 are$30 thousand

The next 19 data values are $35, so Values 43 to (42+19)=61 are$35 thousand

From this we can tell that values 50 and 51 will be $35 thousand, and the mean of these two values is$35 thousand. The median income in this neighborhood is $35 thousand. ### Example 21 If we add in the new neighbor with a$5 million household income, then there will be 101 data values, and the 51st value will be the median. As we discovered in the last example, the 51st value is \$35 thousand. Notice that the new neighbor did not affect the median in this case. The median is not swayed as much by outliers as the mean is.

In addition to the mean and the median, there is one other common measurement of the “typical” value of a data set: the mode.

### Mode

The mode is the element of the data set that occurs most frequently.

The mode is fairly useless with data like weights or heights where there are a large number of possible values. The mode is most commonly used for categorical data, for which median and mean cannot be computed.

### Example 22

In our vehicle color survey, we collected the data

For this data, Green is the mode, since it is the data value that occurred the most frequently.

It is possible for a data set to have more than one mode if several categories have the same frequency, or no modes if each every category occurs only once.

### Try it Now 6

Reviewers were asked to rate a product on a scale of 1 to 5. Find

a. The mean rating

b. The median rating

c. The mode rating

 Rating Frequency 1 4 2 8 3 7 4 3 5 1