Measures of the Location

Learning Outcomes

  • Calculate the median and quartiles for a set of data
  • Find the interquartile range and use it to identify outliers

The common measures of location are quartiles and percentiles.

Quartiles are special percentiles. The first quartile, [latex]Q_1[/latex], is the same as the [latex]25[/latex]th percentile, and the third quartile, [latex]Q_3[/latex], is the same as the [latex]75[/latex]th percentile. The median, [latex]M[/latex], is called both the second quartile and the [latex]50[/latex]th percentile.

The following video gives an introduction to median, quartiles, and interquartile range, the topics you will learn in this section.

To calculate quartiles and percentiles, the data must be ordered from smallest to largest. Quartiles divide ordered data into quarters. Percentiles divide ordered data into hundredths. To score in the [latex]90[/latex]th percentile of an exam does not mean, necessarily, that you received [latex]90[/latex]% on a test. It means that [latex]90[/latex]% of test scores are the same or less than your score and [latex]10[/latex]% of the test scores are the same or greater than your test score.

Percentiles are useful for comparing values. For this reason, universities and colleges use percentiles extensively. One instance in which colleges and universities use percentiles is when SAT results are used to determine a minimum testing score that will be used as an acceptance factor. For example, suppose Duke accepts SAT scores at or above the [latex]75[/latex]th percentile. That translates into a score of at least [latex]1220[/latex].

Percentiles are mostly used with very large populations. Therefore, if you were to say that [latex]90[/latex]% of the test scores are less (and not the same or less) than your score, it would be acceptable because removing one particular data value is not significant.

Recall: Ordering Decimals

It is helpful to use a number line to order decimals. The best way to think about decimals is a form of a fraction. You need to break the number line up with intervals equal to the denominator of the fraction.

Example

Locate [latex]0.4[/latex] on a number line.

Solution
The decimal [latex]0.4[/latex] is equivalent to [latex]{\Large\frac{4}{10}}[/latex], so [latex]0.4[/latex] is located between [latex]0[/latex] and [latex]1[/latex]. On a number line, divide the interval between [latex]0[/latex] and [latex]1[/latex] into [latex]10[/latex] equal parts and place marks to separate the parts.

Label the marks [latex]0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0[/latex]. We write [latex]0[/latex] as [latex]0.0[/latex] and [latex]1[/latex] as [latex]1.0[/latex], so that the numbers are consistently in tenths. Finally, mark [latex]0.4[/latex] on the number line.

A number line is shown with 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0 labeled. There is a red dot at 0.4.

The median is a number that measures the “center” of the data. You can think of the median as the “middle value,” but it does not actually have to be one of the observed values. It is a number that separates ordered data into halves. Half the values are the same number or smaller than the median, and half the values are the same number or larger. For example, consider the following data:

[latex]1[/latex]; [latex]11.5[/latex]; [latex]6[/latex]; [latex]7.2[/latex]; [latex]4[/latex]; [latex]8[/latex]; [latex]9[/latex]; [latex]10[/latex]; [latex]6.8[/latex]; [latex]8.3[/latex]; [latex]2[/latex]; [latex]2[/latex]; [latex]10[/latex]; [latex]1[/latex]

Ordered from smallest to largest:

[latex]1[/latex]; [latex]1[/latex]; [latex]2[/latex]; [latex]2[/latex]; [latex]4[/latex]; [latex]6[/latex]; [latex]6.8[/latex]; [latex]7.2[/latex]; [latex]8[/latex]; [latex]8.3[/latex]; [latex]9[/latex]; [latex]10[/latex]; [latex]10[/latex]; [latex]11.5[/latex]

Since there are [latex]14[/latex] observations, the median is between the seventh value, [latex]6.8[/latex], and the eighth value, [latex]7.2[/latex]. To find the median, add the two values together and divide by two.

[latex]\displaystyle\frac{{{6.8}+{7.2}}}{{2}}={7}[/latex]
The median is seven. Half of the values are smaller than seven and half of the values are larger than seven.

Recall: Multiply Decimal Numbers

  1. Determine the sign of the product
  2. Write the numbers in vertical format, lining up the numbers on the right
  3. Multiply the numbers as if they were whole numbers, temporarily ignoring the decimal points
  4. Place the decimal point. The number of decimal places in the product is the sum of the number of decimal places in the factors. If needed, use zeros as placeholders.
  5. Write the product with the appropriate sign

 

Quartiles are numbers that separate the data into quarters. Quartiles may or may not be part of the data. To find the quartiles, first find the median or second quartile. The first quartile, [latex]Q_1[/latex], is the middle value of the lower half of the data, and the third quartile, [latex]Q_3[/latex], is the middle value, or median, of the upper half of the data. To get the idea, consider the same data set:

[latex]1[/latex]; [latex]1[/latex]; [latex]2[/latex]; [latex]2[/latex]; [latex]4[/latex]; [latex]6[/latex]; [latex]6.8[/latex]; [latex]7.2[/latex]; [latex]8[/latex]; [latex]8.3[/latex]; [latex]9[/latex]; [latex]10[/latex]; [latex]10[/latex]; [latex]11.5[/latex]

The median or second quartile is seven. The lower half of the data are [latex]1[/latex], [latex]1[/latex], [latex]2[/latex], [latex]2[/latex], [latex]4[/latex], [latex]6[/latex], [latex]6.8[/latex]. The middle value of the lower half is two.

[latex]1[/latex]; [latex]1[/latex]; [latex]2[/latex]; [latex]2[/latex]; [latex]4[/latex]; [latex]6[/latex]; [latex]6.8[/latex]

The number two, which is part of the data, is the first quartile. One-fourth of the entire sets of values are the same as or less than two and three-fourths of the values are more than two.

The upper half of the data is [latex]7.2[/latex], [latex]8[/latex], [latex]8.3[/latex], [latex]9[/latex], [latex]10[/latex], [latex]10[/latex], [latex]11.5[/latex]. The middle value of the upper half is nine.

The third quartile, [latex]Q_3[/latex], is nine. Three-fourths ([latex]75[/latex]%) of the ordered data set are less than nine. One-fourth ([latex]25[/latex]%) of the ordered data set are greater than nine. The third quartile is part of the data set in this example.

The interquartile range (IQR) is a number that indicates the spread of the middle half or the middle [latex]50[/latex]% of the data. It is the difference between the third quartile ([latex]Q_3[/latex]) and the first quartile ([latex]Q_1[/latex]).

[latex]IQR[/latex] = [latex]Q_3[/latex] – [latex]Q_1[/latex]

The IQR can help to determine potential outliers. A value is suspected to be a potential outlier if it is less than (1.5)(IQR) below the first quartile or more than (1.5)(IQR) above the third quartile. Potential outliers always require further investigation.

NOTE

A potential outlier is a data point that is significantly different from the other data points. These special data points may be errors or some kind of abnormality or they may be a key to understanding the data.

Example

For the following [latex]13[/latex] real estate prices, calculate the [latex]IQR[/latex] and determine if any prices are potential outliers. Prices are in dollars.
[latex]389{,}950[/latex]; [latex]230{,}500[/latex]; [latex]158{,}000[/latex]; [latex]479{,}000[/latex]; [latex]639{,}000[/latex]; [latex]114{,}950[/latex]; [latex]5{,}500{,}000[/latex]; [latex]387{,}000[/latex]; [latex]659{,}000[/latex]; [latex]529{,}000[/latex]; [latex]575{,}000[/latex]; [latex]488{,}800[/latex]; [latex]1{,}095{,}000[/latex]

Try It

For the following [latex]11[/latex] salaries, calculate the [latex]IQR[/latex] and determine if any salaries are outliers. The salaries are in dollars.

[latex]$33{,}000[/latex], [latex]$64{,}500[/latex], [latex]$28{,}000[/latex], [latex]$54{,}000[/latex], [latex]$72{,}000[/latex], [latex]$68{,}500[/latex], [latex]$69{,}000[/latex], [latex]$42{,}000[/latex], [latex]$54{,}000[/latex] [latex]$120{,}000[/latex], [latex]$40{,}500[/latex]

Example

For the two data sets in the test scores example, find the following:

  1. The interquartile range. Compare the two interquartile ranges.
  2. Any outliers in either set.

Try It

Find the interquartile range for the following two data sets and compare them.

Test Scores for Class A

[latex]69[/latex]; [latex]96[/latex]; [latex]81[/latex]; [latex]79[/latex]; [latex]65[/latex]; [latex]76[/latex]; [latex]83[/latex]; [latex]99[/latex]; [latex]89[/latex]; [latex]67[/latex]; [latex]90[/latex]; [latex]77[/latex]; [latex]85[/latex]; [latex]98[/latex]; [latex]66[/latex]; [latex]91[/latex]; [latex]77[/latex]; [latex]69[/latex]; [latex]80[/latex]; [latex]94[/latex]

Test Scores for Class B

[latex]90[/latex]; [latex]72[/latex]; [latex]80[/latex]; [latex]92[/latex]; [latex]90[/latex]; [latex]97[/latex]; [latex]92[/latex]; [latex]75[/latex]; [latex]79[/latex]; [latex]68[/latex]; [latex]70[/latex]; [latex]80[/latex]; [latex]99[/latex]; [latex]95[/latex]; [latex]78[/latex]; [latex]73[/latex]; [latex]71[/latex]; [latex]68[/latex]; [latex]95[/latex]; [latex]100[/latex]

Example

Fifty statistics students were asked how much sleep they get per school night (rounded to the nearest hour). The results were:

Amount of Sleep per School Night (Hours) Frequency Relative Frequency Cumulative Relative Frequency
[latex]4[/latex] [latex]2[/latex] [latex]0.04[/latex] [latex]0.04[/latex]
[latex]5[/latex] [latex]5[/latex] [latex]0.10[/latex] [latex]0.14[/latex]
[latex]6[/latex] [latex]7[/latex] [latex]0.14[/latex] [latex]0.28[/latex]
[latex]7[/latex] [latex]12[/latex] [latex]0.24[/latex] [latex]0.52[/latex]
[latex]8[/latex] [latex]14[/latex] [latex]0.28[/latex] [latex]0.80[/latex]
[latex]9[/latex] [latex]7[/latex] [latex]0.14[/latex] [latex]0.94[/latex]
[latex]10[/latex] [latex]3[/latex] [latex]0.06[/latex] [latex]1.00[/latex]

Find the [latex]28[/latex]th percentile. Notice the [latex]0.28[/latex] in the “cumulative relative frequency” column. Twenty-eight percent of [latex]50[/latex] data values is [latex]14[/latex] values. There are [latex]14[/latex] values less than the [latex]28[/latex]th percentile. They include the two [latex]4[/latex]s, the five [latex]5[/latex]s, and the seven [latex]6[/latex]s. The [latex]28[/latex]th percentile is between the last six and the first seven. The [latex]28[/latex]th percentile is [latex]6.5[/latex].

Find the median. Look again at the “cumulative relative frequency” column and find [latex]0.52[/latex]. The median is the [latex]50[/latex]th percentile or the second quartile. [latex]50[/latex]% of [latex]50[/latex] is [latex]25[/latex]. There are [latex]25[/latex] values less than the median. They include the two [latex]4[/latex]s, the five [latex]5[/latex]s, the seven [latex]6[/latex]s, and eleven of the [latex]7[/latex]s. The median or [latex]50[/latex]th percentile is between the [latex]25[/latex]th, or seven, and [latex]26[/latex]th, or seven, values. The median is seven.

Find the third quartile. The third quartile is the same as the [latex]75[/latex]th percentile. You can “eyeball” this answer. If you look at the “cumulative relative frequency” column, you find [latex]0.52[/latex] and [latex]0.80[/latex]. When you have all the fours, fives, sixes and sevens, you have [latex]52[/latex]% of the data. When you include all the [latex]8[/latex]s, you have [latex]80[/latex]% of the data. The [latex]75[/latex]th percentile, then, must be an eight. Another way to look at the problem is to find [latex]75[/latex]% of [latex]50[/latex], which is [latex]37.5[/latex],and round up to [latex]38[/latex]. The third quartile, [latex]Q_3[/latex], is the 38th value, which is an 8. You can check this answer by counting the values. (There are [latex]37[/latex] values below the third quartile and 12 values above.)

Try It

Forty bus drivers were asked how many hours they spend each day running their routes (rounded to the nearest hour). Find the [latex]65[/latex]th percentile.

Amount of time spent on route (hours) Frequency Relative Frequency Cumulative Relative Frequency
[latex]2[/latex] [latex]12[/latex] [latex]0.30[/latex] [latex]0.30[/latex]
[latex]3[/latex] [latex]14[/latex] [latex]0.35[/latex] [latex]0.65[/latex]
[latex]4[/latex] [latex]10[/latex] [latex]0.25[/latex] [latex]0.90[/latex]
[latex]5[/latex] [latex]4[/latex] [latex]0.10[/latex] [latex]1.00[/latex]

Example

Amount of Sleep per School Night (Hours) Frequency Relative Frequency Cumulative Relative Frequency
[latex]4[/latex] [latex]2[/latex] [latex]0.04[/latex] [latex]0.04[/latex]
[latex]5[/latex] [latex]5[/latex] [latex]0.10[/latex] [latex]0.14[/latex]
[latex]6[/latex] [latex]7[/latex] [latex]0.14[/latex] [latex]0.28[/latex]
[latex]7[/latex] [latex]12[/latex] [latex]0.24[/latex] [latex]0.52[/latex]
[latex]8[/latex] [latex]14[/latex] [latex]0.28[/latex] [latex]0.80[/latex]
[latex]9[/latex] [latex]7[/latex] [latex]0.14[/latex] [latex]0.94[/latex]
[latex]10[/latex] [latex]3[/latex] [latex]0.06[/latex] [latex]1.00[/latex]
  1. Find the [latex]80[/latex]th percentile.
  2. Find the [latex]90[/latex]th percentile.
  3. Find the first quartile. What is another name for the first quartile?

Try It

Amount of time spent on route (hours) Frequency Relative Frequency Cumulative Relative Frequency
[latex]2[/latex] [latex]12[/latex] [latex]0.30[/latex] [latex]0.30[/latex]
[latex]3[/latex] [latex]14[/latex] [latex]0.35[/latex] [latex]0.65[/latex]
[latex]4[/latex] [latex]10[/latex] [latex]0.25[/latex] [latex]0.90[/latex]
[latex]5[/latex] [latex]4[/latex] [latex]0.10[/latex] [latex]1.00[/latex]

Find the third quartile. What is another name for the third quartile?

Collaborative Exercise

Your instructor or a member of the class will ask everyone in class how many sweaters they own. Answer the following questions:

  1. How many students were surveyed?
  2. What kind of sampling did you do?
  3. Construct two different histograms. For each, starting value = _____ ending value = ____.
  4. Find the median, first quartile, and third quartile.
  5. Construct a table of the data to find the following:
    1. the 10th percentile
    2. the 70th percentile
    3. the percent of students who own less than four sweaters