Data Exploration: Test Scores

Learning Outcomes

  • Use a spreadsheet to explore measures of central tendency of a data set
  • Use a spreadsheet to explore measures of spread and variation of a data set

A teacher has given her math class a test. The 25 scores are listed below.

45 49 50 57 62
67 68 72 72 73
78 79 79 80 81
81 81 84 85 85
88 90 91 92 100

Hands-on Spreadsheet: Explore the Data

Step 1: Test your knowledge

Now it’s your turn to show what you can do. In your existing Excel workbook, at the bottom of the window, click the plus sign next to the tab named Speed Cube Times to open a new page.Rename this tab Test Scores

  1. Store the data
  2. Obtain the descriptive statistics. What are the mean, median, and mode? What is the standard deviation? Does the data appear to be tightly centered or more spread out? What is the skewness? Does the mean appear to have been pulled left or right of center by any extra large or extra small values?

  3. Now create a frequency table and histogram. Use class intervals that correspond with letter grades: 40-59, 60-69, and so on.

Step 2: Create a 5 number summary

You may recall from a previous section in the text that a 5 number summary is a way of summarizing quantitative data using quartiles.

Quartiles

Quartiles are values that divide the data in quarters.

The first quartile (Q1) is the value so that 25% of the data values are below it; the third quartile (Q3) is the value so that 75% of the data values are below it. You may have guessed that the second quartile is the same as the median, since the median is the value so that 50% of the data values are below it.

This divides the data into quarters; 25% of the data is between the minimum and Q1, 25% is between Q1 and the median, 25% is between the median and Q3, and 25% is between Q3 and the maximum value.

Five number summary

The five number summary takes this form:

Minimum, Q1, Median, Q3, Maximum

taking care in the details

The real power of using a spreadsheet isn’t revealed the first time you use it to run an analysis or computation for you. Designing and creating a spreadsheet often takes longer than just doing the calculation by hand. The real power is revealed in using the sheet repeatedly. A few tedious long moments spent storing functions and formulas and testing them carefully can free up precious time in the future. As you type the formulas in, practice being careful and accurate.

Excel can find the 5 number summary for us.

  1. Type the label “5 number summary” into cell C22.
  2. Type the words Min, Q1, Median, Q3, and Max into cells C23, C24, C25, C26, and C27, respectively.
  3. We are going to use Excel’s QUARTILE.INC() function to find the 5 numbers. The formula takes as its input, the data list, followed by a comma, followed by a quantity. The quantity for the Min is 0. For Q1, it’s 1. For the Median, it’s 2. For Q3, it’s 3. And for the Max, it’s 4. You could also use the MIN() and MAX() functions to get those numbers. The array of data is A1:A25.
  4. Into cell D23, type =QUARTILE.INC(A1:A25,0) and enter. Notice that the cell is populated with the minimum data element, 45.
  5. Into cell D24, type =QUARTILE.INC(A1:A25,1)
  6. Into cell D25, type =QUARTILE.INC(A1:A25,2)
  7. Into cell D26, type =QUARTILE.INC(A1:A25,3)
  8. Into cell D26, type =QUARTILE.INC(A1:A25,4) 
  9. Highlight the data in A1 – A25 then click Insert. Under charts, look for the icon that resembles a histogram, Statistical Charts, and click it. Then choose Box and Whisker and click it.
  10. Note the box plot is vertical. You can change the Chart Title or delete it. The horizontal axis is labeled “1” because we just have 1 data series in the graph. You can delete that or change it. If you right-click on the blue box then choose data labels, you’ll see the 5 number summary show up in the graph.

Step 3: Analyze the data representations

Analyzing the graphical representation of the data, we can see that, even though several students did well on the test, the mean seems to have been pulled to the left of center by the 4 failing scores. The teacher can use all of these representations — the list of scores, the table, and the graphs — to make decisions about possible adjustments to her learning environment.