Learning OUTCOMES
- Create and interpret frequency tables.
Once you have a set of data, you will need to organize it so that you can analyze how frequently each datum occurs in the set. However, when calculating the frequency, you may need to round your answers so that they are as precise as possible.
Answers and Rounding Off
A simple way to round off answers is to carry your final answer one more decimal place than was present in the original data. Round off only the final answer. Do not round off any intermediate results, if possible. If it becomes necessary to round off intermediate results, carry them to at least twice as many decimal places as the final answer. For example, the average of the three quiz scores four, six, and nine is [latex]6.3[/latex], rounded off to the nearest tenth, because the data are whole numbers. Most answers will be rounded off in this manner.
Levels of Measurement
The way a set of data is measured is called its level of measurement. Correct statistical procedures depend on a researcher being familiar with levels of measurement. Not every statistical operation can be used with every set of data. Data can be classified into four levels of measurement. They are (from lowest to highest level):
- Nominal scale level
- Ordinal scale level
- Interval scale level
- Ratio scale level
Data that is measured using a nominal scale is qualitative. Categories, colors, names, labels and favorite foods along with yes or no responses are examples of nominal level data. Nominal scale data are not ordered. For example, trying to classify people according to their favorite food does not make any sense. Putting pizza first and sushi second is not meaningful.
Smartphone companies are another example of nominal scale data. Some examples are Sony, Motorola, Nokia, Samsung and Apple. This is just a list and there is no agreed upon order. Some people may favor Apple but that is a matter of opinion. Nominal scale data cannot be used in calculations.
Data that is measured using an ordinal scale is similar to nominal scale data but there is a big difference. The ordinal scale data can be ordered. An example of ordinal scale data is a list of the top five national parks in the United States. The top five national parks in the United States can be ranked from one to five but we cannot measure differences between the data.
Another example of using the ordinal scale is a cruise survey where the responses to questions about the cruise are “excellent,” “good,” “satisfactory,” and “unsatisfactory.” These responses are ordered from the most desired response to the least desired. But the differences between two pieces of data cannot be measured. Like the nominal scale data, ordinal scale data cannot be used in calculations.
Data that is measured using the interval scale is similar to ordinal level data because it has a definite ordering but there is a difference between data. The differences between interval scale data can be measured though the data does not have a starting point.
Temperature scales like Celsius (C) and Fahrenheit (F) are measured by using the interval scale. In both temperature measurements, [latex]40°[/latex] is equal to [latex]100°[/latex] minus [latex]60°[/latex]. Differences make sense. But [latex]0[/latex] degrees does not because, in both scales, [latex]0[/latex] is not the absolute lowest temperature. Temperatures like [latex]-10°[/latex] F and [latex]-15°[/latex] C exist and are colder than [latex]0[/latex].
Interval level data can be used in calculations, but one type of comparison cannot be done. [latex]80°[/latex] C is not four times as hot as [latex]20°[/latex] C (nor is [latex]80°[/latex] F four times as hot as [latex]20°[/latex] F). There is no meaning to the ratio of [latex]80[/latex] to [latex]20[/latex] (or four to one).
Data that is measured using the ratio scale takes care of the ratio problem and gives you the most information. Ratio scale data is like interval scale data, but it has a [latex]0[/latex] point and ratios can be calculated. For example, four multiple choice statistics final exam scores are [latex]80[/latex], [latex]68[/latex], [latex]20[/latex] and [latex]92[/latex] (out of a possible [latex]100[/latex] points). The exams are machine-graded.
The data can be put in order from lowest to highest: [latex]20[/latex], [latex]68[/latex], [latex]80[/latex], [latex]92[/latex].
The differences between the data have meaning. The score [latex]92[/latex] is more than the score [latex]68[/latex] by [latex]24[/latex] points. Ratios can be calculated. The smallest score is [latex]0[/latex]. So [latex]80[/latex] is four times [latex]20[/latex]. The score of [latex]80[/latex] is four times better than the score of [latex]20[/latex].
Frequency
Twenty students were asked how many hours they worked per day. Their responses, in hours, are as follows: [latex]5[/latex], [latex]6[/latex], [latex]3[/latex], [latex]3[/latex], [latex]2[/latex], [latex]4[/latex], [latex]7[/latex], [latex]5[/latex], [latex]2[/latex], [latex]3[/latex], [latex]5[/latex], [latex]6[/latex], [latex]5[/latex], [latex]4[/latex], [latex]4[/latex], [latex]3[/latex], [latex]5[/latex], [latex]2[/latex], [latex]5[/latex], [latex]3[/latex].
The following table lists the different data values in ascending order and their frequencies.
DATA VALUE | FREQUENCY |
---|---|
[latex]2[/latex] | [latex]3[/latex] |
[latex]3[/latex] | [latex]5[/latex] |
[latex]4[/latex] | [latex]3[/latex] |
[latex]5[/latex] | [latex]6[/latex] |
[latex]6[/latex] | [latex]2[/latex] |
[latex]7[/latex] | [latex]1[/latex] |
A frequency is the number of times a value of the data occurs. According to the table, there are three students who work two hours, five students who work three hours, and so on. The sum of the values in the frequency column, [latex]20[/latex], represents the total number of students included in the sample.
A relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes. To find the relative frequencies, divide each frequency by the total number of students in the sample–in this case, [latex]20[/latex]. Relative frequencies can be written as fractions, percents, or decimals.
DATA VALUE | FREQUENCY | RELATIVE FREQUENCY |
---|---|---|
[latex]2[/latex] | [latex]3[/latex] | [latex]\displaystyle\frac{3}{20}[/latex] or [latex]0.15[/latex] |
[latex]3[/latex] | [latex]5[/latex] | [latex]\displaystyle\frac{5}{20}[/latex] or [latex]0.25[/latex] |
[latex]4[/latex] | [latex]3[/latex] | [latex]\displaystyle\frac{3}{20}[/latex] or [latex]0.15[/latex] |
[latex]5[/latex] | [latex]6[/latex] | [latex]\displaystyle\frac{6}{20}[/latex] or [latex]0.30[/latex] |
[latex]6[/latex] | [latex]2[/latex] | [latex]\displaystyle\frac{2}{20}[/latex] or [latex]0.10[/latex] |
[latex]7[/latex] | [latex]1[/latex] | [latex]\displaystyle\frac{1}{20}[/latex] or [latex]0.05[/latex] |
The sum of the values in the relative frequency column of the previous table is [latex]\frac{20}{20}[/latex], or [latex]1[/latex].
Cumulative relative frequency is the accumulation of the previous relative frequencies. To find the cumulative relative frequencies, add all the previous relative frequencies to the relative frequency for the current row, as shown in the table below.
DATA VALUE | FREQUENCY | RELATIVE
FREQUENCY |
CUMULATIVE RELATIVE
FREQUENCY |
---|---|---|---|
[latex]2[/latex] | [latex]3[/latex] | [latex]\displaystyle\frac{3}{20}[/latex] or [latex]0.15[/latex] | [latex]0.15[/latex] |
[latex]3[/latex] | [latex]5[/latex] | [latex]\displaystyle\frac{5}{20}[/latex] or [latex]0.25[/latex] | [latex]0.15 + 0.25 = 0.40[/latex] |
[latex]4[/latex] | [latex]3[/latex] | [latex]\displaystyle\frac{3}{20}[/latex] or [latex]0.15[/latex] | [latex]0.40 + 0.15 = 0.55[/latex] |
[latex]5[/latex] | [latex]6[/latex] | [latex]\displaystyle\frac{6}{20}[/latex] or [latex]0.30[/latex] | [latex]0.55 + 0.30 = 0.85[/latex] |
[latex]6[/latex] | [latex]2[/latex] | [latex]\displaystyle\frac{2}{20}[/latex] or [latex]0.10[/latex] | [latex]0.85 + 0.10 = 0.95[/latex] |
[latex]7[/latex] | [latex]1[/latex] | [latex]\displaystyle\frac{1}{20}[/latex] or [latex]0.05[/latex] | [latex]0.95 + 0.05 = 1.00[/latex] |
The last entry of the cumulative relative frequency column is one, indicating that one hundred percent of the data has been accumulated.
NOTE
Because of rounding, the relative frequency column may not always sum to one, and the last entry in the cumulative relative frequency column may not be one. However, they each should be close to one.
Concept Review
Some calculations generate numbers that are artificially precise. It is not necessary to report a value to eight decimal places when the measures that generated that value were only accurate to the nearest tenth. Round off your final answer to one more decimal place than was present in the original data. This means that if you have data measured to the nearest tenth of a unit, report the final statistic to the nearest hundredth.
In addition to rounding your answers, you can measure your data using the following four levels of measurement.
- Nominal scale level: data that cannot be ordered nor can it be used in calculations
- Ordinal scale level: data that can be ordered; the differences cannot be measured
- Interval scale level: data with a definite ordering but no starting point; the differences can be measured, but there is no such thing as a ratio.
- Ratio scale level: data with a starting point that can be ordered; the differences have meaning and ratios can be calculated.
When organizing data, it is important to know how many times a value appears. How many statistics students study five hours or more for an exam? What percent of families on our block own two pets? Frequency, relative frequency, and cumulative relative frequency are measures that answer questions like these.
References
“State & County QuickFacts,” U.S. Census Bureau. http://quickfacts.census.gov/qfd/download_data.html (accessed May 1, 2013).
“State & County QuickFacts: Quick, easy access to facts about people, business, and geography,” U.S. Census Bureau. http://quickfacts.census.gov/qfd/index.html (accessed May 1, 2013).
“Table 5: Direct hits by mainland United States Hurricanes (1851-2004),” National Hurricane Center, http://www.nhc.noaa.gov/gifs/table5.gif (accessed May 1, 2013).
“Levels of Measurement,” http://infinity.cos.edu/faculty/woodbury/stats/tutorial/Data_Levels.htm (accessed May 1, 2013).
Courtney Taylor, “Levels of Measurement,” about.com, http://statistics.about.com/od/HelpandTutorials/a/Levels-Of-Measurement.htm (accessed May 1, 2013).
David Lane. “Levels of Measurement,” Connexions, http://cnx.org/content/m10809/latest/ (accessed May 1, 2013).