Putting It Together: Collecting and Classifying Data

You’ve been learning about ways to make sense of the overload of data that constantly surrounds you.  Now that you’ve completed the module, you can better determine how to collect data and determine what the data truly represent.

At some point, you have probably been involved in some type of attempt to collect data.  Perhaps you were asked to take a “quick” survey about your attitude toward some type of political or consumer issue, and you agreed.

Sketch of a red rotary phone that looks like it's ringing.The survey took longer than promised, but at least you helped the person collect random and unbiased data. Or did you? In order to appreciate the data collected, you need to ask yourself a few questions:

  • Is it possible to get a truly random sample from a phone survey?
  • What is the population of the sample?
  • What are possible sources of bias?
  • Is the data being collected from you a statistic or a parameter? Is it categorical or qualitative?
  • Is this an experiment or observational study?

Calling a list of phone numbers is partly random, but there is potential to leave out part of the population.  For example, a phone survey misses people who don’t have a landline and use only a cell phone.  It also misses people who monitor calls and let the answering machine pick up. So the sample you were part of wasn’t truly random.

 

Let’s consider some of the results, which are mailed to you at a later date.  The first thing you see is this bar graph.  What does it suggest?

Bar graph is titled “Recycling Efforts”. The x-axis shows values 1, 2, 3, 4, 5 whereas the y-axis extends from 0 to 12. The y-axis is labeled “Number of Respondents”. The bar at 1 has a height of 9. The bar at 2 has a height of 8. The bar at 3 has a height of 10. The bar at 4 has a height of 6. The bar at 5 has a height of 4.

1= never, 5= as much as possible

 

When you look closely, you notice that the data does not indicate a specific quantitative amount, like how many times a week the respondent recycles. The survey instead takes qualitative data – self-reflection about likeliness to recycle – and quantifies that information on a scale of 1 to 5.

You can also determine that the majority of respondents think they recycle less than, or equal to average. It also shows that the category selected by the most survey-takers is 3, halfway between recycling as much as possible and never recycling. But is this what the general public truly thinks about recycling?

To find out, you need to take this research into your hands by considering additional methods of data collection.  If you are interested in qualitative data, you must be prepared for a little more time-consuming research.  The main methods of conducting this research involve individual interviews, focus groups, and direct observation.  Maybe you might ask people at random around town.  Or you might prefer quantitative data, and count the number of recycle bins placed outside homes on the proper day.  But be careful not to bias your results.  If you include data collected in a town that does not have recycling pickup, you will most likely obtain different results than in an area where recycling pickup is easy and free.

Collecting unbiased, useful data is a challenging task.  You must always be sure to take possible errors into account and design your data collection method to minimize them.  How would you design a method to collect data about society’s dedication to recycling?