Statistics as Evidence

Learning Objectives

Evaluate statistics as evidence

Farm workers picking strawberries

Should farm workers earn more money? To make the argument one way or another, you’d need to use statistics to back up your claims.

Statistical evidence supports arguments by grounding claims in verifiable numbers. Consider the claim “farm workers should be paid more.” There are a number of moral or social reasons one could give to back this claim up, but these reasons are unlikely to be convincing to a neutral party, let alone someone who disagrees. By bringing data into the mix, however, one can both clarify claims and anticipate objections. A 2020 article in the Rural Migration News, based at the University of California Davis, combines statistics from The US Bureau of Labor Statistic’s Consumer Expenditure Survey with data about farm wages to make an argument about food prices and farm wages:

“If average farm worker earnings rose by 40 percent, and the increase were passed on to consumers, average spending on fresh fruits and vegetables for a typical household would rise by $25 a year (4 percent x $615 = $24.60). A 40 percent wage increase, on the other hand, would raise the average earnings of seasonal farm workers from $14,000 for 1,000 hours of work to $19,600, lifting the earnings of a farm worker household of four from half of the federal poverty line of $25,750 in 2019 to three-fourths of the poverty line.” (“Food Spending: 2019.”)

In this paragraph, statistics do a number of things. The breakdown of prices and wages explains how farm workers’ income could be improved; the explanation of the relatively minor increase in consumer prices argues against those who claim that wage increases would make food too expensive; and the data on workers’ incomes relative to the poverty line argues (implicitly) that current wages are too low.

Using Data Sources

Using data as sources can improve your research in a variety of ways.
  • Learn more background information.
    • Data and statistics can help give your research context and background. When we’re looking at a problem, for instance, we tend to have questions about the scale of the problem. How many people are affected? What percentage of the total population does this represent?  How long has this been going on?
  • Answer your research question.
    • Some research questions, as we’ll see below, can best be answered with the help of statistical data. The evidence these data provide can help you decide on the best answer for your question.
  • Convince your audience that your answer is correct.
    • Data often give you—and your audience—evidence that your answer to your research question is correct or at least a reasonable answer.
  • Report what others have said about your research question.
    • The kind of statistics that have been collected about a research area or question can tell us a lot about what people have studied or find interesting about a topic.

Sometimes data is actually necessary to answer research questions, particularly in the social sciences and life and physical sciences. For instance, data would be necessary to support or rule out these hypotheses:

  • More women than men voted in the last presidential election in a majority of states.
  • A certain drug shows promising results in the treatment of pancreatic cancer.
  • Listening to certain genres of music lowers blood pressure.
  • People of certain religious denominations are more likely to find a specific television program objectionable.
  • The average weight of house cats in the United States has increased over the past 30 years.
  • The average square footage of supermarkets in the United States has increased in the past 20 years.
  • More tomatoes were consumed per person in the United Kingdom in 2015 than in 1962.
  • Exploding volcanoes can help cool the planet by spewing sulfur dioxide, which combines with water vapor to make reflective aerosols.

So using numeric data in those portions of your final product that require evidence can really strengthen your answer to your research question. At other times, even if data is not actually necessary, numeric data can be particularly persuasive and sharpen the points you want to make in other portions of your final product devoted to, say, describing the situation surrounding your research question.

For example, for a term paper about the research question “Why is there a gap in the number of people who qualify for food from foodbanks and the number of people who use foodbanks?,” you could find data on the website of Feeding America, the nation’s largest network of foodbanks. Some of that data may be the number of people who get food from a foodbank annually, with the number of seniors and children broken out. Those data won’t answer your research question, but they will help you describe the situation around that question and help your audience develop a fuller understanding.

There are two ways of obtaining data:

  • Obtain data that already has been collected and analyzed. That’s what this section will cover.
  • Collect data yourself. This can include activities such as making observations about your environment, conducting surveys or interviews, directly recording measurements in a lab or in the field, or even receiving electronic data recorded by computers/machines that gather the data. We won’t cover these techniques in depth here, but you may encounter them in other courses.

Finding Data in Articles, Books, Web Pages, and More

Numeric search data can be found all over the place. A lot of it can be found as part of another source- such as books; journal, newspaper, and magazine articles; and web pages. In these cases, the data do not stand alone as a distinct element, but instead are part of the larger work.

Many scholarly sources you turn up are likely to contain data. Once you find potential sources, skim them for tables, graphs, or charts. These items are displays or illustrations of data gathered by researchers.  However, sometimes data and interpretations are solely in the body of the narrative text and may be included in sections called “Results” or “Findings.” (That shouldn’t keep you from displaying the data in charts, graphs, or tables as you like in your own work, though.)

Depending on your research question, you may need to gather data from multiple sources to get everything you need to answer your research question and make your argument for it.

For instance, in our example related to foodbanks above, we suggested where you could find statistics about the number of people who get food from American foodbanks. But with that research question (“Why is there a gap in the number of people who qualify for food from foodbanks and the number of people who use foodbanks?”), you would also need to find out from another source how many people qualify for foodbanks based on their income and compare that number with how many people actually use foodbanks.

Finding Data, Data Depositories, and Directories

Sometimes the numeric research data you need may not be in the articles, books, and web sites that you’ve found. But that doesn’t mean that it hasn’t been collected and packaged in a useable format. Governments and research institutions often publish data they have collected in discipline-specific data depositories that make data available online. Here are some examples:

Don’t know if a depository that could contain data in your discipline? Check out a data directory such as re3data.org.

Evaluating Data as Sources

Evaluating data for relevance and credibility is just as important as evaluating any other source. Another thing that is the same with data is that there is never a 100% perfect source. As with any source, you’ll have to make educated guesses (inferences) about whether the data are good enough for your purpose.

To evaluate data, you’ll need to find out how the data were collected. If the book or newspaper, magazine, or web page got the data from somewhere else, do the same evaluation of the source from which the book or article got the data. The article, book, or web page should cite where the data came from. If it doesn’t, then that is an argument against using that data.

If the data are in a research journal article, read the entire article, including the section called Methodology, which tells how the data were collected. Then determine the data’s relevance to your research question by considering such questions as:

  • Were the data collected recently enough?
  • Is the data cross-sectional (based on information from people at any one time) or longitudinal (based on information from the same people over time)? If one is more appropriate for your research question than the other, is there information that you can still logically infer from this data?
  • Were the types of people from whom the data were collected the same type of people your research question addresses? The more representative the study’s sample is of the group your research question addresses, the more confident you can be in using the data to make your argument in your final product.
  • Was the data analysis done at the right level for your research question? For instance, it may have been done at the individual, family, business, state, or zip code level. But if that doesn’t relate to your research question, can you still logically make inferences that will help your argument? Here’s an example: Imagine that your research question asks whether participation in high school sports in Columbus City Schools is positively associated with enrolling in college. But the data you are evaluating is analyzed at the state level. So you have data about the whole state of Ohio’s schools and not Columbus in particular. In this case, ask yourself whether there is still any inference you can make from the data.

Tips for Using Statistics

Whenever you present statistics, it is important to examine them with a careful and critical eye. Statistics can tell a powerful story, but bad statistics can mislead your audience, weaken your argument, and damage your credibility. To avoid the pitfalls of bad statistics:

  • Use reputable sources for statistics, such as government websites, academic institutions and reputable research organizations, and policy/research think tanks.
  • Use a large enough sample size in your statistics to make sure that the statistics are accurate (for example, if a survey only asked four people, then it is likely not representative of the population’s viewpoint).
  • Ensure that data are not skewed. The answers to surveys can often be manipulated by wording the question in such a way as to induce a prevalence towards a certain answer from the respondent.
  • Remember, correlation does not equal causation. For instance, although there is a correlation between increased ice cream consumption and forest fires, that doesn’t mean that ice cream causes forest fires—it just means that both increase during the summer.
  • Use statistics that are easily understood. Many people understand what an average is but not many people will know more complex ideas such as variation and standard deviation.
  • When presenting graphs, make sure that the key points are highlighted and the graphs are not misleading as far as the values presented.
  • Statistics can be intimidating. When presenting statistical ideas or even using numbers in your argument, be sure to thoroughly explain what the numbers mean and use visual aids to help you explain.
  • Tables, graphs, and maps should relate directly to the argument, support statements made in the text, summarize relevant sections of the data analysis, and be clearly labeled.

Try It