Reading: Secondary Data Analysis

While sociologists often engage in original research studies, they also contribute knowledge to the discipline through secondary data analysis. Secondary data don’t result from firsthand research collected from primary sources, but are the already completed work of other researchers. Sociologists might study works written by historians, economists, teachers, or early sociologists. They might search through periodicals, newspapers, or magazines from any period in history.

almost illegible handwritten chart of census information, listing households, names, ages, addresses, etc.

This 1930 Chicago census record is an example of secondary data.

Using available information not only saves time and money but can also add depth to a study. Sociologists often interpret findings in a new way, a way that was not part of an author’s original purpose or intention. To study how women were encouraged to act and behave in the 1960s, for example, a researcher might watch movies, televisions shows, and situation comedies from that period. Or to research changes in behavior and attitudes due to the emergence of television in the late 1950s and early 1960s, a sociologist would rely on new interpretations of secondary data. Decades from now, researchers will most likely conduct similar studies on the advent of mobile phones, the Internet, or Facebook.

Content Analysis of Poor in Magazines

Martin Gilens (1996) wanted to find out why survey research shows that the American public substantially exaggerates the percentage of African Americans among the poor. He examined whether media representations influence public perceptions and did a content analysis of photographs of poor people in American news magazines. He coded and then systematically recorded incidences of three variables: (1) Race: white, black, indeterminate; (2) Employed: working, not working; and (3) Age. Gilens discovered that not only were African Americans markedly overrepresented in news magazine photographs of poverty, but that the photos also tended to underrepresent “sympathetic” subgroups of the poor—the elderly and working poor—while overrepresenting less sympathetic groups—unemployed, working age adults. Gilens concluded that by providing a distorted representation of poverty, U.S. news magazines “reinforce negative stereotypes of blacks as mired in poverty and contribute to the belief that poverty is primarily a ‘black problem’” (1996).

Social scientists also learn by analyzing the research of a variety of agencies. Governmental departments and global groups, like the U.S. Bureau of Labor Statistics or the World Health Organization, publish studies with findings that are useful to sociologists. A public statistic like the foreclosure rate might be useful for studying the effects of the 2008 recession; a racial demographic profile might be compared with data on education funding to examine the resources accessible by different groups.

One of the advantages of secondary data is that it is nonreactive research (or unobtrusive research), meaning that it does not include direct contact with subjects and will not alter or influence people’s behaviors. Unlike studies requiring direct contact with people, using previously published data doesn’t require entering a population and the investment and risks inherent in that research process.

Using available data does have its challenges. Public records are not always easy to access. A researcher will need to do some legwork to track them down and gain access to records. To guide the search through a vast library of materials and avoid wasting time reading unrelated sources, sociologists employ content analysis, applying a systematic approach to record and value information gleaned from secondary data as they relate to the study at hand.

But, in some cases, there is no way to verify the accuracy of existing data. It is easy to count how many drunk drivers, for example, are pulled over by the police. But how many are not? While it’s possible to discover the percentage of teenage students who drop out of high school, it might be more challenging to determine the number who return to school or get their GED later.

Another problem arises when data are unavailable in the exact form needed or do not include the precise angle the researcher seeks. For example, the average salaries paid to professors at a public school is public record. But the separate figures don’t necessarily reveal how long it took each professor to reach the salary range, what their educational backgrounds are, or how long they’ve been teaching.

When conducting content analysis, it is important to consider the date of publication of an existing source and to take into account attitudes and common cultural ideals that may have influenced the research. For example, Robert S. Lynd and Helen Merrell Lynd gathered research for their book Middletown: A Study in Modern American Culture in the 1920s. Attitudes and cultural norms were vastly different then than they are now. Beliefs about gender roles, race, education, and work have changed significantly since then. At the time, the study’s purpose was to reveal the truth about small U.S. communities. Today, it is an illustration of 1920s’ attitudes and values.


1. Which materials are considered secondary data?

  1. Photos and letters given to you by another person
  2. Books and articles written by other authors about their studies
  3. Information that you have gathered and now have included in your results
  4. Responses from participants whom you both surveyed and interviewed

2. Using secondary data is considered an unobtrusive or ________ research method.

  1. nonreactive
  2. nonparticipatory
  3. nonrestrictive
  4. non-confrontive