Chapter 4 Theories in Scientific Research

As we know from previous chapters, science is knowledge represented as a collection of “theories” derived using the scientific method. In this chapter, we will examine what is a theory, why do we need theories in research, what are the building blocks of a theory, how to evaluate theories, how can we apply theories in research, and also presents illustrative examples of five theories frequently used in social science research.

Theories

Theories are explanations of a natural or social behavior, event, or phenomenon. More formally, a scientific theory is a system of constructs (concepts) and propositions (relationships between those constructs) that collectively presents a logical, systematic, and coherent explanation of a phenomenon of interest within some assumptions and boundary conditions (Bacharach 1989). [1]

Theories should explain why things happen, rather than just describe or predict. Note that it is possible to predict events or behaviors using a set of predictors, without necessarily explaining why such events are taking place. For instance, market analysts predict fluctuations in the stock market based on market announcements, earnings reports of major companies, and new data from the Federal Reserve and other agencies, based on previously observed correlations . Prediction requires only correlations. In contrast, explanations require causations , or understanding of cause-effect relationships. Establishing causation requires three conditions: (1) correlations between two constructs, (2) temporal precedence (the cause must precede the effect in time), and (3) rejection of alternative hypotheses (through testing). Scientific theories are different from theological, philosophical, or other explanations in that scientific theories can be empirically tested using scientific methods.

Explanations can be idiographic or nomothetic. Idiographic explanations are those that explain a single situation or event in idiosyncratic detail. For example, you did poorly on an exam because: (1) you forgot that you had an exam on that day, (2) you arrived late to the exam due to a traffic jam, (3) you panicked midway through the exam, (4) you had to work late the previous evening and could not study for the exam, or even (5) your dog ate your text book. The explanations may be detailed, accurate, and valid, but they may not apply to other similar situations, even involving the same person, and are hence not generalizable. In contrast, nomothetic explanations seek to explain a class of situations or events rather than a specific situation or event. For example, students who do poorly in exams do so because they did not spend adequate time preparing for exams or that they suffer from nervousness, attention-deficit, or some other medical disorder. Because nomothetic explanations are designed to be generalizable across situations, events, or people, they tend to be less precise, less complete, and less detailed. However, they explain economically, using only a few explanatory variables. Because theories are also intended to serve as generalized explanations for patterns of events, behaviors, or phenomena, theoretical explanations are generally nomothetic in nature.

While understanding theories, it is also important to understand what theory is not. Theory is not data, facts, typologies, taxonomies, or empirical findings. A collection of facts is not a theory, just as a pile of stones is not a house. Likewise, a collection of constructs (e.g., a typology of constructs) is not a theory, because theories must go well beyond constructs to include propositions, explanations, and boundary conditions. Data, facts, and findings operate at the empirical or observational level, while theories operate at a conceptual level and are based on logic rather than observations.

There are many benefits to using theories in research. First, theories provide the underlying logic of the occurrence of natural or social phenomenon by explaining what are the key drivers and key outcomes of the target phenomenon and why, and what underlying processes are responsible driving that phenomenon. Second, they aid in sense-making by helping us synthesize prior empirical findings within a theoretical framework and reconcile contradictory findings by discovering contingent factors influencing the relationship between two constructs in different studies. Third, theories provide guidance for future research by helping identify constructs and relationships that are worthy of further research. Fourth, theories can contribute to cumulative knowledge building by bridging gaps between other theories and by causing existing theories to be reevaluated in a new light.

However, theories can also have their own share of limitations. As simplified explanations of reality, theories may not always provide adequate explanations of the phenomenon of interest based on a limited set of constructs and relationships. Theories are designed to be simple and parsimonious explanations, while reality may be significantly more complex. Furthermore, theories may impose blinders or limit researchers’ “range of vision,” causing them to miss out on important concepts that are not defined by the theory.

Building Blocks of a Theory

David Whetten (1989) suggests that there are four building blocks of a theory: constructs, propositions, logic, and boundary conditions/assumptions. Constructs capture the “what” of theories (i.e., what concepts are important for explaining a phenomenon), propositions capture the “how” (i.e., how are these concepts related to each other), logic represents the “why” (i.e., why are these concepts related), and boundary conditions/assumptions examines the “who, when, and where” (i.e., under what circumstances will these concepts and relationships work). Though constructs and propositions were previously discussed in Chapter 2, we describe them again here for the sake of completeness.

Constructs are abstract concepts specified at a high level of abstraction that are chosen specifically to explain the phenomenon of interest. Recall from Chapter 2 that constructs may be unidimensional (i.e., embody a single concept), such as weight or age, or multi-dimensional (i.e., embody multiple underlying concepts), such as personality or culture. While some constructs, such as age, education, and firm size, are easy to understand, others, such as creativity, prejudice, and organizational agility, may be more complex and abstruse, and still others such as trust, attitude, and learning, may represent temporal tendencies rather than steady states. Nevertheless, all constructs must have clear and unambiguous operational definition that should specify exactly how the construct will be measured and at what level of analysis (individual, group, organizational, etc.). Measurable representations of abstract constructs are called variables . For instance, intelligence quotient (IQ score) is a variable that is purported to measure an abstract construct called intelligence. As noted earlier, scientific research proceeds along two planes: a theoretical plane and an empirical plane. Constructs are conceptualized at the theoretical plane, while variables are operationalized and measured at the empirical (observational) plane. Furthermore, variables may be independent, dependent, mediating, or moderating, as discussed in Chapter 2. The distinction between constructs (conceptualized at the theoretical level) and variables (measured at the empirical level) is shown in Figure 4.1.

Flowchart showing the theoretical plane with construct A leading to a proposition of construct B, then the emprical plane below with the independent variable leading to a hypothesis about the dependent variable.

Figure 4.1. Distinction between theoretical and empirical concepts

Propositions are associations postulated between constructs based on deductive logic. Propositions are stated in declarative form and should ideally indicate a cause-effect relationship (e.g., if X occurs, then Y will follow). Note that propositions may be conjectural but MUST be testable, and should be rejected if they are not supported by empirical observations. However, like constructs, propositions are stated at the theoretical level, and they can only be tested by examining the corresponding relationship between measurable variables of those constructs. The empirical formulation of propositions, stated as relationships between variables, is called hypotheses . The distinction between propositions (formulated at the theoretical level) and hypotheses (tested at the empirical level) is depicted in Figure 4.1.

The third building block of a theory is the logic that provides the basis for justifying the propositions as postulated. Logic acts like a “glue” that connects the theoretical constructs and provides meaning and relevance to the relationships between these constructs. Logic also represents the “explanation” that lies at the core of a theory. Without logic, propositions will be ad hoc, arbitrary, and meaningless, and cannot be tied into a cohesive “system of propositions” that is the heart of any theory.

Finally, all theories are constrained by assumptions about values, time, and space, and boundary conditions that govern where the theory can be applied and where it cannot be applied. For example, many economic theories assume that human beings are rational (or boundedly rational) and employ utility maximization based on cost and benefit expectations as a way of understand human behavior. In contrast, political science theories assume that people are more political than rational, and try to position themselves in their professional or personal environment in a way that maximizes their power and control over others. Given the nature of their underlying assumptions, economic and political theories are not directly comparable, and researchers should not use economic theories if their objective is to understand the power structure or its evolution in a organization. Likewise, theories may have implicit cultural assumptions (e.g., whether they apply to individualistic or collective cultures), temporal assumptions (e.g., whether they apply to early stages or later stages of human behavior), and spatial assumptions (e.g., whether they apply to certain localities but not to others). If a theory is to be properly used or tested, all of its implicit assumptions that form the boundaries of that theory must be properly understood. Unfortunately, theorists rarely state their implicit assumptions clearly, which leads to frequent misapplications of theories to problem situations in research.

Attributes of a Good Theory

Theories are simplified and often partial explanations of complex social reality. As such, there can be good explanations or poor explanations, and consequently, there can be good theories or poor theories. How can we evaluate the “goodness” of a given theory? Different criteria have been proposed by different researchers, the more important of which are listed below:

  • Logical consistency: Are the theoretical constructs, propositions, boundary conditions, and assumptions logically consistent with each other? If some of these “building blocks” of a theory are inconsistent with each other (e.g., a theory assumes rationality, but some constructs represent non-rational concepts), then the theory is a poor theory.
  • Explanatory power: How much does a given theory explain (or predict) reality? Good theories obviously explain the target phenomenon better than rival theories, as often measured by variance explained (R-square) value in regression equations.
  • Falsifiability: British philosopher Karl Popper stated in the 1940’s that for theories to be valid, they must be falsifiable. Falsifiability ensures that the theory is potentially disprovable, if empirical data does not match with theoretical propositions, which allows for their empirical testing by researchers. In other words, theories cannot be theories unless they can be empirically testable. Tautological statements, such as “a day with high temperatures is a hot day” are not empirically testable because a hot day is defined (and measured) as a day with high temperatures, and hence, such statements cannot be viewed as a theoretical proposition. Falsifiability requires presence of rival explanations it ensures that the constructs are adequately measurable, and so forth. However, note that saying that a theory is falsifiable is not the same as saying that a theory should be falsified. If a theory is indeed falsified based on empirical evidence, then it was probably a poor theory to begin with!
  • Parsimony: Parsimony examines how much of a phenomenon is explained with how few variables. The concept is attributed to 14 th century English logician Father William of Ockham (and hence called “Ockham’s razor” or “Occam’s razor), which states that among competing explanations that sufficiently explain the observed evidence, the simplest theory (i.e., one that uses the smallest number of variables or makes the fewest assumptions) is the best. Explanation of a complex social phenomenon can always be increased by adding more and more constructs. However, such approach defeats the purpose of having a theory, which are intended to be “simplified” and generalizable explanations of reality. Parsimony relates to the degrees of freedom in a given theory. Parsimonious theories have higher degrees of freedom, which allow them to be more easily generalized to other contexts, settings, and populations.

Approaches to Theorizing

How do researchers build theories? Steinfeld and Fulk (1990) [2] recommend four such approaches. The first approach is to build theories inductively based on observed patterns of events or behaviors. Such approach is often called “grounded theory building”, because the theory is grounded in empirical observations. This technique is heavily dependent on the observational and interpretive abilities of the researcher, and the resulting theory may be subjective and non -confirmable. Furthermore, observing certain patterns of events will not necessarily make a theory, unless the researcher is able to provide consistent explanations for the observed patterns. We will discuss the grounded theory approach in a later chapter on qualitative research.

The second approach to theory building is to conduct a bottom-up conceptual analysis to identify different sets of predictors relevant to the phenomenon of interest using a predefined framework. One such framework may be a simple input-process-output framework, where the researcher may look for different categories of inputs, such as individual, organizational, and/or technological factors potentially related to the phenomenon of interest (the output), and describe the underlying processes that link these factors to the target phenomenon. This is also an inductive approach that relies heavily on the inductive abilities of the researcher, and interpretation may be biased by researcher’s prior knowledge of the phenomenon being studied.

The third approach to theorizing is to extend or modify existing theories to explain a new context, such as by extending theories of individual learning to explain organizational learning. While making such an extension, certain concepts, propositions, and/or boundary conditions of the old theory may be retained and others modified to fit the new context. This deductive approach leverages the rich inventory of social science theories developed by prior theoreticians, and is an efficient way of building new theories by building on existing ones.

The fourth approach is to apply existing theories in entirely new contexts by drawing upon the structural similarities between the two contexts. This approach relies on reasoning by analogy, and is probably the most creative way of theorizing using a deductive approach. For instance, Markus (1987) [3] used analogic similarities between a nuclear explosion and uncontrolled growth of networks or network-based businesses to propose a critical mass theory of network growth. Just as a nuclear explosion requires a critical mass of radioactive material to sustain a nuclear explosion, Markus suggested that a network requires a critical mass of users to sustain its growth, and without such critical mass, users may leave the network, causing an eventual demise of the network.

Examples of Social Science Theories

In this section, we present brief overviews of a few illustrative theories from different social science disciplines. These theories explain different types of social behaviors, using a set of constructs, propositions, boundary conditions, assumptions, and underlying logic. Note that the following represents just a simplistic introduction to these theories; readers are advised to consult the original sources of these theories for more details and insights on each theory.

Agency Theory. Agency theory (also called principal-agent theory), a classic theory in the organizational economics literature, was originally proposed by Ross (1973) [4] to explain two-party relationships (such as those between an employer and its employees, between organizational executives and shareholders, and between buyers and sellers) whose goals are not congruent with each other. The goal of agency theory is to specify optimal contracts and the conditions under which such contracts may help minimize the effect of goal incongruence. The core assumptions of this theory are that human beings are self-interested individuals, boundedly rational, and risk-averse, and the theory can be applied at the individual or organizational level.

The two parties in this theory are the principal and the agent; the principal employs the agent to perform certain tasks on its behalf. While the principal’s goal is quick and effective completion of the assigned task, the agent’s goal may be working at its own pace, avoiding risks, and seeking self-interest (such as personal pay) over corporate interests. Hence, the goal incongruence. Compounding the nature of the problem may be information asymmetry problems caused by the principal’s inability to adequately observe the agent’s behavior or accurately evaluate the agent’s skill sets. Such asymmetry may lead to agency problems where the agent may not put forth the effort needed to get the task done (the moral hazard problem) or may misrepresent its expertise or skills to get the job but not perform as expected (the adverse selection problem). Typical contracts that are behavior-based, such as a monthly salary, cannot overcome these problems. Hence, agency theory recommends using outcome-based contracts, such as a commissions or a fee payable upon task completion, or mixed contracts that combine behavior-based and outcome-based incentives. An employee stock option plans are is an example of an outcome-based contract while employee pay is a behavior-based contract. Agency theory also recommends tools that principals may employ to improve the efficacy of behavior-based contracts, such as investing in monitoring mechanisms (such as hiring supervisors) to counter the information asymmetry caused by moral hazard, designing renewable contracts contingent on agent’s performance (performance assessment makes the contract partially outcome-based), or by improving the structure of the assigned task to make it more programmable and therefore more observable.

Theory of Planned Behavior. Postulated by Azjen (1991) [5] , the theory of planned behavior (TPB) is a generalized theory of human behavior in the social psychology literature that can be used to study a wide range of individual behaviors. It presumes that individual behavior represents conscious reasoned choice, and is shaped by cognitive thinking and social pressures. The theory postulates that behaviors are based on one’s intention regarding that behavior, which in turn is a function of the person’s attitude toward the behavior, subjective norm regarding that behavior, and perception of control over that behavior (see Figure 4.2). Attitude is defined as the individual’s overall positive or negative feelings about performing the behavior in question, which may be assessed as a summation of one’s beliefs regarding the different consequences of that behavior, weighted by the desirability of those consequences.

Subjective norm refers to one’s perception of whether people important to that person expect the person to perform the intended behavior, and represented as a weighted combination of the expected norms of different referent groups such as friends, colleagues, or supervisors at work. Behavioral control is one’s perception of internal or external controls constraining the behavior in question. Internal controls may include the person’s ability to perform the intended behavior (self-efficacy), while external control refers to the availability of external resources needed to perform that behavior (facilitating conditions). TPB also suggests that sometimes people may intend to perform a given behavior but lack the resources needed to do so, and therefore suggests that posits that behavioral control can have a direct effect on behavior, in addition to the indirect effect mediated by intention.

TPB is an extension of an earlier theory called the theory of reasoned action, which included attitude and subjective norm as key drivers of intention, but not behavioral control. The latter construct was added by Ajzen in TPB to account for circumstances when people may have incomplete control over their own behaviors (such as not having high-speed Internet access for web surfing).

Flowchart theory of planned behavior showing a consequence leading to attitude, a norm leading to subjective norms, control leading to behavioral control, and all of these things leading to the intention and then the behavior.

Figure 4.2. Theory of planned behavior

Innovation diffusion theory. Innovation diffusion theory (IDT) is a seminal theory in the communications literature that explains how innovations are adopted within a population of potential adopters. The concept was first studied by French sociologist Gabriel Tarde, but the theory was developed by Everett Rogers in 1962 based on observations of 508 diffusion studies. The four key elements in this theory are: innovation, communication channels, time, and social system. Innovations may include new technologies, new practices, or new ideas, and adopters may be individuals or organizations. At the macro (population) level, IDT views innovation diffusion as a process of communication where people in a social system learn about a new innovation and its potential benefits through communication channels (such as mass media or prior adopters) and are persuaded to adopt it. Diffusion is a temporal process; the diffusion process starts off slow among a few early adopters, then picks up speed as the innovation is adopted by the mainstream population, and finally slows down as the adopter population reaches saturation. The cumulative adoption pattern therefore an S-shaped curve, as shown in Figure 4.3, and the adopter distribution represents a normal distribution. All adopters are not identical, and adopters can be classified into innovators, early adopters, early majority, late majority, and laggards based on their time of their adoption. The rate of diffusion a lso depends on characteristics of the social system such as the presence of opinion leaders (experts whose opinions are valued by others) and change agents (people who influence others’ behaviors).

At the micro (adopter) level, Rogers (1995) [6] suggests that innovation adoption is a process consisting of five stages: (1) knowledge: when adopters first learn about an innovation from mass-media or interpersonal channels, (2) persuasion: when they are persuaded by prior adopters to try the innovation, (3) decision: their decision to accept or reject the innovation, (4) implementation: their initial utilization of the innovation, and (5) confirmation: their decision to continue using it to its fullest potential (see Figure 4.4). Five innovation characteristics are presumed to shape adopters’ innovation adoption decisions: (1) relative advantage: the expected benefits of an innovation relative to prior innovations, (2) compatibility: the extent to which the innovation fits with the adopter’s work habits, beliefs, and values, (3) complexity: the extent to which the innovation is difficult to learn and use, (4) trialability: the extent to which the innovation can be tested on a trial basis, and (5) observability: the extent to which the results of using the innovation can be clearly observed. The last two characteristics have since been dropped from many innovation studies. Complexity is negatively correlated to innovation adoption, while the other four factors are positively correlated. Innovation adoption also depends on personal factors such as the adopter’s risk- taking propensity, education level, cosmopolitanism, and communication influence. Early adopters are venturesome, well educated, and rely more on mass media for information about the innovation, while later adopters rely more on interpersonal sources (such as friends and family) as their primary source of information. IDT has been criticized for having a “pro-innovation bias,” that is for presuming that all innovations are beneficial and will be eventually diffused across the entire population, and because it does not allow for inefficient innovations such as fads or fashions to die off quickly without being adopted by the entire population or being replaced by better innovations.

S-shaped diffusion curve showing the comparison with the traditional bell-shaped curve with 2.5% as innovators, 13.5% as early adopters, 34% as early majority, 34% as the late majority, and 16% as laggards.

Figure 4.3. S-shaped diffusion curve

 

Innovation adoption process showing knowledge then persuasion then decision then implementation and then confirmation.

Figure 4.4. Innovation adoption process.

Elaboration Likelihood Model . Developed by Petty and Cacioppo (1986) [7], the elaboration likelihood model (ELM) is a dual-process theory of attitude formation or change in the psychology literature. It explains how individuals can be influenced to change their attitude toward a certain object, events, or behavior and the relative efficacy of such change strategies. The ELM posits that one’s attitude may be shaped by two “routes” of influence, the central route and the peripheral route, which differ in the amount of thoughtful information processing or “elaboration” required of people (see Figure 4.5). The central route requires a person to think about issue-related arguments in an informational message and carefully scrutinize the merits and relevance of those arguments, before forming an informed judgment about the target object. In the peripheral route, subjects rely on external “cues” such as number of prior users, endorsements from experts, or likeability of the endorser, rather than on the quality of arguments, in framing their attitude towards the target object. The latter route is less cognitively demanding, and the routes of attitude change are typically operationalized in the ELM using the argument quality and peripheral cues constructs respectively.

Argument quality (central route), motivation and ability (elaboration likelihood) and source credibility (peripheral route) all lead to attitude change

Figure 4.5. Elaboration likelihood model

Whether people will be influenced by the central or peripheral routes depends upon their ability and motivation to elaborate the central merits of an argument. This ability and motivation to elaborate is called elaboration likelihood . People in a state of high elaboration likelihood (high ability and high motivation) are more likely to thoughtfully process the information presented and are therefore more influenced by argument quality, while those in the low elaboration likelihood state are more motivated by peripheral cues. Elaboration likelihood is a situational characteristic and not a personal trait. For instance, a doctor may employ the central route for diagnosing and treating a medical ailment (by virtue of his or her expertise of the subject), but may rely on peripheral cues from auto mechanics to understand the problems with his car. As such, the theory has widespread implications about how to enact attitude change toward new products or ideas and even social change.

General Deterrence Theory. Two utilitarian philosophers of the eighteenth century, Cesare Beccaria and Jeremy Bentham, formulated General Deterrence Theory (GDT) as both an explanation of crime and a method for reducing it. GDT examines why certain individuals engage in deviant, anti-social, or criminal behaviors. This theory holds that people are fundamentally rational (for both conforming and deviant behaviors), and that they freely choose deviant behaviors based on a rational cost-benefit calculation. Because people naturally choose utility-maximizing behaviors, deviant choices that engender personal gain or pleasure can be controlled by increasing the costs of such behaviors in the form of punishments (countermeasures) as well as increasing the probability of apprehension. Swiftness, severity, and certainty of punishments are the key constructs in GDT.

While classical positivist research in criminology seeks generalized causes of criminal behaviors, such as poverty, lack of education, psychological conditions, and recommends strategies to rehabilitate criminals, such as by providing them job training and medical treatment, GDT focuses on the criminal decision making process and situational factors that influence that process. Hence, a criminal’s personal situation (such as his personal values, his affluence, and his need for money) and the environmental context (such as how protected is the target, how efficient is the local police, how likely are criminals to be apprehended) play key roles in this decision making process. The focus of GDT is not how to rehabilitate criminals and avert future criminal behaviors, but how to make criminal activities less attractive and therefore prevent crimes. To that end, “target hardening” such as installing deadbolts and building self-defense skills, legal deterrents such as eliminating parole for certain crimes, “three strikes law” (mandatory incarceration for three offenses, even if the offenses are minor and not worth imprisonment), and the death penalty, increasing the chances of apprehension using means such as neighborhood watch programs, special task forces on drugs or gang -related crimes, and increased police patrols, and educational programs such as highly visible notices such as “Trespassers will be prosecuted” are effective in preventing crimes. This theory has interesting implications not only for traditional crimes, but also for contemporary white-collar crimes such as insider trading, software piracy, and illegal sharing of music.

[1] Bacharach, S. B. (1989). “Organizational Theories: Some Criteria for Evaluation,” Academy of Management Review (14:4), 496-515.

[2] Steinfield, C.W. and Fulk, J. (1990). “The Theory Imperative,” in Organizations and Communications Technology , J. Fulk and C. W. Steinfield (eds.), Newbury Park, CA: Sage Publications.

[3] Markus, M. L. (1987). “Toward a ‘Critical Mass’ Theory of Interactive Media: Universal Access, Interdependence, and Diffusion,” Communication Research (14:5), 491-511.

[4] Ross, S. A. (1973). “The Economic Theory of Agency: The Principal’s Problem,” American Economic Review (63:2), 134-139.

[5] Ajzen, I. (1991). “The Theory of Planned Behavior,” Organizational Behavior and Human Decision Processes (50), 179-211.

[6] Rogers, E. (1962). Diffusion of Innovations . New York: The Free Press. Other editions 1983, 1996, 2005.

[7] Petty, R. E., and Cacioppo, J. T. (1986). Communication and Persuasion: Central and Peripheral Routes to Attitude Change . New York: Springer-Verlag.