Observational Studies

What are Observational Studies?

An observational study is one in which no variables can be manipulated or controlled by the investigator.

Learning Objectives

Identify situations in which observational studies are necessary and the challenges that arise in their interpretation.

Key Takeaways

Key Points

  • An observational study is in contrast with experiments, such as randomized controlled trials, where each subject is randomly assigned to a treated group or a control group.
  • Variables may be uncontrollable because 1) a randomized experiment would violate ethical standards, 2) the investigator may simply lack the requisite influence, or 3) a randomized experiment may be impractical.
  • Observational studies can never identify causal relationships because even though two variables are related both might be caused by a third, unseen, variable.
  • A major challenge in conducting observational studies is to draw inferences that are acceptably free from influences by overt biases, as well as to assess the influence of potential hidden biases.
  • A major challenge in conducting observational studies is to draw inferences that are acceptably free from influences by overt biases, as well as to assess the influence of potential hidden biases.

Key Terms

  • causality: the relationship between an event (the cause) and a second event (the effect), where the second event is understood as a consequence of the first
  • observational study: a study drawing inferences about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator

A common goal in statistical research is to investigate causality, which is the relationship between an event (the cause) and a second event (the effect), where the second event is understood as a consequence of the first. There are two major types of causal statistical studies: experimental studies and observational studies. An observational study draws inferences about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator. This is in contrast with experiments, such as randomized controlled trials, where each subject is randomly assigned to a treated group or a control group. In other words, observational studies have no independent variables — nothing is manipulated by the experimenter. Rather, observations have the equivalent of two dependent variables.

In an observational study, the assignment of treatments may be beyond the control of the investigator for a variety of reasons:

  1. A randomized experiment would violate ethical standards: Suppose one wanted to investigate the abortion – breast cancer hypothesis, which postulates a causal link between induced abortion and the incidence of breast cancer. In a hypothetical controlled experiment, one would start with a large subject pool of pregnant women and divide them randomly into a treatment group (receiving induced abortions) and a control group (bearing children), and then conduct regular cancer screenings for women from both groups. Needless to say, such an experiment would run counter to common ethical principles. The published studies investigating the abortion–breast cancer hypothesis generally start with a group of women who already have received abortions. Membership in this “treated” group is not controlled by the investigator: the group is formed after the “treatment” has been assigned.
  2. The investigator may simply lack the requisite influence: Suppose a scientist wants to study the public health effects of a community-wide ban on smoking in public indoor areas. In a controlled experiment, the investigator would randomly pick a set of communities to be in the treatment group. However, it is typically up to each community and/or its legislature to enact a smoking ban. The investigator can be expected to lack the political power to cause precisely those communities in the randomly selected treatment group to pass a smoking ban. In an observational study, the investigator would typically start with a treatment group consisting of those communities where a smoking ban is already in effect.
  3. A randomized experiment may be impractical: Suppose a researcher wants to study the suspected link between a certain medication and a very rare group of symptoms arising as a side effect. Setting aside any ethical considerations, a randomized experiment would be impractical because of the rarity of the effect. There may not be a subject pool large enough for the symptoms to be observed in at least one treated subject. An observational study would typically start with a group of symptomatic subjects and work backwards to find those who were given the medication and later developed the symptoms

Usefulness and Reliability of Observational Studies

Observational studies can never identify causal relationships because even though two variables are related both might be caused by a third, unseen, variable. Since the underlying laws of nature are assumed to be causal laws, observational findings are generally regarded as less compelling than experimental findings.

Observational studies can, however:

  1. Provide information on “real world” use and practice
  2. Detect signals about the benefits and risks of the use of practices in the general population
  3. Help formulate hypotheses to be tested in subsequent experiments
  4. Provide part of the community-level data needed to design more informative pragmatic clinical trials
  5. Inform clinical practice

A major challenge in conducting observational studies is to draw inferences that are acceptably free from influences by overt biases, as well as to assess the influence of potential hidden biases.

image

Observational Studies: Nature Observation and Study Hall in The Natural and Cultural Gardens, The Expo Memorial Park, Suita city, Osaka, Japan. Observational studies are a type of experiments in which the variables are outside the control of the investigator.

The Clofibrate Trial

The Clofibrate Trial was a placebo-controlled study to determine the safety and effectiveness of drugs treating coronary heart disease in men.

Learning Objectives

Outline how the use of placebos in controlled experiments leads to more reliable results.

Key Takeaways

Key Points

  • Clofibrate was one of four lipid-modifying drugs tested in an observational study known as the Coronary Drug Project.
  • Placebo -controlled studies are a way of testing a medical therapy in which, in addition to a group of subjects that receives the treatment to be evaluated, a separate control group receives a sham “placebo” treatment which is specifically designed to have no real effect.
  • The purpose of the placebo group is to account for the placebo effect — that is, effects from treatment that do not depend on the treatment itself.
  • Appropriate use of a placebo in a clinical trial often requires, or at least benefits from, a double-blind study design, which means that neither the experimenters nor the subjects know which subjects are in the “test group” and which are in the “control group. “.
  • The use of placebos is a standard control component of most clinical trials which attempt to make some sort of quantitative assessment of the efficacy of medicinal drugs or treatments.

Key Terms

  • placebo: an inactive substance or preparation used as a control in an experiment or test to determine the effectiveness of a medicinal drug
  • placebo effect: the tendency of any medication or treatment, even an inert or ineffective one, to exhibit results simply because the recipient believes that it will work
  • regression to the mean: the phenomenon by which extreme examples from any set of data are likely to be followed by examples which are less extreme; a tendency towards the average of any sample

Clofibrate (tradename Atromid-S) is an organic compound that is marketed as a fibrate. It is a lipid-lowering agent used for controlling the high cholesterol and triacylglyceride level in the blood. Clofibrate was one of four lipid-modifying drugs tested in an observational study known as the Coronary Drug Project. Also known as the World Health Organization Cooperative Trial on Primary Prevention of Ischaemic Heart Disease, the study was a randomized, multi-center, double-blind, placebo-controlled trial that was intended to study the safety and effectiveness of drugs for long-term treatment of coronary heart disease in men.

Placebo-Controlled Observational Studies

Placebo-controlled studies are a way of testing a medical therapy in which, in addition to a group of subjects that receives the treatment to be evaluated, a separate control group receives a sham “placebo” treatment which is specifically designed to have no real effect. Placebos are most commonly used in blinded trials, where subjects do not know whether they are receiving real or placebo treatment.

The purpose of the placebo group is to account for the placebo effect — that is, effects from treatment that do not depend on the treatment itself. Such factors include knowing one is receiving a treatment, attention from health care professionals, and the expectations of a treatment’s effectiveness by those running the research study. Without a placebo group to compare against, it is not possible to know whether the treatment itself had any effect.

Appropriate use of a placebo in a clinical trial often requires, or at least benefits from, a double-blind study design, which means that neither the experimenters nor the subjects know which subjects are in the “test group” and which are in the “control group. ” This creates a problem in creating placebos that can be mistaken for active treatments. Therefore, it can be necessary to use a psychoactive placebo, a drug that produces physiological effects that encourage the belief in the control groups that they have received an active drug.

Patients frequently show improvement even when given a sham or “fake” treatment. Such intentionally inert placebo treatments can take many forms, such as a pill containing only sugar, a surgery where nothing is actually done, or a medical device (such as ultrasound) that is not actually turned on. Also, due to the body’s natural healing ability and statistical effects such as regression to the mean, many patients will get better even when given no treatment at all. Thus, the relevant question when assessing a treatment is not “does the treatment work? ” but “does the treatment work better than a placebo treatment, or no treatment at all? ”

Therefore, the use of placebos is a standard control component of most clinical trials which attempt to make some sort of quantitative assessment of the efficacy of medicinal drugs or treatments.

Results of The Coronary Drug Project

Those in the placebo group who adhered to the placebo treatment (took the placebo regularly as instructed) showed nearly half the mortality rate as those who were not adherent. A similar study of women found survival was nearly 2.5 times greater for those who adhered to their placebo. This apparent placebo effect may have occurred because:

  1. Adhering to the protocol had a psychological effect, i.e. genuine placebo effect.
  2. People who were already healthier were more able or more inclined to follow the protocol.
  3. Compliant people were more diligent and health-conscious in all aspects of their lives.

The Coronary Drug Project found that subjects using clofibrate to lower serum cholesterol observed excess mortality in the clofibrate-treated group despite successful cholesterol lowering (47% more deaths during treatment with clofibrate and 5% after treatment with clofibrate) than the non-treated high cholesterol group. These deaths were due to a wide variety of causes other than heart disease, and remain “unexplained”.

Clofibrate was discontinued in 2002 due to adverse affects.

image

Placebo-Controlled Observational Studies: Prescription placebos used in research and practice.

Confounding

A confounding variable is an extraneous variable in a statistical model that correlates with both the dependent variable and the independent variable.

Learning Objectives

Break down why confounding variables may lead to bias and spurious relationships and what can be done to avoid these phenomenons.

Key Takeaways

Key Points

  • A perceived relationship between an independent variable and a dependent variable that has been misestimated due to the failure to account for a confounding factor is termed a spurious relationship.
  • Confounding by indication – the most important limitation of observational studies – occurs when prognostic factors cause bias, such as biased estimates of treatment effects in medical trials.
  • Confounding variables may also be categorised according to their source: such as operational confounds, procedural confounds or person confounds.
  • A reduction in the potential for the occurrence and effect of confounding factors can be obtained by increasing the types and numbers of comparisons performed in an analysis.
  • Moreover, depending on the type of study design in place, there are various ways to modify that design to actively exclude or control confounding variables.

Key Terms

  • peer review: the scholarly process whereby manuscripts intended to be published in an academic journal are reviewed by independent researchers (referees) to evaluate the contribution, i.e. the importance, novelty and accuracy of the manuscript’s contents
  • placebo effect: the tendency of any medication or treatment, even an inert or ineffective one, to exhibit results simply because the recipient believes that it will work
  • prognostic: a sign by which a future event may be known or foretold
  • confounding variable: an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable

Confounding Variables

A confounding variable is an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable. A perceived relationship between an independent variable and a dependent variable that has been misestimated due to the failure to account for a confounding factor is termed a spurious relationship, and the presence of misestimation for this reason is termed omitted-variable bias.

As an example, suppose that there is a statistical relationship between ice cream consumption and number of drowning deaths for a given period. These two variables have a positive correlation with each other. An individual might attempt to explain this correlation by inferring a causal relationship between the two variables (either that ice cream causes drowning, or that drowning causes ice cream consumption). However, a more likely explanation is that the relationship between ice cream consumption and drowning is spurious and that a third, confounding, variable (the season) influences both variables: during the summer, warmer temperatures lead to increased ice cream consumption as well as more people swimming and, thus, more drowning deaths.

Types of Confounding

Confounding by indication has been described as the most important limitation of observational studies. Confounding by indication occurs when prognostic factors cause bias, such as biased estimates of treatment effects in medical trials. Controlling for known prognostic factors may reduce this problem, but it is always possible that a forgotten or unknown factor was not included or that factors interact complexly. Randomized trials tend to reduce the effects of confounding by indication due to random assignment.

Confounding variables may also be categorised according to their source:

  • The choice of measurement instrument (operational confound) – This type of confound occurs when a measure designed to assess a particular construct inadvertently measures something else as well.
  • Situational characteristics (procedural confound) – This type of confound occurs when the researcher mistakenly allows another variable to change along with the manipulated independent variable.
  • Inter-individual differences (person confound) – This type of confound occurs when two or more groups of units are analyzed together (e.g., workers from different occupations) despite varying according to one or more other (observed or unobserved) characteristics (e.g., gender).

Decreasing the Potential for Confounding

A reduction in the potential for the occurrence and effect of confounding factors can be obtained by increasing the types and numbers of comparisons performed in an analysis. If a relationship holds among different subgroups of analyzed units, confounding may be less likely. That said, if measures or manipulations of core constructs are confounded (i.e., operational or procedural confounds exist), subgroup analysis may not reveal problems in the analysis.

Peer review is a process that can assist in reducing instances of confounding, either before study implementation or after analysis has occurred. Similarly, study replication can test for the robustness of findings from one study under alternative testing conditions or alternative analyses (e.g., controlling for potential confounds not identified in the initial study). Also, confounding effects may be less likely to occur and act similarly at multiple times and locations.

Moreover, depending on the type of study design in place, there are various ways to modify that design to actively exclude or control confounding variables:

  1. Case-control studies assign confounders to both groups, cases and controls, equally. In case-control studies, matched variables most often are age and sex.
  2. In cohort studies, a degree of matching is also possible, and it is often done by only admitting certain age groups or a certain sex into the study population. this creates a cohort of people who share similar characteristics; thus, all cohorts are comparable in regard to the possible confounding variable.
  3. Double blinding conceals the experiment group membership of the participants from the trial population and the observers. By preventing the participants from knowing if they are receiving treatment or not, the placebo effect should be the same for the control and treatment groups. By preventing the observers from knowing of their membership, there should be no bias from researchers treating the groups differently or from interpreting the outcomes differently.
  4. A randomized controlled trial is a method where the study population is divided randomly in order to mitigate the chances of self-selection by participants or bias by the study designers. Before the experiment begins, the testers will assign the members of the participant pool to their groups (control, intervention, parallel) using a randomization process such as the use of a random number generator.

Sex Bias in Graduate Admissions

The Berkeley study is one of the best known real life examples of an experiment suffering from a confounding variable.

Learning Objectives

Illustrate how the phenomenon of confounding can be seen in practice via Simpson’s Paradox.

Key Takeaways

Key Points

  • A study conducted in the aftermath of a law suit filed against the University of California, Berkeley showed that men applying were more likely than women to be admitted.
  • Examination of the aggregate data on admissions showed a blatant, if easily misunderstood, pattern of gender discrimination against applicants.
  • When examining the individual departments, it appeared that no department was significantly biased against women.
  • The study concluded that women tended to apply to competitive departments with low rates of admission even among qualified applicants, whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants.
  • Simpson’s Paradox is a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data.

Key Terms

  • Simpson’s paradox: a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data
  • partition: a part of something that had been divided, each of its results
  • aggregate: a mass, assemblage, or sum of particulars; something consisting of elements but considered as a whole

Women have traditionally had limited access to higher education. Moreover, when women began to be admitted to higher education, they were encouraged to major in less-intellectual subjects. For example, the study of English literature in American and British colleges and universities was instituted as a field considered suitable to women’s “lesser intellects”.

However, since 1991 the proportion of women enrolled in college in the U.S. has exceeded the enrollment rate for men, and that gap has widened over time. As of 2007, women made up the majority — 54 percent — of the 10.8 million college students enrolled in the U.S.

This has not negated the fact that gender bias exists in higher education. Women tend to score lower on graduate admissions exams, such as the Graduate Record Exam (GRE) and the Graduate Management Admissions Test (GMAT). Representatives of the companies that publish these tests have hypothesized that greater number of female applicants taking these tests pull down women’s average scores. However, statistical research proves this theory wrong. Controlling for the number of people taking the test does not account for the scoring gap.

Sex Bias at the University of California, Berkeley

On February 7, 1975, a study was published in the journal Science by P.J. Bickel, E.A. Hammel, and J.W. O’Connell entitled “Sex Bias in Graduate Admissions: Data from Berkeley. ” This study was conducted in the aftermath of a law suit filed against the University, citing admission figures for the fall of 1973, which showed that men applying were more likely than women to be admitted, and the difference was so large that it was unlikely to be due to chance.

Examination of the aggregate data on admissions showed a blatant, if easily misunderstood, pattern of gender discrimination against applicants.

Aggregate Data:

Men: 8,442 applicants – 44% admitted

Women: 4,321 applicants – 35% admitted

When examining the individual departments, it appeared that no department was significantly biased against women. In fact, most departments had a small but statistically significant bias in favor of women. The data from the six largest departments are listed below.

Department A

Men: 825 applicants – 62% admitted

Women: 108 applicants – 82% admitted

Department B

Men: 560 applicants – 63% admitted

Women: 25 applicants – 68% admitted

Department C

Men: 325 applicants – 37% admitted

Women: 593 applicants – 34% admitted

Department D

Men: 417 applicants – 33% admitted

Women: 375 applicants – 35% admitted

Department E

Men: 191 applicants – 28% admitted

Women: 393 applicants – 24% admitted

Department F

Men: 272 applicants – 6% admitted

Women: 341 applicants – 7% admitted

The research paper by Bickel et al. concluded that women tended to apply to competitive departments with low rates of admission even among qualified applicants (such as in the English Department), whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants (such as in engineering and chemistry). The study also concluded that the graduate departments that were easier to enter at the University, at the time, tended to be those that required more undergraduate preparation in mathematics. Therefore, the admission bias seemed to stem from courses previously taken.

Confounding Variables and Simpson’s Paradox

The above study is one of the best known real life examples of an experiment suffering from a confounding variable. In this particular case, we can see an occurrence of Simpson’s Paradox. Simpson’s Paradox is a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data. This result is often encountered in social-science and medical-science statistics, and is particularly confounding when frequency data are unduly given causal interpretations.

image

Simpson’s Paradox: An illustration of Simpson’s Paradox. For a full explanation of the figure, visit: http://en.wikipedia.org/wiki/Simpson’s_paradox#Description

The practical significance of Simpson’s paradox surfaces in decision making situations where it poses the following dilemma: Which data should we consult in choosing an action, the aggregated or the partitioned? The answer seems to be that one should sometimes follow the partitioned and sometimes the aggregated data, depending on the story behind the data; with each story dictating its own choice.

As to why and how a story, not data, should dictate choices, the answer is that it is the story which encodes the causal relationships among the variables. Once we extract these relationships we can test algorithmically whether a given partition, representing confounding variables, gives the correct answer.

image

Confounding Variables in Practice: One of the best real life examples of the presence of confounding variables occurred in a study regarding sex bias in graduate admissions here, at the University of California, Berkeley.