6.6.4 Making decisions in the face of uncertainty

Uncertainty need not lead to inaction.

The fourth and final area where I expect researchers to struggle is making decisions in the face of uncertainty. That is, after all the philosophizing and balancing, research ethics involves making decisions about what to do and what not to do. Unfortunately, these decisions often must be made based on incomplete information. For example, when designing Encore, researchers might wish to know the probability that it will cause someone to be visited by the police. Or, when designing Emotional Contagion researchers might wish to know the probability that it could trigger depression in some participants. These probabilities are probably extremely low, but they are unknown before the research takes place. And, because neither project publicly tracked information about adverse events, these probabilities are not generally known even after the projects were completed.

Uncertainties are not unique to social research in the digital age. The Belmont Report, when describing the systematic assessment of risks and benefits, explicitly acknowledges these will be difficult to quantify exactly. These uncertainties, however, are more severe in the digital age, in part because we have less experience, and in part because of the characteristics of digital age social research.

Given these uncertainties some people seem to advocate for something like “better safe than sorry,” which is a colloquial version of the Precautionary Principle. While this approach appears reasonable—perhaps even wise—it can actually cause harm; it is chilling to research; and it causes people think in the wrong way (Sunstein 2005). In order to understand the problems with the Precautionary Principle, let’s consider Emotional Contagion. The experiment was planned to involve about 700,000 people, and there was certainly some chance that people in the experiment would suffer harm. But, there was also some chance the experiment could yield knowledge that would be beneficial to Facebook users and to society. Thus, while allowing the experiment is a risk (as has been amply discussed), preventing the experiment is also a risk because the experiment could have produced valuable knowledge. Of course, the choice is not between doing the experiment as it occurred and not doing the experiment; there are many possible modifications to the design that might have brought it into a different ethical balance. However, at some point, researchers will have the choice between doing a study and not doing a study, and there are risks in both action and inaction. It is inappropriate to focus only on the risks of action. Quite simply, there is no risk-free approach.

Moving beyond the Precautionary Principle, one important way to think about making decisions given uncertainty is the minimal risk standard. The minimal risk standard attempts to benchmark the risk of a particular study against the risks that participants undertake in their daily lives, such as playing sports and driving cars (Wendler et al. 2005). This approach is valuable because assessing whether something is minimal risk is easier than assessing the actual level of risk. For example, in Emotional Contagion, before the research started, the researchers could have compared the emotional content on naturally occurring News Feeds to the emotional content that participants would see in the experiment (Meyer 2015). If the News Feeds under the treatment were similar to those that naturally occur on Facebook, then the researchers could conclude that the experiment is minimal risk. And, they could make this decision even if they don’t know the absolute level of risk. The same approach could be applied to Encore. Initially, Encore triggered requests to websites that were known to be sensitive, such as websites of banned political groups in countries with repressive governments. As such, it was not minimal risk for participants in certain countries. However, the revised version of Encore—which only triggered requests to Twitter, Facebook, and YouTube—is minimal request because requests to those sites are triggered during normal web browsing (Narayanan and Zevenbergen 2015).

A second important idea is when making decisions about studies with unknown risk is power analysis, which allows researchers to calculate an appropriate size for their study (Cohen 1988). That is, if your study might expose participants to risk—even minimal risk—then the principle of Beneficence suggests that you want to impose the smallest amount of risk needed to achieve your research goals. (Think back to the Reduce principle that I discussed in Chapter 4.) Even though some researchers have an obsession with making their studies as big as possible, research ethics suggests that we should make our studies as small as possible. Thus, even if you don’t know the exact level of risk your study involves, a power analysis can help you ensure that it is as small as possible. Power analysis is not new, of course, but there is an important difference between the way that it was used in the analog age and how it should be used today. In the analog age, researchers generally did power analysis to make sure that their study was not too small (i.e., under-powered). Now, however, researchers should do power analysis to make sure that their study is not too big (i.e., over-powered). If you do a power analysis and your study appears to require an enormous number of people, then that may be a sign that the effect you are studying is tiny. If so, you should ask whether this small effect is sufficiently important to impose a large number of people to risks of an unknown size. In many situations the answer is probably no (Prentice and Miller 1992).

The minimal risk standard and power analysis help you reason about and design studies, but they don’t provide you any new information about how participants might feel about your study and what risks they might experience from participating in your study. Another way to deal with uncertainty is to collect additional information, which leads to ethical-response surveys and staged trials.

In ethical-response surveys, researchers present a brief description of a proposed research project and then ask two questions:

  • (Q1) “If someone you cared about were a candidate participant for this experiment, would you want that person to be included as a participant?”: [Yes], [I have no preferences], [No]
  • (Q2) “Do you believe that the researchers should be allowed to proceed with this experiment?”: [Yes], [Yes, but with caution], [I’m not sure], [No]

Following each question, respondents are provided a space in which they can explain their answer. Finally, respondents—who could be potential participants or people recruited from a micro-task labor markets (e.g., Amazon Mechanical Turk)—answer some basic demographic questions (Schechter and Bravo-Lillo 2014).

Ethical-response surveys have two features that I find particularly attractive. First, they happen before a study has been conducted, and therefore can prevent problems before the research starts (as opposed to approaches that monitor for adverse reactions). Second, ethical-response surveys enable researchers to pose multiple versions of a research project in order to assess the perceived ethical balance of different versions of the same project. One limitation, however, of ethical-response surveys is that it is not clear how to decide between different research designs given the survey results. In cases of extreme uncertainty this kind of information might help guide researchers’ decisions; in fact, Schechter and Bravo-Lillo (2014) report abandoning a planned study in response to concerns raised by participants in an ethical-response survey.

While ethical-response surveys can be helpful for assessing reactions to proposed research, they cannot measure the probability or severity of adverse events. One way that medical researchers deal with uncertainty in high-risk settings is staged trials, an approach that might be helpful in some social research.

When testing the effectiveness of a new drug, researchers do not immediately jump to a large randomized clinical trial. Rather, they run two types of studies first. Initially, in a Phase I trial, researchers are particularly focused on finding a safe dose, and these studies involve a small number of people. Once a safe dose is discovered, Phase II trials assess the efficacy of the drug, it’s ability to work in a best-case situation (Singal, Higgins, and Waljee 2014). Only after Phase I and II studies is a new drug allowed to be assessed in a large randomized controlled trial. While the exact structure of staged trials used in the development of new drugs may not be a good fit for social research, when faced with uncertainty, researchers could run smaller studies explicitly designed to assess safety and efficacy. For example, with Encore, you could imagine the researchers starting with participants in countries with strong rule-of-law.

Together these four approaches—the minimal risk standard, power analysis, ethical-response surveys, and staged trials—can help you proceed in a sensible way, even in the face of uncertainty. Uncertainty need not lead to inaction.