2.4.3.1 Natural experiments

You are reading the Open Review Edition of Bit by Bit. Click here to read the 1st Edition.

2.4.3.1 Natural experiments

Natural experiments take advantage of random events in the world. random event + always-on data system = natural experiment

The key to randomized controlled experiments enabling fair comparison is the randomization. However, occasionally something happens in the world that essentially assigns people randomly or nearly randomly to different treatments. One of clearest examples of the strategy of using natural experiments comes from the research of Angrist (1990) that measures the effect of military services on earnings.

During the war in Vietnam, the United States increased the size of its armed forces through a draft. In order to decide which citizens would be called into service, the US government held a lottery. Every birthdate was represented on a piece of paper, and these papers were placed in a large glass jar. As shown in Figure 2.5, these slips of paper were drawn from the jar one at a time to determine the order that young men would be called to serve (young women were not subject to the draft). Based on the results, men born on September 14 were called first, men born on April 24 were called second, and so on. Ultimately, in this lottery, men born on 195 different days were called to service while men born on 171 days were not called.

Figure 2.5: Congressman Alexander Pirnie (R-NY) drawing the first capsule for the Selective Service draft on December 1, 1969. Joshua Angrist (1990) combined the draft lottery with earnings data from the Social Security Administration to estimate the effect of military service on earnings. This is an example of research using a natural experiment. Source: Wikimedia Commons

Although it might not be immediately apparent, a draft lottery has a critical similarity to a randomized controlled experiment: in both situations participants are randomly assigned to receive a treatment. In the case of the draft lottery, if we are interested in learning about the effects of draft-eligibility and military service on subsequent labor market earnings, we can compare outcomes for people whose birthdates were below the lottery cutoff (e.g., September 14, April 24, etc.) with the outcomes for people whose birthdays were after the cutoff (e.g., February 20, December 2, etc.).

Given that this treatment of being drafted has been randomly assigned, we can then measure the effect of this treatment for any outcome that has been measured. For example, Angrist (1990) combined the information about who was randomly selected in the draft with earnings data that was collected by the Social Security Administration to conclude that the earnings of white veterans were about 15% less than the earnings of comparable non-veterans. Other researchers have used a similar trick as well. For example, Conley and Heerwig (2011) combined the information about who was randomly selected in the draft with household data collected from the 2000 Census and 2005 American Community Survey and found that so long after the draft, there was little long-term effect of military service on variety of outcomes such as housing tenure (owning versus renting) and residential stability (likelihood of having moved in previous five years).

As this example illustrates, sometimes social, political, or natural forces create experiments or near-experiments that can be leveraged by researchers. Often natural experiments are the best way to estimate cause-and-effect relationships in settings where it is not ethical or practical to run randomized controlled experiments. They are an important strategy for discovering fair comparisons in non-experimental data. This research strategy can be summarized by this equation:

$\text{random (or as if random) event} + \text{always-on data stream} = \text{natural experiment}\qquad(2.1)$

However, the analysis of natural experiments can be quite tricky. For example, in the case of the Vietnam draft, not everyone who was draft-eligible ended up serving (there were a variety of exemptions). And, at the same time, some people who were not draft-eligible volunteered for service. It was as if in a clinical trial of a new drug, some people in the treatment group did not take their medicine and some of the people in the control group somehow received the drug. This problem, called two-sided noncompliance, as well as many other problems are described in greater detail in some of the recommended readings at the end of this chapter.

The strategy of taking advantage of naturally occurring random assignment precedes the digital age, but the prevalence of big data makes this strategy much easier to use. Once you realize some treatment has been assigned randomly, big data sources can provide the outcome data that you need in order to compare the results for people in the treatment and control conditions. For example, in his study of the effects of the draft and military service, Angrist made use of earnings records from the Social Security Administration; without this outcome data, his study would not have been possible. In this case, the Social Security Administration is the always-on big data source. As more and more automatically collected data sources exist, we will have more outcome data that can measure the effects of changes created by exogenous variation.

To illustrate this strategy in the digital age, let’s consider Mas and Moretti’s (2009) elegant research on the effect of peers on productivity. Although on the surface it might look different than Angrist’s study about the effects of the Vietnam Draft, in structure they both follow the pattern in eq. 2.1.

Mas and Moretti measured how peers affect the productivity of workers. On the one hand, having a hard working peer might lead workers to increase their productivity because of peer pressure. Or, on the other hand, a hard working peer might lead other workers to slack off even more. The clearest way to study peer effects on productivity would be a randomized controlled experiment where workers are randomly assigned to shifts with workers of different productivity levels and then resulting productivity is measured for everyone. Researchers, however, do not control the schedule of workers in any real business, and so Mas and Moretti had to rely on a natural experiment which took place in a supermarket.

Just like eq. 2.1, their study had two parts. First, they used the logs from the supermarket checkout system to have a precise, individual, and always-on measure of productivity: the number of items scanned per second. And, second, because of the way that scheduling was done at this supermarket, they have near random composition of peers. In other words, even though the scheduling of cashiers is not determined by a lottery, it was essentially random. In practice, the confidence we have in natural experiments frequently hinges on the plausibility of this “as-if” random claim. Taking advantage of this random variation, Mas and Moretti found that working with higher productivity peers increases productivity. Further, Mas and Moretti used the size and richness of their dataset to move beyond the estimation of cause-and-effect to explore two more important and subtle issues: heterogeneity of this effect (for which kinds of workers is the effect larger) and mechanism behind the effect (why does having high productivity peers lead to higher productivity). We will return to these two important issues—heterogeneity of treatment effects and mechanisms—in Chapter 5 when we discuss experiments in more detail.

Generalizing from the studies on the effect of the Vietnam Draft on earnings and the study of the effect of peers on productivity, Table 2.3 summarizes other studies that have this exact same structure: using an always-on data source to measure the impact of some event. As Table 2.3 makes clear, natural experiments are everywhere if you just know how to look for them.

Table 2.3: Examples of natural experiments using big data sources. All these studies follow the same basic recipe: random (or as if random) event + always-on data system. See Dunning (2012) for more examples.
Substantive focus	Source of natural experiment	Always-on data source	Citation
Peer effects on productivity	scheduling process	checkout data	Mas and Moretti (2009)
Friendship formation	hurricanes	Facebook	Phan and Airoldi (2015)
Spread of emotions	rain	Facebook	Coviello et al. (2014)
Peer to peer economic transfers	earthquake	mobile money data	Blumenstock, Fafchamps, and Eagle (2011)
Personal consumption behavior	2013 US government shutdown	personal finance data	Baker and Yannelis (2015)
Economic impact of recommender systems	various	browsing data at Amazon	Sharma, Hofman, and Watts (2015)
Effect of stress on unborn babies	2006 Israel–Hezbollah war	Birth records	Torche and Shwed (2015)
Reading behavior on Wikipedia	Snowden revelations	Wikipedia logs	Penney (2016)

In practice, researchers use two different strategies for finding natural experiments, both of which can be fruitful. Some researchers start with the always-on data source and look for random events in the world; others start with random events in the world and look for data sources that capture their impact. Finally, notice that the strength of natural experiments comes not from the sophistication of the statistical analysis, but from the care in discovering a fair comparison created by a fortunate accident of history.