The key to running large experiments is to drive your variable cost to zero. The best ways to do this are automation and designing enjoyable experiments.
Digital experiments can have dramatically different cost structures, and this enables researchers to run experiments that were impossible in the past. One way to think about this difference is to note that experiments generally have two types of costs: fixed costs and variable costs. Fixed costs are costs that remain unchanged regardless of the number of participants. For example, in a lab experiment, fixed costs might be the costs of renting space and buying furniture. Variable costs, on the other hand, change depending on the number of participants. For example, in a lab experiment, variable costs might come from paying staff and participants. In general, analog experiments have low fixed costs and high variable costs, while digital experiments have high fixed costs and low variable costs (figure 4.19). Even though digital experiments have low variable costs, you can create a lot of exciting opportunities when you drive the variable cost all the way to zero.
There are two main elements of variable cost—payments to staff and payments to participants—and each of these can be driven to zero using different strategies. Payments to staff stem from the work that research assistants do recruiting participants, delivering treatments, and measuring outcomes. For example, the analog field experiment of Schultz and colleagues (2007) on electricity usage required research assistants to travel to each home to deliver the treatment and read the electric meter (figure 4.3). All of this effort by research assistants meant that adding a new household to the study would have added to the cost. On the other hand, for the digital field experiment of Restivo and van de Rijt (2012) on the effect of awards on Wikipedia editors, researchers could add more participants at virtually no cost. A general strategy for reducing variable administrative costs is to replace human work (which is expensive) with computer work (which is cheap). Roughly, you can ask yourself: Can this experiment run while everyone on my research team is sleeping? If the answer is yes, you’ve done a great job of automation.
The second main type of variable cost is payments to participants. Some researchers have used Amazon Mechanical Turk and other online labor markets to decrease the payments that are needed for participants. To drive variable costs all the way to zero, however, a different approach is needed. For a long time, researchers have designed experiments that are so boring they have to pay people to participate. But what if you could create an experiment that people want to be in? This may sound far-fetched, but I’ll give you an example below from my own work, and there are more examples in table 4.4. Note that this idea of designing enjoyable experiments echoes some of the themes in chapter 3 regarding designing more enjoyable surveys and in chapter 5 regarding the design of mass collaboration. Thus, I think that participant enjoyment—what might also be called user experience—will be an increasingly important part of research design in the digital age.
|Website with health information||Centola (2010)|
|Exercise program||Centola (2011)|
|Free music||Salganik, Dodds, and Watts (2006); Salganik and Watts (2008); Salganik and Watts (2009b)|
|Fun game||Kohli et al. (2012)|
|Movie recommendations||Harper and Konstan (2015)|
If you want to create experiments with zero variable cost data, you’ll need to ensure that everything is fully automated and that participants don’t require any payment. In order to show how this is possible, I’ll describe my dissertation research on the success and failure of cultural products.
My dissertation was motivated by the puzzling nature of success for cultural products. Hit songs, best-selling books, and blockbuster movies are much, much more successful than average. Because of this, the markets for these products are often called “winner-take-all” markets. Yet, at the same time, which particular song, book, or movie will become successful is incredibly unpredictable. The screenwriter William Goldman (1989) elegantly summed up lots of academic research by saying that, when it comes to predicting success, “nobody knows anything.” The unpredictability of winner-take-all markets made me wonder how much of success is a result of quality and how much is just luck. Or, expressed slightly differently, if we could create parallel worlds and have them all evolve independently, would the same songs become popular in each world? And, if not, what might be a mechanism that causes these differences?
In order to answer these questions, we—Peter Dodds, Duncan Watts (my dissertation advisor), and I—ran a series of online field experiments. In particular, we built a website called MusicLab where people could discover new music, and we used it for a series of experiments. We recruited participants by running banner ads on a teen-interest website (figure 4.20) and through mentions in the media. Participants arriving at our website provided informed consent, completed a short background questionnaire, and were randomly assigned to one of two experimental conditions—independent and social influence. In the independent condition, participants made decisions about which songs to listen to, given only the names of the bands and the songs. While listening to a song, participants were asked to rate it after which they had the opportunity (but not the obligation) to download the song. In the social influence condition, participants had the same experience, except they could also see how many times each song had been downloaded by previous participants. Furthermore, participants in the social influence condition were randomly assigned to one of eight parallel worlds, each of which evolved independently (figure 4.21). Using this design, we ran two related experiments. In the first, we presented the songs to the participants in an unsorted grid, which provided them with a weak signal of popularity. In the second experiment, we presented the songs in a ranked list, which provided a much stronger signal of popularity (figure 4.22).
We found that the popularity of the songs differed across the worlds, suggesting that luck played an important role in success. For example, in one world the song “Lockdown” by 52Metro came in 1st out of 48 songs, while in another world it came in 40th. This was exactly the same song competing against all the same other songs, but in one world it got lucky and in the others it did not. Further, by comparing results across the two experiments, we found that social influence increases the winner-take-all nature of these markets, which perhaps suggests the importance of skill. But, looking across the worlds (which can’t be done outside of this kind of parallel worlds experiment), we found that social influence actually increased the importance of luck. Further, surprisingly, it was the songs of highest appeal where luck mattered most (figure 4.23).
MusicLab was able to run at essentially zero variable cost because of the way that it was designed. First, everything was fully automated so it was able to run while I was sleeping. Second, the compensation was free music, so there was no variable participant compensation cost. The use of music as compensation also illustrates how there is sometimes a trade-off between fixed and variable costs. Using music increased the fixed costs because I had to spend time securing permission from the bands and preparing reports for them about participants’ reaction to their music. But in this case, increasing fixed costs in order to decrease variables costs was the right thing to do; that’s what enabled us to run an experiment that was about 100 times larger than a standard lab experiment.
Further, the MusicLab experiments show that zero variable cost does not have to be an end in itself; rather, it can be a means to running a new kind of experiment. Notice that we did not use all of our participants to run a standard social influence lab experiment 100 times. Instead, we did something different, which you could think of as switching from a psychological experiment to a sociological one (Hedström 2006). Rather than focusing on individual decision-making, we focused our experiment on popularity, a collective outcome. This switch to a collective outcome meant that we required about 700 participants to produce a single data point (there were 700 people in each of the parallel worlds). That scale was only possible because of the cost structure of the experiment. In general, if researchers want to study how collective outcomes arise from individual decisions, group experiments such as MusicLab are very exciting. In the past, they have been logistically difficult, but those difficulties are fading because of the possibility of zero variable cost data.
In addition to illustrating the benefits of zero variable cost data, the MusicLab experiments also show a challenge with this approach: high fixed costs. In my case, I was extremely lucky to be able to work with a talented web developer named Peter Hausel for about six months to construct the experiment. This was only possible because my advisor, Duncan Watts, had received a number of grants to support this kind of research. Technology has improved since we built MusicLab in 2004 so it would be much easier to build an experiment like this now. But, high fixed cost strategies are really only possible for researchers who can somehow cover those costs.
In conclusion, digital experiments can have dramatically different cost structures than analog experiments. If you want to run really large experiments, you should try to decrease your variable cost as much as possible and ideally all the way to zero. You can do this by automating the mechanics of your experiment (e.g., replacing human time with computer time) and designing experiments that people want to be in. Researchers who can design experiments with these features will be able to run new kinds of experiments that were not possible in the past. However, the ability to create zero variable cost experiments can raise new ethical questions, the topic that I shall now address.