5.1 Introduction

Wikipedia is amazing. A mass collaboration of volunteers created a fantastic encyclopedia that is available to everyone. The key to Wikipedia’s success was not new knowledge; rather, it was a new form of collaboration. The digital age, fortunately, enables many new forms of collaboration. Thus, we should now ask: What massive scientific problems—problems that we could not solve individually—can we now tackle together?

Collaboration in research is nothing new, of course. What is new, however, is that the digital age enables collaboration with a much larger and more diverse set of people: the billions of people around the world with Internet access. I expect that these new mass collaborations will yield amazing results not just because of the number of people involved but also because of their diverse skills and perspectives. How can we incorporate everyone with an Internet connection into our research process? What could you do with 100 research assistants? What about 100,000 skilled collaborators?

There are many forms of mass collaboration, and computer scientists typically organize them into a large number of categories based on their technical characteristics (Quinn and Bederson 2011). In this chapter, however, I’m going to categorize mass collaboration projects based on how they can be used for social research. In particular, I think it is helpful to roughly distinguish between three types of projects: human computation, open call, and distributed data collection (figure 5.1).

I’ll describe each of these types in greater detail later in the chapter, but for now let me describe each one briefly. Human computation projects are ideally suited for easy-task-big-scale problems such as labeling a million images. These are projects that in the past might have been performed by undergraduate research assistants. Contributions don’t require task-related skills, and the final output is typically an average of all of the contributions. A classic example of a human computation project is Galaxy Zoo, where a hundred thousand volunteers helped astronomers classify a million galaxies. Open call projects, on the other hand, are ideally suited for problems where you are looking for novel and unexpected answers to clearly formulated questions. These are projects that in the past might have involved asking colleagues. Contributions come from people who have special task-related skills, and the final output is usually the best of all of the contributions. A classic example of an open call is the Netflix Prize, where thousands of scientists and hackers worked to develop new algorithms to predict customers’ ratings of movies. Finally, distributed data collection projects are ideally suited for large-scale data collection. These are projects that in the past might have been performed by undergraduate research assistants or survey research companies. Contributions typically come from people who have access to locations that researchers do not, and the final product is a simple collection of the contributions. A classic example of a distributed data collection is eBird, in which hundreds of thousands of volunteers contribute reports about birds they see.

Figure 5.1: Mass collaboration schematic. This chapter is organized around three main forms of mass collaboration: human computation, open call, and distributed data collection. More generally, mass collaboration combines ideas from fields such as citizen science, crowdsourcing, and collective intelligence.

Figure 5.1: Mass collaboration schematic. This chapter is organized around three main forms of mass collaboration: human computation, open call, and distributed data collection. More generally, mass collaboration combines ideas from fields such as citizen science, crowdsourcing, and collective intelligence.

Mass collaboration has a long, rich history in fields such as astronomy (Marshall, Lintott, and Fletcher 2015) and ecology (Dickinson, Zuckerberg, and Bonter 2010), but it is not yet common in social research. However, by describing successful projects from other fields and providing a few key organizing principles, I hope to convince you of two things. First, mass collaboration can be harnessed for social research. And, second, researchers who use mass collaboration will be able to solve problems that had previously seemed impossible. Although mass collaboration is often promoted as a way to save money, it is much more than that. As I will show, mass collaboration doesn’t just allow us to do research cheaper, it allows us to do research better.

In the previous chapters, you have seen what can be learned by engaging with people in three different ways: observing their behavior (Chapter 2), asking them questions (Chapter 3), and enrolling them in experiments (Chapter 4). In this chapter, I’ll show you what can be learned by engaging people as research collaborators. For each of the three main forms of mass collaboration, I will describe a prototypical example, illustrate important additional points with further examples, and finally describe how this form of mass collaboration might be used for social research. The chapter will conclude with five principles that can help you design your own mass collaboration project.