5.3.4 Conclusion

Open calls enable you to find solutions to problems that you can state clearly but that you cannot solve yourself.

In all three open call projects—Netflix Prize, Foldit, Peer-to-Patent—researchers posed questions of a specific form, solicited solutions, and then picked the best solutions. The researchers didn’t even need to know the best expert to ask, and sometimes the good ideas came from unexpected places.

Now I can also highlight two important differences between open call projects and human computation projects. First, in open call projects the researcher specifies a goal (e.g., predicting movie ratings), whereas in human computation, the researcher specifies a microtask (e.g., classifying a galaxy). Second, in open calls, the researchers want the best contribution—such as the best algorithm for predicting movie ratings, the lowest-energy configuration of a protein, or the most relevant piece of prior art—not some sort of simple combination of all of the contributions.

Given the general template for open calls and these three examples, what kinds of problems in social research might be suitable for this approach? At this point, I should acknowledge that there have not been many successful examples yet (for reasons that I’ll explain in a moment). In terms of direct analogs, one could imagine a Peer-to-Patent style open call being used by a historical researcher searching for the earliest document to mention a specific person or idea. An open call approach to this kind of problem could be especially valuable when the potentially relevant documents are not in a single archive but are widely distributed.

More generally, many governments and companies have problems that might be amenable to open calls because open calls can generate algorithms that can be used for predictions, and these predictions can be an important guide for action (Provost and Fawcett 2013; Kleinberg et al. 2015). For example, just as Netflix wanted to predict ratings on movies, governments might want to predict outcomes such as which restaurants are most likely to have health-code violations in order to allocate inspection resources more efficiently. Motivated by this kind of problem, Edward Glaeser and colleagues (2016) used an open call to help the City of Boston predict restaurant hygiene and sanitation violations based on data from Yelp reviews and historical inspection data. They estimated that the predictive model that won the open call would improve the productivity of restaurant inspectors by about 50%.

Open calls can also potentially be used to compare and test theories. For example, the Fragile Families and Child Wellbeing Study has tracked about 5,000 children since birth in 20 different US cities (Reichman et al. 2001). Researchers have collected data about these children, their families, and their broader environment at birth and at ages 1, 3, 5, 9, and 15 years. Given all the information about these children, how well could researchers predict outcomes such as who will graduate from college? Or, expressed in a way that would be more interesting to some researchers, which data and theories would be most effective in predicting these outcomes? Since none of these children are currently old enough to go to college, this would be a true forward-looking prediction, and there are many different strategies that researchers might employ. A researcher who believes that neighborhoods are critical in shaping life outcomes might take one approach, while a researcher who focuses on families might do something completely different. Which of these approaches would work better? We don’t know, and in the process of finding out, we might learn something important about families, neighborhoods, education, and social inequality. Further, these predictions might be used to guide future data collection. Imagine that there were a small number of college graduates who were not predicted to graduate by any of the models; these people would be ideal candidates for follow-up qualitative interviews and ethnographic observation. Thus, in this kind of open call, the predictions are not the end; rather, they provide a new way to compare, enrich, and combine different theoretical traditions. This kind of open call is not specific to using data from the Fragile Families and Child Wellbeing Study to predict who will go to college; it could be used to predict any outcome that will eventually be collected in any longitudinal social data set.

As I wrote earlier in this section, there have not been many examples of social researchers using open calls. I think that this is because open calls are not well suited to the way that social scientists typically ask their questions. Returning to the Netflix Prize, social scientists wouldn’t usually ask about predicting tastes; rather, they would ask about how and why cultural tastes differ for people from different social classes (see e.g., Bourdieu (1987)). Such “how” and “why” question do not lead to easily verifiable solutions, and therefore seem poorly fit to open calls. Thus, it appears that open calls are more appropriate for questions prediction than questions of explanation. Recent theorists, however, have called on social scientists to reconsider the dichotomy between explanation and prediction (Watts 2014). As the line between prediction and explanation blurs, I expect that open calls will become increasingly common in social research.