5.4 Distributed data collection

Mass collaboration can also help with data collection, but it is tricky to ensure data quality and systematic approaches to sampling.

In addition to creating human computation and open calls, researchers can also create a distributed data collection project. In fact, much of quantitative social science already relies on distributed data collection in the form of surveys administered by employees. For example, to collect the data for the General Social Survey, a company is hired that in turn hires interviewers who go to respondent’s homes to collect information from them. But, what if we could somehow enlist volunteers as data collectors?

As the examples below—from ornithology and computer science—show distributed data collection enables researchers to collect data more frequently and in more places than were possible previously. Further, given appropriate protocols, this data can be reliable enough to be used for scientific research. In fact, for certain research questions, distributed data collection is better than anything that would realistically be possible with paid data collectors.