5.4 Distributed data collection

Mass collaboration can also help with data collection, but it is tricky to ensure data quality and systematic approaches to sampling.

In addition to creating human computation and open call projects, researchers can also create distributed data collection projects. In fact, much of quantitative social science already relies on distributed data collection using paid staff. For example, to collect the data for the General Social Survey, a company hires interviewers to collect information from respondents. But, what if we could somehow enlist volunteers as data collectors?

As the examples below—from ornithology and computer science—show, distributed data collection enables researchers to collect data more frequently and in more places than were possible previously. Further, given appropriate protocols, these data can be reliable enough to be used for scientific research. In fact, for certain research questions, distributed data collection is better than anything that would realistically be possible with paid data collectors.