5.5.2 Leverage heterogeneity

Once you have motivated a lot of people to work on a real scientific problem, you will discover that your participants will be heterogeneous in two main ways: they will vary both in their skill and their level of effort. The first reaction of many social researchers is to fight against this heterogeneity by trying to exclude low-quality participants and then attempting to collect a fixed amount of information from everyone left. This is the wrong way to design a mass collaboration project. Instead of fighting heterogeneity, you should leverage it.

First, there is no reason to exclude low-skilled participants. In open calls, low-skilled participants cause no problems; their contributions don’t hurt anyone and they don’t require any time to evaluate. In human computation and distributed data collection projects, moreover, the best form of quality control comes through redundancy, not through a high bar for participation. In fact, rather than excluding low-skill participants, a better approach is to help them make better contributions, much as the researchers at eBird have done.

Second, there is no reason to collect a fixed amount of information from each participant. Participation in many mass collaboration projects is incredibly unequal (Sauermann and Franzoni 2015), with a small number of people contributing a lot—sometimes called the fat head—and a lot of people contributing a little—sometimes called the long tail. If you don’t collect information from the fat head and the long tail, you are leaving masses of information uncollected. For example, if Wikipedia accepted 10 and only 10 edits per editor, it would lose about 95% of edits (Salganik and Levy 2015). Thus, with mass collaboration projects, it is best to leverage heterogeneity rather than try to eliminate it.