5.2.2 Crowd-coding of political manifestos

Coding political manifestos, something typically done by experts, can be performed by a human computation project resulting in greater reproducibility and flexibility.

Similar to Galaxy Zoo, there are many situations where social researchers want to code, classify, or label an image or piece of text. An example of this kind of research is the coding of political manifestos. During elections, political parties produce manifestos describing their policy positions and guiding philosophies. For example, here’s a piece of the manifesto of the Labor Party in Great Britain from 2010:

“Millions of people working in our public services embody the best values of Britain, helping empower people to make the most of their own lives while protecting them from the risks they should not have to bear on their own. Just as we need to be bolder about the role of government in making markets work fairly, we also need to be bold reformers of government.”

These manifestos contain valuable data for political scientists, particular those studying elections and the dynamics of policy debates. In order to systematically extract information from these manifestos, researchers created The Manifesto Project, which organized political scientists to code 4,000 manifestos from nearly 1,000 parties in 50 countries. Each sentence in each manifesto has been coded by an expert using a 56-category scheme. The result of this collaborative effort is a massive dataset summarizing the information embedded in these manifestos, and this dataset has been used in more than 200 scientific papers.

Kenneth Benoit and colleagues (2015) decided to take the manifesto coding task that had previously been performed by experts and turn it into a human computation project. As a result, they created a coding process that is more reproducible and more flexible, not to mention cheaper and faster.

Working with 18 manifestos generated during six recent elections in the UK, Benoit and colleagues used the split-apply-combine strategy with workers from a micro-task labor market (Amazon Mechanical Turk and CrowdFlower are examples of micro-task labor markets; for more on micro-task labor markets, see Chapter 4). The researchers took each manifesto and split it into sentences. Next, human rating was applied to each sentence. In particular, if the sentence involved a policy statement, it was coded along two dimensions: economic (from very left to very right) and social (from liberal to conservative) (Figure 5.5). Each sentence was coded by about 5 different people. Finally, these ratings were combined using a statistical model that accounted for both individual rater effects and difficulty of sentence effects. In all, Benoit and colleagues collected 200,000 ratings from about 1,500 workers.

Figure 5.5: Coding scheme from Benoit et al. (2015) (Fig 1).

Figure 5.5: Coding scheme from Benoit et al. (2015) (Fig 1).

In order to assess the quality of the crowd coding, Benoit and colleagues also had about 10 experts—professors and graduate students in Political Science—rate the same manifestos using a similar procedure. Although the ratings from members of the crowd were more variable than the ratings from the experts, the consensus crowd rating had remarkable agreement with the consensus expert rating (Figure 5.6). This comparison shows that, as with Galaxy Zoo, human computation projects can produce high quality results.

Figure 5.6: Expert estimates (x-axis) and crowd estimates (y-axis) were in remarkable agreement when coding 18 party manifestos from the Great Britain (Benoit et al. 2015). The manifestos coded were from three political parties (Conservative, Labour, Liberal Democrats) and six elections (1987, 1992, 1997, 2001, 2005, 2010).

Figure 5.6: Expert estimates (x-axis) and crowd estimates (y-axis) were in remarkable agreement when coding 18 party manifestos from the Great Britain (Benoit et al. 2015). The manifestos coded were from three political parties (Conservative, Labour, Liberal Democrats) and six elections (1987, 1992, 1997, 2001, 2005, 2010).

Building on this result, Benoit and colleagues used their crowd-coding system to do research that was impossible with the Manifesto Project. For example, the Manifesto Project did not code the manifestos on the topic of immigration because that was not a salient topic when the coding scheme was developed in the mid-1980s. And, at this point, it is logistically infeasible for the Manifesto Project to go back and re-code their manifestos to capture this information. Therefore, it would appear that researchers interested in studying the politics of immigration are out of luck. However, Benoit and colleagues were able to use their human computation system to do this coding—customized to their research question—quickly and easily.

In order to study immigration policy, they coded the manifestos for eight parties in the 2010 election in Great Britain. Each sentence in each manifesto was coded as to whether it related to immigration, and if so, whether it was pro-immigration, neutral, or anti-immigration. Within 5 hours of launching their project, the results were in. They had collected more than 22,000 responses at a total cost of $360. Further, the estimates from the crowd showed remarkable agreement with an earlier survey of experts. Then, as a final test, two months later, the researchers reproduced their crowd-coding. Within a few hours, they had created a new crowd-coded dataset that closely matched their original crowd-coded data set. In other words, human computation enabled them to generate coding of political texts that agreed with expert evaluations and was reproducible. Further, because the human computation was quick and cheap, it was easy for them to customize their data collection to their specific research question about immigration.