6.2.3 Encore

Researchers caused people’s computers to secretly attempt to visit websites that were thought to be blocked by repressive governments.

In March 2014, researchers launched Encore, a system to provide real-time and global measurements of Internet censorship. To understand how it worked, let’s think about it in the context of your personal webpage (if you don’t have one, you can imagine your friend’s). One way to think about your webpage is as a computer program written in the html language. When a user visits your website, her computer downloads your html program and then renders it on her screen. Thus, your webpage is a program that is able to induce other people’s computers to follow certain sets of instructions. Therefore, the researchers, Sam Burnett and Nick Feamster, who were at Georgia Tech, encouraged website owners to install a small code snippit into their webpages:

<iframe src="//encore.noise.gatech.edu/task.html" 
        width="0" height="0"
        style="display: none"></iframe>

If you visit a webpage with this code snippit in it, here’s what will happen. While your web browser was rending the webpage, the code snippit will cause your computer to try to contact a website that the researchers were monitoring. For example, it could be the website of a banned political party or persecuted religious group. Then, your computer will report back to the researchers about whether it was able to contact the potentially blocked website (Figure 6.2). Further, all of this would be invisible to you unless they checked the html source file of your webpage. Such invisible third-party page requests are actually quite common on the web (Narayanan and Zevenbergen 2015), but they rarely involve explicit attempts to measure censorship.

Figure 6.2: Schematic of the research design of Encore. The origin website sends you a webpage written in html with a small code snippet embedded in it (step 1). Your computer renders the webpage, which triggers the measurement task (step 2). Your computer attempts to access a measurement target, which could be the website of a banned political group (step 3). A censor, such as a government, may then block your access to the measurement target (step 4). Finally, your computer reports the results of this request to the researchers (not shown in the figure). Figure from Burnett and Feamster (2015).

Figure 6.2: Schematic of the research design of Encore. The origin website sends you a webpage written in html with a small code snippet embedded in it (step 1). Your computer renders the webpage, which triggers the measurement task (step 2). Your computer attempts to access a measurement target, which could be the website of a banned political group (step 3). A censor, such as a government, may then block your access to the measurement target (step 4). Finally, your computer reports the results of this request to the researchers (not shown in the figure). Figure from Burnett and Feamster (2015).

This approach has some very attractive technical properties for measuring censorship. If enough websites add this code snippit, then the researchers can have a real-time, global-scale measure of which websites are censored by which countries. Before launching the project, the researchers conferred with the IRB at Georgia Tech, and the IRB declined to review the project because it was not “human subjects research” under the Common Rule (the Common Rule is the set of regulations governing most federally-funded research in the US; for more information, see the Historical Appendix at the end of this chapter).

Soon after Encore was launched, however, the researchers were contacted by Ben Zevenbergen, then a graduate student, who raised questions about the ethics of the project. In particular, there was a concern that people in certain countries could be exposed to risk if their computer attempted to visit certain sensitive websites, and these people who were being exposed to risk did not consent to participate in the study. Based on these conversations, the Encore team modified the project to only attempt to measure the censorship of Facebook, Twitter, and YouTube because third-party attempts to access these sites are common during normal web browsing (e.g., every webpage with a Facebook Like Button triggers a third-party request to Facebook).

After collecting data using this modified design, a paper describing the methodology and some results was submitted to SIGCOMM, a prestigious computer science conference. The program committee appreciated the technical contribution of the paper, but expressed concern about the lack of informed consent from participants. Ultimately, the program committee decided to publish the paper, but with a signing statement expressing ethical concerns (Burnett and Feamster 2015). Such a signing statement had never been used before at SIGCOMM, and this case has led to additional debate by computer scientists about the nature of ethics in their research (Narayanan and Zevenbergen 2015).