6.3 Digital is different

Social research in the digital age has different characteristics and therefore raises different ethical questions.

In the analog age, most social research had a relatively limited scale and operated within a set of reasonably clear rules. Social research in the digital age is different. Researchers—often in collaboration with companies and governments—have more power over participants than in the past, and the rules about how that power should be used are not yet clear. By power, I mean simply the ability to do things to people without their consent or even awareness. The kinds of things that researchers can do to people include observing their behavior and enrolling them in experiments. As the power of researchers to observe and perturb is increasing, there has not been an equivalent increase in clarity about how that power should be used. In fact, researchers must decide how to exercise their power based on inconsistent and overlapping rules, laws, and norms. This combination of powerful capabilities and vague guidelines creates difficult situations.

One set of powers that researchers now have is the ability to observe people’s behavior without their consent or awareness. Researchers could, of course, do this in past, but in the digital age, the scale is completely different, a fact that has been proclaimed repeatedly by many fans of big data sources. In particular, if we move from the scale of an individual student or professor and instead consider the scale of a company or government—institutions with which researchers increasingly collaborate—the potential ethical issues become complex. One metaphor that I think helps people visualize the idea of mass surveillance is the panopticon. Originally proposed by Jeremy Bentham as an architecture for prisons, the panopticon is a circular building with cells built around a central watchtower (figure 6.3). Whoever occupies this watchtower can observe the behavior of all the people in the rooms without being seen herself. The person in the watchtower is thus an unseen seer (Foucault 1995). To some privacy advocates, the digital age has moved us into a panoptic prison where tech companies and governments are constantly watching and recoding our behavior.

Figure 6.3: Design for the panopticon prison, first proposed by Jeremy Bentham. In the center, there is an unseen seer who can observe the behavior of everyone but cannot be observed. Drawing by Willey Reveley, 1791 (Source: Wikimedia Commons).

To carry this metaphor a bit further, when many social researchers think about the digital age, they imagine themselves inside of the watchtower, observing behavior and creating a master database that could be used to do all kinds of exciting and important research. But now, rather than imagining yourself in the watchtower, imagine yourself in one of the cells. That master database starts to look like what Paul Ohm (2010) has called a database of ruin, which could be used in unethical ways.

Some readers of this book are lucky enough to live in countries where they trust their unseen seers to use their data responsibly and to protect it from adversaries. Other readers are not so lucky, and I’m sure that issues raised by mass surveillance are very clear to them. But I believe that even for the lucky readers there is still an important concern raised by mass surveillance: unanticipated secondary use. That is, a database created for one purpose—say targeting ads—might one day be used for a very different purpose. A horrific example of unanticipated secondary use happened during the Second World War, when government census data were used to facilitate the genocide that was taking place against Jews, Roma, and others (Seltzer and Anderson 2008). The statisticians who collected the data during peaceful times almost certainly had good intentions, and many citizens trusted them to use the data responsibly. But, when the world changed—when the Nazis came to power—these data enabled a secondary use that was never anticipated. Quite simply, once a master database exists, it is hard to anticipate who may gain access to it and how it will be used. In fact, William Seltzer and Margo Anderson (2008) have documented 18 cases in which population data systems have been involved or potentially involved in human rights abuses (table 6.1). Further, as Seltzer and Anderson point out, this list is almost certainly an underestimate because most abuses happen in secret.

Table 6.1: Cases where Population Data Systems Have Been Involved or Potentially Involved in Human Rights Abuses. See Seltzer and Anderson (2008) for more information about each case and inclusion criteria. Some, but not all, of these cases involved unanticipated secondary use.
Place	Time	Targeted individuals or groups	Data system	Human rights violation or presumed state intention
Australia	19th and early 20th century	Aborigines	Population registration	Forced migration, elements of genocide
China	1966-76	Bad-class origin during cultural revolution	Population registration	Forced migration, instigated mob violence
France	1940-44	Jews	Population registration, special censuses	Forced migration, genocide
Germany	1933-45	Jews, Roma, and others	Numerous	Forced migration, genocide
Hungary	1945-46	German nationals and those reporting German mother tongue	1941 population census	Forced migration
Netherlands	1940-44	Jews and Roma	Population registration systems	Forced migration, genocide
Norway	1845-1930	Samis and Kvens	Population censuses	Ethnic cleansing
Norway	1942-44	Jews	Special census and proposed population register	Genocide
Poland	1939-43	Jews	Primarily special censuses	Genocide
Romania	1941-43	Jews and Roma	1941 population census	Forced migration, genocide
Rwanda	1994	Tutsi	Population registration	Genocide
South Africa	1950-93	African and “Colored” populations	1951 population census and population registration	Apartheid, voter disenfranchisement
United States	19th century	Native Americans	Special censuses, population registers	Forced migration
United States	1917	Suspected draft law violators	1910 census	Investigation and prosecution of those avoiding registration
United States	1941-45	Japanese Americans	1940 census	Forced migration and internment
United States	2001-08	Suspected terrorists	NCES surveys and administrative data	Investigation and prosecution of domestic and international terrorists
United States	2003	Arab-Americans	2000 census	Unknown
USSR	1919-39	Minority populations	Various population censuses	Forced migration, punishment of other serious crimes

Ordinary social researchers are very, very far from anything like participating in human rights abuses through secondary use. I’ve chosen to discuss it, however, because I think it will help you understand how some people might react to your work. Let’s return to the Tastes, Ties, and Time project, as an example. By merging together complete and granular data from Facebook with complete and granular data from Harvard, the researchers created an amazingly rich view of the social and cultural life of the students (Lewis et al. 2008). To many social researchers, this seems like the master database, which could be used for good. But to some others, it looks like the beginning of the database of ruin, which could be used unethically. In fact, it is probably a bit of both.

In addition to mass surveillance, researchers—again in collaboration with companies and governments—can increasingly intervene in people’s lives in order to create randomized controlled experiments. For example, in Emotional Contagion, researchers enrolled 700,000 people in an experiment without their consent or awareness. As I described in chapter 4, this kind of secret conscription of participants into experiments is not uncommon, and it does not require the cooperation of large companies. In fact, in chapter 4, I taught you how to do it.

In the face of this increased power, researchers are subject to inconsistent and overlapping rules, laws, and norms. One source of this inconsistency is that the capabilities of the digital age are changing more quickly than rules, laws, and norms. For example, the Common Rule (the set of regulations governing most government-funded research in the United States) has not changed much since 1981. A second source of inconsistency is that norms around abstract concepts such as privacy are still being actively debated by researchers, policy makers, and activists. If specialists in these areas cannot reach a uniform consensus, we should not expect empirical researchers or participants to do so. A third and final source of inconsistency is that digital-age research is increasingly mixed into other contexts, which leads to potentially overlapping norms and rules. For example, Emotional Contagion was a collaboration between a data scientist at Facebook and a professor and graduate student at Cornell. At that time, it was common at Facebook to run large experiments without third-party oversight, as long as the experiments complied with Facebook’s terms of service. At Cornell, the norms and rules are quite different; virtually all experiments must be reviewed by the Cornell IRB. So, which set of rules should govern Emotional Contagion—Facebook’s or Cornell’s? When there are inconsistent and overlapping rules, laws, and norms even well-meaning researchers might have trouble doing the right thing. In fact, because of the inconsistency, there might not even be a single right thing.

Overall, these two features—increasing power and lack of agreement about how that power should be used—mean that researchers working in the digital age are going to be facing ethical challenges for the foreseeable future. Fortunately, when dealing with these challenges, it is not necessary to start from scratch. Instead, researchers can draw wisdom from previously developed ethical principles and frameworks, the topics of the next two sections.