Activities

You are reading the Open Review Edition of Bit by Bit. Click here to read the 1st Edition.

Activities

Key:

degree of difficulty: easy , medium , hard , very hard
requires math ( $requires math$ )
requires coding ()
data collection ()
my favorites ()

[] In arguing against the Emotional Contagion experiment, Kleinsman and Buckley (2015) wrote:

“Even if it is true that the risks for the Facebook experiment were low and even if, in hindsight, the results are judged to be useful, there is an important principle at stake here that must be upheld. In the same way that stealing is stealing no matter what amounts are involved, so we all have a right not to be experimented on without our knowledge and consent, whatever the nature of the research.”
1. Which of the two ethical frameworks discussed in this chapter—consequentialism or deontology—is this argument most clearly associated with?
2. Now, imagine that you wanted to argue against this position. How would you argue the case to a reporter for The New York Times?
3. How, if at all, would your argument be different if you were discussing this with a colleague?
[] Maddock, Mason, and Starbird (2015) considers the question of whether researchers should use tweets that have been deleted. Read their paper to learn about the background.
1. Analyze this decision from deontological perspective.
2. Analyze the exact same decision from a consequentialist perspective.
3. Which do you find more convincing in this case?
[] In an article on the ethics of field experiments, Humphreys (2015), proposed the following hypothetical experiment to highlight the ethical challenges of interventions that are done without consent of all impacted parties and that harms some and help others.

“Say a researcher is contacted by a set of community organizations that want to figure out whether placing street lights in slums will reduce violent crime. In this research the subjects are the criminals: seeking informed consent of the criminals would likely compromise the research and it would likely not be forthcoming anyhow (violation of respect for persons); the criminals will likely bear the costs of the research without benefiting (violation of justice); and there will be disagreement regarding the benefits of the research – if it is effective, the criminals in particular will not value it (producing a difficulty for assessing benevolence). . . . The special issues here are not just around the subjects however. Here there are also risks that obtain to non-subjects, if for example criminals retaliate against the organizations putting the lamps in place. The organization may be very aware of these risks but be willing to bear them because they erroneously put faith in the ill-founded expectations of researchers from wealthy universities who are themselves motivated in part to publish.”
1. Write an email to the community organization offering your ethical assessment of the experiment as designed? Would you help them do the experiment as proposed? What factors might impact your decision?
2. Are there some changes that might improve your assessment of the ethics of this experimental design.
[] In the 1970’s 60 men participated in field experiment that took place in the men’s bathroom at a university in the midwestern part of the US (the researchers don’t name the university) (Middlemist, Knowles, and Matter 1976). The researchers were interested in how people respond to violations of their personal space, which Sommer (1969) defined as the “area with invisible boundaries surrounding a person’s body into which intruders may not come.” More specifically, the researchers chose to study how a man’s urination was impacted by the presence of others nearby. After conducting a purely observation study, the researchers conducted a field experiment. Participants were forced to use the left-most urinal in a three urinal bathroom (the researchers do not explain exactly how this happened). Next, participants were assigned to one of three levels of interpersonal distance. For some men a confederate used a urinal right next to them, for some men a confederate used a urinal one space away from them, and for some men no confederate entered the bathroom. The researchers measured their outcome variables—delay time and persistence—by stationing a research assistant inside the toilet stall adjacent to the participant’s urinal. Here’s how the researchers described the measurement procedure:

“An observer was stationed in the toilet stall immediately adjacent to the subjects’ urinal. During pilot tests of these procedures it became clear that auditory cues could not be used to signal the initiation and cessation of [urination]. . . . Instead, visual cues were used. The observer used a periscopic prism imbedded in a stack of books lying on the floor of the toilet stall. An 11-inch (28-cm) space between the floor and the wall of the toilet stall provided a view, through the periscope, of the user’s lower torso and made possible direct visual sightings of the stream of urine. The observer, however, was unable to see a subject’s face. The observer started two stop watches when a subject stepped up to the urinal, stopped one when urination began, and stopped the other when urination was terminated.”

The researchers found that decreased physical distance leads to increased delay of onset and decreased persistence (Figure 6.7).
1. Do you think the participants were harmed by this experiment?
2. Do you think that the researchers should have conducted this experiment?
3. What changes, if any, would you recommend to improve the ethical balance?
Figure 6.7: Results from Middlemist, Knowles, and Matter (1976). Men who entered the bathroom were assigned to one of three conditions: close distance (a confederate was placed in the immediately adjacent urinal), moderate distance (a confederate was placed one urinal removed), or no confederate used a urinal. An observer stationed in a toilet stall used a custom-built periscope to observe and time the delay and persistence of urination. Standard errors around estimates are not available.
[] In August 2006, about 10 days prior to the a primary election, 20,000 people living in Michigan received a mailing that showed their voting behavior and the voting behavior of their neighbors (Figure 6.8). (As discussed in the chapter, in the US, state governments keeps records of who votes in each election and this information is available to the public.) This particular treatment produced the largest effect ever seen up to that point for a single piece mailing: it increased the turnout rate by 8.1 percentage points (Gerber, Green, and Larimer 2008). To put this in context, one piece mailings typically produce increases of about one percentage point (Gerber, Green, and Larimer 2008). The effect was so large that a political operative named Hal Malchow offered Donald Green $100,000 not to publish the result of the experiment (presumably so that Malchow could make use of this information himself) (Issenberg 2012, p 304). But, Alan Gerber, Donald Green, and Christopher Larimer did publish the paper in 2008 in the American Political Science Review.

When you carefully inspect the mailer in Figure 6.8 you may notice that the researchers’ names do not appear on it. Rather, the return address is to Practical Political Consulting. In the acknowledgment to the paper the authors explain: “Special thanks go to Mark Grebner of Practical Political Consulting, who designed and administered the mail program studied here.”
1. Please assess the use of this treatment in terms of the four ethical principles described in this chapter.
2. What changes, if any, would you recommend to this experiment?
3. Write an ethical appendix that could appear with this paper when it was published.
Figure 6.8: Neighbor mailer from Gerber, Green, and Larimer (2008). This mailer increased turnout rates by 8.1 percentage points, the largest effect that had ever been observed for a single-piece mailer.
[] Building on the previous question, once these 20,000 mailers were sent (Figure 6.8), as well as 60,000 other potentially less sensitive mailers, there was a backlash from participants. In fact, Issenberg (2012) (p 198) reports that “Grebner [the director of Practical Political Consulting] was never able to calculate how many people took the trouble to complain by phone, because his office answering machine filled so quickly that new callers were unable to leave a message.” In fact, Grebner noted that the backlash could have been even larger if they had scaled up the treatment. He said to Alan Gerber, one of the researchers, “Alan if we had spent five hundred thousand dollars and covered the whole state you and I would be living with Salman Rushdie.” (Issenberg 2012, p 200)
1. Does this information change your answers to the previous question?
2. What strategies for dealing with making decisions in the face of uncertainty would you recommend for future studies that are similar?
[] In practice, most ethical debate occurs about studies where researchers do not have true informed consent from participants (e.g., the three case studies in this chapter). However, ethical debate can also occur for studies that have true informed consent. Design a hypothetical study where you would have true informed consent from participants, but which you still think would be unethical. (Hint: If you are struggling, you can try reading Emanuel, Wendler, and Grady (2000).)
[] Researchers often struggle to describe their ethical thinking to each other and to the general public. After it was discovered that Taste, Ties, and Time was re-identified, Jason Kauffman, the leader of the research team, made a few public comments about the ethics of the project. Read Zimmer (2010) and then rewrite Kauffman’s comments using the principles and ethical frameworks that are described in this chapter.
[] Banksy is one of the most famous contemporary artist in the United Kingdom, and he is know for politically-oriented street graffiti (Figure 6.9). His precise identity, however, is a mystery. Banksy has a personal website so he could make his identity public if he wanted, but he has chosen not to. In 2008 the Daily Mail, a newspaper, published an article claiming to identify Banksy’s real name. Then in 2016, Michelle Hauge, Mark Stevenson, D. Kim Rossmo and Steven C. Le Comber (2016) attempted to verify this claim using Dirichlet process mixture model of geographic profiling. More specifically, they collected the geographic locations of Banksy’s public graffiti in Bristol and London. Next, by searching through old newspaper articles and public voting records, Hauge and colleagues found past addresses of the named individual, his wife, and his football (i.e., soccer) team. The author’s summarize the finding of their paper as follows:

“With no other serious ‘suspects’ [sic] to investigate, it is difficult to make conclusive statements about Banksy’s identity based on the analysis presented here, other than saying the peaks of the geoprofiles in both Bristol and London include addresses known to be associated with [name redacted].”

Following Metcalf and Crawford (2016), I have decided to not to include the name of the individual when discussing this study.
1. Assess this study using the principles and ethical frameworks in this chapter.
2. Would you have done this study?
3. The authors justify this study in the abstract of the paper with the following sentence: “More broadly, these results support previous suggestions that the analysis of minor terrorism-related acts (e.g., graffiti) could be used to help locate terrorist bases before more serious incidents occur, and provides a fascinating example of the application of the model to a complex, real-world problem.” Does this change your opinion of the paper? If so, how?
4. The authors included the following ethical note at the end of their paper: “The authors are aware of, and respectful of, the privacy of [name redacted] and his relatives and have thus only used data in the public domain. We have deliberately omitted precise addresses.” Does this change your opinion of the paper? If so, how? Do you think the public/private dichotomy makes sense in this case?
Figure 6.9: Street art by Banksy in Cheltenham, England. Photo by Brian Robert Marshall. Source: Wikimedia Commons.
[] In an interesting article Metcalf (2016) makes the argument that “publicly available datasets containing private data are among the most interesting to researchers and most risky to subjects.”
1. What are two concrete examples that support this claim?
2. In this same article Metcalf also claims that is anachronistic to assume that “any information harm has already been done by a public dataset”. Give one example of where this could be the case.

[ medium ] In this chapter I proposed the rule of thumb that all data is potentially identifiable and all data is potentially sensitive. Table 6.5 provides a list of examples of data that has no obviously personally identifying information but which can still be linked to specific people.

Pick two of these examples and describe how the de-anonymization attack in both cases has a similar structure.
For each of the two examples in part (a), describe how the data could reveal sensitive information about the people in the dataset.
Now pick a third dataset from the table. Write an email to someone considering releasing it. Explain to them how this data could be potentially identifiable and potentially sensitive.

Table 6.5: List of examples of social data that does not have any obvious personally identifying information, but which can still be linked to specific people.
Data	Citation
Health insurance records	Sweeney (2002)
Credit card transaction data	Montjoye et al. (2015)
Netflix movie rating data	Narayanan and Shmatikov (2008)
Phone call meta-data	Mayer, Mutchler, and Mitchell (2016)
Search log data	Barbaro and Zeller Jr (2006)
Demographic, administrative, and social data about students	Zimmer (2010)

[] Putting yourself in everyone’s shoes includes your participants and the general public, not just your peers. This distinction is illustrated in the case of the Jewish Chronic Disease Hospital (Katz, Capron, and Glass 1972, Ch. 1; Lerner 2004; Arras 2008).

Dr. Chester M. Southam was a distinguished physician and researcher at Sloan-Kettering Institute for Cancer Research and an Associate Professor of Medicine at the Cornell University Medical College. On July 16, 1963, Southam and two colleagues injected live cancer cells into the bodies of 22 debilitated patients at the Jewish Chronic Disease Hospital in New York. These injections were part of Southam’s research to understand the immune system of cancerous patients. In earlier research, Southam had found that healthy volunteers were able to reject injected cancer cells in roughly 4 to 6 weeks, whereas it took patients who already had cancer much longer. Southam wondered whether the delayed response in the cancer patients was because they had cancer or because they were elderly and debilitated already. To address these possibilities, Southam decided to inject live cancer cells into a group of people who were elderly and debilitated but who did not have cancer. When word of the study spread, triggered in part by the resignation of three physicians who were asked to participate, some made comparisons to the Nazi Concentration Camp Experiments, but others—based in part on assurances by Southam—found the research unproblematic. Eventually, the New York State Board of Regents reviewed the case in order to decide if Southam should be able to continue to practice medicine. Southam argued at his defense that he was acting in “the best tradition of responsible clinical practice.” Southam’s defense was based on a number of claims, which were all supported by several distinguished experts who testified on his behalf: (1) his research was of high scientific and social merit; (2) there were no appreciable risks to participants; a claim based in part of Southam’s 10 years of prior experience with more than 600 subjects; (3) the level of disclosure should be adjusted according to the level of risk posed by the researcher; (4) the research was in conformity with the standard of medical practice at that time. Ultimately, the Regent’s board found Southam guilty of fraud, deceit, and unprofessional conduct, and suspended his medical license for one year. Yet, just a few years later, Chester M. Southam was elected president of the American Association of Cancer Researchers.
1. Assess Southam’s study using the four principles in this chapter.
2. It appears that Southam took the perspective of his colleagues and correctly anticipated how they might respond to his work; in fact, many of them testified on his behalf. But, he was unable or unwilling to understand how his research might be troubling to the public. What role do you think public opinion—which could be distinct from the opinions of participants—should have in research ethics? What should happen if popular opinion and peer opinion differ?
[] In a paper titled “Crowdseeding in Eastern Congo: Using Cell Phones to Collect Conflict Events Data in Real Time”, Van der Windt and Humphreys (2016) describe a distributed data collection system (see Chapter 5) that they created in Eastern Congo. Describe how the researchers dealt with the uncertainty about possible harms to participants.

[ medium ] In October 2014, three political scientists sent mailers to 102,780 registered voters in Montana as part of an experiment to measure whether voters who are given more information are more likely to vote. The mailers—which were labeled 2014 Montana General Election Voter Information Guide—placed Montana Supreme Court Justice candidates, which is a non-partisan election, on a scale from liberal to conservative, which included Barack Obama and Mitt Romney as comparisons. The mailer also included a reproduction of the Great Seal of the State of Montana (Figure 6.10).

The mailers generated complaints from Montana voters, and they caused Linda McCulloch, Montana’s Secretary of State, to file a formal complaint with the Montana state government. The universities that employed the researchers—Dartmouth and Stanford—sent a letter to everyone that had received the mailer apologizing for any potential confusion and making clear that the mailer “was not affiliated with any political party, candidate or organization, and was not intended to influence any race.” The letter also clarified that the ranking “relied upon public information about who had donated to each of the campaigns.” (Figure 6.11)

In May 2015, the Commissioner of Political Practices of the State of Montana, Jonathan Motl, determined that the researchers violated Montana law: “The Commissioner determines that there are sufficient facts to show that Stanford, Dartmouth and/or its researchers violated Montana campaign practice laws requiring registration, reporting and disclosure of independent expenditures.” (Sufficient Finding Number 3 in Motl (2015)). The Commissioner also recommended that the County Attorney investigate whether the use of the unauthorized use of the Great Seal of Montana violates Montana state law (Motl 2015).

Stanford and Dartmouth disagreed with Motl’s ruling. A Stanford spokeswoman named Lisa Lapin said “Stanford…does not believe any election laws were violated” and that the mailing “did not contain any advocacy supporting or opposing any candidate.” She pointed out that the mailer explicitly stated that it “is nonpartisan and does not endorse any candidate or party.” (Richman 2015)

Assess this study using the four principles and two frameworks described in this chapter.
Assume that the mailers were sent to a random sample of voters (but more on that in a moment), under what conditions might this mailing have altered the outcome of the Supreme Court Justice election?
In fact, the mailers were not sent to a random sample of voters. According to a report by Jeremy Johnson (a political scientists who assisted in the investigation), mailers “were sent to 64,265 voters identified as likely liberal to centrist leaning in Democratic leaning precincts and 39,515 voters identified as conservative to centrist in Republican leaning precincts. The researchers justified the disparity between Democratic and Republican numbers on grounds that they anticipated turnout to be significantly lower among Democratic voters.” Does this change your assessment of the research design? If so, how?
In response to the investigation, the researchers said that they picked this election in part because “neither judicial race had been closely contested in the primary. Based on an analysis of the 2014 primary election results in the context of previous Montana judicial elections, the researchers determined that the research study as designed would not change the outcome of either contest.” (Motl 2015) Does this change your assessment of the research? If so, how?
In fact, the election turned out to be not particularly close (Table 6.6). Does this change your assessment of the research? If so, how?
It turns out that a study was submitted to Dartmouth IRB by one of the researchers, but it differed substantially from the actual Montana study. The mailer used in Montana was never submitted to the IRB. The study was never submitted to the Stanford IRB. Does this change your assessment of the research? If so, how?
It also turns out that the researchers sent similar election materials to 143,000 voters in California and 66,000 in New Hampshire. As far as I know, there were no formal complaints triggered by these approximately 200,000 additional mailers. Does this change your assessment of the research? If so, how?
What, if anything, would you have done differently if you were the principal investigators? How would you have designed the study if you were interested in exploring whether additional information increases voter turnout in nonpartisan races?

Table 6.6: Results from the 2014 Montana Supreme Court Justice elections. Source: Webpage of Montana Secretary of State.
Candidates	Votes received	Percentage
Supreme Court Justice #1
W. David Herbert	65,404	21.59%
Jim Rice	236,963	78.22%
Supreme Court Justice #2
Lawrence VanDyke	134,904	40.80%
Mike Wheat	195,303	59.06%

Figure 6.10: Mailer sent by three political scientists to 102,780 registered voters in Montana as part of an experiment to measure whether voters who are given more information are more likely to vote. The sample size in this experiment was roughly 15% of eligible voters in the state.

Figure 6.11: Apology letter that was sent to the 102,780 registered voters in Montana who had received the mailer in Figure 6.10. The letter was sent by the Presidents of Dartmouth and Stanford, the universities that employed the researchers who sent the mailer.

[] On May 8, 2016, two researchers—Emil Kirkegaard and Julius Bjerrekaer—scraped information from the online dating site OkCupid and publicly released a dataset of about 70,000 users, including variables of username, age, gender, location, religion-related opinions, astrology-related opinions, dating interests, number of photos, etc., as well as answers given to the top 2600 questions on the site. In a draft paper accompanying the released data, the authors stated that “Some may object to the ethics of gathering and releasing this data. However, all the data found in the dataset are or were already publicly available, so releasing this dataset merely presents it in a more useful form.”

In response to the data release, one of the authors was asked on Twitter: “This data set is highly re-identifiable. Even includes usernames? Was any work at all done to anonymize it?”. His response was “No. Data is already public.” (Zimmer 2016; Resnick 2016)
1. Assess this data release using the principles and ethical frameworks discussed in this chapter.
2. Would you use this data for your own research?
3. What if you scraped it yourself?
[] In 2010 an intelligence analyst with the U.S. Army gave 250,000 classified diplomatic cables to the organization WikiLeaks, and they were subsequently posted online. Gill and Spirling (2015) argue that “the WikiLeaks disclosure potentially represents a trove of data that might be tapped to test subtle theories in international relations”, and then statistically characterize the sample of leaked documents. For example, the authors estimate that they represent about 5% of all diplomatic cables during that time period, but that this proportion varies from embassy to embassy (see Figure 1 of their paper).
1. Read the paper, and then write an ethical appendix to it.
2. The authors did not analyze the content of any of the leaked documents. Is there any project using these cables that you would conduct? Is there any project using these cables that you would not conduct?
[] In order to study how companies respond to complaints, a researcher sent fake complaint letters to 240 high-end restaurants in New York City. Here’s an excerpt from the fictitious letter.

“I am writing this letter to you because I am outraged about a recent experience I had at your restaurant. Not long ago, my wife and I celebrated our first anniversary. … The evening became soured when the symptoms began to appear about four hours after eating. Extended nausea, vomiting, diarrhea, and abdominal cramps all pointed to one thing: food poisoning. It makes me furious just thinking that our special romantic evening became reduced to my wife watching me curl up in a fetal position on the tiled floor of our bathroom in between rounds of throwing up. …Although it is not my intention to file any reports with the Better Business Bureau or the Department of Health, I want you, [name of the restaurateur], to understand what I went through in anticipation that you will respond accordingly.”
1. Evaluate this study using the principles and ethical frameworks described in this chapter. Given your assessment, would you do the study?
2. Here’s how the restaurants who received the letter reacted: “It was culinary chaos as owners, managers and chefs searched through computers for [name redacted] reservations or credit card records, reviewed menus and produce deliveries for possibly spoiled food, and questioned kitchen workers about possible lapses, all spurred by what both the university and the professor now concede was the business school study from hell.” (Kifner 2001) Does this information change how you assess the study?
3. As far as I know, this study was not reviewed by an IRB or any other third-party. Does that change how you assess the study? Why or why not?
[] Building on this previous question, I’d like you to compare this study to a completely different study that also involved restaurants. In this other study, Neumark and colleagues (1996) sent two male and two female college students with fabricated resumes to apply for jobs as waiters and waitresses at 65 restaurants in Philadelphia, in order to investigate sex discrimination in restaurant hiring. The 130 applications led to 54 interviews and 39 job offers. The study found statistically significant evidence of sex discrimination against women in high-price restaurants.
1. Write an ethical appendix for study.
2. Do you think this study is ethically different from the one described in the previous question. If so, how?
[] Some time around 2010, 6,548 professors in the United States received emails similar to this one.

“Dear Professor Salganik,

I am writing you because I am a prospective Ph.D. student with considerable interest in your research. My plan is to apply to Ph.D. programs this coming fall, and I am eager to learn as much as I can about research opportunities in the meantime.

I will be on campus today, and although I know it is short notice, I was wondering if you might have 10 minutes when you would be willing to meet with me to briefly talk about your work and any possible opportunities for me to get involved in your research. Any time that would be convenient for you would be fine with me, as meeting with you is my first priority during this campus visit.

Thank you in advance for your consideration.

Sincerely, Carlos Lopez"

These emails were part of a field experiment to measure whether professors were more likely to respond to the email depending on 1) the time-frame (today vs next week) and 2) the name of the sender which was varied to signal ethnicity and gender (e.g., Meredith Roberts, Raj Singh, etc). The researchers found that when the requests were to meet in 1 week, Caucasian males were granted access to faculty members about 25% more often than were women and minorities. But, when the fictitious students requested meetings that same day these patterns were essentially eliminated (Milkman, Akinola, and Chugh 2012).
1. Assess this experiment according to the principles and frameworks in this chapter.
2. After the study was over, the researchers sent the following debriefing email to all participants.
“Recently, you received an email from a student asking for 10 minutes of your time to discuss your Ph.D. program (the body of the email appears below). We are emailing you today to debrief you on the actual purpose of that email, as it was part of a research study. We sincerely hope our study did not cause you any disruption and we apologize if you were at all inconvenienced. Our hope is that this letter will provide a sufficient explanation of the purpose and design of our study to alleviate any concerns you may have about your involvement. We want to thank you for your time and for reading further if you are interested in understanding why you received this message. We hope you will see the value of the knowledge we anticipate producing with this large academic study.”

After explaining the purpose and design of the study, they further noted that:

“As soon as the results of our research are available, we will post them on our websites. Please rest assured that no identifiable data will ever be reported from this study, and our between subject design ensures that we will only be able to identify email responsiveness patterns in aggregate – not at the individual level. No individual or university will be identifiable in any of the research or data we publish. Of course, any one individual email response is not meaningful as there are multiple reasons why an individual faculty member might accept or decline a meeting request. All data has already been de-identified and the identifiable email responses have already been deleted from our databases and related server. In addition, during the time when the data was identifiable, it was protected with strong and secure passwords. And as is always the case when academics conduct research involving human subjects, our research protocols were approved by our universities’ Institutional Review Boards (the Columbia University Morningside IRB and the University of Pennsylvania IRB).

If you have any questions about your rights as a research subject, you may contact the Columbia University Morningside Institutional Review Board at 212-851-7040 or by email at askirb@columbia.edu and/or the University of Pennsylvania Institutional Review Board at 215-898-2614.

Thank you again for your time and understanding of the work we are doing."
1. What are the arguments for debriefing in this case? What are the arguments against? Do you think that the researchers should have debriefed the participants in this case?
2. In the supporting online materials, the researchers have a section titled “Human Subjects Protections.” Please read this section. Is there anything that you would add or remove.
3. What was the cost of this experiment to the researchers? What was the cost of this experiment to participants? Andrew Gelman (2010) has argued that participants in this study could have been compensated for their time after the experiments was over. Do you agree? Try to make your argument using the principles and ethical frameworks in the chapter.