When Science, Customer Service, and Human Subjects Research Collide. Now What?

My brothers and sisters in data science, computational social science, and all of us studying and building the Internet of things inside or outside corporate firewalls, to improve a product, explore a scientific question, or both: we are now, officially, doing human subjects research.

I’m frustrated that the state of public intellectualism allows us, individually, to jump into the conversation about the recently published Facebook “Emotions” Study [1]. What we—from technology builders and interface designers to data scientists and ethnographers working in industry and at universities alike—really (really) need right now is to sit down together and talk. Pointing the finger or pontificating doesn’t move us closer to the discussions we need to have, from data sharing and users’ rights to the drop in public funding for basic research itself. We need a dialogue—a thoughtful, compassionate conversation among those who are or will be training the next generation of researchers studying social media. And, like all matters of ethics, this discussion will become a personal one as we reflect on our doubts, disagreements, missteps, and misgivings. But the stakes are high. Why should the Public trust social media researchers and the platforms that make social media a thing? It is our collective job to earn and maintain the Public’s trust so that future research and social media builders have a fighting chance to learn and create more down the line. Science, in particular, is an investment in questions that precede and will live beyond the horizon of individual careers.

As more and more of us crisscross disciplines and work together to study or build better social media, we are pressed to rethink our basic methods and the ethical obligations pinned to them. Indeed “ethical dilemmas” are often signs that our methodological techniques are stretched too thin and failing us. When is something a “naturalistic experiment” if the data are always undergoing A/B tweaks? How do we determine consent if we are studying an environment that is at once controllable, like a lab, but deeply social, like a backyard BBQ? When do we need to consider someone’s information “private” if we have no way to know, for sure, what they want us to do with what we can see them doing? When, if ever, is it ok to play with someone’s data if there’s no evident harm but we have no way to clearly test the long-term impact on a nebulous number of end users?

There is nothing obvious about how to design and execute ethical research that examines people’s individual or social lives. The reality is, when it comes to studying human interaction or behavior (for profit or scientific glory), it is no more (or less) complicated whether we’re interviewing someone in their living room, watching them in a lab, testing them at the screen, or examining the content they post online. There is no clearer sign of this than the range of reactions to the news (impeccably curated here by James Grimmelmann) that for one week, back in January 2012, researchers manipulated (in the scientific sense) what 689,003 Facebook users read in their individual News Feed. Facebook’s researchers fed some users a diet containing fewer posts of “happy” and positive words than their usual News Feed; other users received a smaller than their average allotment of posts ladled with sad words. Cornell-based researchers came in after the experiment was over to help sift through and crunch the massive data set. Here’s what the team found: By the experiment’s last day (which, coincidentally, landed on the day of the SOPA online protests! Whoops), it turned out that a negligible—but statistically detectable—number of people produced fewer positive posts and more negative ones if their Feed included fewer positive news posts from friends; when the researchers scaled back the number of posts with negative cues from friends, people posted fewer negative and more positive posts. This interesting, even if small, finding was published in the June 2014 issue of the Proceedings of the National Academy of Sciences (PNAS). That’s how Science works—one small finding at a time.

At issue: the lead author, Facebook Data Scientist, Adam Kramer, never told users in the study that their News Feeds were part of this experiment, either before or after that week in January. And Cornell University’s researchers examining the secondary data set (fancy lingo for the digital records of more than half a million people’s interactions with each other) weren’t, technically, on the hook for explaining that to subjects either. Mind you, it’s often acceptable in human subjects research to conduct experiments without prior consent, as long as everyone discussing the case agrees that the experiment does not impose greater risk to the person than they might experience in a typical day. But even in those cases, at some point the research subjects are told (“debriefed”) about their participation in the study and given the option to withdraw data collected about them from the study. Researchers also have a chance to study the impact of the stimulus they introduced into the system. So, the question of the hour is: Do we cross a line when testing a product also asks a scientifically relevant question? If researchers or systems designers are “just” testing a product on end users (aka humans) and another group has access to all that luscious data, whose ethics apply? When does “testing” end and “real research” begin in the complicated world of “The Internet?”

Canonical Science teaches us that the greater the distance between researchers and our subjects (often framed as objectivity), the easier it is for us to keep trouble at arm’s length. Having carried out what we call “human subjects research” for much of my scholarly life—all of it under the close scrutiny of Institutional Review Boards (IRBs)—I feel professionally qualified to say, “researching people ain’t easy.” And, you know what makes it even harder? We are only about 10 years into this thing we call “social media”—which can morph into a telephone, newspaper, reality TV show, or school chalkboard, depending on who’s wielding it and when we’re watching them in action. Online, we are just as likely to be passionately interacting with each other, skimming prose, or casually channel-surfing, depending on our individual context. Unfortunately, it’s hard for anyone studying the digital signs of humans interacting online to know what people mean for us to see—unless we ask them. We don’t have the methods (yet) to robustly study social media as sites of always-on, dynamic human interaction. So, to date, we’ve treated the Internet as a massive stack of flat, text files to scrape and mine. We have not had a reason to collectively question this common, methodological practice as long as we maintained users’ privacy. But is individual privacy really the issue?

My brothers and sisters in data science, computational social science, and all of us studying and building the Internet of things inside or outside corporate firewalls, to improve a product, explore a scientific question, or both: We are now, officially, doing human subjects research. Here’s some background to orient us and the people who pay our research bills (and salaries) to this new reality.

Genealogy of Human Subjects Research Oversight in the United States

In 1966, the New England Journal of Medicine published an article by Harvard research physician, Henry Beecher, chronicling 22 ethically questionable scientific studies conducted between 1945 and 1965 (Rothman, 2003: 70-84). Dr. Beecher’s review wasn’t exposing fringe science on the margins. Federally and industry-funded experiments conducted by luminaries of biomedicine accounted for most of the work cited in his review. Even if today we feel like it’s a no brainer to call ethical foul on the studies Beecher cited, keep in mind that it took DECADES for people to reach consensus on what not to do. Take, for example, Beecher’s mention of Dr. Saul Krugman. From 1958-1964, Dr. Saul Krugman injected children with live hepatitis virus at Willowbrook State School on New York’s Staten Island, a publicly-funded institution for children with intellectual disabilities. The Office of the Surgeon General, U.S. Armed Forces Epidemiological Board, and New York State Department of Mental Hygiene funded and approved his research. Krugman directed staff to put the feces of infected children into milkshakes later fed to newly admitted children, to track the spread of the disease. Krugman pressed poor families to include their children in what he called “treatments” to secure their admission to Willowbrook, the only option for poor families with children suffering from mental disabilities. After infecting the children, Krugman experimented with their antibodies to develop what would later become the vaccines for the disease. Krugman was never called out for the lack of consent or failure to provide for the children he infected with the virus, now at risk of dying from liver disease. Indeed, he received the prestigious Lasker Prize for Medicine for developing the Hepatitis A and B vaccines and, in 1972, became the President of the American Pediatric Society. Pretty shocking. But, at the time, and for decades after that, Willowbrook did not register as unequivocally unethical. My point here is not to draw one to one comparisons of Willowbrook and the Facebook Emotions study. They are not even close to comparable. I bring up Willowbrook to point out that no matter how ethically egregious something might seem in hindsight, often such studies do not appear so at the time, especially when weighed against the good they might seem to offer in the moment. Those living in the present are never in the best position to judge what will or will not seem “obviously wrong.”

News accounts of risky experiments carried out without prior or clear consent, often targeting marginalized communities with little power, catalyzed political will for federal regulations for biomedical and behavioral researchers’ experiments (Rothman, 2003: 183-184). Everyone agreed: there’s a conflict of interest when individual researchers are given unfettered license to decide if their research (and their reputations) are more valuable to Science than an individual’s rights to opt out of research, no matter how cool and important the findings might be. The balance between the greater good and individual risk of research involving human subjects must be adjudicated by a separate review committee, made up of peers and community members, with nothing to be gained by approving or denying a researcher’s proposed project.

The Belmont Report

The National Research Act of 1974 created the Commission for the Protection of Human Subjects of Biomedical and Behavioral Research [2]. Five years later, the Commission released The Belmont Report: The Ethical Principles and Guidelines for the Protection of Human Subjects of Research. The Belmont Report codified the call for “respect for persons, beneficence, and justice” (The Belmont Report, 1979). More concretely, it spelled out what newly mandated university and publicly funded agency-based IRBs should expect their researchers to do to safeguard subjects’ informed consent, address the risks and benefits their participation might accrue, and more fairly distribute science’s “burdens and benefits” (The Belmont Report, 1979). The Belmont Report now guides how we define human subjects research and the attendant ethical obligations of those who engage in it.

Put simply, the Belmont Report put a Common Rule in place to manage ethics through a procedure focused on rooting out bad apples before something egregious happens or is uncovered, after the fact. But it did not—and we have not—positioned ethics as an on-going, complicated discussion among researchers actively engaging fellow researchers and the human subjects we study. And we’ve only now recognized that human subjects research is core to technology companies’ product development and, by extension, bottom lines. However, there is an element of the Belmont Report that we could use to rethink guidance for technology companies, data scientists, and social media researchers alike: the lines drawn in the Belmont Report between “practice and research.”

The fine line between practice and research

The Belmont Report drew a clear line demarcating the “boundaries between biomedical and behavioral research and the accepted and routine practice of medicine”—the difference between research and therapeutic intervention (The Belmont Report 1979). This mandate, which was in fact the Report’s first order of business, indexes the Commission’s most pressing anxiety: how to reign in biomedicine’s professional tendencies to experiment in therapeutic contexts. The history of biomedical breakthroughs—from Walter Reed’s discovery of the causes of yellow fever to Jonas Salk’s polio vaccines—attest to the profession’s culture of experimentation (Halpern 2004: 41-96). However, this professional image of the renegade (mad) scientist pioneering medical advances was increasingly at odds with the need, pressing by the 1970s, for a more restrained and cautious scientific community driven first by an accountability to the public and only second by a desire for discovery.

In redrawing the boundaries between research and practice, the Belmont Report positioned ethics as a wedge between competing interests. If a practitioner simply wanted to tweak a technique to see if it could improve an individual subjects’ experience, the experiment did not meet the threshold of “real scientific inquiry” and could be excused from more formal procedures of consent, debriefing, and peer review. Why? Practitioners already have guiding codes of ethics (“do no harm”) and, as importantly, ongoing relationships built on communication and trust with the people in their care (at least, in theory). The assumption was that practitioners and “their” subjects could hold each other mutually accountable.

But, once a researcher tests something out for testing’s sake or to work on, more broadly, a scientific puzzle, they are in the realm of research and must consider a new set of questions: Cui bono, who benefits? Will the risk or harm to an individual outweigh the benefits for the greater good? What if that researcher profits from the greater good? The truth is, in most cases, the researcher will benefit, whether they make money or not, because they will gain credibility and status through the experience of their research. Can we say the same for the individual contributing their experiences to our experiments? If not, that’s, typically, an ethical dilemma.

Constructing ethical practice in a social media world

Social media platforms and the technology companies that produce our shared social playgrounds blur the boundaries between practice and research. They (we?) have to, in many cases, to improve the products that companies provide users. That’s no easy thing if you’re in the business of providing a social experience through your technology! But that does not exempt companies, any more than it exempts researchers, from extending respect, beneficence, and justice to individuals sharing their daily interactions with us. So we need to, collectively, rethink when “testing a feature” transitions from improving customer experience to more than minimally impacting someone’s social life.

Ethical stances on methodological practices are inextricably linked to how we conceptualize our objects of study. Issues of consent hinge on whether researchers believe they are studying texts or people’s private interactions. Who needs to be solicited for consent also depends on whether researchers feel they are engaged in a single site study or dealing with an infrastructure that crosses multiple boundaries. What ethical obligations, then, should I adhere to as I read people’s posts—particularly on commercial venues such as Facebook that are often considered “public domain”—even when they may involve participants who share personal details about their lives from the walled garden of their privacy settings? Are these obligations different from those I should heed with individuals not directly involved in my research? How can I use this information and in what settings? Does consent to use information from interviews with participants include the information they publicly post about themselves online? These questions are not easily grouped as solely methods issues or strictly ethical concerns.

For me, the most pragmatic ethical practice follows from the reality that I will work with many of the people I meet through my fieldwork for years to come. And, importantly, if I burn bridges in my work, I am, literally, shutting out researchers who might want to follow in my footsteps. I can give us all a bad reputation that lasts a human subject’s lifetime. I, therefore, treat online materials as the voices of the people with whom I work. In the case of materials I would like to cite, I email the authors, tell them about my research, and ask if I may include their web pages in my analyses. I tread lightly and carefully.

The Facebook Emotions study could have included a follow up email to all those in the study, sharing the cool results with participants and offering them a link to the happy and sad moments that they missed in their News Feed while the experiment was underway (tip of the hat to Tarleton Gillespie for those ideas). And, with more than half a million people participating, I’m sure a few hundred thousand would have opted-in to Science and to let Facebook keep the results.

We do not always have the benefit of personal relationships, built over time with research participants to guide our practices. And, unfortunately, our personal identities or affinities with research participants do not safeguard us from making unethical decisions in our research. We have only just started (like, last week) to think through what might be comparable practices for data scientists or technology designers, who often never directly talk with the people they study. That means that clear, ethical frameworks will be even more vital as we build new toolkits to study social media as sites of human interaction and social life.

Conclusion

Considering that more and more of social media research links universities and industry-based labs, we must coordinate our methodologies and ethics no matter who pays us to do our research. None of us should be relieved from duty when it comes to making sure all facets of our collaborations are conducted with an explicit, ethical plan of action. There are, arguably, no secondary data sets in this new world.

The Belmont Report was put in place to ensure that we have conversations with the Public, among ourselves, and with our institutions about the risks of the scientific enterprise. It’s there to help us come to some agreement as to how to address those risks and create contingency plans. While IRBs as classification systems can and have provided researchers with reflexive and sometimes necessary intervention, bureaucratic mechanisms and their notions of proper science are not the only or even the best source of good ethics for our work—ongoing and reflexive conversations among researchers and practitioners sharing their work with invested peers and participants are.

Whether from the comfort of a computer or in the thick of a community gathering, studying what people do in their everyday lives is challenging. The seeming objectivity of a lab setting or the God’s eye view of a web scraping script may seem to avoid biases and desires that could, otherwise, interfere with the social situations playing out in front of us that we want to observe. But, no matter how removed we are, our presence as researchers does not evaporate when we come into contact with human interaction. One of the values of sustained, ethnographic engagement with people as we research their lives: it keeps researchers constantly accountable not only to our own scientific (and self) interests but also to the people we encounter in any observation, experiment, or engagement.

Some of my peers argue that bothering people with requests for consent or efforts to debrief them will either “contaminate the data” or “seem creepy” after the fact. They argue that it’s less intrusive and more scientifically powerful to just study “the data” from a distance or adjust the interface design on the fly. I get it. It is not easy to talk with people about what they’re doing on online. Keep in mind that by the end of USENET’s long life as the center of the Internet’s social world, many moderated newsgroups blocked two kinds of lurkers: journalists. And researchers. In the long run, keeping a distance can leave the general public more suspicious of companies’, designers’, and researchers’ intentions. People may also be less likely to talk to us down the road when we want to get a richer sense of what they’re doing online. Let’s move away from this legalistic, officious discussion of consent and frame this debate as a matter of trust.

None of us would accept someone surreptitiously recording our conversations with others to learn what we’re thinking or feeling just because “it’s easier” or it’s not clear that we are interested in sharing them if asked outright. We would all want to understand what someone wants to know about us and why they want to study what we’re doing—what do they hope to learn and why does it matter? Those are completely reasonable questions. All of us have a right to be asked if we want to share our lives with strangers (even researchers or technology companies studying the world or providing a service) so that we have a chance to say, “nah, not right now, I’m going through a bad break up.” What would it look like for all of us—from LOLcat enthusiasts and hardcore gamers, to researchers and tech companies—to (re)build trust and move toward a collective enterprise of explicitly opting-in to understand this rich, social world that we call “The Internet?”

Scientists and technology companies scrutinizing data bubbling up from the tweets, posts, driving patterns, or check-ins of people are coming to realize that we are also studying moments of humans interacting with each other. These moments call for respect, trust, mutuality. By default. Every time we even think we see social interactions online. Is working from this premise too much to ask of researchers or the companies and universities that employ us? I don’t think so.

Addendum (added June 13, 2014)

I realized after posting my thoughts on how to think about social media as a site of human interaction (and all the ethical and methodological implications of doing so) that I forgot to leave links to what are, bar none, the best resources on the planet for policy makers, researchers, and the general public thinking through all this stuff.

Run, don’t walk, to download copies of the following must-reads:

Charles Ess and the AOIR Ethics Committee (2002). Ethical decision-making and Internet research: Recommendations from the AoIR ethics working committee. Approved by the Association of Internet Researchers, November 27, 2002. Available at: http://aoir.org/reports/ethics.pdf

Annette Markham and Elizabeth Buchanan (2012). Ethical decision-making and Internet research: Recommendations from the AoIR ethics working committee (version 2.0). Approved by the Association of Internet Researchers, December 2012. Available at: http://aoir.org/reports/ethics2.pdf

Notes/Bibliography/Additional Reading

[1] The United States Department of Health, Education and Welfare (HEW) was a cabinet-level, U.S. governmental department from 1953-1979. In 1979, HEW was reorganized into two separate cabin-level departments: the Department of Education and the Department of Health and Human Services (HHS). HHS is in charge of all research integrity and compliance including research involving human subjects.

[2] I wanted to thank my fellow MSR Ethics Advisory Board members, MSR New England Lab, and the Social Media Collective, as well as the following people for their thoughts on drafts of this essay: danah boyd, Henry Cohn, Kate Crawford, Tarleton Gillespie, James Grimmelmann, Jeff Hancock, Jaron Lanier, Tressie Cottom McMillan, Kate Miltner, Christian Sandvig, Kat Tiidenberg, Duncan Watts, and Kate Zyskowski

Bowker, Geoffrey C., and Susan Leigh Star

1999 Sorting Things Out: Classification and Its Consequences, Inside Technology. Cambridge, Mass.: MIT Press.

Brenneis, Donald

2006 Partial Measures. American Ethnologist 33(4): 538-40.

Brenneis, Donald

1994 Discourse and Discipline at the National Research Council: A Bureaucratic Bildungsroman. Cultural Anthropology 9(1): 23-36.

Epstein, Steven

2007 Inclusion : The Politics of Difference in Medical Research. Chicago: University of Chicago Press.

Gieryn, Thomas F.

1983 Boundary-Work and the Demarcation of Science from Non-Science: Strains and Interests in Professional Ideologies of Scientists.” American Sociological Review 48(6): 781-95.

Halpern, Sydney A.

2004 Lesser Harms: The Morality of Risk in Medical Research. Chicago: University of Chicago Press.

Lederman, Rena

2006 The Perils of Working at Home: Irb “Mission Creep” as Context and Content for an Ethnography of Disciplinary Knowledges. American Ethnologist 33(4): 482-91.

Rothman, David J.

2003 Strangers at the Bedside: A History of How Law and Bioethics Transformed Medical Decision Making. 2nd pbk. ed, Social Institutions and Social Change. New York: Aldine de Gruyter.

Schrag, Zachary M.

2010 Ethical Imperialism: Institutional Review Boards and the Social Sciences, 1965-2009. John Hopkins University Press.

Stark, Laura

2012 Behind Closed Doors: IRBs and the Making of Ethical Research. University of Chicago Press. 2012

Strathern, Marilyn

2000 Audit Cultures: Anthropological Studies in Accountability, Ethics, and the Academy. London New York: Routledge, 2000.

United States. National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research.

1978 Report and Recommendations: Institutional Review Boards. [Washington]: U.S. Dept. of Health, Education, and Welfare : for sale by the Supt. of Docs., U.S. Govt. Print. Off.

This essay has been cross-posted from Ethnography Matters.

When Science, Customer Service, and Human Subjects Research Collide. Now What?

Author

Mary Gray