Preface

For me, this book began in 2005, when I was working on my dissertation. I was running an online experiment, which I’ll tell you all about in Chapter 4, but now I’m going to tell you something that is not in any academic paper. And, it’s something that fundamentally changed how I think about research. One morning, when I checked the web-server, I discovered that overnight about 100 people from Brazil had participated in my experiment. This experience had a profound impact on me. At that time, I had friends who were running traditional lab experiments, and I knew how hard they had to work to recruit, supervise, and pay people to participate in their experiments; if they could run 10 people in a single day, that was good progress. But, with my online experiment, 100 people participated while I was sleeping. Doing your research while you are sleeping might sound too good to be true, but it isn’t. Changes in technology—specifically the transition from the analog age to the digital age—mean that we can now collect and analyze social data in new ways. This book is about doing social research in these new ways.

This book is for two different communities. It is for social scientists that want to do more data science, and it is for data scientists that want to do more social science. I spend time in both of these communities, and this book is my attempt to bring their ideas together in a way that avoids the quirks and jargon of either. Given the communities that this book is for, it should go without saying that this book is not just for students and professors. I’ve worked some in government (at the US Census Bureau) and in the tech industry (at Microsoft Research), and I know that there is lots of exciting research happening outside of universities. So, if you think of what you are doing as social research, then this book is for you, no matter where you work or what kind of techniques you currently use.

We are still in the early days of social research in the digital age, and I’ve seen some misunderstandings that are so fundamental and so common that it makes the most sense for me to address them here, in the preface. From data scientists, I’ve seen two common misunderstandings. The first is thinking that more data automatically solves problems. But, for social research that has not been my experience. In fact, for social research new types of data, as opposed to more of the same data, seems to be most helpful. The second misunderstanding that I’ve seen from data scientists is thinking that social science is just a bunch of fancy-talk wrapped around common sense. Of course, as a social scientist—more specifically as a sociologist—I don’t agree with that; I think that social science has a lot of to offer. Smart people have been working hard to understand human behavior for a long time, and it seems unwise to ignore the wisdom that has accumulated from this effort. My hope is that this book will offer you some of that wisdom in a way that is easy to understand.

From social scientists, I’ve also seen two common misunderstandings. First, I’ve seen some people write-off the entire idea of social research using the tools of the digital age based on a few bad papers. If you are reading this book, you have probably already read a bunch of papers that uses social media data in ways that are banal or wrong (or both). I have too. However, it would be a serious mistake to conclude from these examples that all digital age social research is bad. In fact, you’ve probably also read a bunch of papers that use survey data in ways that are banal or wrong, but you don’t write-off all research using surveys. That’s because you know that there is great research done with survey data, and in this book, I’m going to show you that there is also great research done with the tools of the digital age.

The second common misunderstanding that I’ve seen from social scientists is to confuse the present with the future. When assessing social research in the digital age—the research that I’m going to describe in this book—it is important to ask two distinction questions:

  • How well does this style of research work now?
  • How well will this style of research work in the future as the data landscape changes and as researchers devote more attention to these problems?

Even though researchers are trained to answer the first question, for this book, I think the second question is more important. That is, even though social research in the digital age has not yet produced massive, paradigm-changing intellectual contributions, the rate of improvement of digital age research is incredibly rapid. It is this rate of change, more than the current level, that makes digital age research so exciting to me.

Even though that last paragraph seemed to offer you potential riches at some unspecified time in the future, my goal in this book is not to sell you on any particular type of research. I don’t personally own shares in Twitter, Facebook, Google, Microsoft, Apple or any other tech company (although, for the sake of full disclosure, I have worked at or received research funding from Microsoft, Google, and Facebook). If you are happy with the research that you are already doing: great, keep doing what you are doing. But, if you have a sense that the digital age means that new and different things are possible, then I’d like to show you those possibilities. Thus, throughout the book my goal is to remain a credible narrator, telling you about all the exciting new stuff that is possible, while guiding you away from a few pitfalls that I’ve seen others fall into. I hope that this will help improve your research and help you better evaluate the research of others.

As you might have noticed already, the tone of this book is a bit different from some other academic books. That’s intentional. This book emerged from a graduate seminar that I have taught at Princeton in the Department of Sociology, and I’d like this book to capture some of the the energy and excitement from that seminar. In particular, I want this book to have three characteristics: helpful, optimistic, and future-oriented.

Helpful: My goal is to write a book that is helpful for you. Therefore, I’m going to write in an open and informal style. That’s because the most important thing that I want to convey is a certain way of thinking about social research. And, my experience from teaching suggests that the best way to convey this way of thinking is informally and with lots of examples.

Optimistic: The two communities that this book engages—social scientists and data scientists—have very different styles. Data scientists are generally excited; they tend to see the glass as half full. Social scientists, on the other hand, are generally more critical; they tend to see the glass as half empty. In this book, I’m going to adopt the optimistic tone of a data scientist, even though my training is as a social scientist. So, when I present examples, I’m going to tell you what I love about these examples. And, when I do point out problems with the examples—and I will do this because no research is perfect—I’m going to try to point out these problems in a way is positive and optimistic. I’m not going to be critical for the sake of being critical. I’m going to be critical so that I can help you create more beautiful research.

Future-oriented: I hope that this book will help you do social research using the digital systems that exist today and the digital systems that will be created in the future. I started doing this kind of research in 2003, and since then I’ve seen a lot of changes. I remember that when I was in graduate school people were very excited about using MySpace for social research. And, when I taught my first class on what I then called “web-based social research,” people were very excited about virtual worlds such as SecondLife. I’m sure that in the future much of what people are talking about today will seem silly and outdated. The trick to staying relevant in the face of this rapid change is abstraction. Therefore, this is not going to be a book that teaches you exactly how to use the Twitter API; instead, it is going to be a book that teaches you how to learn from digital traces (Chapter 2). This is not going to be a book that gives you step-by-step instructions for running experiments on Amazon Mechanical Turk; instead, it is going to teach you how to design and interpret experiments that rely on digital age infrastructure (Chapter 4). Through the use of abstraction, I hope this will be a timeless book on a timely topic.

I think this is the most exciting time ever to be a social researcher, and I’m going to try to convey that excitement in a way that is precise. That is, it is time to move beyond vague generalities about the magical powers of new data. It is time to get specific.