3.6 Surveys linked to big data sources

Linking surveys to big data sources enables you to produce estimates that would be impossible with either data source individually.

Most surveys are stand-alone, self-contained efforts. They don’t build on each other, and they don’t take advantage of all of the other data that exists in the world. This will change. There is just too much to be gained by linking survey data to the big data sources discussed in chapter 2. By combining these two types of data, it is often possible to do something that was impossible with either one individually.

There are a couple of different ways in which survey data can be combined with big data sources. In this section, I’ll describe two approaches that are useful and distinct, and I’ll call them enriched asking and amplified asking (figure 3.12). Although I’m going to illustrate each approach with a detailed example, you should recognize that these are general recipes that could be used with different types of survey data and different types of big data. Further, you should notice that each of these examples could be viewed in two different ways. Thinking back to the ideas in chapter 1, some people will view these studies as examples of “custommade” survey data enhancing “readymade” big data, and others will view them as examples of “readymade” big data enhancing “custommade” survey data. You should be able to see both views. Finally, you should notice how these examples clarify that surveys and big data sources are complements and not substitutes.

Figure 3.12: Two ways to combine big data sources and survey data. In enriched asking (section 3.6.1), the big data source has a core measure of interest and the survey data builds the necessary context around it. In amplified asking (section 3.6.2), the big data source does not have a core measure of interest, but it is used to amplify the survey data.

Figure 3.12: Two ways to combine big data sources and survey data. In enriched asking (section 3.6.1), the big data source has a core measure of interest and the survey data builds the necessary context around it. In amplified asking (section 3.6.2), the big data source does not have a core measure of interest, but it is used to amplify the survey data.