2.2 Big data

Ana ƙirƙira manyan bayanai kuma sun tattara ta hanyar kamfanoni da gwamnatoci don dalilai banda bincike. Amfani da wannan bayanai don bincike, sabili da haka, yana buƙatar sake dawowa.

Hanyar farko da mutane da dama ke fuskanta a kan zaman rayuwar jama'a a cikin shekarun zamani shine ta hanyar abin da ake kira babban bayanai . Duk da yin amfani da wannan lokaci, babu wani ra'ayi game da abin da babban bayanai ko da yake. Duk da haka, ɗaya daga cikin ma'anar da aka fi sani da manyan bayanai yana mayar da hankali kan "3 Vs": Volume, Sauye-sauye, da Ƙima. Abin takaici, akwai bayanai da dama, a cikin nau'i daban-daban, kuma an halicce ta kullum. Wasu magoya bayan babban bayanai sun hada da wasu "Vs" kamar Veracity da Darajar, yayin da wasu masu sukar suna ƙara Vs irin su Vague da Vacuous. Maimakon 3 "Vs" (ko 5 "Vs" ko 7 "Vs"), don dalilan nazarin zamantakewa, ina tsammanin wuri mafi kyau da zan fara shine 5 "Ws": Wanne, Abin da, Ina, Lokacin , da kuma Me yasa. A gaskiya, ina tsammanin yawancin kalubalen da damar da manyan masanan sun samo asali sun fito ne daga "W" guda daya: Me ya sa.

A cikin analog zamani, yawancin bayanai da aka yi amfani da su don nazarin zamantakewa an halicce shi don manufar yin bincike. A cikin shekarun dijital, duk da haka, kamfanoni da gwamnatoci suna kirkiro yawan bayanai don dalilai banda bincike, kamar samar da ayyuka, samar da riba, da kuma aiwatar da dokokin. Mutanen kirkirar, duk da haka, sun gane cewa za ka iya mayar da wannan kamfani da kuma bayanan gwamnati don bincike. Tunanin tunani na zane a cikin babi na 1, kamar yadda Duchamp ya sake samo wani abu da aka gano don ƙirƙirar fasaha, masana kimiyya zasu iya sake dawowa sun sami bayanai don ƙirƙirar bincike.

Yayinda akwai shakka babbar dama ga sake dawowa, ta yin amfani da bayanan da ba a halicce shi ba don dalilai na bincike ya gabatar da kalubale. Yi la'akari, misali, sabis na kafofin watsa labarun, irin su Twitter, tare da nazarin ra'ayi na jama'a, irin su General Social Survey. Babban burin Twitter shine samar da sabis ga masu amfani da shi don samun riba. A wani bangare, Janar Social Survey, ya mayar da hankali ga samar da bayanan da suka dace don nazarin zamantakewa, musamman ga bincike na jama'a. Wannan bambanci a raga yana nufin cewa bayanan Twitter da abin da Janar Social Survey ya samar yana da nau'ayi daban-daban, ko da yake dukansu za a iya amfani dasu don nazarin ra'ayin jama'a. Twitter yana aiki a sikelin da sauri cewa Janar Social Survey ba zai iya daidaita ba, amma, ba kamar General Social Survey ba, Twitter bata kula da masu amfani ba kuma ba ya aiki tukuru don kula da daidaito a tsawon lokaci. Saboda wadannan mahimman bayanai guda biyu suna da bambanci, ba ya da ma'anar cewa Janar Social Survey ya fi Twitter ko kuma mataimakin. Idan kuna so matakan lokaci na yanayin duniya (misali, Golder and Macy (2011) ), Twitter ne mafi kyau. A gefe guda, idan kuna so ku fahimci canje-canje na dogon lokaci game da halayyar halaye a Amurka (misali, DiMaggio, Evans, and Bryson (1996) ), to, General Social Survey shine mafi kyawun zabi. Fiye da kullum, maimakon ƙoƙari na jayayya cewa manyan bayanan bayanan sun fi kyau ko mafi muni fiye da wasu nau'in bayanai, wannan babi zai yi kokarin bayyana wace irin tambayoyin bincike manyan batutuwan bayanai suna da kyawawan abubuwa kuma wace irin tambayoyin da ba zasu kasance ba manufa.

Lokacin da kake tunani game da manyan bayanan bayanan, yawancin masu bincike sun mayar da hankalinsu kan bayanan intanit da aka tattara da kuma tattarawa ta hanyar kamfanoni, irin su shafukan bincike da kuma shafukan yanar gizo. Duk da haka, wannan ƙirar ta fi dacewa ya fitar da wasu muhimman mahimman bayanai guda biyu. Na farko, ƙananan kamfanonin manyan bayanai sun fito ne daga na'urorin dijital a cikin jiki ta duniya. Alal misali, a cikin wannan babi, zan gaya muku game da wani binciken da ya sake fitar da bayanan kantin sayar da ma'adinai don nazarin yadda yawancin ma'aikatanta ke tasiri akan su (Mas and Moretti 2009) . Bayan haka, a cikin surori na gaba, zan gaya muku game da masu bincike da suka yi amfani da rikodin kira daga wayoyin salula (Blumenstock, Cadamuro, and On 2015) da kuma lissafin kuɗin da aka gina ta hanyar lantarki (Allcott 2015) . Kamar yadda waɗannan misalai suka nuna, kamfanonin manyan bayanai sunfi karfin hali na kan layi.

Babban muhimmin mahimman bayanai na manyan bayanan da aka rasa ta hanyar mayar da hankali a kan labarun kan layi shine bayanai da gwamnatoci suka kafa. Wadannan bayanan gwamnati, waɗanda masu bincike suka kira tarihin gwamnati , sun haɗa da abubuwa kamar kididdigar haraji, rubuce-rubucen makarantar, da kuma kididdiga masu muhimmanci (misali, rikodin haihuwa da mutuwa). Gwamnatocin suna samar da irin wadannan bayanai, a wasu lokuta, daruruwan shekaru, da kuma masana kimiyyar zamantakewar al'umma suna amfani da su har kusan idan akwai masana kimiyyar zamantakewa. Abin da ya canza, duk da haka, ƙaddamarwa ce, wanda ya sa ya zama sauƙi ga gwamnatoci su tattara, watsawa, adana, da kuma nazarin bayanai. Alal misali, a cikin wannan babi, zan gaya maka game da binciken da aka sake juyo daga matakan mita na mita a birnin New York City domin magance wata muhawara mai mahimmanci a harkokin tattalin arziki (Farber 2015) . Bayan haka, a cikin surori na gaba, zan fada maka yadda aka yi amfani da bayanan zabe na gwamnati a wani binciken (Ansolabehere and Hersh 2012) da gwaji (Bond et al. 2012) .

Ina tsammanin ra'ayin sake dawowa yana da mahimmanci ga koyo daga manyan asusun bayanai, sabili da haka, kafin magana musamman game da kaddarorin manyan bayanan bayanan (sashe na 2.3) da kuma yadda za a iya amfani da su a bincike (sashi na 2.4), Ina so don bayar da bangarorin biyu game da sake dawowa. Da farko, yana iya zama mai jaraba don tunani game da bambancin da na kafa a tsakanin "samo" bayanai da "tsara" bayanai. Shi ke kusa, amma ba haka ba ne. Ko da yake, daga masu bincike, manyan "bayanai" sun samo asali, ba kawai suna fada daga sama ba. Maimakon haka, asalin bayanan da "samo" suka samo daga masu bincike sun tsara wani don wani dalili. Saboda "samin" wanda aka samo asali ne, wani lokaci na bayar da shawarar cewa kayi ƙoƙarin fahimtar da yawa game da mutane da kuma matakai da suka samar da bayananka. Na biyu, lokacin da kake kididdiga bayanai, sau da yawa yana da muhimmanci a yi la'akari da yanayin da zai dace don matsalarka sannan kuma kwatanta wannan dataset ɗin da aka saba da wanda kake amfani dashi. Idan ba ku tattara bayanan ku ba, to akwai yiwuwar bambance-bambance tsakanin abin da kuke so da abin da kuke da shi. Ganin cewa waɗannan bambance-bambance zasu taimaka wajen bayyana abin da za ka iya kuma ba zai iya koya daga bayanan da kake da shi ba, kuma zai iya bada shawara ga sabon bayanai da ya kamata ka tattara.

A cikin kwarewa, masana kimiyyar zamantakewar jama'a da masana kimiyya sunyi kusanci da sake dawowa da bambanci. Masana kimiyyar zamantakewa, waɗanda suka saba aiki tare da bayanan da aka tsara don bincike, suna da sauri don nuna matsaloli tare da bayanan da aka sake juyo yayin da basu kula da ƙarfinta ba. A gefe guda kuma, masana kimiyyar bayanai suna da sauri don nuna amfanin amfanin bayanan da aka sake dawowa yayin da basu kula da kasawarsa ba. A dabi'a, mafi dacewa shine matasan. Wato, masu bincike sun bukaci fahimtar halaye na manyan bayanan bayanai-da nagarta da kuma mummunan-sannan kuma su fahimci yadda zasu koya daga gare su. Kuma, wannan shine shirin don sauran wannan babi. A cikin sashe na gaba, zan bayyana alamomi guda goma na manyan masanan bayanai. Bayan haka, a cikin sashe na gaba, zan bayyana abubuwa uku na bincike wanda zai iya aiki tare da irin waɗannan bayanai.