3.6.1 Kuthuthukiswe ukubuza

Ekubuzeni okucebile, idatha yocwaningo yenza ukwakha umongo eduze nomthombo omkhulu wedatha oqukethe izilinganiso ezibalulekile kodwa ulahlekelwa abanye.

Enye indlela yokuhlanganisa idatha yocwaningo kanye nemithombo emikhulu yedatha yinkqubo engizoyishayela ecebile ekubuzeni . Ekucekeni okucebile, umthombo omkhulu wedatha uqukethe izilinganiso ezibalulekile kodwa awunaminye isilinganiso ukuze umcwaningi eqoqe lezi zilinganiso ezingekho embonweni bese ehlanganisa imithombo emibili yedatha ndawonye. Isibonelo esisodwa sokubuza okucebile ukuhlolwa Burke and Kraut (2014) mayelana nokuthi ukusebenzisana kwi-Facebook kwandisa amandla obungane, engikuchazile esigabeni 3.2). Kulokho, uBurke noKraut bahlangene idatha yokuhlola ne-Facebook log data.

Isimo lapho uBurke noKraut basebenza khona, kodwa kwakusho ukuthi akudingeki babhekane nezinkinga ezimbili ezinkulu abacwaningi abazenzayo abacebisa ukubuza ngokujwayelekile ubuso. Okokuqala, empeleni ukuxhumanisa ndawonye amasethingi wedatha-level, inqubo ebizwa ngokuthi ukuhlanganiswa kwerekhodi , kungaba nzima uma kungekho sikhombisi esiyingqayizivele emithonjeni yombili yedatha engasetshenziswa ukuqinisekisa ukuthi irekhodi elifanele kwelinye ididatha lihambisana nerekhodi elungile kwelinye iphasiwedi. Inkinga yesibili eyinhloko ngokubuza okucebile ukuthi izinga lomthombo omkhulu wemininingwane luzoba nzima kubacwaningi ukuba bahlole ngoba inqubo edalwe ngayo idatha ingaba ne-proprietary futhi ingaba nezinkinga eziningi ezichazwe esahlukweni 2. Ngamanye amazwi, ukucebisa ukubuza kuyohlale kuvumelanisa ukuhlanganiswa okungekho emthethweni kokuhlolwa kumithombo yedatha yamabhokisi amnyama wekhwalithi engaziwa. Naphezu kwalezi zinkinga, noma kunjalo, ukucebisa ukucela kungasetshenziswa ukuqhuba ucwaningo olubalulekile, njengoba kwaboniswa nguStephen Ansolabehere no-Eitan Hersh (2012) ekucwaningweni kwabo emaphethweni wokuvota e-United States.

Ukuvota ukuvotela kuye kwaba yingxenye yocwaningo olunzulu kwisayensi yezopolitiki, futhi, esikhathini esedlule, ukuqonda kwabacwaningi ukuthi ngubani ovoti futhi kungani ngokuvamile kuye kusekelwe ekuhlaziyweni kwedatha yocwaningo. Ukuvotela e-United States, kodwa-ke, ukuziphatha okungavamile ukuthi uhulumeni uthola ukuthi ngabe isakhamuzi ngasinye sivotele (Yebo, uhulumeni akarekhodi ukuthi isakhamuzi ngasinye sivotela). Kwaphela iminyaka eminingi, la marekhodi kahulumeni wokuvota ayatholakala emafomu ephepha, ahlakazekile emahhovisi ahlukene kahulumeni wasekhaya ezweni lonke. Lokhu kwakwenza kube nzima kakhulu, kodwa akunakwenzeka, ngoba ososayensi bezombangazwe babe nomfanekiso ophelele wabakhethi futhi baqhathanise lokho abantu abakushoyo ekuhloleni ngokuziphatha kwabo okuvota (Ansolabehere and Hersh 2012) .

Kodwa la marekhodi okuvotela asekhishwe manje, futhi izinkampani eziningi ezizimele ziqoqiwe ngokuhlelekile futhi zihlangene ukuze zikhiqize amafayela amaningi wokuvota afaka ukuziphatha kokuvota kwabo bonke abaseMelika. U-Ansolabehere no-Hersh babambisana nenye yezinkampani-i-Catalist LCC-ukuze basebenzise ifayela labo elivota lokusiza ukuthuthukisa isithombe esingcono sabakhethi. Ngaphezu kwalokho, ngoba ukutadisha kwabo kuncike kumarekhodi edijithali eqoqwe futhi ephikisiwe yinkampani eyayisetshenzisile imithombo eningi ekuqoqweni kwedatha nokuvumelanisa, yanikeza izinzuzo ezimbalwa ngaphezu kwemizamo yangaphambili eyayenziwe ngaphandle kwezinkampani kanye nokusebenzisa amarekhodi e-analog.

Njengemithombo eminingi yedatha enkulu esahlukweni sesi-2, ifayela le-Catalist eliyinhloko alifaki ulwazi oluthe xaxa lwama-demographic, isimo sengqondo, nokuziphatha okwenziwa u-Ansolabehere no-Hersh. Eqinisweni, babe nesithakazelo ikakhulukazi ngokuqhathanisa ukuziphatha okubikiwe okubikiwe ekuhloleni ngokuziphatha okuqinisekisiwe kokuvota (okungukuthi, imininingwane egciniwe le-Catalist database). Ngakho u-Ansolabehere noHersh baqoqa idatha ababeyifuna njengenhlolovo enkulu yomphakathi, i-CCES, okukhulunywe ngayo ekuqaleni kwalesi sahluko. Bese banikeza idatha yabo kwisiCatalist, futhi uCatalist yababuyisa ifayela elihlanganisiwe lefayela elibandakanya ukuziphatha okuvotelwe okuvela (kusukela ku-Catalist), ukuziphatha okubikiwe okuvotelwe (kusukela ku-CCES) kanye nokubalwa kwabantu kanye nemibono yabaphenduli (kusuka ku-CCES) (isibalo 3.13). Ngamanye amazwi, u-Ansolabehere no-Hersh bahlanganisa idatha yokurekhoda yokuvota nedatha yocwaningo ukuze kwenziwe ucwaningo olungenakwenzeka ngomthombo wedatha ngabanye.

Umdwebo 3.13: Isimiso sokucwaninga ngo-Ansolabehere no-Hersh (2012). Ukudala i-datafile yedatha, i-Catalist ihlanganisa futhi ivumelanise ulwazi kusuka emithonjeni eminingi ehlukene. Le nqubo yokuhlanganisa, kungakhathaliseki ukuthi iqaphele kangakanani, izosakaza amaphutha kwimithombo yangempela yedatha futhi izokwethula amaphutha amasha. Umthombo wesibili wamaphutha ukuhlanganiswa kwerekhodi phakathi kwedatha yocwaningo kanye ne-datafile eyinhloko. Uma wonke umuntu enesimo esiqinile, esiyingqayizivele emithonjeni yombili yedatha, ukuxhumeka bekuyoba okuncane. Kodwa, isiCatalist kwadingeka senze ukuxhumanisa besebenzisa izihlonzi ezingaphelele, kuleli gama igama, ubulili, unyaka wokuzalwa, nekheli lekhaya. Ngeshwa, amacala amaningi angahle abe nolwazi olungaphelele noma olungalungile; uHomer Simpson ovotelayo angase avele njengoHomer Jay Simpson, uHomie J Simpson, noma uHomer Sampsin. Naphezu kwamathuba amaphutha ku-data-masterfile ye-Catalist namaphutha ekuxhumaneni kwerekhodi, u-Ansolabehere no-Hersh bakwazi ukwakha ukuqiniseka kokulinganisa kwabo ngokusebenzisa izinhlobo ezahlukene zokuhlola.

Umdwebo 3.13: Ansolabehere and Hersh (2012) sokucwaninga ngo- Ansolabehere and Hersh (2012) . Ukudala i-datafile yedatha, i-Catalist ihlanganisa futhi ivumelanise ulwazi kusuka emithonjeni eminingi ehlukene. Le nqubo yokuhlanganisa, kungakhathaliseki ukuthi iqaphele kangakanani, izosakaza amaphutha kwimithombo yangempela yedatha futhi izokwethula amaphutha amasha. Umthombo wesibili wamaphutha ukuhlanganiswa kwerekhodi phakathi kwedatha yocwaningo kanye ne-datafile eyinhloko. Uma wonke umuntu enesimo esiqinile, esiyingqayizivele emithonjeni yombili yedatha, ukuxhumeka bekuyoba okuncane. Kodwa, isiCatalist kwadingeka senze ukuxhumanisa besebenzisa izihlonzi ezingaphelele, kuleli gama igama, ubulili, unyaka wokuzalwa, nekheli lekhaya. Ngeshwa, amacala amaningi angahle abe nolwazi olungaphelele noma olungalungile; uHomer Simpson ovotelayo angase avele njengoHomer Jay Simpson, uHomie J Simpson, noma uHomer Sampsin. Naphezu kwamathuba amaphutha ku-data-masterfile ye-Catalist namaphutha ekuxhumaneni kwerekhodi, u-Ansolabehere no-Hersh bakwazi ukwakha ukuqiniseka kokulinganisa kwabo ngokusebenzisa izinhlobo ezahlukene zokuhlola.

Ngefayela labo elihlanganisiwe yedatha, u-Ansolabehere no-Hersh bafika eziphethweni ezintathu ezibalulekile. Okokuqala, ukubika ngokweqile kokuvota kuvamile: cishe isigamu sama-nonvoters abike ukuvota, futhi uma othile ebika ukuvota, kunamathuba angu-80% kuphela ukuthi abavotele. Okwesibili, ukubika ngokweqile akuyona okungahleliwe: ukubika ngokweqile kuvame kakhulu phakathi kwabangenayo imali eningi, abafunde kahle, ababambisene nabo abasebenza ezindabeni zomphakathi. Ngamanye amazwi, abantu abavame ukuvota nabo kungenzeka banamanga ngokuvota. Okwesithathu, futhi okugxile kakhulu, ngenxa yemvelo ehlelekile yokubika, ukungafani okwamanje phakathi kwabavoti nabangabonayo kuncane kunokuba kubonakale nje kusuka ekuhloleni. Isibonelo, labo abaneziqu ze-bachelor's cishe amaphuzu angu-22 amathuba okubika ukuvotela, kodwa amaphuzu angu-10 kuphela amathuba okuvota. Lokhu kuvela, mhlawumbe akumangalisi ukuthi imibono ekhona yokusekela imithombo yokuvota ingcono kangcono ekubikezeni ukuthi ngubani ozobika ukuvota (yiyiphi idatha abacwaningi abasetshenziswe esikhathini esidlule) kunokuba babikezela ukuthi ngubani ngempela amavoti. Ngakho-ke, ukutholakala komqondo ka- Ansolabehere and Hersh (2012) kubiza amakhophi amasha ukuqonda nokubikezela ukuvota.

Kodwa kufanele sithembele kangakanani lemiphumela? Khumbula, le miphumela incike ekuxhumaniseni okungekho emthethweni kwedatha yamabhokisi amnyama namanani angaziwa ephutha. Ngokuqondile, imiphumela ihambisana nezinyathelo ezimbili eziyinhloko: (1) ikhono le-Catalist ukuhlanganisa imithombo eminingi yedatha engafani ukuze kuvezwe ifayela eliyinkimbinkimbi yedatha kanye (2) nekhono leCatalist ukuxhumanisa idatha yocwaningo ku-masterfile yedatha yayo. Ngalinye yalezi zinyathelo kunzima, futhi amaphutha kunoma isiphi isinyathelo angaholela abacwaningi kwiziphetho ezingalungile. Kodwa-ke, kokubili ukucubungula idatha nokuxhumanisa kubalulekile ekuqhubekeleni phambili kweCatalist njengenkampani, ngakho-ke kungatshala izimali ekuxazululeni lezi zinkinga, kaningi ngezinga lapho kungekho mcwaningi wezemfundo ongafanisa khona. Emaphepheni abo, u-Ansolabehere no-Hersh bahamba ngezinyathelo eziningana ukuze bahlole imiphumela yalezi zinyathelo ezimbili-nakuba ezinye zazo zingabanikazi-futhi lokhu kuhlola kungase kube usizo kwabanye abacwaningi abafisa ukuxhumanisa idatha yocwaningo kumininingwane enkulu yebhokisi lamnyama imithombo.

Yiziphi izifundo ezijwayelekile abacwaningi abangayithola kulolu cwaningo? Okokuqala, kunenani elibaluleke kakhulu kokubili ekuthuthukiseni imithombo emikhulu yedatha ngemininingwane yocwaningo kanye nasekuthuthukiseni idatha yocwaningo ngemithombo emikhulu yedatha (ungabona lolu cwaningo noma indlela). Ngokuhlanganisa la mithombo emibili yedatha, abacwaningi bakwazi ukwenza okuthile okungenakwenzeka ngabanye ngabanye. Isifundo sesibili esilandelayo ukuthi yize kunjalo, imithombo yedatha yezohwebo, njengedatha esuka ku-Catalist, akumele ithathwe ngokuthi "iqiniso eliyisisekelo," kwezinye izimo, ingaba usizo. Ngezinye izikhathi ama-skeptics aqhathanisa umthombo wedatha wezentengiselwano, wezentengiselwano ngeqiniso eliphelele futhi ukhombise ukuthi le mithombo yedatha iyancipha. Kodwa-ke, kulokhu, abahlukumezayo benza ukuqhathaniswa okungalungile: yonke idatha abacwaningi abayisebenzisa ayifinyeleli iqiniso eliphelele. Esikhundleni salokho, kungcono ukuqhathanisa imithombo egciniwe yedatha yezohwebo neminye imithombo yedatha etholakalayo (isb., Ukuziphatha okubikiwe okuzibizayo), okungaze kube namaphutha. Ekugcineni, isifundo sesithathu esiphezulu sokufunda kuka-Ansolabehere noHersh ukuthi kwezinye izimo, abacwaningi bangazuza emalini enkulu eyenziwa yizinkampani ezizimele ekuqoqeni nasekuvumelaneni nezintambo zezenhlalo eziyinkimbinkimbi.