2.2 date Big

Nyaya huru yakasikwa uye inounganidzwa nemakambani nehurumende kune zvimwe zvinangwa kunze kwekutsvakurudza. Kushandisa iyi data yekutsvakurudza, saka, inoda kudzokorora.

Nzira yekutanga iyo vanhu vakawanda vanosangana nekutarisa kwevanhu munharaunda yemadhora ndeye iyo inowanzonzi guru data . Pasinei nekushandiswa kwemazwi aya, hakuna kubvumirana pamusoro pezvinhu zvakakura zvakadzama kunyange. Zvisinei, imwe yenotsanangudzo yakawanda yehurukuro huru inotarisa "3 Vs": Volume, Variety, uye Velocity. Zvichida, pane dzakawanda dze data, mumhando dzakasiyana-siyana, uye iri kusikwa nguva dzose. Vamwe vanyori vemakuru makuru vanowedzerawo mamwe ma "Vs" akadai seVeracity uye Value, asi vamwe vatsoropodzi vanowedzera VS dzakananga uye Vakasununguka. Pane kuti "Vs" (kana kuti 5 "Vs" kana kuti 7 "Vs"), nokuda kwekutsvakurudza kwevanhu, ndinofunga nzvimbo yakanaka kunotanga ndeye "Ws": Who, What, Where, When , uye Sei. Zvechokwadi, ndinofunga kuti mazhinji ematambudziko uye mikana yakagadzirwa nehurukuro inotora mashoko inotevera kubva kune imwe chete "W": Sei.

Mune mamiriro ekufananidza, dzakawanda zvemashoko akashandiswa pakutsvakurudza kwemagariro evanhu zvakasikwa kuitira chinangwa chekutsvakurudza. Muzera re digital, zvakadaro, deta yakawandisa iri kushandiswa nekambani nehurumende kune zvimwe zvinangwa kunze kwekutsvakurudza, zvakadai sekupa rubatsiro, kugadzira purogiramu, nekugadza mitemo. Vanhu vakagadzirira, zvisinei, vakaziva kuti munogona kudzorera iyi data yehurumende uye yehurumende yekutsvakurudza. Kufungidzira kumashure kufananidzi yekufananidzira muchitsauko 1, sezvo Duchamp akadzokorora chimwe chinhu chakawanikwa kuti aumbe unyanzvi, masayendisiti anogona kugadzirisazve deta kuti aite tsvakurudzo.

Nepo pane pasina mubvunzo mikana yakawanda yekudzokorora, kushandiswa kwemashoko asina kuumbwa nekuda kwekutsvakurudza kunounzawo matambudziko matsva. Enzanisa, somuenzaniso, rubatsiro rwemagariro evanhu, zvakadai se Twitter, neongororo yemagariro evanhu, zvakadai seGeneral Social Survey. Twitter's main goals are to provide service to its users and to make profits. Iwo General Social Survey, kune rumwe rutivi, inonyanya kugadzirisa mukugadzira dhiyabhorosi-chinangwa chekutsvakurudza kwevanhu, kunyanya mukutsvakurudza kwepfungwa dzevanhu. Iyi kusiyana pakati pezvinangwa zvinoreva kuti data yakagadzirwa ne Twitter uye iyo yakasikwa neGeneral Social Survey ine dzimba dzakasiyana, kunyange zvazvo zvose zvingashandiswa pakudzidza maonero evanhu. Twitter inoshanda pamwero uye inokurumidza iyo General Social Survey haikwanisi kufanana, asi, kusiyana neGener Social Survey, Twitter haina kunyatsoongorora vashandisi uye haishande zvakaoma kuchengetedza kufanana nekufamba kwenguva. Nemhaka yokuti idzi mbiri dzinotora mashoko dzakasiyana zvakasiyana, hazvina musoro kutaura kuti General Social Survey iri nani pane Twitter kana zvakasiyana. Kana uchida zviyero zveawa imwechete zvepfungwa dzese (semuenzaniso, Golder and Macy (2011) ), Twitter yakanakisisa. Kune rumwe rutivi, kana iwe uchida kunzwisisa kushanduka kwemazuva kwenguva refu mukuronga kwemafungiro muUnited States (semuenzaniso, DiMaggio, Evans, and Bryson (1996) ), ipapo General Social Survey ndiyo sarudzo yakanakisisa. Zvimwe zvinowanzoitika, panzvimbo yekuedza kutaura kuti makuru makuru ezvinyorwa zviri nani kana kuti akaipisisa kupfuura mamwe marudzi e data, chitsauko ichi chichaedza kujekesa kuti ndeapi mibvunzo yekutsvakurudza mibvunzo yakakura yezvinyorwa zvemashoko ane zvinhu zvinonakidza uye ndeapi mibvunzo yavasingave yakanaka.

Paunofunga nezvemashoko makuru ekutsvaga, vatsvakurudzi vakawanda vanobva vatanga kuisa paIndaneti data yakagadzirwa uye yakaunganidzwa nemakambani, akadai sejeri rekutsvaga injini uye zvinyorwa zvekugarisana nevamwe. Zvisinei, kutsvaga uku kwakanyanyisa kunobuda kune dzimwe mbiri mbiri dzinokosha dzemadhora makuru. Kutanga, kunyanya kushamwaridzana kwemashoko makuru kubva kumagetsi ejairo munyika. Semuenzaniso, muchitsauko chino, ini ndichakuudza nezvekudzidza kwakadzokorora dhipatimendi chekucherechedza zvinyorwa kuti uone kuti kushanda kwevashandi kunobata sei nekubudirira kwevezera rake (Mas and Moretti 2009) . Zvadaro, mune zvitsauko zvinotevera, ini ndichakuudza nezvevatsvakurudzi vakashandisa mafoni (Blumenstock, Cadamuro, and On 2015) uye data yekubhadhara yakagadzirwa nemagetsi emagetsi (Allcott 2015) . Sezvo mienzaniso iyi inoratidzira, hutungamiri hukuru hwemashoko ehuwandu hunenge huri pamusoro pehutano hwemutambo.

Chechipiri chinonyanya kukosha chemashoko makuru asina kukonzerwa nekufungisisa kudarika pahutano hwepakombiyuta dhesi rakagadzirwa nehurumende. Iyi data yehurumende, inotsvakurudzwa nevatsvakurudzo hurumende yehutori hwemashoko, inosanganisira zvinhu zvakadai sematareji emitero, zvinyorwa zvechikoro, uye zvakakosha zvinyorwa zvese (semuenzaniso, nhoroondo dzekuberekwa nekuzvarwa). Hurumende dzave dzichigadzira dudzi urwu rwemashoko, mune dzimwe nguva, mazana emakore, uye masayendisiti masayendisiti ave achivashandisa zvese kusvika chero nguva yakave yakambove yeruzhinji masayendisiti. Chii chakashanduka, zvisinei, kuongorora, izvo zvakaita kuti zvive nyore nyore kuti hurumende dziunganidze, dzichitumire, dzichengetedze, uye dziongorore data. Somuenzaniso, muchitsauko chino, ini ndichakuudza pamusoro pekudzidzira kwakadzokorora nhamba kubva kuNew York City hurumende yemagetsi ematareji ematare kuitira kuti agadzirise nharo huru mumabasa emari (Farber 2015) . Zvadaro, mune zvitsauko zvinotevera, ini ndichakuudza kuti hurumende yakaunganidza sei marekodhi ekuvhota yakashandiswa mukuongorora (Ansolabehere and Hersh 2012) uye kuedza (Bond et al. 2012) .

Ndinofunga kuti pfungwa yekudzorerazve inokosha pakudzidza kubva kune makuru makuru ezvinyorwa, uye saka, tisati tava kunyanya kutaura pamusoro pezvinhu zvemashoko makuru makuru (chikamu 2.3) uye kuti ingashandiswa sei pakutsvakurudza (chikamu 2.4), ndinoda kupa zvikamu zviviri zvemazano akawanda pamusoro pokudzokorora. Kutanga, zvinogona kuva kuedza kufunga pamusoro pekusiyana kwandakagadzira sekuve pakati pe "data" uye "yakagadzirwa" data. Icho chiri pedyo, asi hachisi chakanaka. Kunyange zvazvo, kubva pakuona kwevatsvakurudzi, makuru makuru ezvinyorwa "anowanikwa," haangorambi achibva kudenga. Pane kudaro, zvinyorwa zvitsva izvo "zvakawanikwa" nevatsvakurudzi zvakagadzirwa nemunhu kune chimwe chinangwa. Nokuti "yakawanikwa" data yakagadzirwa nemumwe munhu, ndinogara ndichikurudzira kuti iwe uedze kunzwisisa zvakanyanyisa pamusoro pevanhu uye nzira dzakagadzira data yako. Chechipiri, paunenge uchidzokorora deta, inowanzobatsira zvikuru kufungidzira yakakodzera dataset yechinetso chako wozoenzanisa iyo yakanaka yedasetet neyoyo yauri kushandisa. Kana iwe usina kutora data yako iwe pachako, pane zvingangove zvakakosha kusiyana pakati pezvaunoda uye zvaunazvo. Kuziva kusiyana uku kuchabatsira kujekesa zvaunogona uye haugoni kudzidza kubva kune dheta raunayo, uye zvinogona kuratidza dhepfenyuro itsva yaunofanira kuunganidza.

Muhupenyu hwangu, masayendisiti ezvesuzhinji uye nhoroondo dzesayenzi dzinowanzotaurira kudzokorora zvakasiyana zvakasiyana. Masayendisiti ezvesuzhinji, avo vanojaira kushanda ne data yakagadzirirwa kutsvakurudza, vanowanzokurumidza kuratidzira matambudziko nekodzokodzwa dhesi asi vasingateereri simba rayo. Kune rumwe rutivi, dasayendisiti dhiyabhorosi inowanzokurumidza kuratidza zvikomborero zvekudzokorora deta asi nekuregeredza kushaya simba kwayo. Zvinonzwisisika, nzira yakanakisisa inoshandiswa. Izvi zvinoreva kuti vatsvakurudzi vanoda kunzwisisa maitiro ezvinyorwa zvitsva-zvese zvakanaka nezvakaipa-ndokuzoverenga nzira yekudzidza kubva kwavari. Uye, iyo ndiyo chirongwa chezvasara zvechitsauko chino. Muchikamu chinotevera, ini ndicharondedzera gumi zvakafanana maitiro ezvinyorwa zvemashoko makuru. Zvadaro, muchikamu chinotevera, ndicharondedzera nzira mbiri dzekutsvakurudza dzinogona kushanda zvakanaka nedhimwe data.