2.2 data Big

Data Big Zidalwa kwaye iqokelelwe ngoorhulumente ngenxa ngaphandle kwe zophando iinjongo. Ukusebenzisa le iinkcukacha zophando, ngoko ke, ifuna UkuFumanela.

Olunemboniselo idealized zophando lwentlalo inomfanekiso sisazinzulu enehlala ingcamango uze nokuqokelela iinkcukacha ukuvavanya ukuba ingcamango. Olu hlobo lophando kukhokelela kufanelekile ngokuqinileyo phakathi lemibuzo yophando idata, kodwa anqongophele ngenxa yokuba Umphandi ngamnye amaxesha azinazo izinto ezifunekayo ukuqokelela iinkcukacha abazifunayo, ezinjengolwazi amakhulu, izityebi, kwaye kuzwelonke-ummeli. nophando zentlalo Ngoko ke, uphando oluninzi lwezentlalo elidlulileyo kusetyenziswa ezinkulu-isikali, ezifana General Social Survey (GSS), yaseMerika Study loNyulo kuZwelonke (ANES), kunye nePhaneli Study of Income Dynamics (PSID). Ezi saveyi elikhulu-isikali aqhutywa ngokubanzi liqela labaphandi kwaye ziyilelwe ukuba ukudala data ezinokuthi zisetyenziswe ngabaphandi abaninzi. Ngenxa yeenjongo zale iisaveyi ezinkulu-isikali, inkathalo enkulu wafakwa uhlengahlengiso ukuqokelelwa kweenkcukacha nokulungiselela idata ebangela ukusetyenziswa ngabaphandi. Ezi data abaphandi kunye nabaphandi.

Uninzi lophando loluntu ngokusebenzisa imithombo ubudala yedijithali Noko ke, bohluke. Endaweni yokusebenzisa data eqokelelwe abaphandi kunye nabaphandi, isebenzisa imithombo yedatha ezazenziwe kwaye iqokelelwe amashishini kunye noorhulumente iinjongo zabo ezifana ukwenza ingeniso, ngokubonelela ngeenkonzo, okanye asebenzise umthetho. Olu amashishini norhulumente imithombo yedatha weza kubizwa ngokuba data ezinkulu. Ukwenza uphando data enkulu yahlukile kunokuba ukwenza uphando kunye data leyo wadalwa uphando. Thelekisa, umzekelo, iwebhusayithi eendaba zentlalo, ezifana Twitter, apho uphando uluvo loluntu emveli anjalo njengoko General Social Survey (GSS). iinjongo eziphambili Twitter zezi ukubonelela ngenkonzo kubasebenzisi kunye nokwenza inzuzo. Xa inkqubo ekuphumezeni ezi njongo, Twitter yenza data kunokuba luncedo ukufunda iinkalo ezithile izimvo zoluntu. Kodwa ke, ngokungafaniyo General Social Survey (GSS), Twitter ayikho ingqalelo ngokuyintloko kuphando lwentlalo.

Idatha enkulu binzana frustratingly mfiliba, kwaye amaqela kunye izinto ezininzi ezahlukeneyo. Ngeenjongo zophando zentlalo, ndicinga ukuba kuluncedo ukwahlula phakathi iintlobo ezimbini imithombo yedatha enkulu:. Iirekhodi zolawulo kurhulumente kunye namashishini iirekhodi zolawulo zikaRhulumente iirekhodi lolawulo data wadalwa ngoorhulumente njengenxalenye yemisebenzi yabo yesiqhelo. Ezi ndidi zeerekodi ziye asetyenziswa abaphandi elidlulileyo-ezifana labemi ukufunda sokuzalwa, ingxelo-kodwa umtshato, nokufa oorhulumente baqokele kusanda ibukhuphe iirekhodi ezineenkcukacha kwiintlobo analyzable. Umzekelo, urhulumente City New York efakiweyo iimitha digital ngaphakathi zonke iteksi esixekweni. Ezi iimitha ukurekhoda zonke iintlobo data malunga ukhwele ngamnye iteksi kuquka nomqhubi, ixesha lokuqala nendawo, ixesha stop nendawo, kwaye yokukhwela. Kuphando ukuba ndiza kukuxelela kamva kwesi sahluko, Henry Farber (2015) selebonwa ezi data ukujongana ingxoxo olusiseko Economics yabasebenzi malunga nolwalamano phakathi umvuzo ngeyure kunye inani leeyure ezisetyenziweyo.

Uhlobo lwesibini engundoqo data omkhulu wophando zentlalo na ishishini iirekhodi zolawulo. Ezi data ishishini ukwenza aqokelele njengenxalenye yemisebenzi yabo yesiqhelo. Ezi ngxelo zolawulo ishishini kudla ngokuba imizila digital, kwaye kuquka izinto ezifana logs umbuzo enjini yokukhangela, izithuba eendaba zentlalo, ubize iirekhodi ukusuka mobile phones. Okubalulekileyo, le ngxelo zolawulo ishishini azikho nje malunga nokuziphatha online. Umzekelo, iivenkile ezisebenzisa yabavavanyi khangela-phandle zidala amanyathelo okwixesha-lokwenene yemveliso umsebenzi. Kwisifundo ukuba ndiza kukuxelela malunga kamva kwesi sahluko, Alexandre Mas kunye uEnrico Moretti (2009) selebonwa le esuphamakethi data khangela-ukuba bafunde njani imveliso wabasebenzi ithe kwachatshazelwa imveliso oontanga babo.

Njengoko zombini le mizekelo ibonisa, ingcamango UkuFumanela sisisekelo ukuba ukufunda data ezinkulu. Ngokwamava am, ntle kunye izazinzulu data basondele oku UkuFumanela ngokwahlukileyo kakhulu. ntle, baqhele ukusebenza data ezenzelwe uphando, bakhawuleza ukuchaza iingxaki idatha selebonwa ngelixa igatya amandla ayo. Kwelinye icala, izazinzulu data bakhawuleza ukuchaza iinzuzo data selebonwa ngelixa igatya ubuthathaka zayo. Ngokwendalo, yeyona ndlela iyakuba gqith. Oko kukuthi, abaphandi kufuneka baqonde iimpawu zezi imithombo ezintsha data-okuhle nokubi-uze ukuyazi indlela ufunde kubo. Kwaye, oko ke isicwangciso nasesi sahluko. Okulandelayo, ndiya ukuchaza iimpawu ezilishumi eliqhelekileyo amashishini norhulumente data zolawulo. Emva koko, ndiya achaza iindlela ezintathu uphando enokusetyenziswa nezi data, iindlela ezilungele kakuhle iimpawu le data.