5.2.1 Galaxy Zoo

Galaxy Zoo idibanisa imigudu amavolontiya ezininzi non-ingcali lihlele kwesigidi iminyele.

Galaxy Zoo wakhula ngaphandle ingxaki ejongene Kevin Schawinski, umfundi isidanga ngeenkwenkwezi kwiYunivesithi Oxford ngo-2007 Lula kancinane, Schawinski wayenomdla iminyele, neminyele zinokuhlelwa kweendunduma-rhoxiswa okanye zabo kwamagqabi-kwaye babo umbala-blowu okanye ebomvu. Ngelo xesha, ubulumko aqhelekileyo phakathi ngeenkwenkwezi yayikukuba iminyele kwamagqabi, ezifana iMilky Way, baba blue ngebala (ulutsha ebonisa) kwaye iminyele ombhoxo abomvu ngebala (ebonisa abadala). Schawinski ndabuza oku ubulumko eziqhelekileyo. Yena bazindla ukuba nangona oku kunokuba yinyaniso ngokubanzi, apho kusenokwenzeka ukuba inani elithe ngaphandle, yaye ngokufunda amaqashiso ezi iminyele-okungaqhelekanga abo abazange ezingangeniyo kulindeleke ipatheni-awayenokuyiqonda okuthile malunga nenkqubo apho iminyele abembumbile.

Ngoko ke, yintoni Schawinski efunekayo ukuze sibhukuqe ubulumko eqhelekileyo yaba iseti elikhulu iminyele morphologically zezifakwe; oko kukuthi, iminyele ukuba nyonke nokuba luphelelwe okanye skip. Ingxaki ke, ukuba iindlela ezikhoyo algorithmic sokuhlelwa ningekabi angakuboni okulungileyo ngokwaneleyo ukuba isetyenziselwe zophando lwezenzululwazi; ngamanye amazwi, iminyele ukuhlelwa ke, ngelo xesha, ingxaki ukuba nzima iikhompyutha. Ngoko ke, oko Kwakufuneka inani elikhulu iminyele zabantu zihlelwe. Schawinski yathatha le ngxaki ukuhlelwa kunye Siyithandile umfundi isidanga. Kwiseshoni imarathoni le, iintsuku ezisixhenxe iiyure ezingama-12, wakwazi ukuba uhlele iminyele 50,000. Nangona iminyele 50,000 meko kakhulu, nguwe kanye 5% kuphela iminyele phantse yaba sisigidi ukuba sele ukufota kwi Sloan Digital Sky Survey. Schawinski waqonda ukuba kwakufuneka indlela mizobo ngaphezulu.

Ngethamsanqa, kuvele elubala ukuba umsebenzi iminyele zokuhlela ayifuni uqeqesho ngeenkwenkwezi; unako ukufundisa umntu ukuba enze oko ngokukhawuleza. Ngamanye amazwi, nangona zokuhlela iminyele ngumsebenzi ukuba nzima iikhompyutha, nto intle lula ngabantu. Ngoko, xa wayehleli thwengisa e Oxford, Schawinski kunye ngeenkwenkwezi namanye Chris Lintott waphupha i website apho matsha uhlele imifanekiso iminyele. Kwiinyanga ezimbalwa kamva, Galaxy Zoo wazalwa.

Ngelo website Galaxy Zoo, amavolontiya lokunyulwa imizuzu embalwa noqeqesho; umzekelo, ukufunda umahluko phakathi uwele kunye skip (Figure 5.2). Emva olu qeqesho, amavolontiya kwafuneka Kwathi a 11 15 iminyele zokuhlela lula ngokwentelekiso quiz-ngokuchanekileyo eyaziwa wokuhlelwa-kwandule yevolontiya iza kuqalisa uhlelo lokwenene iminyele engaziwayo ngapha kojongano web-based ezilula (Figure 5.3). Yinguqu ivolontiya ukuya ngeenkwenkwezi kuya kwenzeka ngemizuzu ngaphantsi kwama-10 yaye kufuneka bapase iphantsi imiqobo, ikhwizi elula kuphela.

Isazobe 5.2: Imizekelo yeentlobo ezimbini eziphambili iminyele: luphelelwe kunye skip. Iprojekthi Galaxy Zoo kusetyenziswa amavolontiya angaphezu kwama-100,000 ukuya kwiindidi ngaphezu imifanekiso 900,000. Source: www.galaxyzoo.org.

Isazobe 5.2: Imizekelo yeentlobo ezimbini eziphambili iminyele: luphelelwe kunye skip. Iprojekthi Galaxy Zoo kusetyenziswa amavolontiya angaphezu kwama-100,000 ukuya kwiindidi ngaphezu imifanekiso 900,000. Source: www.galaxyzoo.org .

Isazobe 5.3: Ungeniso screen apho abavoti bacelwa ukuba ukuhlela umfanekiso enye. Source: www.galaxyzoo.org.

Isazobe 5.3: Ungeniso screen apho abavoti bacelwa ukuba ukuhlela umfanekiso enye. Source: www.galaxyzoo.org .

Galaxy Zoo umdla ntliziyo yayo yokuqala emva kokuba projekthi ibandakanywa inqaku leendaba, yaye malunga neenyanga ezintandathu projekthi wakhula ukubandakanya ngaphezulu kwama-100,000 izazinzulu ummi, abantu inxaxheba ngenxa bakuvuyele msebenzi kunye babefuna ukunceda ngeenkwenkwezi kwangaphambili. Kunye, ezi amavolontiya-100,000 negalelo elipheleleyo ulwahlulo ngaphezu kwezigidi ezingama-40, apho uninzi wokuhlelwa evela kwiqela elincinane ngokwentelekiso, core-nxaxheba (Lintott et al. 2008) .

Abaphandi babe namava eliqeshisayo abancedisi uphando isidanga ukuze nangoko sokuthandabuza nomgangatho data. Nangona le amathandabuzo unengqiqo, Galaxy Zoo ibonisa ukuba xa iminikelo abangamavolontiya zicocwa ngokuchanekileyo, debiased, kwaye ethe, bakwazi ukuvelisa iziphumo ezikumgangatho ophezulu (Lintott et al. 2008) . An iqhinga ebalulekileyo bokufumana isihlwele ukuba ukudala data quality oqeqeshiweyo angafuneki nganto; oko kukuthi, akuba umsebenzi ofanayo yenziwa ngabantu abaninzi. Kwi Galaxy Zoo, kwakukho malunga 40 ulwahlulo kumnyele nganye; abaphandi usebenzisa abancedisi uphando isidanga akanakuze ukubanayo lo inqanaba angafuneki nganto kwaye ke ngoko kufuneka abe kakhulu bacinga kakhulu umgangatho ukuhlelwa ngamnye ngamnye. Yintoni amavolontiya wayeswele kuqeqesho, benza ke kuba angafuneki nganto.

Nakuba oku kuhlelwa ezininzi kumnyele nganye, nangona kunjalo, edibanisa iseti wokuhlelwa ivolontiya ukuvelisa uhlobo ukuvumelana nkohliso. Kuba neengxaki ezifanayo kakhulu ezininzi iiprojekthi kubalwa kwabantu kuvela, oko kuluncedo ngokufutshane sihlolisise amanyathelo amathathu ukuba abaphandi Galaxy Zoo kusetyenziswa ukuvelisa ulwahlulo yabo imvumelwano. Okokuqala, abaphandi "icocwe" idata ngokususa ukuhlela i bogus. Umzekelo, abantu ngokuphindaphindiweyo zihlelwa okufanayo Way-into eyayiza kwenzeka xa babezama ukulawula iziphumo-zonke ulwahlulo zabo zizakulahlwa. Oku kwakunye nezinye zokucoca efanayo kususwa 4% yabo bonke System.

Okwesibini, emva ukucoca, abaphandi kwakufuneka ukususa ucalu ecwangcisiweyo kwi System. Ngokusebenzisa uthotho izifundo icala ubhaqo lilungiswe ngaphakathi umzekelo yeprojekthi-ngokuba yantlandlolo, ebonisa ezinye amavolontiya kumnyele kwi monochrome endaweni lombala-abaphandi bafumanisa bevinjwa eziliqela ecwangcisiweyo, ezifana icala ecwangcisiweyo sokuhlela iminyele kude isantya njengoko iminyele ombhoxo (Bamford et al. 2009) . Ukuhlengahlengisa ezi yocalu ibalulekile kakhulu kuba nama iminikelo alisusi kwamacala ngendlela; isusa kuphela imposiso oluzenzekelayo.

Ekugqibeleni, emva debiasing, abaphandi kwakufuneka indlela ukudibanisa wokuhlelwa ngamnye ukuvelisa wocalulo imvumelwano. Eyona ndlela ilula ukudibanisa ukuhlelwa ukuze kumnyele nganye iya kuba ukukhetha ulwahlulohlulo ixhaphakileyo. Nangona kunjalo, le ndlela iya kukunika ivolontiya ubunzima ngalinye ngokulinganayo, yaye abaphandi bazindla ukuba ezinye bokuzithandela ngcono ukuhlelwa kunabanye. Ngoko ke, abaphandi liphuhlise inkqubo umlinganiselo yophindaphindo ithande ukuba ukuzama ibhaqe ngokuzenzekelayo classifiers ingcono ndibanike ubunzima ngakumbi.

Ngoko ke, emva inyathelo emithathu inkqubo-ukucoca, debiasing, kwaye weighting-iqela lophando Galaxy Zoo beguqukele izigidi 40 ukuhlelwa abangamavolontiya ibe iseti kwisivumelwano ulwahlulo morphological. Xa ezi ulwahlulo Galaxy Zoo zithelekiswe ukuba iinzame ezincinane abasakhasayo edlulileyo ezintathu ngeenkwenkwezi zobungcali, kuquka ngokuthi Schawinski eyandincedayo abanokukhuthaza Galaxy Zoo, kwabakho isivumelwano olomeleleyo. Ngoko, amavolontiya, ngokomyinge, baba nakho ukubonelela wokuhlelwa ezikumgangatho ophezulu ngenzinga abaphandi ababa ekngafunekanga lulungelane (Lintott et al. 2008) . Enyanisweni, xa ulwahlulo abantu inani elikhulu enjalo iminyele, Schawinski, Lintott, kwaye abanye baye bakwazi ukubonisa ukuba malunga ne-80% kuphela iminyele balandela spirals kulindeleke ipatheni-eziluhlaza ellipticals-kwaye bomvu amaphepha ezininzi ezibhaliweyo malunga Ukufumanisa (Fortson et al. 2011) .

Ngokwemeko le nkcazelo, sinokubona ngoku indlela Galaxy Zoo lulandelayo kwahlulwa-isicelo-ukudibanisa iresiphi, iresiphi efanayo osetyenziswa ezininzi iiprojekthi intelekelelo kwabantu. Okokuqala, ingxaki enkulu lahlulelene laba izigaqa. Kulo mzekelo, ingxaki zokuhlela sesigidi iminyele lahlulelene laba iingxaki nezigidi zokuhlela kumnyele enye. Okulandelayo, tyando lisetyenziswa kweziqulathi ngamnye ngokuzimeleyo. Kule meko, ivolontiya zigqala kumnyele nganye njengoko nokuba luphelelwe okanye skip. Okokugqibela, iziphumo ziyadibana ukuvelisa iziphumo imvumelwano. Kulo mzekelo, ukudibanisa inyathelo yayiquka ukucocwa, debiasing, weighting ukuvelisa uhlobo mvu kumnyele nganye. Nangona iiprojekthi ezininzi zisebenzisa le iresiphi jikelele, ngamnye amanyathelo kufuneka elicwangcisiweyo usingethwe ingxaki ethile. Umzekelo, kwiprojekthi kubalwa kwabantu echazwe ngezantsi, iresiphi mnye uya kulandelwa, kodwa isicelo lidibanise amanyathelo ziya kwahluka kakhulu.

Kuba iqela Galaxy Zoo, le projekthi kuqala nje kwasekuqaleni. Ngokukhawuleza kakhulu saqonda ukuba bakwazi ukuhlela akufutshane iminyele nezigidi nangona, le isikali akwanelanga ukusebenza iimvavanyo isibhakabhaka yedijithali intsha, leyo ukuvelisa imifanekiso-10 lamawaka ezigidi (Kuminski et al. 2014) . Ukusingatha ukunyuka ukusuka ku-1 million ukuya kwi-10 ezigidi-bekuya kufuneka ngumba 10,000-Galaxy Zoo zokugaya kalukhuni izihlandlo 10,000 ngaphezulu nxaxheba. Nangona inani amavolontiya kwi-Internet inkulu, akukho mda. Ngoko ke, abaphandi baqonda ukuba baza yokusingatha izixa olukhula data, entsha, nokuba mizobo kakhulu, indlela kwafuneka.

Ngoko ke, Manda Banerji-ukusebenza Kevin Schawinski, Chris Lintott, kunye namanye amalungu Galaxy Zoo iqela-ukuqala iikhompyutha zokufundisa lihlele iminyele. Ngokungakumbi, usebenzisa wokuhlelwa zabantu wadalwa Galaxy Zoo, Banerji et al. (2010) wakha sokufunda umatshini onokuchaza uhlelo bomntu kukho kumnyele esekelwe iimpawu zalo mfanekiso. Ukuba le sokufunda matshini ayikwazanga lokuyivelisa wokuhlelwa zabantu ngokuchanileyo aphakamileyo, ngoko oku kusetyenziswa Galaxy Zoo abaphandi ukuba Cazulula inani ncakasana engenasiphelo iminyele.

Undoqo indlela Banerji noogxa 'eneneni intle ezifanayo iindlela ezisetyenziswa kuphando zentlalo, nangona ukuze icace Xa uqala kuqala ukuba ukufana. Okokuqala, Banerji kunye nabo batshintsha umfanekiso ngamnye ibe iseti iimpawu yamanani ukuba lunokushwankathelwa ukuba iimpawu. Umzekelo, kuba imifanekiso iminyele kwakunokubakho iinkalo ezintathu: ubungakanani eziblowu umfanekiso, umahluko kwi njengokukhazimla ipixels, yaye isixa ipixels non-emhlophe. Ukukhethwa iimpawu echanekileyo yinxalenye ebalulekileyo ingxaki, yaye oko kufuna ngokubanzi nobuchule kwisifundo-ndawo. Eli nyathelo lokuqala, ezibizwa ngokuba osemqoka zobunjineli, kukhokelela matrix data kunye kumqolo omnye mfanekiso uze emithathu imihlathi echaza loo mfanekiso. Ngenxa isizalo data kwaye kwisiqhamo esinqwenelekayo (umzekelo, nokuba oqingqiweyo zinokuhlelwa umntu njengoko skip), umphandi iqikelela kuMmandla umzekelo statistical model-ngokuba, into efana sine-ukuba nobucukubhede kuqikelela ulwahlulohlulo oluntu phezu iimpawu zalo mfanekiso. Okokugqibela, umphandi usebenzisa parameters kule ndlela statistical ukuvelisa uqikelelo ukuhlela iminyele amatsha (Figure 5.4). Ukucinga ianalog zentlalo, masithi ukuba kwafuneka iinkcukacha ngendawo okuyo malunga abafundi million, kwaye uyazi ukuba baphumelela kwikholeji okanye hayi. Wena inokwanela sine yobuchule kule nkcukacha, yaye emva koko usebenzise onesiphumo imodeli parameters ukuqikelela ukuba abafundi abatsha abaza kugqiba kwikholeji. Ekufundeni umatshini, le ndlela-usebenzisa imizekelo enombhalo ukudala statistical model unako ukuba uze uleyibhelishe entsha data-kuthiwa kweliso ekufundeni (Hastie, Tibshirani, and Friedman 2009) .

Isazobe 5.4: Inkcazelo lula Banerji et al njani. (2010) wasebenzisa ulwahlulo Galaxy Zoo yokuqeqesha sokufunda umatshini ukwenza uhlelo kumnyele. Imifanekiso iminyele eziguqulelwe kwi bezi lweempawu. Kulo mzekelo lula kukho iinkalo ezintathu (isixa eziblowu mfanekiso, le kwiSaveyi njengokukhazimla ipixels, yaye isixa ipixels non-abamhlophe). Ngoko, kuba elucwangciso olusezantsi imifanekiso, iileyibhile Galaxy Zoo asetyenziswa ukuqeqesha indlela kumatshini yokufunda. Okokugqibela, le yokufunda umatshini isetyenziswa ukuqikelela ukuhlelwa ngenxa iminyele eseleyo. Ndiya kubiza olu hlobo lweprojekthi a-isizukulwana yesibini iprojekthi bezibalo kwabantu ngenxa, kunokuba abantu ukuba ukusombulula ingxaki, kuba abantu ukwakha dataset ezinokusetyenziswa ukuqeqesha ikhompyutha ukusombulula le ngxaki. Okuhle kule ndlela oluncediswa computer kukuba ikwenza ukuphatha iimali ncakasana engenasiphelo data usebenzisa kuphela isixa anesiphelo womntu.

Isazobe 5.4: Inkcazelo Simplified indlela Banerji et al. (2010) wasebenzisa ulwahlulo Galaxy Zoo yokuqeqesha sokufunda umatshini ukwenza uhlelo kumnyele. Imifanekiso iminyele eziguqulelwe kwi bezi lweempawu. Kulo mzekelo lula kukho iinkalo ezintathu (isixa eziblowu mfanekiso, le kwiSaveyi njengokukhazimla ipixels, yaye isixa ipixels non-abamhlophe). Ngoko, kuba elucwangciso olusezantsi imifanekiso, iileyibhile Galaxy Zoo asetyenziswa ukuqeqesha indlela kumatshini yokufunda. Okokugqibela, le yokufunda umatshini isetyenziswa ukuqikelela ukuhlelwa ngenxa iminyele eseleyo. Ndiya kubiza olu hlobo lweprojekthi a-isizukulwana yesibini iprojekthi bezibalo kwabantu ngenxa, kunokuba abantu ukuba ukusombulula ingxaki, kuba abantu ukwakha dataset ezinokusetyenziswa ukuqeqesha ikhompyutha ukusombulula le ngxaki. Okuhle kule ndlela oluncediswa computer kukuba ikwenza ukuphatha iimali ncakasana engenasiphelo data usebenzisa kuphela isixa anesiphelo womntu.

Iimpawu kwi Banerji et al. (2010) sokufunda umatshini baba ezintsonkothileyo ngaphezu kwabo yokudlala yam umzekelo-ngokomzekelo, wasebenzisa izinto ezifana "de Vaucouleurs ukungena ratio yezihlunu" -yaye umzekelo wakhe wayengekho parameter ngokwamalungiselelo, kwaba uthungelwano ngumntu ukuziqhelanisa. Ukusebenzisa iimpawu zakhe, imodeli yakhe, yaye imvumelwano zokuhlelwa Galaxy Zoo, wakwazi ukuba ukudala iintsimbi kuphawu ngamnye, uze usebenzise ezi amatye ukuba benze iingqikelelo malunga ukuhlelwa iminyele. Umzekelo, uhlalutyo wakhe wafumanisa ukuba imifanekiso eliphantsi "de Vaucouleurs ukungena ratio yezihlunu" ekwakunokulindeleka ukuba iminyele kwamagqabi. Njengoko ezi amatye, wakwazi ukuqikelela uhlelo bomntu kukho kumnyele ngokuchanileyo elifanelekileyo.

Umsebenzi Banerji et al. (2010) wajika Galaxy Zoo oko ndiya kubiza-isizukulwana sesibini inkqubo intelekelelo kwabantu. Eyona ndlela ibhetele bacinge ngezi nkqubo-isizukulwana yesibini kukuba kunokuba abantu ukuba ukusombulula ingxaki, kuba abantu ukwakha dataset ezinokusetyenziswa ukuqeqesha ikhompyutha ukusombulula le ngxaki. Ubungakanani data ezifunekayo ukuze bayale computer kunokuba enkulu ngayo ifuna nobunzima intsebenziswano yoluntu ukudala. Kwimeko Galaxy Zoo, neminatha zemithambo esetyenziswa Banerji et al. (2010) efunekayo inani elikhulu kakhulu imizekelo yabantu-abhalwa ukwakha umfuziselo unako lokuyivelisa ngokuthembekileyo ulwahlulohlulo kwabantu.

Okuhle kule ndlela oluncediswa computer kukuba ikwenza ukuphatha iimali ncakasana engenasiphelo data usebenzisa kuphela isixa anesiphelo womntu. Umzekelo, ngumphandi kwesigidi iminyele zabantu ihlelwa onokwakha imodeli axelwe ukuba ke ingasetyenziswa Cazulula ezigidi okanye nkqu iminyele ezigidi. Ukuba kukho inani elikhulu iminyele, ngoko olu hlobo exubileyo yoluntu-computer ngokwenene isicombululo kuphela. Le scalability infinite akanalungelo, kunjalo. Ukwakha indlela yokufunda kumatshini ukuze izenze ngokuchanekileyo wokuhlelwa kwabantu iyodwa ingxaki enzima, kodwa ngethamsanqa kukho sele iincwadi ezigqwesileyo ezimisele kwesi sihloko (Hastie, Tibshirani, and Friedman 2009; Murphy 2012; James et al. 2013) .

Galaxy Zoo ibonisa zazivelela iiprojekthi ezininzi kokubalwa kwabantu. Okokuqala, umphandi imizamo yeprojekthi ngokwakhe okanye iqela elincinane abancedisi ophando (umzekelo, iinzame yokuhlelwa sokuqala Schawinski ngayo). Ukuba le ndlela akuthethi umtsi kakuhle, umphandi Ungasa iprojekthi intelekelelo yabantu apho abantu abaninzi negalelo System. Kodwa ke, kuba umthamo othile wolwazi, nomzamo womntu enyulu iya kuba yanele. Ngelo xesha, abaphandi kufuneka ukwakha iinkqubo-sizukulwana yesibini apho ukuhlela babantu zisetyenziselwa yokuqeqesha sokufunda umatshini ke kunokusetyenziswa izixa phantse unlimited lwe data.