5.2.1 Galaxy Zoo

Galaxy Zoo isku daraa dadaalka iskaa wax badan oo aan khabiir si kala a million falagyada.

Galaxy Zoo koray out of dhibaato soo food by Kevin Schawinski, arday ka qalin in xiddigaha ka dhiga jaamacadda Oxford a 2007. fududaynta qayb ilaa xad ah, Schawinski jiray xiiso falagyada, iyo falagyada loo kala saari karaa by ay fataahadda-elliptical ama aarayo-iyo by ay midab-buluug ah ama kuwa gaduudan. Isla mar, xigmad caadiga ah ka mid ah cirbixiyeyaasha ahaa in falagyada aarayo, sida noo calooshiisa Jidka, waxay ahaayeen midabka buluuga ah (dhalinyarada muujinaysa) iyo in falagyada elliptical ahaayeen cas ee midabka (muujinaysa da 'weyn). Schawinski shakiyey xigmaddan caadiga ah. Waxa uu la tuhunsan yahay in inta qaabkani waxaa laga yaabaa run guud ahaan, waxaa laga yaabaa in ay ahaayeen tiro fara badan oo ka reeban, iyo in ay waxbarasho badan oo kuwan falagyada-aan caadi ahayn kuwa aan ku haboon la filayo naqshad-uu wax ku saabsan geedi socodka baran karin, iyada oo loo marayo taas oo falagyada sameeyay.

Sidaas darteed, waxa loo baahan yahay Schawinski si ay u rogto xigmad caadiga ahaa set badan oo ah falagyada morphologically xogo; in uu yahay, falagyada wixii lagu tilmaamay inay tahay mid loogu aarayo ama elliptical. Dhibaatadu waxay, si kastaba ha ahaatee, waxay ahayd in habab algorithmic for soocidda jira aan weli wanaagsan oo ku filan in loo isticmaalo cilmi sayniska, in si kale loo dhigo, falagyada kala saarista ahaa, wakhtigaas, dhibaato in ay adag tahay ahaa ee kombiyuutarada. Sidaa darteed, waxa loo baahan yahay ahayd tiro badan oo ah falagyada aadanaha lagu tilmaamay. Schawinski aqbaleen dhibaato kala soocidda this xamaasad ee arday ka qalin ah. In casharka marathon toddoba, maalmaha 12-ka saac ah, wuxuu ahaa awoodaan in ay kala saaraan 50,000 falagyada. Iyada oo 50,000 falagyada u muuqan kartaa sida badan, waxaa run ahaantii waa kaliya oo ku saabsan 5% oo ka mid ah ku dhowaad hal milyan oo falagyada in la sawiro in Sloan Digital Sky Survey ah. Schawinski ogaaday in uu u baahan yahay hab ka badan scalable.

Nasiib wanaag, waxaa soo baxday in hawsha falagyada kala saarista uma baahna tababar sare ee xiddigaha; idin bari kartaa qof in ay si deg deg ah soo jiidasho sameeyaan. In si kale loo dhigo, inkastoo kala saarista falagyada waa hawl adag in uu ahaa for kombiyuutarada, waxay ahayd quruxsan fudud aadanaha. Sidaas daraaddeed, halka fadhiya baar ah ee Oxford, Schawinski iyo astronomer shaqayn Chris Lintott kor ku riyooday website halkaas oo iskaa wax kala lahaa images of falagyada. Dhawr bilood ka dib, Galaxy Zoo waxa uu ku dhashay.

At website-ka Galaxy Zoo, iskaa wax u qabso ku mari doontaa dhowr daqiiqo oo tababar ah, tusaale ahaan, barashada farqiga u dhexeeya muquuninta iyo Galaxy elliptical (Jaantuska 5.2). tababarkan ka dib, iskaa wax u lahaa inuu fudud kedis-sax kala saarista 11 of 15 falagyada la yaqaan kala saaro-ka dibna iskaa wax u qabso ku bilaabi lahaa qoondaynta dhabta ah ee falagyada aan la garaneyn iyada oo interface a fudud web-ku salaysan (Jaantuska 5.3). guurka ka tabaruca ay astronomer qaadan lahaa meel ka yar 10 daqiiqo oo kaliya looga baahan yahay marayay ugu hooseeya ee caqabado, kedis ah oo fudud.

Jaantuska 5.2: Tusaalayaal ka mid ah laba nooc oo waaweyn oo falagyada: muquuninta iyo elliptical. Mashruuca Galaxy Zoo isticmaalo in ka badan 100,000 oo iskaa wax u qabso ee ay qaybaha oo in ka badan 900,000 oo images. Source: www.galaxyzoo.org.

Jaantuska 5.2: Tusaalayaal ka mid ah laba nooc oo waaweyn oo falagyada: muquuninta iyo elliptical. Mashruuca Galaxy Zoo isticmaalo in ka badan 100,000 oo iskaa wax u qabso ee ay qaybaha oo in ka badan 900,000 oo images. Source: www.galaxyzoo.org .

Jaantuska 5.3: screen Input meesha cod la weydiiyey in ay kala saaraan hal image ah. Source: www.galaxyzoo.org.

Jaantuska 5.3: screen Input meesha cod la weydiiyey in ay kala saaraan hal image ah. Source: www.galaxyzoo.org .

Galaxy Zoo jiitay iskaa wax ay hore kadib markii mashruuca lagu ciyaaray article news a, iyo in ku dhow lix bilood mashruuca koray inay ku lug leeyihiin in ka badan 100,000 oo saynisyahano muwaadin, dadka ka qayb maxaa yeelay, waxay ku riyaaqay hawsha iyo waxay doonayeen si ay u caawiyaan xiddigaha ka hor. Si wadajir ah, kuwaas oo 100,000 oo mutadawiciin ah ka qayb qaatay wadar ahaan in ka badan 40 milyan oo kala saaro, iyada oo inta badan kala saaro soo socda ka yar, koox muhiim ah oo ka qaybgalayaasha (Lintott et al. 2008) .

Cilmi-kii leeyihiin waayo-aragnimo ay shaqaaleysato caawiyeyaasha cilmi undergraduate isla markiiba laga yaabaa walaacsan tayada xogta. Iyadoo shaki tani waa macquul, Galaxy Zoo waxay muujinaysaa in markii darsaday iskaa wax u qabso si sax ah loo nadiifiyo, debiased, iyo darka, waxay soo saari kartaa natiijooyin tayo sare leh (Lintott et al. 2008) . Trick An muhiim u tahay helitaanka badnaa si ay u abuuraan xogta tayada xirfadeed waa shaqo; in uu yahay, isagoo hawl isku mid ah ay fuliyeen dad badan oo kala duwan. In Galaxy Zoo, waxaa jiray oo ku saabsan 40 per kala saaro Galaxy, cilmi isticmaalaya caawiyeyaasha cilmi undergraduate marnaba awoodi kari waayay heerka this of shaqo oo sidaas daraaddeed u baahan tahay in ka badan ka walaacsan la tayada soocidda kasta oo gaar ah. Maxaa iskaa wax u baahnaydeen tababarka, waxay kor u sameeyey iyadoo ka eryis.

Xitaa la kala saaro badan per Galaxy, si kastaba ha ahaatee, isku set ee kala saaro iskaa wax u qabso si loo soo saaro soocidda dhanyahay waa adag. Maxaa yeelay, caqabadaha la mid kacayaan inta badan mashaariicda xisaabinta aadanaha, waxa fiican in ay si kooban u eegi ah saddex tallaabo in cilmi Galaxy Zoo loo isticmaalaa in lagu soo saaro kala saaro heshiis ay. First, cilmi "nadiifiyaa" xogta iyadoo la fogeynayo kala saaro been. Tusaale ahaan, dadka si joogta ah lagu tilmaamay inay isku Galaxy-wax dhici kara haddii ay isku dayayeen in ay isku dubaridi natiijada-lahaa kala saaro oo dhan ay la tuuraa. Tani iyo nadiifinta kale la mid ah laga saaray oo ku saabsan 4% dhammaan kala saaro.

Second, ka dib markii lagu nadiifiyo, cilmi loo baahan yahay in meesha laga saaro eexasho nidaamsan ee kala saaro. Iyada oo taxane ah oo waxbarashada la ogaado eexda gundhig gudahood tusaale ahaan mashruuca-for asalka ah, oo muujinaya mutadawiciin ah qaar ka mid ah Galaxy ee monochrome halkii midab-cilmi helay eexasho nidaamsan dhowr, sida eexda habaysan si ay u kala saaraan falagyada fog aarayo sida falagyada elliptical (Bamford et al. 2009) . Qabsashada Waayo, kuwaas oo eexasho nidaamsan waa mid aad u muhiim ah, sababtoo ah celcelis ahaan ku darsaday badan ma ka saari eexda nidaamsan; waxa keliya oo ka saaraysaa baadi random.

Ugu dambeyntii, ka dib markii debiasing, cilmi loo baahan yahay hab in la isugu daro kala saaro shakhsiga in ay soo saaraan soocidda heshiis ah. Habka ugu fudud in la isu geeyo kala saaro waayo Galaxy kasta noqon lahaa si ay u doortaan soocidda ugu badan. Si kastaba ha ahaatee, habkan ku siin lahaa iskaa wax kasta miisaan siman, oo cilmi ah looga shakisan yahay in iskaa wax u qabso qaar ka mid ah ay ahaayeen wanaagsan at soocidda badan kuwa kale. Sidaa darteed, cilmi-horumartay nidaamka miisaan ka sii adag noqnoqod in isku dayaysa in ay si toos ah u ogaado ka classifiers ugu fiican oo ay siiyaan miisaan dheeraad ah.

Sidaas darteed, ka dib markii saddex tallaabo habka-nadiifinta, debiasing, iyo-ka miisaan kooxda cilmi Galaxy Zoo lahaa diinta 40 million kala saaro iskaa wax u qabso galay set ah loo dhanyahay kala saaro morphological. Oo kuwaasu markay kala saaro Galaxy Zoo ayaa marka la barbar dhigo saddex isku day yar baaxad hore by cirbixiyeyaasha xirfad, oo ay ku jirto kala soocidda by Schawinski in ka caawiyay in lagu dhiirrigeliyo Galaxy Zoo, waxaa jiray heshiis u xoog badan. Sidaas darteed, iskaa wax u qabso ah, in wadar ahaan, ay awoodaan si ay u siiyaan kala saaro tayo sare leh iyo miisaan in cilmi ma u dhigma ayaa laga yaabaa (Lintott et al. 2008) . Dhab ahaantii, by isagoo kala saaro aadanaha sida tiro badan oo ah falagyada, Schawinski, Lintott, iyo kuwa kale oo ay awoodaan si ay u muujiyaan in kaliya 80% ka mid ah falagyada raacaan spirals filayaa naqshad-buluug iyo ellipticals-iyo casaan waraaqo badan ayaa laga qoray oo ku saabsan daahfurka this (Fortson et al. 2011) .

Marka la eego asalka this, hadda waxaan ka arki kartaa sida Galaxy Zoo soo socota ee kala-dalban-geeyo karinayo, karinayo la mid ah in waxa loo isticmaalaa inta badan mashaariicda xisaabinta aadanaha. First, dhibaato weyn ayaa la kala doodiisa. Xaaladdan oo kale, dhibaatada of saarista a million falagyada waxaa loo kala qaybiyey a million dhibaatooyinka kala saarista mid Galaxy. Next, qaliin waxaa laga codsadaa in waslad kasta si madax banaan. Xaaladdan oo kale, iskaa wax u qabso ah kala lahaa Galaxy kasta sida mid loogu aarayo ama elliptical. Ugu dambeyntii, natiijada lagu daro in la soo saaro natiijada heshiis ah. Xaaladdan oo kale, tallaabada lagu biiriyo ka mid nadiifinta, debiasing, iyo miisaan ah in ay soo saaraan soocidda dhanyahay for Galaxy kasta ah. Inkasta oo mashaariicda ugu isticmaali this recipe guud, mid kasta oo ka mid ah tallaabooyinka loo baahan yahay in kartoo dhibaatada gaarka ah ee la qabto. Tusaale ahaan, in mashruuca xisaabinta aadanaha hoos ku qeexan, karinayo la mid ah ayaa la raacayaa, laakiin codsan iyo isku tallaabooyin ay noqon doontaa arrin oo kala duwan.

Waayo, kooxda Galaxy Zoo, mashruuca ugu horeysay oo intaasu waxay ahayd uun bilowga. Aad si deg deg ah ay xaqiiqsadeen in inkasta oo ay awoodaan in ay kala saaraan ku dhowaad hal milyan falagyada ahaayeen, qiyaasta this kuma filna in la sahan samada digital cusub, taas oo soo saari kari lahaa images of saabsan 10 bilyan oo falagyada shaqeeyaan (Kuminski et al. 2014) . Si aad siddo kordhay laga bilaabo 1 million ilaa 10 bilyan oo ah arrin ka mid ah 10,000-Galaxy Zoo u baahantahay in aad qortaan qiyaastii 10,000 oo jeer ka badan ka qayb galayaasha. Inkastoo tirada iskaa wax u qabso ah oo ku saabsan internetka waa weyn yahay, ma aha aan la koobi karayn. Sidaa darteed, cilmi ogaaday in haddii ay u socdaan in ay la tacaalaan tiro sii kordhaysa oo ah data, waxaa loo baahan yahay cusub, xitaa ka sii badan scalable, hab.

Sidaa darteed, Manda Banerji-shaqeeya la Kevin Schawinski, Chris Lintott, iyo xubnaha kale oo ka mid ah kombiyuutarada cilmiga kooxda bilaabo Galaxy Zoo inay kala falagyada. More si gaar ah, iyadoo la isticmaalayo kala saaro aadanaha abuuray by Galaxy Zoo, Banerji et al. (2010) dhisay model waxbarashada mashiinka a in la saadaaliyo yaabaa soocidda aadanaha ee Galaxy ku salaysan sifooyinka sanamkii. Haddii model this waxbarashada mashiinka soo saari kara kala saaro aadanaha saxsanaan sare, ka dibna waxa loo isticmaali karaa cilmi Galaxy Zoo loo kala saaro tiro aan la koobi karayn muhiimad of falagyada.

muhiimka ah ee Banerji iyo asxaabtii 'hab dhab ahaantii waa mid tahay farsamooyinka caadi ahaan loo isticmaalo cilmi bulsheed, inkastoo la mid ah in aan laga yaabaa cad at jaleecada hore. First, Banerji iyo asxaabtii diinta image kasta set ah muuqaalada tiro koobaan in ay tahay hantida. Tusaale ahaan, images of falagyada waxaa uu noqon karaa saddex muuqaalada: qadarka buluug ah ee image ah, iskuna ee dhalaalka pixels, oo saamiga pixels aan cadaanka aheyn. Doorashada oo ka mid ah muuqaalada saxda ah waa qayb muhiim ah oo dhibaatada, iyo guud ahaan waxay u baahan tahay khibrad maado-banaan. Tallaabada ugu horeysay Tani, si caadi ah loo yaqaan injineernimada feature, ay keentey in furta xogta la mid isku xigta per image ka ​​dibna saddex tiirar tilmaamay image in. Marka la eego furta oo xogta iyo wax soo saarka la doonayo (tusaale ahaan, haddii image waxaa lagu tilmaamay inay by aadanaha ah sida Galaxy ah elliptical), oo cilmibaadhis ku qiyaastay xuduudaha ka tusaale model-for tirakoobka, wax u eg dhaca-in a habkii ay saadaalisay soocidda aadanaha ku salaysan on the muuqaalada image ah. Ugu dambeyntii, cilmi isticmaalaa xuduudaha ee model this tirakoobka in ay soo saaraan kala saaro qiyaasay of falagyada cusub (Jaantuska 5.4). Si aad u malaynayso of analog ah bulshada, qiyaasi in aad xogta dadka oo ku saabsan a million ardayda, oo waxaad ogaan in ay ka qalin jaamacad ama ma. Waxaad ku haboon laga yaabaa dhaca ah habkii ay xogta this, ka dibna aad isticmaali karto ka dhalanaya oo xuduudaheedu model in la saadaaliyo in ardayda cusub ee la doonayo in laga college qalin. In waxbarashada mashiinka, hab-la isticmaalayo tusaale oo ku tilmaamay this si ay u abuuraan model ah tirakoobka in markaas calaamadee karaa cusub xog-waxaa lagu magacaabaa kormeero barashada (Hastie, Tibshirani, and Friedman 2009) .

Jaantuska 5.4: description la fududeeyay ee sida Banerji et al. (2010) loo isticmaalo ee kala saaro Galaxy Zoo si ay u tababaraan model waxbarashada mashiinka a in la sameeyo kala soocidda Galaxy. Images of falagyada ayaa diinta ee maxalka furta oo ka mid ah qaababka. Tusaalahaan fududeeyay waxaa jira saddex muuqaalada (qadarka buluug araggiisa, iskuna ee dhalaalka pixels, oo saamiga pixels non-cad). Markaas, waayo hoosaad ka mid ah tiirarkii, ku dheggan Galaxy Zoo waxaa loo isticmaalaa si ay u tababaraan model waxbarashada mashiin. Ugu dambeyntii, waxbarashada mishiinka loo isticmaalo in la qiyaaso kala saaro ee falagyada haray. Waxaan ugu baaqayaa noocan oo kale ah mashruuca mashruuc xisaabeed aadanaha jiilka labaad, maxaa yeelay, halkii ay aadanuhu isagoo xal u helidda dhibaatada a, waxay leeyihiin dadka dhiso dataset ah in loo isticmaali karaa si ay u tababaraan computer ah ee lagu xallinayo dhibaatada. Faa'iidada of hab computer-gacan this waa in ay awood aad in ay la tacaalaan tiro muhiimad aan la koobi karayn xogta la isticmaalayo kaliya qadar uguna dadaal aadanaha.

Jaantuska 5.4: description la fududeeyay ee sida Banerji et al. (2010) loo isticmaalo ee kala saaro Galaxy Zoo si ay u tababaraan model waxbarashada mashiinka a in la sameeyo kala soocidda Galaxy. Images of falagyada ayaa diinta ee maxalka furta oo ka mid ah qaababka. Tusaalahaan fududeeyay waxaa jira saddex muuqaalada (qadarka buluug araggiisa, iskuna ee dhalaalka pixels, oo saamiga pixels non-cad). Markaas, waayo hoosaad ka mid ah tiirarkii, ku dheggan Galaxy Zoo waxaa loo isticmaalaa si ay u tababaraan model waxbarashada mashiin. Ugu dambeyntii, waxbarashada mishiinka loo isticmaalo in la qiyaaso kala saaro ee falagyada haray. Waxaan ugu baaqayaa noocan oo kale ah mashruuca mashruuc xisaabeed aadanaha jiilka labaad, maxaa yeelay, halkii ay aadanuhu isagoo xal u helidda dhibaatada a, waxay leeyihiin dadka dhiso dataset ah in loo isticmaali karaa si ay u tababaraan computer ah ee lagu xallinayo dhibaatada. Faa'iidada of hab computer-gacan this waa in ay awood aad in ay la tacaalaan tiro muhiimad aan la koobi karayn xogta la isticmaalayo kaliya qadar uguna dadaal aadanaha.

Muuqaaladan in Banerji et al. (2010) model waxbarashada mashiinka ahaayeen mid ka dhib badan kuwa aan toy tusaale-tusaale ahaan, ayay qaababka loo isticmaalo sida "de Vaucouleurs haboon ratio axial" kolba iyada model ma ahaa dhaca SKA, waxa ay ahayd oo ah urur neural aan dabiici ahayn. Isticmaalka muuqaalada iyada, iyada model, iyo aragtida kala saaro Galaxy Zoo, oo iyana waxay ahayd karin inuu abuuro isagoo miisaanka on feature kasta, ka dibna u isticmaali miisaanka si loo samayn lahaa saadaal ku saabsan kala soocidda ee falagyada. Tusaale ahaan, iyada falanqaynta ogaaday in images la hooseeyo "de Vaucouleurs haboon ratio axial" waxay u badantahay in uu noqdo falagyada aarayo. Marka la eego miisaanka kuwan, waxay awooday in la saadaaliyo soocidda aadanaha ee Galaxy ah oo sax ah oo macquul ah.

Shaqada ee Banerji et al. (2010) soo jeestay Galaxy Zoo waxa aan ugu yeeri lahaa nidaamka xisaabinta aadanaha jiilka labaad ah. Habka ugu fiican in aad ka fikirto hababka jiilka labaad waa in halkii ay aadanuhu isagoo xal u helidda dhibaatada a, waxay leeyihiin dadka dhiso dataset ah in loo isticmaali karaa si ay u tababaraan computer ah ee lagu xallinayo dhibaatada. Inta ay xogta loo baahan yahay si ay u tababaraan computer wuxuu noqon karaa mid si weyn in ay u baahan tahay iskaashi aadanaha mass si ay u abuuraan. In the case of Galaxy Zoo, shabakadaha neural isticmaalo by Banerji et al. (2010) looga baahan yahay tiro aad u badan oo ah tusaalayaal aadanaha-ku tilmaamay in si loo dhiso model ah in aysan awoodin in ay soo saari kalsoonaan karo soocidda aadanaha.

Faa'iidada of hab computer-gacan this waa in ay awood aad in ay la tacaalaan tiro muhiimad aan la koobi karayn xogta la isticmaalayo kaliya qadar uguna dadaal aadanaha. Tusaale ahaan, cilmi-a la a million falagyada aadanaha tilmaamay inay dhisi karaa model ah saadaal ah in ka dibna waxaa loo isticmaali karaa si ay u kala saaraan billion ama xitaa a trillion falagyada. Haddii ay jiraan tirooyin aadka u badan ee falagyada, markaas noocan oo kale ah hybrid aadanaha-computer runtii waa xalka keliya ee suurtogal ah. Tani scalability koobi karayn ma aha lacag la'aan ah, si kastaba ha ahaatee. Dhisidda model waxbarashada mashiinka a in si sax ah soo saari kartaa kala saaro aadanaha laftiisa waa dhibaato adag, laakiin nasiib wanaag horeba jira buugaag fiican ka go'an in this topic (Hastie, Tibshirani, and Friedman 2009; Murphy 2012; James et al. 2013) .

Galaxy Zoo muujinaysaa horumar ah mashaariic badan xisaabinta aadanaha. First, cilmibaadhe oo isku dayaysa in mashruuca nafteeda by ama koox yar oo ah caawiyeyaasha cilmi (tusaale ahaan, dadaal soocidda hore Schawinski ee). Haddii habkan uusan si fiican u ma qaaddo, cilmi guuri kartaa si mashruuc aadanaha xisaabinta halkaas oo dad badan oo gacan ka kala saaro. Laakiin, waayo, mugga gaar ah oo xogta, dadaal saafi ah aadanaha ma noqon doonto in ku filan. Halkaa marka ay marayso in, cilmi u baahan tahay in la dhiso nidaamyo jiilka labaad ee halkaas kala saaro aadanaha waxaa loo isticmaalaa si ay u tababaraan model waxbarashada mashiinka a in markaas lagu saleyn karaa xaddi shiidaa aan xad lahayn of data.