5.2.1 Way Zoo

Way Zoo Chili khama la anthu ambiri odzipereka sanali akatswiri kuti m'kagulu miliyoni nyenyezi.

Way Zoo anachokera vuto anakumana ndi Kevin Schawinski, wophunzira maphunziro mu Astronomy pa University ya Oxford mu 2007. wosafuna zambiri pang'ono ndithu, Schawinski ankafuna nyenyezi, ndi nyenyezi akhoza wachinsinsi ndi awo kafukufuku wakapangidwe kazachilengedwe-elliptical kapena mwauzimu ndipo ndi wawo mtundu buluu kapena wofiira. Panthawiyo, nzeru ochiritsira pakati asayansi anali nyenyezi mwauzimu, monga wathu Milky Way, anali buluu mtundu (Wachinyamata zikusonyeza) ndi kuti nyenyezi elliptical anali wofiira mu mtundu (zikusonyeza ukalamba). Schawinski ankakayikira nzeru wamba. Iye amaganiziridwa kuti pamene chitsanzo chimenechi zingakhale zoona ambiri, panali mwinamwake ndithu chiwerengero cha kuchotserapo, ndipo mwa kuphunzira zambiri nyenyezi ndi ovuta amene sanali woyenera kuyembekezera Ankachita iye akanakhoza kumvetsa chinachake za ndondomeko yomwe nyenyezi anapanga.

Choncho, chimene Schawinski chofunika kugwetsa nzeru ochiritsira anali akonzedwa lalikulu la nyenyezi morphologically wachinsinsi; kuti, nyenyezi kuti anali wachinsinsi monga kaya mwauzimu kapena elliptical. Vuto Komabe, njira alipo algorithmic kwa gulu anali asanakhale woyenelela ntchito kafukufuku wa sayansi; mwa kulankhula kwina, mtundu nyenyezi anali, pa nthawi imeneyo, vuto lovuta kwa makompyuta. Choncho, chimene chinafunika ambiri anthu wachinsinsi nyenyezi. Schawinski anayamba vutoli gulu ndi changu wophunzira maphunziro. Mu mpikisano gawo la seveni, maola 12 masiku, iye anatha m'kagulu nyenyezi 50,000. Pamene nyenyezi 50,000 zingamvekere ngati ochuluka zedi, ndi kwenikweni pafupifupi 5% ya pafupifupi miliyoni nyenyezi kuti anali kujambulidwa pa Sloan Intaneti Sky Survey. Schawinski anazindikira kuti anafunika njira zambiri scalable.

Mwamwayi, likukhalira kuti ntchito mtundu nyenyezi sikutanthauza kukapitiriza maphunziro ku zakuthambo; mungaphunzitse munthu kuchita izo wokongola mwamsanga. M'mawu ena, ngakhale mtundu nyenyezi ndi ntchito lovuta kwa makompyuta, anali wokongola n'kovuta kwa anthu. Choncho, atakhala mu malo omwera mu Oxford, Schawinski ndi anzake zakuthambo Chris Lintott ndinalota mmwamba webusaiti kumene ongodzipereka m'kagulu zithunzi za nyenyezi. Patapita miyezi ingapo, Way Zoo anabadwa.

Pa webusaiti Way Zoo, ongodzipereka kukumana ndi maminiti angapo a maphunziro; Mwachitsanzo, kuphunzira kusiyanitsa mwauzimu ndi elliptical Way (Chithunzi 5.2). Pambuyo maphunziro abale anali pochitika ndi mosavuta mafunso-molondola mtundu 11 15 nyenyezi ndi odziwika classifications-ndiyeno wodzipereka udzayamba gulu weniweni wa nyenyezi zosadziwika kupyolera losavuta ukonde ofotokoza mawonekedwe (Chithunzi 5.3). The kusintha kwa mmodzi kuti zakuthambo zikanadzachitika mu mphindi zosakwana 10 zokha chofunika kudutsa chotsikitsitsa cha mavuto, ndi mafunso ophweka.

Chithunzi 5.2: Zitsanzo za mitundu iwiri waukulu wa nyenyezi: mwauzimu ndi elliptical. The ntchito Way Zoo ntchito oposa 100,000 odzipereka kwa magulu oposa 900.000 mafano. Source: www.galaxyzoo.org.

Chithunzi 5.2: Zitsanzo za mitundu iwiri waukulu wa nyenyezi: mwauzimu ndi elliptical. The ntchito Way Zoo ntchito oposa 100,000 odzipereka kwa magulu oposa 900.000 mafano. Source: www.galaxyzoo.org .

Chithunzi 5.3: Malangizo chophimba pamene adzaponye chisanko anafunsidwa kuti m'kagulu fano ngakhale limodzi. Source: www.galaxyzoo.org.

Chithunzi 5.3: Malangizo chophimba pamene adzaponye chisanko anafunsidwa kuti m'kagulu fano ngakhale limodzi. Source: www.galaxyzoo.org .

Way Zoo anakopeka odzipereka koyamba pambuyo ntchito yomanga m'nkhaniyi ndi uthenga, ndi miyezi isanu ndi umodzi polojekiti anakula kuchita zoposa 100,000 nzika asayansi, anthu amene anachita nawo chifukwa anasangalala ntchitoyo ndipo iwo akhafuna kuphedza pasadakhale zakuthambo. Pamodzi odzipereka amenewa 100,000 anathandiza okwana classifications oposa 40 miliyoni, ndi ambiri classifications kudza ku ochepa, gulu pakati pa ophunzira (Lintott et al. 2008) .

Akatswiri amene akudziŵa wolembedwa ntchito yakulipidwa ogwira kafukufuku maphunziro a digiri yoyamba mwina yomweyo kwenikweni pa khalidwe deta. Pamene kukayikira zimenezi n'zomveka, Way Zoo akusonyeza kuti zopereka wodzipereka ndi molondola kutsukidwa, debiased, ndi aggregated, iwo zingabweretse zotsatira apamwamba (Lintott et al. 2008) . An tsenga zofunika chifukwa cholowa khamu kulenga akatswiri deta khalidwe ndi redundancy; ndiko kuti, popeza ntchito yomweyo anachita ndi anthu osiyanasiyana. Mu Way Zoo, panali 40 classifications pa Way; Akatswiri ntchito ogwira kafukufuku maphunziro a digiri yoyamba sizikanakhoza kugula motere redundancy choncho tiyenera kukhala zambiri za khalidwe lililonse gulu munthu. Kodi odzipereka analibe maphunziro, iwo anapanga pakuti redundancy.

Ngakhale classifications angapo pa Way Komabe, kaphatikizidwe ndi ya classifications wodzipereka kubala kuuzidwa ndi gulu lachinyengo. Chifukwa mavuto ofanana kwambiri angabuke ntchito kwambiri masovedwe anthu, ndi bwino kuti tikambirane njira zitatu kuti ofufuza Way Zoo kuti apange kugwirizana classifications awo. Choyamba ofufuza "kutsukidwa" deta mwa kuchotsa classifications chonamizira. Mwachitsanzo, anthu amene mobwerezabwereza wachinsinsi yemweyo Way-chinachake chimene chikanati chichitike ngati anali kuyesa kusintha zotsatira-anali classifications awo onse anataya. Izi ndi zina zofanana kuyeretsa anachotsa za 4% ya classifications onse.

Chachiwiri, pambuyo kukonza ofufuza ankafunika kuchotsa biases pulogalamu classifications. Angapo maphunziro kukondera kudziwika ophatikizidwa mu original chitsanzo ntchito chifukwa, kusonyeza ena odzipereka Way mu monochrome m'malo mtundu wa akatswiri anapeza biases angapo mwadongosolo, monga kukondera ndi mwadongosolo kuti m'kagulu nyenyezi kutali mwauzimu monga nyenyezi elliptical (Bamford et al. 2009) . Kusintha kwa biases izi mwadongosolo n'kofunika kwambiri chifukwa pafupifupi zopereka zambiri sikuchotsa kukondera mwadongosolo; kokha amachotsa zolakwa mwachisawawa.

Pambuyo pake, debiasing ofufuza anafunika njira kuphatikiza ndi classifications munthu kubala kuuzidwa ndi gulu. Njira zosavuta kuphatikiza classifications aliyense Way adzakhala kusankha gulu ambiri. Komabe, njira imeneyi idzathandiza aliyense wodzipereka ofanana kulemera, ndi akatswiri anaganiza kuti ena odzipereka kwabwino pa gulu kuposa ena. Choncho, akatswiri anayamba zovuta kwambiri iterative weighting ndondomeko zimene akufuna basi amatha classifiers bwino ndi kuwapatsa kulemera kwambiri.

Choncho, njira zitatu izi, kuyeretsa, debiasing, ndi weighting ndi Way Zoo kafukufukuyo anatembenuka 40 miliyoni wodzipereka classifications mu ya kugwirizana classifications morphological. Pamene classifications Way Zoo anali poyerekeza Mayesero atatu yapita ang'onoang'ono-ang'ono ndi asayansi akatswiri, kuphatikizapo gulu ndi Schawinski kuti anathandiza kuuzira Way Zoo, panali amphamvu mgwirizano. Choncho abale, mu akaphatikiza, anatha kupereka classifications mkulu khalidwe ndi pa sikelo kuti akatswiri sakanakhoza zikugwirizana (Lintott et al. 2008) . Ndipotu, ndi kukhala classifications anthu kuti ambiri amenewa a nyenyezi, Schawinski, Lintott, ndi ena anatha bwanji kuti pafupifupi 80% ya nyenyezi kutsatira kuyembekezera chitsanzo buluu mizere yozungulira ndi ellipticals ndi wofiira mapepala ambiri onena za kupezeka (Fortson et al. 2011) .

Anakulira zimenezi, tikhoza tsopano Way Zoo amatsatira kugawanika-ntchito-kuphatikiza Chinsinsi, Chinsinsi yemweyo kuti ntchito ntchito kwambiri masovedwe anthu. Choyamba, vuto lalikulu ndi unagawika chunks. Choncho, vuto la mtundu miliyoni nyenyezi ng'ambidwa mu mavuto miliyoni a mtundu Way mmodzi. Kenako, opaleshoni ntchito kwa aliyense chipika paokha. Pankhaniyi, mongodzipereka kodi m'kagulu aliyense Way monga kaya mwauzimu kapena elliptical. Pomaliza, zotsatira amaphatikizidwa ndi kupanga chifukwa vutoli. Pankhaniyi, ndi kuphatikiza sitepe zinaphatikizapo kuyeretsa, debiasing, ndi weighting kubala kuuzidwa gulu lililonse Way. Ngakhale ntchito kwambiri ntchito Chinsinsi ichi ambiri, aliyense masitepe ayenera makonda vuto enieni kulankhula. Mwachitsanzo, anthu masovedwe ntchito pansipa, Chinsinsi chomwecho kudzakhala, koma ntchito ndi kuphatikiza njira adzakhala yosiyana.

Kwa gulu Way Zoo, polojekitiyi choyamba chinali chiyambi chabe. Mwamsanga iwo anazindikira kuti ngakhale anatha m'kagulu pafupi ndi nyenyezi miliyoni, lonse izi sikokwanira ntchito ndi kafukufuku atsopano digito kumwamba, amene akhoza kupanga zifanizo za biliyoni 10 nyenyezi (Kuminski et al. 2014) . Kusamalira kuwonjezeka kwa miliyoni 1 10 biliyoni chinthu cha 10,000 Way Zoo anafunika kum'tenga ziri nthawi 10,000 kuposa ophunzira. Ngakhale chiwerengero cha odzipereka pa Internet ndi lalikulu, si wopandamalire. Choncho, akatswiri anazindikira kuti ngati iwo ati kusamalira konse kukula ndalama deta, latsopano, zambiri scalable, njira zotha.

Choncho, Manda Banerji ntchito ndi Kevin Schawinski, Chris Lintott, ndi anthu ena a Way Zoo gulu kuyambira makompyuta pophunzitsa m'kagulu nyenyezi. Komanso mwapadera, ntchito classifications anthu analengedwa ndi Way Zoo, Banerji et al. (2010) anamanga makina kuphunzira chitsanzo kuti akhoza kulosera gulu munthu wa Way zochokera makhalidwe a fano. Ngati izi makina kuphunzira chitsanzo akhoza kubala classifications anthu molondola mkulu, ndiye izo zikhoza kugwiritsidwa ntchito ndi akatswiri Way Zoo kuti m'kagulu nambala kwenikweni malire a nyenyezi.

Phata la njira Banerji ndi anzake 'kwenikweni wokongola ofanana ndi njira amagwiritsidwa ntchito mu kafukufuku chikhalidwe, ngakhale kuti kufanana akhoza kukhala bwino pa koyamba. Choyamba, Banerji ndi anzake atatembenuzidwa aliyense fano mu m'ndandanda wa zinthu M'ndandanda kuti chifupikitso ndi katundu. Mwachitsanzo, kwa mafano a nyenyezi pangakhale mbali zitatu: kuchuluka kwa buluu fano, masiyanidwe mu kuwala kwa pixels, ndipo chiwerengero cha pixels si woyera. Kusankha zinthu zolondola ndi mbali yofunika kwambiri ya vutoli, ndipo ambiri amafuna nkhani m'dera ukatswiri. Izi Choyamba ankatchedwa Mbali zomangamanga, zimachititsa bokosi deta ndi mzere umodzi pa fano ndiyeno atatu mizati kufotokoza fanizo. Popeza masanjidwewo deta ndi linanena bungwe ankafuna (mwachitsanzo, ngati fano anali zimagawidwa ndi umunthu monga Way elliptical), kafukufuku wa pafupifupi magawo a zowerengera chitsanzo Mwachitsanzo, chinachake ngati logistic mgwirizano kuti limaneneratu gulu munthu zochokera ndi maonekedwe a fano. Pomaliza, kafukufuku anagwiritsa ntchito magawo mu chitsanzo ichi zowerengera kutulutsa pafupifupi classifications wa nyenyezi yatsopano (Chithunzi 5.4). Kuganiza za analogi chikhalidwe, yerekezani kuti inuyo atadziwa zaziwerengero za ophunzira miliyoni, ndipo inu kudziwa ngati maphunziro a ukachenjede kapena ayi. Inu inkatha ndi mgwirizano logistic kuti deta ichi, ndiyeno inu mukhoza ntchito chifukwa magawo chitsanzo kulosera ngati kuphunzira muti maphunziro a ukachenjede. Kuphunzira makina, izi njira kugwiritsa ntchito olembedwa zitsanzo kulenga zowerengera chitsanzo kuti kenako amanena latsopano data-amatchedwa kuyang'anira kuphunzira (Hastie, Tibshirani, and Friedman 2009) .

Chithunzi 5.4: Anasintha kufotokoza Banerji neri Al. (2010) ntchito Way Zoo classifications kuphunzitsa makina kuphunzira chitsanzo kuchita gulu Way. Images wa nyenyezi anatembenuka mu bokosi la zinthu. Mu chitsanzo chosavuta pali zinthu zitatu (kuchuluka kwa buluu mu fano, masiyanidwe mu kuwala kwa pixels, ndipo chiwerengero cha pixels si woyera). Ndiye, kuti kagawo wa mafano, ndi zolemba Way Zoo ntchito yolangiza chitsanzo makina learning. Pomaliza, kuphunzira makina ntchito amanena classifications kwa nyenyezi otsala. Ndikaitana mtundu uwu ntchito wachiwiri m'badwo zowerengera anthu ntchito chifukwa, koposa kukhala anthu kuthetsa vuto, ndi anthu kumanga gulu lazidziwitso amene angagwiritsidwe ntchito kuphunzitsa kompyuta kuthetsa vutolo. The ntchito njira imeneyi kompyuta anathandiza kuti imakuthandiza kusamalira ndalama kwenikweni wopandamalire deta ntchito kokha amalire kuchuluka kwa umunthu.

Chithunzi 5.4: Anasintha kufotokoza Banerji et al. (2010) ntchito Way Zoo classifications kuphunzitsa makina kuphunzira chitsanzo kuchita gulu Way. Images wa nyenyezi anatembenuka mu bokosi la zinthu. Mu chitsanzo chosavuta pali zinthu zitatu (kuchuluka kwa buluu mu fano, masiyanidwe mu kuwala kwa pixels, ndipo chiwerengero cha pixels si woyera). Ndiye, kuti kagawo wa mafano, ndi zolemba Way Zoo ntchito yolangiza chitsanzo makina learning. Pomaliza, kuphunzira makina ntchito amanena classifications kwa nyenyezi otsala. Ndikaitana mtundu uwu ntchito wachiwiri m'badwo zowerengera anthu ntchito chifukwa, koposa kukhala anthu kuthetsa vuto, ndi anthu kumanga gulu lazidziwitso amene angagwiritsidwe ntchito kuphunzitsa kompyuta kuthetsa vutolo. The ntchito njira imeneyi kompyuta anathandiza kuti imakuthandiza kusamalira ndalama kwenikweni wopandamalire deta ntchito kokha amalire kuchuluka kwa umunthu.

Mbali mu Banerji et al. (2010) makina kuphunzira chitsanzo anali zovuta kuposa anthu chidole changa chitsanzo Mwachitsanzo, iye anagwiritsa ntchito zinthu monga "de Vaucouleurs woyenera ofananira chiŵerengero" -ndipo chitsanzo iye sanali logistic mgwirizano, anali yokumba ubongo Intaneti. Pogwiritsa ntchito nkhani yake, chitsanzo chake, ndi kugwirizana Way Zoo classifications, iye anatha kulenga zinthu zolemera iliyonse mbali, ndiyeno ntchito miyezo imeneyi kuti maulosi za gulu la nyenyezi. Mwachitsanzo, kusanthula iye anapeza kuti mafano ndi otsika "de Vaucouleurs woyenera ofananira chiŵerengero" anali zambiri mwauzimu nyenyezi. Popeza zolemera zimenezi, iye anatha kulosera gulu munthu wa Way ndi zolondola.

Ntchito Banerji et al. (2010) anatembenuka Way Zoo zimene ine adzaitana m'ma m'badwo masovedwe anthu dongosolo. Njira yabwino kuganiza za kachitidwe wachiwiri m'badwo kuti koposa kukhala anthu kuthetsa vuto, ndi anthu kumanga gulu lazidziwitso amene angagwiritsidwe ntchito kuphunzitsa kompyuta kuthetsa vutolo. Kuchuluka kwa deta anafunika kuphunzitsa kompyuta kungakhale yaikulu kwambiri moti pamafunika anthu misa mgwirizano kulenga. Pankhani ya Way Zoo, ndi kulumikiza anthu ogwira ubongo ntchito Banerji et al. (2010) ankafuna ochuluka zedi zitsanzo munthu olembedwa kuti amange chitsanzo kuti anatha molondola kubereka gulu munthu.

The ntchito njira imeneyi kompyuta anathandiza kuti imakuthandiza kusamalira ndalama kwenikweni wopandamalire deta ntchito kokha amalire kuchuluka kwa umunthu. Mwachitsanzo, kafukufuku ndi miliyoni munthu wachinsinsi nyenyezi angathe kumanga predictive chitsanzo kuti akhoza kugwiritsidwa ntchito kwa m'kagulu biliyoni kapena nyenyezi zankhaninkhani. Ngati pali nambala yaikulu kwambiri nyenyezi, ndiye mtundu uwu wa anthu-kompyuta hybrid alidi pakhale njira. Izi scalability wopandamalire si free, Komabe. Kumanga makina kuphunzira chitsanzo kuti angathe molondola kubala classifications munthu ali pachokha ndi vuto zovuta, koma mwamwayi pali kale mabuku kwambiri wodzipereka kwa nkhaniyi (Hastie, Tibshirani, and Friedman 2009; Murphy 2012; James et al. 2013) .

Way Zoo limasonyeza zamoyo ntchito zambiri masovedwe anthu. Choyamba, katswiriyu akufuna polojekiti yekha kapena ndi gulu laling'ono la ogwira kafukufuku (mwachitsanzo, Schawinski koyamba gulu khama). Ngati njira imeneyi si onga bwino, kafukufuku akhoza kusamukira masovedwe ntchito anthu kumene anthu ambiri zimathandiza classifications. Koma buku ena deta, khama wangwiro sadzakhala okwanira. Panthawiyo, ofufuza ayenera kumanga kachitidwe wachiwiri m'badwo kumene classifications anthu ntchito yolangiza makina kuphunzira chitsanzo kuti kenako lingagwiritsidwe ndalama pafupifupi malire deta.