5.3.1 Netflix Prize

The Netflix Prize usebenzisa umnxeba evulekileyo ukuqikelela apho bhanya kukuthanda abantu.

Iprojekthi yefowuni evulekileyo kakhulu eyaziwayo ngumvuzo we-Netflix. I-Netflix yinkampani yokuqeshisa i-movie e-intanethi, kwaye ngowama-2000 yaqalisa iCinematch, inkonzo yokucebisa ama-movie kumakhasimende. Ngokomzekelo, iCinematch ingaqaphela ukuba uyithandayo iW Star Wars kunye neBukhosi ihlaselwa emva koko ucebisa ukuba ubukele ukubuya kweJedi . Ekuqaleni, iCinematch yayisebenza kakuhle. Kodwa, ngaphaya kweeminyaka emininzi, yaqhubeka iphucula ikhono layo lokuqikelela ukuba yiziphi iikhenkethi eziza kuba nazo. Ngo-2006, nangona kunjalo, inkqubela phambili kwiCinematch yayinxweme. Abaphandi baseNetflix babezame yonke into ababeyicinga ngayo, kodwa, ngelo xesha, babecingela ukuba kukho ezinye iimbono ezinokubanceda baphucule inkqubo yabo. Ngaloo ndlela, beza kunye noko kwakunjalo, ngelo xesha, isisombululo esinzulu: ifowuni evulekile.

Okubalulekileyo ekuphumeleleni komvuzo weNetflix yindlela yokufowunelwa kwefowuni evulekileyo, kwaye olu luyilo luye luncedo lwezifundo ezibalulekileyo malunga nokuba iifowuni ezivulekileyo zingasetyenziselwa uphando loluntu. I-Netflix ayizange ifake isicelo esingenakulungiswa seengcamango, nto abantu abaninzi abakucingayo xa beqala ukujonga ucingo oluvulekile. Kunoko, i-Netflix ibangela ingxaki ecacileyo kwinkqubo yokuhlola elula: bacela umngeni abantu ukuba basebenzise isilinganiselo se-movie esiyizigidi eziyi-100 ukuqikelela ukulinganiswa kwezigidi ezi-3 (ukulinganisa abasebenzisi abenzileyo kodwa ukuba i-Netflix ayikhululanga). Umntu wokuqala ukudala i-algorithm eqikelele ukulinganiselwa kwezigidi ezi-3 ezi-10% ezingcono kuneCinematch eza kuphumelela iidola ezigidi. Oku kucacile kwaye kulula ukuyisebenzisa inkqubo yokuvavanya-ukuthelekisa izilinganiso eziqikelelweyo kunye nokulinganiswa okubanjelwe-kwakuthetha ukuba i-Netflix Prize yenziwe ngendlela efanelekileyo ukuze izisombululo zibe lula ukujonga ngaphandle kokuvelisa; yajika umngeni wokuphucula iCinematch ibe yingxaki efanelekileyo ifowuni evulekile.

Ngo-Oktobha ka-2006, i-Netflix ikhishwe i-dataset equkethe ama-movie angama-100 angama-movie angama-500 000 (siya kuqwalasela impembelelo yangasese kule nkcazelo yesahluko 6). Idatha ye-Netflix ingacatshulwa njengomthamo omkhulu othengwa ngabathengi abangama-500,000 ngama-20,000 amafilimu. Ngaloo matrix, kwakukho ukulinganisa kwezigidi ezili-100 kwinqanaba ukusuka kwelinye ukuya kweenkwenkwezi ezintlanu (itafile 5.2). Umngeni wawuwukuba kusetyenziswe idatha ephawulweyo kwisibalo sokuqikelela ukulinganiswa kwezigidi ezi-3 ezibanjelwe.

Itheyibhile 5.2: iSimmatic Data ukusuka kwiNetflix Prize
I-Movie 1 I-Movie 2 I-Movie 3 ... Movie 20,000
Umthengi 1 2 5 ... ?
Umthengi 2 2 ? ... 3
Umthengi 3 ? 2 ...
\(\vdots\) \(\vdots\) \(\vdots\) \(\vdots\) \(\vdots\)
Umthengi 500,000 ? 2 ... 1

Abaphandi kunye nabahlaseli behlabathi bebanjelwe kumngeni, kwaye ngo-2008 abantu abangaphezu kwama-30,000 babesebenza kulo (Thompson 2008) . Ngaphezulu kwekhosi, iNetflix yafumana ngaphezu kwezi-40,000 izicwangciso ezicetywayo ezivela kumaqela angama-5,000 (Netflix 2009) . Kucacile ukuba, iNetflix ayikwazanga ukufunda nokuqonda zonke izicwangciso ezicetywayo. Yonke into yayiphumelele kakuhle, nangona kunjalo, ngenxa yokuba izisombululo zazilula ukujonga. I-Netflix inokwenza ikhompyutha ingafanise ukulinganiswa kwangaphambili kunye nokulinganiswa okubanjwe ngokusetyenziswa kweetrikhi ekhethiweyo (iteksi ethile ebeyisebenzisayo yayiyi-root square). Kwakunjalo ukukwazi ukujonga ngokukhawuleza izisombululo ezenza i-Netflix yamkele isisombululo kubo bonke abantu, oko kubonakala kubalulekile kuba iingcinga ezilungileyo zavela kwiindawo ezithile ezimangalisayo. Enyanisweni, isisombululo esiphumeleleyo sathunyelwa liqela liqaliswe ngabaphandi abathathu abangazange babe nolwazi oluphambili lwezakhiwo zokucetyiswa kwee-movie (Bell, Koren, and Volinsky 2010) .

Enye into ebalulekileyo ye-Netflix Prize kukuba yenze zonke izicwangciso ezicetywayo zihlolwe ngokufanelekileyo. Okokuthi, xa abantu belayishe izilinganiso zabo eziqikelelwayo, abazange bafune ukulayisha iziqinisekiso zabo zezifundo, ubudala babo, ubuhlanga, ubulili, ukuxhatshazwa ngokwesini, okanye nantoni na malunga nabo. Ukulinganiswa okuchaziweyo kwiprofesa odumileyo evela eStanford kwaphathwa ngokufanayo kunye nalabo abasuka kwintsana ekwibhedlele lakhe. Ngelishwa, oku akuyinyani kwininzi yophando loluntu. Oko kukuthi, ngenxa yophando oluntu, uphononongo ludla ixesha kwaye lincinane. Ngoko ke, ezininzi iingcamango zophando azizange zihlolwe ngokugqithiseleyo, kwaye xa iingcamango zivavanywa, kunzima ukuwanceda loo vavanyo evela kumdali weengcamango. Vula iiprojekthi zeefowuni, ngakolunye uhlangothi, zibe novavanyo olulula kwaye olufanelekileyo ukuze bakwazi ukufumana iingcamango ezingenakuphulwa ngenye indlela.

Ngokomzekelo, ngenye indlela ngexesha lomvuzo we-Netflix, umntu onesiqhamo igama elithi Simon Funk ethunyelwe kwiblogi yakhe isisombululo esicetywayo esisekelwe kwinqanaba lokunciphisa ixabiso elilodwa, indlela esuka kwi-algebra engqinelanayo eyayingasetyenziswe ngaphambili ngabanye abathathi-nxaxheba. Iposi leBlok ye-Funk yaxeshanye ngobugcisa kwaye ingabonakali. Ngaba le bhokisi yeposi ibonisa isisombululo esihle okanye ngaba yinkcitho yexesha? Ngaphandle kwiprojekti yefowuni evulekile, isisombululo asinakufumana uvavanyo olubi. Emva koko, uSimon Funk wayengengunjingalwazi kuMIT; Wayengumqhubi wesofthiwe, okwangoku, ngokuphindaphindiweyo e-New Zealand (Piatetsky 2007) . Ukuba wayethumelele le ngcamango kwinjineli kwi-Netflix, ngokuqinisekileyo yayingeke ifundwe.

Ngethamsanqa, ngenxa yokuba imilinganiselo yokuvavanya yayicacile kwaye kulula ukuyisebenzisa, ukulinganiswa kwayo kwangaphambili kwahlolwa, kwaye kwacaca ngokucacileyo ukuba indlela yakhe yayinamandla kakhulu: waqhayisa ukuya kwindawo yesine kwintcintiswano, umphumo omkhulu owenziwe ngamanye amaqela esebenza iinyanga kwiingxaki. Ekugqibeleni, iinxalenye zeendlela zakhe zazisetyenziswe ngabo bonke abaphangaleleyo (Bell, Koren, and Volinsky 2010) .

Inyaniso yokuba uSimon Funk wakhetha ukubhala isithuba seblothi esichaza indlela yakhe, kunokuzama ukugcina imfihlo, ibonisa ukuba abathathi-nxaxheba abaninzi kwi-Netflix Prize abazange baxhaswe kuphela yintengo-dollar yezigidi. Kunoko, abathathi-nxaxheba abaninzi babonakala benandipha inselele yengqondo kunye noluntu olwakhiwe malunga neengxaki (Thompson 2008) , iimvakalelo endikulindele ukuba abaphandi abaninzi baqonde.

Umvuzo weNetflix ngumzekelo weklasi wefowuni evulekile. I-Netflix yabuza umbuzo kunye nenjongo ethile (ukuqikelela ukulinganiswa kwee-movie) kwaye icela izicombululo kubantu abaninzi. I-Netflix yakwazi ukuvavanya zonke ezi zizisombululo kuba zilula ukujonga ngaphandle kokudala, kwaye ekugqibeleni iNetflix ithathe isisombululo esona sisisombululo. Emva koko, ndiya kukubonisa indlela le ndlela inokusetyenziswa ngayo kwi-biology nomthetho, kwaye ngaphandle kwerhafu ye-dollar.