## Amanothi ezembalo

Kulesi sithasiselo, ngizochaza ezinye zezimvo ezivela esahlukweni kufomu elithile lemathematika. Umgomo lapha ukukusiza ukuba ukhululeke ngokubaluleka kanye nohlaka lwezibalo olwenziwa ngabacwaningi bezinhlolovo ukuze ukwazi ukushintshela kwezinye izinto zobuchwepheshe obhalwe kulezi zihloko. Ngizoqala ngokufaka isampula kungenzeka, bese uthuthela kusampula okungenzeka ukuthi ungekho, futhi ekugcineni, isampula esingenakwenzeka.

Isampula esingenzeka

Njengesibonelo esisebenzayo, ake sicabangele umgomo wokulinganisa izinga lokungasebenzi emsebenzini e-United States. Vumela $$U = \{1, \ldots, k, \ldots, N\}$$ babe ngabantu abathintekayo futhi vumela $$y_k$$ ngenani $$y_k$$ komuntu $$k$$ . Kulesi sibonelo $$y_k$$ kungakhathaliseki ukuthi umuntu $$k$$ akasebenzi yini. Okokugcina, vumela $$F = \{1, \ldots, k, \ldots, N\}$$ kube yisibalo sabantu, okusho ukuthi kube lula ukuthi kufane nokulingana kwabantu.

Isakhiwo sampula eyisisekelo sampula esingahleliwe okungahleliwe ngaphandle kokushintshwa. Kulesi simo, umuntu ngamunye kungenzeka ukuba afakwe kwisampuli $$s = \{1, \ldots, i, \ldots, n\}$$ . Lapho idatha iqoqiwe nalesi sakhiwo sampula, abacwaningi bangalinganisa inani labantu abangasebenzi kanye nesampula kusho:

$\hat{\bar{y}} = \frac{\sum_{i \in s} y_i}{n} \qquad(3.1)$

lapho $$\bar{y}$$ kuyinani labantu abangasebenzi futhi $$\hat{\bar{y}}$$ ukulinganisa izinga lokungasebenzi (i $$\hat{ }$$ kuvame esetshenziselwa ukukhombisa umlinganisi).

Eqinisweni, abacwaningi abavame ukusebenzisa isampuli elula okungahleliwe ngaphandle kokushintshwa. Ngezizathu ezihlukahlukene (okunye engizokuchaza okwesikhashana), abacwaningi bavame ukudala amasampuli ngokungenzeka okungalingani kokufakwa. Isibonelo, abacwaningi bangakhetha abantu eFlorida ngamathuba aphezulu okufakwa ngaphandle kwabantu baseCalifornia. Kulesi simo, isampula sisho (iq. 3.1) kungenzeka ukuthi ingabi isilinganiso sokulinganisela. Esikhundleni salokho, uma kunamathuba angalingani okufakwa, abacwaningi basebenzisa

$\hat{\bar{y}} = \frac{1}{N} \sum_{i \in s} \frac{y_i}{\pi_i} \qquad(3.2)$

lapho $$\hat{\bar{y}}$$ ukulinganisa kwenani lokungasebenzi futhi $$\pi_i$$ ngumuntu $$i$$ amathuba okufakwa. Ukulandela umkhuba ojwayelekile, ngizobiza umlinganisi ku-eq. 3.2 umlinganisi weHorvitz-Thompson. Umlinganisi weHorvitz-Thompson uwusizo kakhulu ngoba uholela ekulinganisweni okungenakubalwa kwanoma yikuphi ukuklanywa kwesampula (Horvitz and Thompson 1952) . Ngenxa yokuthi umlinganisi weHorvitz-Thompson uvela njalo, kuyasiza ukuqaphela ukuthi kungabhalwa kabusha

$\hat{\bar{y}} = \frac{1}{N} \sum_{i \in s} w_i y_i \qquad(3.3)$

lapho $$w_i = 1 / \pi_i$$ . Njengoba eq. 3.3 yembula, umlinganisi weHorvitz-Thompson isampula esindayo esho ukuthi izisindo zihlobene kanjani namathuba okukhethwa. Ngamanye amazwi, okungenani umuntu kumele afakwe kwisampula, isisindo esingaphezu komuntu okufanele angene ekulinganisweni.

Njengoba kuchaziwe ngaphambili, abacwaningi bavame ukulingisa abantu abanamathuba angalingani okufakwa. Isibonelo esisodwa somklamo ongaholela ekungeneni okungafani kokufakwa kufakwe isampuli esicacile , okubalulekile ukuyiqonda ngoba isondelene kakhulu nenqubo yokulinganisa okuthiwa i- post-stratification . Esikhathini sampuli olunqunyiwe, umcwaningi uhlukanisa inani labantu ababhekiswe kulo $$H$$ amaqembu ahlukene futhi aphelele. Lawa maqembu abizwa ngokuthi yi- strata futhi aboniswa njenge- $$U_1, \ldots, U_h, \ldots, U_H$$ . Kulesi sibonelo, i-strata yizizwe. Ubukhulu bamaqembu $$N_1, \ldots, N_h, \ldots, N_H$$ njenge- $$N_1, \ldots, N_h, \ldots, N_H$$ . Umcwaningi angase afune ukusebenzisa isampuli eselungisiwe ukuze aqiniseke ukuthi unabantu abanele esifundazweni ngasinye ukwenza izilinganiso ezingeni likahulumeni zokungasebenzi.

Uma ngabe abantu sebehlukanisiwe babe yizintambo , cabanga ukuthi umcwaningi ukhetha isampula esingahleliwe ngaphandle kokushintshwa kwesayizi $$n_h$$ , ngokuzimela kusuka $$n_h$$ ngamunye. Ngaphezu kwalokho, cabanga ukuthi wonke umuntu okhethiwe kwisampula uba ummangalelwa (Ngizobhekana nokungaphenduli esigabeni esilandelayo). Kulesi simo, amathuba okufakwa

$\pi_i = \frac{n_h}{N_h} \mbox{ for all } i \in h \qquad(3.4)$

Ngenxa yokuthi lezi zikhundla zingahlukahluka komunye nomuntu, uma zenza ukulinganisa kulolu hlobo lwesakhiwo, abacwaningi badinga ukulinganisa ummangalelwa ngamunye ngokungahambisani namathuba abo okufakwa ngokusebenzisa umlinganisi weHorvitz-Thompson (isib. 3.2).

Ngisho noma umlinganisi weHorvitz-Thompson engakhethi, abacwaningi bangakwazi ukuveza okulinganiselwe (okungukuthi, ukulinganisa okuphansi) ngokuhlanganisa isampuli ngolwazi olungasiza . Abanye abantu bakuthola kuyamangaza ukuthi lokhu kuyiqiniso ngisho nalapho kunesampuli esingenzeka ngokubulawa. Lezi zindlela usebenzisa ulwazi olusiza kubaluleke kakhulu ngoba, njengoba ngizobonisa ngokuhamba kwesikhathi, ulwazi oluwusizo lubalulekile ekwenzeni izilinganiso kusuka emasampula enamathuba ngokungahloniphi nakwamasampuli angenakwenzeka.

Enye inqubo evamile yokusebenzisa ulwazi olungesizayo i- post-stratification . Ake ucabange, isibonelo, ukuthi umcwaningi uyazi inani lamadoda nabesifazane kuleso naleso sikhombisa ngasinye; singachaza la maqembu ezinjenge- $$N_1, N_2, \ldots, N_{100}$$ . Ukuze uhlanganise lolu lwazi oluwusizo nesampula, umcwaningi angaphula isampula kumaqembu we- $$H$$ (kuleli cala 100), yenza ukulinganisa kweqembu ngalinye, bese udala isilinganiso esilinganisiwe salezi zithi kusho:

$\hat{\bar{y}}_{post} = \sum_{h \in H} \frac{N_h}{N} \hat{\bar{y}}_h \qquad(3.5)$

Cishe, umlinganisi ku-eq. 3.5 kungenzeka ukuthi inembile kakhulu ngoba isebenzisa ulwazi $$N_h$$ lwabantu-i- $$N_h$$ okulungile uma isampula engalinganiselwe $$N_h$$ . Enye indlela yokucabanga ngayo ukuthi ukufakwa kwe-post-stratification kufana nokuqhathaniswa kwe-stratification ngemuva kokuthi idatha iqoqwe kakade.

Ekuphetheni, lesi sigaba sichaze imiklamo embalwa yesampula: isampuli esilula engahleliwe ngaphandle kokufaka esikhundleni, isampuli ngokungenzeka okungalingani, kanye nesampuli esinezintambo. Kuye kwachaza imibono emibili eyinhloko ngokulinganisa: umlinganisi weHorvitz-Thompson kanye nokuhlelwa kwe-post-stratification. Ukuze uthole incazelo ehlelekile kakhulu yemiklamo yesampula engenzeka, bheka isahluko 2 Särndal, Swensson, and Wretman (2003) . Ukuze uthole ukwelashwa okusemthethweni Särndal, Swensson, and Wretman (2003) bheka isigaba 3.7 Särndal, Swensson, and Wretman (2003) . Ukuze uthole incazelo yezobuchwepheshe Overton and Stehman (1995) -Thompson, bheka Horvitz and Thompson (1952) , u- Overton and Stehman (1995) , noma isigaba 2.8 se-@ sarndal_model_2003. Ukuze uthole ukwelashwa okusemthethweni ngokwengeziwe kwe-post-stratification, bheka Holt and Smith (1979) , Smith (1991) , Little (1993) , noma ingxenye 7.6 Särndal, Swensson, and Wretman (2003) .

Isampula esingenzeka ngokunganaki

Cishe yonke inhlolovo yangempela ayinandaba; okungukuthi, akuwona wonke umuntu esampula umphakathi ophendula yonke imibuzo. Kunezinhlobo ezimbili eziyinhloko zokungahloniphi: into engekho neyunithi ngokungabi nandaba . Ngento engekho, abanye abaphendulayo abaphenduli izinto ezithile (isb., Abanye abaphendulayo abafuni ukuphendula imibuzo abayibhekayo ebucayi). Eyunithi ngokungenandaba, abanye abantu abakhethiwe kubalandeli besampula abaphenduli kulolu cwaningo. Izizathu ezimbili ezivame kakhulu ze-unit ngokunganaki yizo ukuthi umuntu osampuli akakwazi ukuxhumana futhi umuntu osesampula uxhumana naye kodwa wenqaba ukubamba iqhaza. Kulesi sigaba, ngizogxila kwiyunithi engekho; abafundi abanentshisekelo yento ngaphandle kokunakekela kumele babone u-Little noRubin (2002) .

Abacwaningi bavame ukucabangela ukuhlola nge unit unit non-response njengendlela inqubo yesigaba sampula. Esigabeni sokuqala, umcwaningi ukhetha isampuli $$s$$ njengokuthi umuntu ngamunye unethuba lokufakwa $$\pi_i$$ (lapho $$0 < \pi_i \leq 1$$ ). Khona-ke, esigabeni sesibili, abantu abakhethiwe kwisampula baphendule ngamathuba [ $$\phi_i$$ (lapho $$0 < \phi_i \leq 1$$ ). Le nqubo yezinyathelo ezimbili ikhombisa isethi yokugcina yabaphenduli $$r$$ . Umehluko obalulekile phakathi kwalezi zigaba ezimbili ukuthi abacwaningi balawula inqubo yokukhetha isampula, kodwa abalawuli ukuthi yimuphi walabo bantu abathintekayo ababaphenduli. Ukubeka lezi zinqubo ndawonye, ​​amathuba okuba omunye abe ngummangalelwa

$pr(i \in r) = \pi_i \phi_i \qquad(3.6)$

Ukuze kube lula, ngizocabangela icala lapho umklamo wesampula wokuqala owuhlelo olulula okungahleliwe ngaphandle kokushintshwa. Uma umcwaningi ukhetha isampula yobukhulu $$n_s$$ eveza $$n_r$$ abaphenduli, futhi uma umcwaningi engayinaki impendulo futhi esebenzisa $$n_r$$ yabaphenduli, khona-ke ukuhlaziywa kokulinganisa kuyoba:

$\mbox{bias of sample mean} = \frac{cor(\phi, y) S(y) S(\phi)}{\bar{\phi}} \qquad(3.7)$

lapho $$cor(\phi, y)$$ ukulungiswa kwamanani phakathi kwesimo sempendulo kanye nomphumela (isib. isimo sokungasebenzi), $$S(y)$$ ukuphambuka okujwayelekile komphakathi (isib. ukungasebenzi isimo), $$S(\phi)$$ ukuphambuka okujwayelekile komphakathi kwempendulo yokuphendula, futhi $$\bar{\phi}$$ ukuphendula kwesibalo sabantu (Bethlehem, Cobben, and Schouten 2011, sec. 2.2.4) .

Eq. 3.7 kubonisa ukuthi ukungabi nandaba ngeke kuveze ukuhlaziywa uma ngabe kunezimo ezilandelayo:

• Akukho ukuhluka kwesimo sokungasebenzi / $$(S(y) = 0)$$ .
• Akukho ukuhluka kokuziphendulela kwempendulo $$(S(\phi) = 0)$$ .
• Akukho ukulungiswa phakathi kwesimo sokuphendula kanye nesimo sokungasebenzi [ $$(cor(\phi, y) = 0)$$ .

Ngeshwa, akekho kulezi zimo ezibonakala sengathi kungenzeka. Kubonakala kungenakuqhathaniswa ukuthi ngeke kube khona ukuhluka kwesimo sokuqashwa noma ukuthi ngeke kube nokushintshashintsha kokuziphendulela. Ngakho, igama eliyisihluthulelo ku-eq. 3.7 ukuhlanganiswa: $$cor(\phi, y)$$ . Isibonelo, uma abantu abangasebenzi bangase baphendule, izinga lokuqagela lokuqashwa lizohlelelwa phezulu.

Iqhinga lokwenza izilinganiso uma kungenasizathu ukusebenzisa ulwazi olungasiza. Isibonelo, enye indlela ongasebenzisa ngayo ulwazi olusizayo i-post-stratification (khumbula iq. 3.5 kusuka phezulu). Kuvela ukuthi ukuhlaziywa kwesilinganiso se-post-stratification kuyinto:

$bias(\hat{\bar{y}}_{post}) = \frac{1}{N} \sum_{h=1}^H \frac{N_h cor(\phi, y)^{(h)} S(y)^{(h)} S(\phi)^{(h)}}{\bar{\phi}^{(h)}} \qquad(3.8)$

lapho $$cor(\phi, y)^{(h)}$$ , $$cor(\phi, y)^{(h)}$$ , $$S(y)^{(h)}$$ , $$S(\phi)^{(h)}$$ futhi $$\bar{\phi}^{(h)}$$ kuchazwa njengenhla kodwa kunqunyelwe abantu eqenjini $$h$$ (Bethlehem, Cobben, and Schouten 2011, sec. 8.2.1) . Ngakho-ke, ukuhlaziya okujwayelekile kuzoba mncane uma ukuhlelwa kweqembu ngalinye ngemuva kokunciphisa i-post-stratification. Kunezindlela ezimbili engithandayo ukucabanga ngokukwenza ukuhleleka okuncane kuqembu ngalinye lokuthungatha. Okokuqala, ufuna ukuzama ukwakha amaqembu ahambisanayo lapho kukhona ukuhluka okuncane ekuphenduleni kwempendulo ( $$S(\phi)^{(h)} \approx 0$$ ) nomphumela ( $$S(y)^{(h)} \approx 0$$ ). Okwesibili, ufuna ukwakha amaqembu lapho abantu obakubona banjengabantu ongaboni ( $$cor(\phi, y)^{(h)} \approx 0$$ ). Ukuqhathanisa iq. 3.7 kanye neq. 3.8 kusiza ukucacisa lapho ukulandelwa kwe-post-stratification kunganciphisa ukulingana okubangelwa ukunganaki.

Ekuphetheni, lesi sigaba sinikeze isibonelo samphakamiso okungenzeka singasaphenduli futhi siboniswe ukukhetha ukuthi ukunganaki kungethula kokubili ngaphandle kokuhlelwa kokuhlelwa kwe-post-stratification. Bethlehem (1988) ihlinzekela ukukhishwa kwesibindi esabangelwa ukungakhathaleli imiklamo eminingi yesampula jikelele. Ukuze uthole okwengeziwe ngokusebenzisa i-post-stratification ukuze ulungiselele ukungahloniphi, bheka Smith (1991) Gelman and Carlin (2002) . I-post-stratification iyingxenye yemikhiqizo eminingi ejwayelekile ebizwa ngokuthi izilinganiso zokulinganisela, bheka i-Zhang (2000) yokwelapha ubude be-athikili kanye ne- Särndal and Lundström (2005) yokwelapha ubude besikhathi. Ngolunye ulwazi kwezinye izindlela zokulinganisela zokulungisa ngokungahloniphi, bheka Kalton and Flores-Cervantes (2003) , Brick (2013) , no- Särndal and Lundström (2005) .

Isampula esingenakwenzeka

Isibonelo sokungabi namathuba kufaka phakathi izinhlobo eziningi zemiklamo (Baker et al. 2013) . Ukugxila ikakhulukazi kwisampula yabasebenzisi be-Xbox ngu-Wang nabasebenza nabo (W. Wang et al. 2015) , ungacabanga ngalolu hlobo lwesampula njengenye lapho ingxenye eyinhloko yesakhiwo sampula engeyona $$\pi_i$$ ( umkhombandlela oqhutshwa umcwaningi wokufakwa) kepha i- $$\phi_i$$ (izimpendulo ezithintekayo zokuphendula). Ngokwemvelo, lokhu akulungile ngoba $$\phi_i$$ aziwa. Kodwa, njengoba u-Wang nabasebenza nabo bebonise, lolu hlobo lwesampula lokungena-ngisho nasesitokisini sampula ngephutha elikhulu lokuhlanganisa-akudingeki libe yingozi uma umcwaningi enolwazi oluhle lokusiza kanye nomfanekiso omuhle wokubalwa kwezibalo ukuphendula ngalezi zinkinga.

Bethlehem (2010) ihlanganisa okuningi kwezivumelwano ezingenhla mayelana nokulandelwa kwe-post ukuze kufake kokubili amaphutha okungahloniphi kanye nokubika. Ngaphandle kokuhlelwa kwe-post-stratification, amanye amasu okusebenza ngamasampuli (Ansolabehere and Rivers 2013; ??? ) -faka isampula esifanayo (Ansolabehere and Rivers 2013; ??? ) , i-propensity score weighting (Lee 2006; Schonlau et al. 2009) , kanye nokulinganisa (Lee and Valliant 2009) . Isihloko esisodwa esivamile phakathi kwalezi zindlela kusetshenziswa ulwazi losizo.