## Bayanan lissafi

A cikin wannan shafi, zan bayyana wasu daga cikin ra'ayoyin daga babi a cikin ɗan ƙaramin lissafin lissafi. Makasudin nan shine don taimaka maka damu da sanarwa da tsarin ilmin lissafi da masu bincike suka yi amfani da su domin ka iya canzawa zuwa wasu kayan fasahar da aka rubuta akan waɗannan batutuwa. Zan fara da gabatar da samfurin samfur, to, motsa zuwa samfurin samfur tare da rashin amsa, kuma a ƙarshe, samfurin samfuri.

Probability Samfur

A matsayin misali na gudana, bari muyi la'akari da manufar kiyasta rashin aikin yi a Amurka. Bari $$U = \{1, \ldots, k, \ldots, N\}$$ zama ƙananan mutane kuma bari $$y_k$$ ta hanyar darajar sakamako mai mahimmanci ga mutumin $$k$$ . A cikin wannan misali $$y_k$$ shine ko mutum $$k$$ shi da aiki. A ƙarshe, bari $$F = \{1, \ldots, k, \ldots, N\}$$ su zama yawan ƙirar, wanda don an yi la'akari da sauƙi ya zama daidai da yawan mutane.

Wani samfurin samfurin samfurin yana da sauki samfurin samfurin ba tare da sauyawa ba. A wannan yanayin, kowane mutum yana iya kasancewa a cikin samfurin $$s = \{1, \ldots, i, \ldots, n\}$$ . Lokacin da aka tattara bayanai tare da wannan samfurin samfurin, masu bincike zasu iya kimanta yawan aikin rashin aikin yi tare da alamar samfurin:

$\hat{\bar{y}} = \frac{\sum_{i \in s} y_i}{n} \qquad(3.1)$

inda $$\bar{y}$$ shine rashin aikin yi a cikin jama'a kuma $$\hat{\bar{y}}$$ shine kimantawa na rashin aikin yi (da $$\hat{ }$$ yana da yawa An yi amfani da shi don nuna mai kimantawa).

A gaskiya, masu bincike basu da amfani da samfurin samfurin ba tare da sauyawa ba. Don dalilai da dama (wanda zan bayyana a cikin wani lokaci), masu bincike sukan haifar da samfurori tare da yiwuwar hadawa. Alal misali, masu bincike zasu iya zaɓar mutane a Florida tare da yiwuwar shiga fiye da mutane a California. A wannan yanayin, samfurin yana nufin (eq. 3.1) bazai zama mai kirki mai kyau ba. Maimakon haka, idan akwai yiwuwar shiga, masu bincike suna amfani da su

$\hat{\bar{y}} = \frac{1}{N} \sum_{i \in s} \frac{y_i}{\pi_i} \qquad(3.2)$

inda $$\hat{\bar{y}}$$ shine kimantawar rashin aikin yi kuma $$\pi_i$$ shine mutum $$i$$ yiwuwar hadawa. Bayan bin ka'ida, zan kira mai kimantawa a cikin eq. 3.2 da kimanin Horvitz-Thompson. Muhimmancin Horvitz-Thompson yana da amfani sosai domin yana kaiwa ga ƙayyadaddun zane-zane na kowane samfurin samfurin (Horvitz and Thompson 1952) . Tunda kimanin Horvitz-Thompson ya fito sosai akai-akai, yana da kyau a lura cewa za'a iya sake rubuta shi

$\hat{\bar{y}} = \frac{1}{N} \sum_{i \in s} w_i y_i \qquad(3.3)$

inda $$w_i = 1 / \pi_i$$ . Kamar yadda eq. 3.3 ya bayyana, mai kiyasin Horvitz-Thompson yana nuna alamar samfurin inda ma'aunin nauyi ke da alaka da yiwuwar zaɓi. A wasu kalmomi, ƙananan wata ila mutum ya kasance a cikin samfurin, ƙimar da mutum ya kamata ya samu a cikin kimantawa.

Kamar yadda aka bayyana a baya, masu binciken sukan samo mutane da rashin yiwuwar hadawa. Misali ɗaya na zane wanda zai iya haifar da rashin yiwuwar hadawa shine samfurin samfurin , wanda yake da muhimmanci a fahimta saboda yana da alaka da hanyar ƙididdiga wanda ake kira post-stratification . A cikin samfurin samfuri, mai bincike ya raba yawan mutane a cikin $$H$$ . Wadannan kungiyoyi ana kiranta sashi kuma ana nuna su a matsayin $$U_1, \ldots, U_h, \ldots, U_H$$ . A cikin wannan misalin, alamar suna jihohi. Ana nuna yawancin kungiyoyi kamar $$N_1, \ldots, N_h, \ldots, N_H$$ . Mai bincike zai iya amfani da samfurin samfuri don tabbatar da cewa tana da mutane masu yawa a cikin kowace jiha don yin la'akari da matakin kasa na rashin aikin yi.

Da zarar yawan da aka raba up cikin duwatsu, zaton cewa bincike ance wani sauki bazuwar samfurin ba tare da sauyawa daga size $$n_h$$ , da kansa daga kowane strata. Bugu da ari, ɗauka cewa duk wanda aka zaɓa a cikin samfurin ya zama mai amsa (Zan rike da ba amsa a sashe na gaba). A wannan yanayin, yiwuwar hadawa ita ce

$\pi_i = \frac{n_h}{N_h} \mbox{ for all } i \in h \qquad(3.4)$

Saboda wadannan yiwuwar na iya bambanta daga mutum zuwa mutum, lokacin da aka yi kimantawa daga wannan samfurin samfurin, masu bincike suna buƙatar nauyin kowanne mai amsawa ta hanyar karkatacciyar yiwuwar shiga ta amfani da kimanin Horvitz-Thompson (eq 3.2).

Duk da cewa kimanin Horvitz-Thompson ba shi da sha'awa, masu bincike zasu iya samar da ƙayyadaddun ƙididdiga (watau ƙananan bambanci) ta hanyar hada samfurin tare da ƙarin bayani . Wasu mutane suna ganin abin mamaki ne cewa wannan gaskiya ne ko da a lokacin da aka cika cikakkiyar samfurin samfurin. Wadannan dabarun ta amfani da bayanai masu mahimmanci suna da mahimmanci saboda, kamar yadda zan nuna a baya, bayanan bayani yana da mahimmanci don yin kimantawa daga samfurori na yiwuwa tare da rashin amsawa kuma daga samfurori marasa yiwuwa.

Ɗaya daga cikin hanyoyin da ake amfani dashi don amfani da bayanan mai amfani shine bayanan ƙaddamarwa . Misali, alal misali, wani mai bincike ya san yawan maza da mata a kowace jihohi 50; za mu iya nuna waɗannan rukunin kungiyoyi kamar $$N_1, N_2, \ldots, N_{100}$$ . Don hada wannan bayanan tare da samfurin, mai bincike zai iya raba samfurin a cikin $$H$$ a cikin wannan batu 100), yayi kimantawa ga kowane rukuni, sannan kuma ƙirƙirar matsakaicin matsayi na waɗannan rukuni yana nufin:

$\hat{\bar{y}}_{post} = \sum_{h \in H} \frac{N_h}{N} \hat{\bar{y}}_h \qquad(3.5)$

Abin ƙyama, mai kimantawa a cikin eq. 3.5 zai iya zama mafi ƙari saboda yana amfani da bayanan yawan mutane - da $$N_h$$ - don daidaitaccen kimanin idan an samo samfurin da ba a samuwa ba. Wata hanyar da za ta yi tunani game da shi ita ce, bayan da aka ƙaddamar da bayanan bayan an tattara bayanai.

A ƙarshe, wannan sashe ya bayyana wasu samfurori samfurori: sauki samfurin bazuwar ba tare da maye gurbin ba, samfur tare da rashin daidaituwa, kuma samfurin samfurin. Ya kuma bayyana mahimman ra'ayoyi guda biyu game da kimantawa: tsinkayyar Horvitz-Thompson da tsinkaya. Don ƙarin bayani game da yiwuwar samfurin samfurin, duba sura ta 2 na Särndal, Swensson, and Wretman (2003) . Don ƙarin bayani da cikakke game da samfurin samfur, duba sashi na 3.7 na Särndal, Swensson, and Wretman (2003) . Don bayanin fasaha na dukiyar da aka kwatanta da Horvitz da Thompson, duba Horvitz and Thompson (1952) , Overton and Stehman (1995) , ko sashe na 2.8 na @ sarndal_model_2003. Don ƙarin magani game da tsaikowa, duba Holt and Smith (1979) , Smith (1991) , Little (1993) , ko sashe na 7.6 daga Särndal, Swensson, and Wretman (2003) .

Tabbatacce samfurin tare da rashin amsa

Kusan dukkanin binciken da ake yi na gaske suna da rashin amsawa; Wato, ba kowa a cikin samfurin samfurin ya amsa duk tambayoyin ba. Akwai manyan nau'o'in rashin amsawa: abu marar amsawa ba tare da amsa ba . Idan ba a amsa ba, wasu masu amsa ba su amsa wasu abubuwa ba (misali, wasu lokuta masu amsawa ba sa so su amsa tambayoyin da suka yi la'akari). A cikin ba da amsa ba, wasu mutane da aka zaɓa don yawan samfurin ba su amsa wannan bincike ba. Abubuwan da suka fi dacewa guda biyu don ba da amsawa ba shi ne cewa ba za'a iya tuntuɓar mutumin da aka samo ba kuma ana tuntuɓar mutumin da aka aiko shi amma ya ƙi shiga. A cikin wannan ɓangaren, zan mayar da hankali kan ba da amsa ba; masu karatu masu sha'awar abin da ba a amsa ba su ga Little da Rubin (2002) .

Masu bincike suna tunani akai game da binciken da aka ba da amsa ba tare da amsa ba a matsayin tsarin samfurin samfurori guda biyu. A mataki na farko, mai bincike ya zaɓi samfurin $$s$$ wanda kowane mutum yana da yiwuwar shiga $$\pi_i$$ (inda $$0 < \pi_i \leq 1$$ ). Bayan haka, a mataki na biyu, mutanen da aka zaɓa a cikin samfurin sun amsa da yiwu $$\phi_i$$ (inda $$0 < \phi_i \leq 1$$ ). Wannan tsari na biyu yana haifar da saitin karshe na masu amsa $$r$$ . Bambanci mai mahimmanci tsakanin waɗannan matakai guda biyu shi ne cewa masu bincike suna sarrafa tsarin zabar samfurin, amma basu kula da wanene daga wadanda aka samo asali suka zama masu amsa ba. Sanya wadannan matakai biyu tare, yiwuwar cewa wani zai kasance mai amsawa ne

$pr(i \in r) = \pi_i \phi_i \qquad(3.6)$

Don kare kanka da sauƙi, zan yi la'akari da yanayin inda samfurin samfurin asalin ya zama mai sauƙi samfurin ba tare da sauyawa ba. Idan mai bincike ya zaɓi wani samfurin girman $$n_s$$ wanda ke haifar da masu amsa $$n_r$$ , kuma idan mai bincike ya ƙi amsa ba da amsa ba kuma yana amfani da ma'anar masu amsawa, to, tsinkaya na kimantawa zai zama:

$\mbox{bias of sample mean} = \frac{cor(\phi, y) S(y) S(\phi)}{\bar{\phi}} \qquad(3.7)$

inda $$cor(\phi, y)$$ shine haɓaka yawan jama'a tsakanin karɓar amsawa da sakamako (misali, rashin aikin yi), $$S(y)$$ shine bambancin yawan jama'a na sakamakon (misali, rashin aikin yi matsayi), $$S(\phi)$$ shi ne daidaitattun daidaitattun jama'a na karɓar amsawa, kuma $$\bar{\phi}$$ shine yawancin jama'a yana nufin karɓar amsawa (Bethlehem, Cobben, and Schouten 2011, sec. 2.2.4) .

Eq. 3.7 ya nuna cewa ba amsa ba zai gabatar da nuna bambanci ba idan an cika wani yanayin da ya biyo baya:

• Babu bambanci a matsayin rashin aikin yi $$(S(y) = 0)$$ .
• Babu bambanci a cikin karɓan amsawa $$(S(\phi) = 0)$$ .
• Babu daidaituwa a tsakanin karuwar amsawa da rashin aiki na aiki $$(cor(\phi, y) = 0)$$ .

Abin takaici, babu wani daga cikin waɗannan yanayi wanda zai iya yiwuwa. Babu alama cewa babu wani canji a matsayin matsayi ko kuma cewa babu wani bambanci a cikin halayen amsawa. Saboda haka, mahimmancin magana a cikin eq. 3.7 shine daidaitawa: $$cor(\phi, y)$$ . Alal misali, idan mutane sun kasance marasa aikin yi sun fi dacewa su amsa, to, za a yi la'akari da ƙimar kuɗin aiki a sama.

Trick don yin kimantawa idan babu amsa shi ne don amfani da bayanan da suka dace. Alal misali, hanyar da za ka iya amfani da bayanan da aka ba da ita shine bayanan ƙaddamarwa (tuna ambaliyar 3.5 daga sama). Ya bayyana cewa, abin da ake nufi da ƙaddamar da ƙaddamarwa shine:

$bias(\hat{\bar{y}}_{post}) = \frac{1}{N} \sum_{h=1}^H \frac{N_h cor(\phi, y)^{(h)} S(y)^{(h)} S(\phi)^{(h)}}{\bar{\phi}^{(h)}} \qquad(3.8)$

inda $$cor(\phi, y)^{(h)}$$ , $$S(y)^{(h)}$$ , $$S(\phi)^{(h)}$$ , da $$\bar{\phi}^{(h)}$$ an bayyana kamar haka amma an taƙaita wa mutane a rukuni $$h$$ (Bethlehem, Cobben, and Schouten 2011, sec. 8.2.1) . Saboda haka, zancen gaba ɗaya zai zama ƙananan idan bambance-bambance a cikin kowace ƙungiya bayan ƙaddamarwa ba karamin ba ne. Akwai hanyoyi guda biyu da nake so in yi tunani game da yin ƙananan abin takaici a kowane ɗayan ƙungiyoyi. Da farko, kuna son gwada kungiyoyi masu kamala inda akwai kadan canje-canjen a cikin karɓar amsawa ( $$S(\phi)^{(h)} \approx 0$$ ) da sakamakon ( $$S(y)^{(h)} \approx 0$$ ). Na biyu, kuna son kafa kungiyoyi inda mutane da kuke gani suna kama da mutanen da ba ku gani ( $$cor(\phi, y)^{(h)} \approx 0$$ ). Kwatanta eq. 3.7 da eq. 3.8 yana taimakawa a bayyane lokacin da post-stratification na iya rage yawan abin da ba'a so ba.

A ƙarshe, wannan ɓangaren ya samar da samfurin don samfurin samfur tare da wadanda ba amsa ba kuma ya nuna nuna bambanci cewa rashin amsawa zai iya gabatarwa ba tare da tare da daidaitawa ba. Bethlehem (1988) tana ba da ladabi na abin da ya faru da rashin amsa ga samfuran samfurori da yawa. Don ƙarin bayani game da yin amfani da bayanan ƙaddamarwa don daidaitawa ba tare da amsa ba, duba Smith (1991) da Gelman and Carlin (2002) . Tsarin gine-ginen yana cikin ɓangare na yau da kullum da ake kira 'yan ƙididdigar calibration, duba Zhang (2000) don Särndal and Lundström (2005) maganin rubutu da Särndal and Lundström (2005) don yin Särndal and Lundström (2005) littafin. Don ƙarin bayani game da sauran hanyoyin da za a yi don daidaitawa don rashin amsa, duba Kalton and Flores-Cervantes (2003) , Brick (2013) , da Särndal and Lundström (2005) .

Babu yiwuwar samfur

Babu samfurin samfurin yana samuwa da ƙananan kayayyaki (Baker et al. 2013) . Idan muka mayar da hankali akan samfurin Xbox masu amfani da Wang da abokan aiki (W. Wang et al. 2015) , za ka iya tunanin irin wannan samfurin a matsayin ɗaya inda maɓallin ɓangaren samfurin samfurin ba shine $$\pi_i$$ ( mai yiwuwa mai bincike na iya shiga) amma $$\phi_i$$ (wanda aka mayar da martani). A al'ada, wannan ba manufa bane saboda $$\phi_i$$ ba a sani ba. Amma, kamar yadda Wang da abokan aiki suka nuna, irin wannan fitowar-samfurin-ko da daga samfurin samfurin da babban kuskuren kuskuren-bazai zama masifa ba idan mai bincike yana da kyakkyawan bayani da kuma tsarin kirkirar kirki don lissafin wadannan matsalolin.

Bethlehem (2010) ya shimfiɗa da yawa daga cikin abubuwan da aka samo daga baya game da ƙaddamarwa don haɗawa da kurakurai marasa laifi da ɗaukar hoto. Baya ga post-stratification, wasu hanyoyin da za a yi aiki tare da samfurori marasa yiwuwa-da yiwuwar samfurori tare da kurakuran ɗaukar hoto da rashin amsawa-sun haɗa da matakan samfurin (Ansolabehere and Rivers 2013; ??? ) , (Ansolabehere and Rivers 2013; ??? ) (Lee 2006; Schonlau et al. 2009) , da kuma calibration (Lee and Valliant 2009) . Ɗaya daga cikin batutuwa ɗaya daga cikin waɗannan fasahohin ita ce yin amfani da ƙarin bayani.