조사연구 권 호 연구노트 1) 가구방문조사에서무응답보정을위한파라데이터활용 * : 국제성인역량조사사례분석을중심으로 Using Paradata in Nonresponse Adjustment for a Household Interview Survey: Case Study of Survey on Program for the International Assessment of Adult Competencies 2)3) a) b) 주제어 재조사모형 무시할수없는무응답 파라데이터 가구방문조사 국제성 인역량조사
조사연구 We consider the use of level-of-effort(loe) paradata to model the nonresponse mechanism in survey and to adjust for non-ignorable nonresponse bias. Also we review the call-back model for household interview surveys and unconditional maximum likelihood estimation procedure that was proposed by Biemer et al.(2013). The nonresponse adjust method are then applied and evaluated for a large scale field surveythe Survey on Program for the International Assessment of Adult Competencies(PIAAC). We found that the call-back model based on LOE paradata was effective to reduce the nonresponse bias. Key words : call-back model, non-ignorable nonresponse, paradata, household survey, PIAAC Ⅰ. 서론
가구방문조사에서무응답보정을위한파라데이터활용
조사연구
가구방문조사에서무응답보정을위한파라데이터활용 Ⅱ. 파라데이터를활용한재조사모형
조사연구
가구방문조사에서무응답보정을위한파라데이터활용 log log log log log log log
조사연구 Ⅲ. 국제성인역량조사 (PIAAC) 개요및파라데이터현황 1. 조사개요
가구방문조사에서무응답보정을위한파라데이터활용
조사연구 2. 가구접촉현황및파라데이터 < 표 1> 조사단계별접촉횟수현황 접촉횟수 전체 선별조사 설문조사 역량평가 완료 4.9 3.8 1.9 1.1 4.0 4 3 1 1 3 22 21 16 11 20
가구방문조사에서무응답보정을위한파라데이터활용 < 표 2> 접촉횟수별표본가구분포 횟수 전체 (A1) 적격표본 (A2) 최종 (A) 빈도 % 누적빈도 % 빈도 % 누적빈도 % 코드 빈도 % 누적빈도 % 1 2,220 25.2 2,220 25.2 2,115 24.9 2,115 24.9 1 2,115 24.9 2,115 24.9 2 1,708 19.4 3,928 44.5 1,648 19.4 3,763 44.2 2 1,648 19.4 3,763 44.2 3 1,320 15.0 5,248 59.5 1,284 15.1 5,047 59.3 3 2,217 26.1 5,980 70.3 4 971 11.0 6,219 70.5 933 11.0 5,980 70.3 5 645 7.3 6,864 77.8 633 7.4 6,613 77.8 4 1,493 20.0 7,473 87.9 6 455 5.2 7,319 82.9 433 5.1 7,046 82.9 7 441 5.0 7,760 87.9 427 5.0 7,473 87.9 8 321 3.6 8,081 91.6 310 3.6 7,783 91.5 5 664 8.2 8,137 95.7 9 197 2.2 8,278 93.8 191 2.3 7,974 93.8 10 167 1.9 8,445 95.7 163 1.9 8,137 95.7 11 146 1.7 8,591 97.4 143 1.7 8,280 97.4 6 368 4.3 8,505 100.0 12 76 0.9 8,667 98.2 72 0.9 8,352 98.2 13 71 0.8 8,738 99.0 68 0.8 8,420 99.0 14 40 0.5 8,778 99.5 40 0.5 8,460 99.5 15 22 0.3 8,800 99.7 22 0.3 8,482 99.7 16 13 0.2 8,813 99.9 13 0.2 8,495 99.9 17 4 0.1 8,817 99.9 3 0.0 8,498 99.9 18 6 0.1 8,823 100.0 5 0.1 8,503 100.0 19 1 0.0 8,824 100.0 1 0.0 8,504 100.0 20 1 0.0 8,825 100.0 1 0.0 8,505 100.0
조사연구 선별조사 1 3 / 4-5 - 7 8 9 / 12 13 14 15 16 17 ( ) 19-21 22 24 26 27-28 대상초기화 / 배경설문조사 최종현황코드 1 3 / 4-5 - 7 8 9 / 12 13 14 15 16 17 ( ) 18 21 24 27 - ICT 모듈직접평가임시코드 1 3 / 4-1 3 / 4-5 - 7 8 9 / 12 13 14 15 16 17 ( ) 18 21 24 27 - AP - CB - NH RB IL / OT (//) // 그림 국제성인역량조사의현황코드
가구방문조사에서무응답보정을위한파라데이터활용 < 표 3> 접촉결과별표본가구분포 코드 전체 (D1) 적격표본 (D2) 최종 (D) 빈도 % 누적빈도 % 빈도 % % 코드빈도 % 누적빈도 % 01 7,341 83.1 7,341 83.1 7,341 86.3 7,341 86.3 1 7,341 86.3 7,341 86.3 03 11 0.1 7,352 83.3 11 0.1 7,352 86.4 2 785 9.2 8,126 95.5 04 738 8.4 8,090 91.6 738 8.7 8,090 95.1 05 17 0.2 8,107 91.8 17 0.2 8,107 95.3 07 5 0.1 8,112 91.9 5 0.1 8,112 95.3 14 1 0.0 8,113 91.9 1 0.0 8,113 95.3 17 13 0.2 8,126 92.0 13 0.2 8,126 95.5 21 356 4.0 8,482 96.1 356 4.2 8,482 99.7 3 384 4.5 8,510 100.0 24 28 0.3 8,510 96.4 28 0.3 8,510 100.0 19 147 1.7 8,657 98.0 20 4 0.1 8,661 98.1 22 9 0.1 8,670 98.2 26 129 1.5 8,799 99.6 27 2 0.0 8,801 99.7 28 29 0.3 8,830 100.0
조사연구 < 표 4> 분할표 A ( 접촉횟수 ) 1 2 3 4 5 6 D ( 접촉결과 ) G ( 가구원수 ) 1 명 2 명 3 명 4 명이상모름 합계 ( ) 1 206 464 532 904-2,106 2 - - - - 4 4 3 - - - - 5 5 206 464 532 904 9 2,115 1 207 374 440 612-1,633 2 - - - - 11 11 3 - - - - 4 4 207 374 440 612 15 1,648 1 326 505 535 773-2,139 2 - - - - 58 58 3 - - - - 20 20 326 505 535 773 78 2,217 1 246 247 269 318-1,080 2 - - - - 281 281 3 - - - - 132 132 246 247 269 318 413 1,493 1 89 67 60 68-284 2 - - - - 248 248 3 - - - - 132 132 89 67 60 68 380 664 1 30 22 28 14-94 2 - - - - 183 183 3 - - - - 91 91 30 22 28 14 274 368 1 1,104 1,679 1,864 2,689 - : 7,336 2 - - - - 785 785 3 - - - - 384 384 1,104 1,679 1,864 2,689 1,169 : 8,505
가구방문조사에서무응답보정을위한파라데이터활용 < 표 5> 가중 분할표 ( 가구설계가중값반영 ) A ( 접촉횟수 ) 1 2 3 4 5 6 D ( 현황코드 ) G ( 가구원수 ) 1 명 2 명 3 명 4 명이상모름 합계 ( ) 1 208.1 467.6 528.6 895.4-2,099.7 2 - - - - 4.1 4.1 3 - - - - 5.7 5.7 208.1 467.6 528.6 895.4 9.7 2,109.4 1 204.7 377.6 438.9 607.1-1,628.4 2 - - - - 11.6 11.6 3 - - - - 4.0 4.0 204.7 377.6 438.9 607.1 15.6 1,643.9 1 326.3 508.7 535.5 773.1-2,143.6 2 - - - - 59.3 59.3 3 - - - - 18.7 18.7 326.3 508.7 535.5 773.1 78.0 2,221.6 1 246.9 247.2 271.3 324.7-1,090.0 2 - - - - 283.0 283.0 3 - - - - 129.3 129.3 246.9 247.2 271.3 324.7 412.2 1,502.3 1 85.0 67.5 58.1 72.1-282.6 2 - - - - 250.8 250.8 3 - - - - 134.6 134.6 85.0 67.5 58.1 72.1 385.4 668.0 1 27.8 20.8 26.2 14.1-88.9 2 - - - - 181.7 181.7 3 - - - - 89.2 89.2 27.8 20.8 26.2 14.1 270.9 359.8 1 1098.7 1689.4 1858.6 2686.6 - : 7,333.2 2 - - - - 790.3 790.3 3 - - - - 381.4 381.4 1098.7 1689.4 1858.6 2686.6 1,171.0 : 8,505.0
조사연구 3. 표본설계가중치적용 Ⅳ. 재조사모형설정및사례분석결과 1. 모형설정을위한기초분석및초기모수추정값
가구방문조사에서무응답보정을위한파라데이터활용 < 표 6> 가구원수별가구비율 구분 G ( 가구원수 ) 1 명 2 명 3 명 4 명이상 2010 0.201 0.223 0.236 0.340 0.151 0.229 0.254 0.367 () 0.150 0.230 0.253 0.366
조사연구 < 표 7> 접촉횟수및가구원수별접촉성공비율 A ( 접촉횟수 ) G ( 가구원수 ) 1 2 3 4 평균 ( ) 1 0.1565 0.2401 0.2481 0.2907 0.2339 2 0.1841 0.2567 0.2754 0.2795 0.2489 3 0.3690 0.4744 0.4732 0.5041 0.4552 4 0.4295 0.4305 0.4467 0.4195 0.4316 5 0.5532 0.5257 0.4904 0.5088 0.5195 6 0.7818 0.7550 0.7666 0.7142 0.7544 ( ) 0.4124 0.4470 0.4501 0.4528 0.4406
가구방문조사에서무응답보정을위한파라데이터활용 < 표 8> 접촉횟수에따른조사완료가구비율 G ( 가구원수 ) A ( 접촉횟수 ) 1 2 3 4 평균 ( ) 1 0.9961 0.9981 0.9982 0.9985 0.9977 2 0.9888 0.9932 0.9938 0.9936 0.9923 3 0.9647 0.9746 0.9746 0.9746 0.9721 4 0.8125 0.7963 0.8028 0.7715 0.7958 5 0.6274 0.5464 0.4958 0.4582 0.5319 6 0.4319 0.3392 0.3802 0.1854 0.3342 ( ) 0.8735 0.9054 0.9090 0.9091 0.8992 < 표 9> 접촉횟수및가구원수별중도절단비율 G ( 가구원수 ) A ( 접촉횟수 ) 1 2 3 4 평균 ( ) 1 0.0010 0.0009 0.0008 0.0009 0.0009 2 0.0009 0.0008 0.0008 0.0009 0.0008 3 0.0065 0.0072 0.0072 0.0081 0.0073 4 0.0960 0.1095 0.1130 0.1244 0.1107 5 0.2476 0.2699 0.2604 0.3012 0.2698 ( ) 0.0704 0.0776 0.0765 0.0871
조사연구 2. 재조사모형설정및모수추정
가구방문조사에서무응답보정을위한파라데이터활용 3. 모형별추정결과비교
조사연구 < 표 10> 모형별그룹비율 에대한추정값비교 2010 0.201 0.223 0.236 0.340 ( ) 0.150 0.230 0.253 0.366 [ 1] 0.150 0.230 0.253 0.366 [ 2] 0.167 0.229 0.252 0.352 [ 3] 0.177 0.230 0.258 0.335 Ⅴ. 결론
가구방문조사에서무응답보정을위한파라데이터활용
조사연구 참고문헌 Alho, J. 1990. Adjusting for Nonresponse Bias Using Logistic Regression. Biometrika 77: 617624. Biemer, P., P. Chen, and K. Wang. 2013. Using Level-of-effort Paradata in Non-response Adjustments with Application to Filed Surveys. Journal of Royal Statistical Society A 176: Part 1, 147-168. Biemer, P. and M. Link. 2007. Evaluating and Modeling Early Cooperator Bias in RDD Surveys. In J.M. Lepkowski, C. Tucker, J.M. Brick, E.D. De Leeuw, L. Japec, P.J. Lavrakas, M.W. Link and R.L. Sangster(eds). Advances in Telephone Survey Methodology. Hoboken: Wiley: 587-617. Couper, M.P. 1998. Measuring Survey Quality in a CASIC Environment. In Proceedings of the Survey Research Section in American Statistical Association: 42-49. Dempster, A.P., N.M. Laird, and D.B. Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of Royal Statistical Society B 39: Part 1, 1-38. Drew, J.H. and W.A. Fuller. 1980. Modeling Nonresponse in Surveys with Callbacks. In Proceedings of the Survey Research Section in American Statistical Association: 639-642. Göksel, H., D. Judkins, and W. Mosher. 1992. Nonresponse Adjustments for a Telephone Follow-up to a National In-person Survey. Journal of Official Statistics 8: 417-431.
가구방문조사에서무응답보정을위한파라데이터활용 Groves, R.M. and M.P. Couper. 1998. Nonresponse in Household Interview Surveys. New York: Wiley. Groves, R.M. and R.L. Kahn. 1979. Surveys by Telephone: A National Comparison with Personal Interviews. New York: Academic Press. Kim, J.K. and J. Im. 2014. Propensity Score Weighting Adjustment with Several Follow-ups. Biometrika 101: 439-448. Little, R.J.A. and D.B. Rubin. 2002. Statistical Analysis with Missing Data(2nd eds.). New York: Wiley. Organisation for Economic Co-operation and Development. 2013. Technical Report of the Survey of Adult Skills(PIAAC), OECD PIAAC(Available from http://www.oecd.org/site/piaac/_technical%20report_17oct13. pdf) Politz, A. and W. Simmons. 1949. An Attempt to Get the Not at Homes into the Sample without Callbacks. Journal of American Statistical Association 44: 916. Rubin, D.B. 1976. Inference and Missing Data. Biometrika 63: 581592. Wagner, J., R. Valliant, F. Hubbard, and C.L. Jiang. 2014. Level-of-Effort Paradata and Nonresponse Adjustment Models. Journal of Survey Statistics and Methodology 2(4): 410-432. Wang, K., J. Murphy, R. Baxter, and J. Aldworth. 2005. Are Two Feet in the Door Better Than One?: Using Process Data to Examine Interviewer Effort and Nonresponse Bias. In Proceeding of Federal Committee on Statistical Methodology Conference. Arlington, Nov. (Available from http://www.fcsm.gov/05papers/wang-aldworth-etal-vib.pdf.) Wood, A.M., I.R. White, and M. Hotopf. 2006. Using Number of Failed Contact Attempts to Adjust for Nonignorable Non-response. Journal of Royal Statistical Society A 169: 525542. <: 2015/2/1, 2015/2/17>