Lim SM et al. (non-randomaized study, NRS) 임에도불구하고이에대 한질평가도구가부재한실정임을감안할때, 무작위배정 비교임상시험 (prospective randomized controlled double blind clinical trial,

DOI: 10.5124/jkma.2011.54.4.419 pissn: 1975-8456 eissn: 2093-5951 http://jkma.org Healthcare Policy 근거수준별문헌의질평가도구 임선미 1 신인순 2* 이선희 2 서경화 1 정유민 2 장지은 2 1 대한의사협회의료정책연구소, 2 이화여자대학교의과대학예방의학교실 Tools for assessing quality and risk of bias by levels of evidence Sun Mi Lim, PhD 1 Ein Soon Shin, PhD 2* Sun Hee Lee, MD 2 Kyung Hwa Seo, MBA 1 Yu Min Jung, RN 2 Ji Eun Jang, BS 2 1 Research Institute for Healthcare Policy, Korean Medical Association, 2 Department of Preventive Medicine, Ewha Womans University School of Medicine, Seoul, Korea *Corresponding author: Ein Soon Shin, E-mail: shin1121@ewha.ac.kr Received January 30, 2011 Accepted February 17, 2011 Tools for assessing methodological quality or risk of bias in randomized controlled trials (RCTs) and non-randomized studies (NRS) were reviewed. The van Tulder scale and Cochrane s assessment of risk of bias are the two most useful methodological quality evaluation tools for RCTs. Cochrane s tool includes sequence generation, allocation of sequence concealment, blinding, incomplete outcome data, selective outcome reporting, and other potential sources of bias. The Cochrane Collaboration Group recommends the Downs and Black instrument and the Newcastle-Ottawa Scale for evaluating the quality of NRS. In conclusion, this study offers useful information to physicians about tools for assessing the quality of evidence in clinical guidelines. Further research is needed to provide an essential core for evidence-based decision making regarding levels and/or grades of recommendations. Keywords: Jadad s scale; van Tulder scale; Cochrane s tool; Downs and Black instrument; Newcastle-Ottawa Scale 서 론 사람의생명을다루는보건의료분야에서사용되고있는다양한근거자료들을체계적으로요약하고그근거의 질을종합적이고구체적인기준에의해평가 [1-3] 하는것은 근거기반의사결정 [4-9] 및근거중심의학 [10-14] 의핵심적 방법론에속한다. 특히근거의질을평가하는과정은임상현 장에서과학적인진료문화 [15] 를정착시키는데핵심적인역 할 [16-19] 을함으로써궁극적으로의료의질 [20] 을높일뿐만아니라, 근거평가결과에서불필요하거나잘못된근거자료들을판별해내고이를임상현장에적용함으로써의료비를절감하는잠정적인효과까지기대할수있다. 아울러최근국내에서도높은관심의대상이되고있는임상진료지침개발의핵심구성요소이며임상진료지침의질에결정적인영향을미치기도한다는점에서중요한방법론적이슈가되고있다. 특히국내문헌혹은근거의대부분이비무작위연구 c Korean Medical Association This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons. org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. 대한의사협회지 419

Lim SM et al. (non-randomaized study, NRS) 임에도불구하고이에대 한질평가도구가부재한실정임을감안할때, 무작위배정 비교임상시험 (prospective randomized controlled double blind clinical trial, RCT) 뿐만아니라비무작위연구 근거들에대하여우리나라현실에맞는한국형진료지침제 작에필요한근거수준별질평가도구 [21-22] 를발굴하여 적용타당성에대한체계적인검토를하는일은꼭필요하 다. 이러한필요성과의미를고려할때, 근거수준별로문헌 의질평가에활용할수있는국제적도구들을검토하고이 중국내에서주로생산되고있는근거의실정에부합할수 있는도구및활용방법을제시하는것은근거기반의료문화 를정착시켜나가는데있어중요한연구작업이라고할수있 겠다. RCT 연구에사용되는근거및문헌의질평가도구 RCT 연구에사용되는근거및문헌의질평가도구들중 에서 Jadad, van Tulder, Cochrane, Newell s, Scottish Intercollegiate Guidelines Network (SIGN), 그리고 National Institute for Health and Clinical Excellence (NICE) 에서사용하고있는평가도구들은타당도가검증된 척도혹은체크리스트로서국제적으로많이쓰이고있다. RCT 의내적타당도를향상시키기위해실험적근거에기초 하여개발된 consolidated standards of reporting trials (CONSORT) 체크리스트 25 개항목들을기준으로하여이 상의 6 개 RCT 평가도구에서활용하고있는구성요소들을 비교해본결과 SIGN 체크리스트가 RCT 문헌평가에있어 서가장많은부분을포괄적으로평가하고있음을알수 있다 (Table 1). 국 내외에서많이사용하고있는대표적인평가도구는 Jadad 척도와코크란연합의평가방법이다. Jadad 척도는 평가문항이간단하고, 평가하기쉬운장점이있어그동안 국내 외에서많이사용되어왔지만, RCT 연구평가의가장 중요한항목인무작위배정순서의할당은닉에대한평가항 목을포함하고있지않기때문에코크란연합에서는권장하 지않고있는평가도구이다. 코크란연합의체계적문헌고찰방법론에서권장하고있으면서 RCT 방법론에대한간편하고타당한평가도구로는 van Tulder 척도가있다. 1. Jadad 척도 Jadad 척도는 Oxford quality scoring system이라고도알려져있으며, RCT 문헌을평가하기위한척도이며총 5문항으로구성되어있다 [23]. 즉, 무작위배정에관한평가문항이 2개, 눈가림에관한평가문항이 2개, 탈락에관한평가문항이 1개이다. 평가방법은 RCT 문헌에서무작위화를언급하면 1점, 적절한무작위법을포함하고있으면 1점이추가되며, 잘못된무작위평가는 -1점의점수를주게되고, 총체적으로 0-2점의점수분포를갖는다. 눈가림에대한평가는이중눈가림이언급된경우 1점, 적절한이중눈가림법을포함하고있으면 1점이추가되며, 잘못된이중눈가림을표현하고있으면 -1점의점수를주게되고, 무작위배정에관한평가와마찬가지로최대 2점의점수를부여할수있다. 마지막으로탈락에관한내용이문헌에언급되었을경우 1점의점수를부여할수있다. 이렇게 5가지항목에대해각 1점씩점수를줄수있으며, 총 5점만점으로평가한다. 최종문헌의질평가는평가점수가 0-2점사이는문헌의질이낮은것으로, 3-5점사이는문헌의질이높은것으로평가한다 (Table 2). 2. van Tulder 척도 van Tulder 척도 [24] 는코크란연합에서권장하고있는 RCT 문헌의질평가에적합한도구중의하나이다. 본평가도구는무작위적절성, 치료배정은닉, 기본특성의유사성, 환자눈가림, 치료제공자눈가림, 관찰자눈가림, 동시중재, 규정준수, 탈락 / 탈락률, 결과평가시점, 그리고무작위할당대로의분석과같은 11개구성요소로평가하도록되어있다. 평가방법은각각의항목에대하여 예, 아니오, 알수없음 중하나를선택하도록되어있는데, 5개이상의평가항목에대한기준을만족하는경우 (5점이상 ) 문헌의질이높은것으로평가하고있다 (Table 3). 420 근거수준별문헌의질평가도구

Tools for assessing quality and risk of bias by levels of evidence 의료정책 Table 1. Quality assessment tools of randomized controlled trials CONSORT checklist Jadad [23] van Tulder [24] Cochrane [1] Newell s [25] SIGN [26] NICE [27] 1. Title and abstract Identification as a randomized trial in the title Structured summary of trial design, methods, results, and conclusions 2. Introduction background and objectives 3. Methods Trial design Participants Interventions Outcomes Sample size Randomization Sequence generation Allocation concealment Implementation Blinding Statistical methods 4. Results Participant flow /numbers analysed/ drop outs Recruitment and follow-up Baseline data Outcomes and estimation Ancillary analyses Harms 5. Discussion Limitations Generalisability Interpretation 6. Other information Registration Protocol Funding CONSORT, consolidated standards of reporting trials; SIGN, Scottish Intercollegiate Guidelines Network; NICE, National Institute for Health and Clinical Excellence. 3. 코크란연합의질평가방법 (Cochrane s assessment of risk of bias) 코크란연합은 RCT 근거문헌의평가를위하여 6개영역즉, 순서생성, 배정은닉, 눈가림, 불완전한결과변수, 선택적결과보고, 그리고타당도를위협하는기타잠재적편견으로구분하여평가하고있다 [1]. 평가방법은각영역별로 예, 아니오, 분명하지않음 으로평가하도록되어있으며, 평가자는 6개의개별영역별로 yes, no, unclear에대한세부기준에따라판단하고평가하도록되어있다. 예 는낮은위 험편견을, 아니오 는높은위험편견을, 그리고 분명하지 않음 은정보가불충분한경우를의미한다 (Table 4). NRS 연구에활용되는근거및문헌의질평가도구 Deeks 등 [21] 은비무작위연구의방법론적질평가도구 에대한체계적문헌고찰결과 213 개의비무작위연구평가 도구를발굴한바있으며, 이때역학연구, 중재부작용에대 대한의사협회지 421

Lim SM et al. Table 2. Jadad scale Item Maximum points Description Randomization 2 1 point if randomization is mentioned Blinding 2 1 point if blinding is mentioned 1 additional point if the method of randomization is appropriate Deduct 1 point if the method of randomization is inappropriate (minimum 0) 1 additional point if the method of blinding is appropriate Deduct 1 point if the method of blinding is inappropriate (minimum 0) An account of all patients 1 The fate of all patients in the trial is known. If there are no data the reason is stated From Jadad AR, et al. Control Clin Trials 1996;17:1-12, Appendix with permission from Elsevier [23]. Table 3. van Tulder scale [24] Criteria A Was the method of randomization adequate? Yes/no/don t know B Was the treatment allocation concealed? Yes/no/don t know C Were the groups similar at baseline regarding the most important prognostic indicators? Yes/no/don t know D Was the patients blinded to the intervention? Yes/no/don t know E Was the care provider blinded to the intervention? Yes/no/don t know F Was the timing of the outcome assessment in all groups similar? Yes/no/don t know G Were co-interventions avoided or similar? Yes/no/don t know H Was the compliance acceptable in all groups? Yes/no/don t know I Was the drop-out rate described and acceptable? Yes/no/don t know J Was the timing of the outcome assessment in all groups similar? Yes/no/don t know K Did the analysis include an intention-to-treat analysis? Yes/no/don t know 한조사연구, 환자-대조군연구, 그리고통제되지않은연구들은조사대상에서제외되었다. Deeks 등에의하여개발된 NRS 평가도구들을비교 평가하기위한 12개핵심영역의구체적인기준및세부항목들은 Table 5와같다. 본연구에서는 Downs and Black 도구, Newcastle- Ottawa 척도, Reish, Thomas, 그리고 Zaza 평가도구들의구성항목들을 Deeks 등에의해개발된 NRS 평가기준및세부항목들과비교해본결과 Downs and Black 도구및 Reisch 도구가 NRS 평가기준에가장부합되는유용한도구임을알수있었다 (Table 5). 코크란연합에서는비무작위연구의질평가를위한유용한도구로 Downs and Black 도구를그리고환자- 대조군연구의질평가도구로는 Newcastle- Ottawa 척도를권장하고있다. 1) Downs and Black 도구 Downs and Black 도구는역학연구의기본원칙, 연구 설계의검토, 그리고 RCT 평가에대한기존체크리스트등을기반으로개발된것이다 [28]. 연구의전반적기술보고에관한항목 10개, 외적타당도에관한항목 3개, 내적타당도의비뚤림에관한항목 7개, 내적타당도의혼란요인및선택비뚤림에관한항목 6개, 그리고임상적으로중요한효과크기에관한항목 1개즉, 5개척도 27개세부항목으로구성되어있는평가도구이다 (Table 6). 평가방법은각항목에 0점또는 1점을줄수있으나, 예외적으로연구의전반적기술보고에관한 1개항목은최대 2점까지줄수있고, 효과크기의항목은최대 5점까지줄수있어, 최대점수는 32점이다. Downs and Black 도구는타당도와신뢰도측면에서질이높은평가도구이다. 그러나환자- 대조군연구의질을평가하는데에는적합하지않은척도이다. 2) Newcastle-Ottawa 척도 Newcastle-Ottawa 척도는비무작위연구인대부분의 422 근거수준별문헌의질평가도구

Tools for assessing quality and risk of bias by levels of evidence 의료정책 Table 4. Cochrane s assessment of risk of bias [1] Item Criteria Sequence generation Allocation concealment Yes referring to a random number table, using a computer random number generator, coin tossing, shuffling cards or envelopes, throwing dice, drawing of lots, minimization* No sequence generated by odd or even date of birth, sequence generated by some rule based on date (or day) of admission, sequence generated by some rule based on hospital or clinic record number Yes central allocation (including telephone, web-based and pharmacy-controlled randomization), sequentially numbered drug containers of identical appearance, sequentially numbered, opaque, sealed envelopes No using an open random allocation schedule(e.g. a list of random numbers), assignment envelopes were used without appropriate safeguards, alternation or rotation, date of birth, case record number, any other explicitly unconcealed procedure Blinding Yes No blinding, but the review authors judge that the outcome and the outcome measurement are not likely to be influenced by lack of blinding; Blinding of participants and key study personnel ensured, and unlikely that the blinding could have been broken; Either participants or some key study personnel were not blinded, but outcome assessment was blinded and the non-blinding of others unlikely to introduce bias No No blinding or incomplete blinding, and the outcome or outcome measurement is likely to be influenced by lack of blinding; Blinding of key study participants and personnel attempted, but likely that the blinding could have been broken; Either participants or some key study personnel were not blinded, and the non-blinding of others likely to introduce bias Incomplete outcome data Selective outcome reporting Other potential threats to validity Yes No Yes No Yes No No missing outcome data; Reasons for missing outcome data unlikely to be related to true outcome Missing outcome data balanced in numbers across intervention groups, with similar reasons for missing data across groups; For dichotomous outcome data, the proportion of missing outcomes compared with observed event risk not enough to have a clinically relevant impact on the intervention effect estimate; For continuous outcome data, plausible effect size among missing outcomes not enough to have a clinically relevant impact on observed effect size; Missing data have been imputed using appropriate methods Reason for missing outcome data likely to be related to true outcome, with either imbalance in numbers or reasons for missing data across intervention groups; For dichotomous outcome data, the proportion of missing outcomes compared with observed event risk enough to induce clinically relevant bias in intervention effect estimate; For continuous outcome data, plausible effect size among missing outcomes enough to induce clinically relevant bias in observed effect size; As-treated analysis done with substantial departure of the intervention received from that assigned at randomization; Potentially inappropriate application of simple imputation The study protocol is available and all of the study s pre-specified (primary and secondary) outcomes that are of interest in the review have been reported in the pre-specified way; The study protocol is not available but it is clear that the published reports include all expected outcomes, including those that were pre-specified Not all of the study s pre-specified primary outcomes have been reported; One or more primary outcomes is reported using measurements, analysis methods or subsets of the data (e.g., subscales) that were not pre-specified; One or more reported primary outcomes were not pre-specified (unless clear justification for their reporting is provided, such as an unexpected adverse effect); One or more outcomes of interest in the review are reported incompletely so that they cannot be entered in a meta-analysis; The study report fails to include results for a key outcome that would be expected to have been reported for such a study Had a potential source of bias related to the specific study design used; or Stopped early due to some data-dependent process (including a formal-stopping rule); or Had extreme baseline imbalance; or Has been claimed to have been fraudulent; or Had some other problem Insufficient information to assess whether an important risk of bias exists; or Insufficient rationale or evidence that an identified problem will introduce bias * Minimization may be implemented without a random element, and this is considered to be equivalent to being random. 대한의사협회지 423

Lim SM et al. Table 5. Quality assessment tools of non-randomized studies Evaluation criteria (Deeks et al. [21]) Downs and Black [28] Newcastle- Ottawa [29] Reish et al. [30] Thomas [31] Zaza et al. [32] 1. Background Background information provided 2. Sample Retrospective/prospective Inclusion/exclusion criteria Sample size Representative 3. Interventions Clear specification of interventions 4. Outcomes Clear specification of outcomes 5. Creation of groups Generation of random sequence Concealment of allocation How allocation occurred Balance groups by design 6. Blinding Blind (or double-blind) administration Blind outcome assessment 7. Ascertainment Receipt of the intervention Attributable outcomes 8. Follow-up Equal follow-up between group Completeness of follow-up 9. Comparability Baseline comparability assessed Prognostic factors identified Case-mix adjustment 10. Analysis Intention-to-treat analysis Appropriate analysis methods 11. Interpretation Appropriately based on results Assessed strength of evidence Application/implications 12. Presentation Completeness, clarity, structure 관찰연구에서사용할수있으나 RCT 연구평가에는사용할수없다 [29]. Newcastle-Ottawa 척도는환자-대조군연구평가와코호트연구평가를위하여두종류의척도로개발되어있다. 환자-대조군연구의평가항목으로는환자-대조군선택, 환자-대조군비교가능성, 그리고노출확인에대해 평가할수있도록되어있다 (Table 7A). 코호트연구는코호트선택, 코호트비교가능성, 그리고결과평가에대해평가할수있다 (Table 7B). 평가방법은각항목에대해근거의질이높은경우 로표시하도록되어있는데, 선택과노출 / 결과에대한항목의경우 424 근거수준별문헌의질평가도구

Tools for assessing quality and risk of bias by levels of evidence 의료정책 Table 6. Downs and Black scale: checklist for measuring study quality [28] Reporting: Yes=1, No=0 1. Is the hypothesis/aim/objective of the study clearly described? 2. Are the main outcomes to be measured clearly described in the Introduction or Methods section? 3. Are the characteristics of the patients included in the study clearly described? 4. Are the interventions of interest clearly described? 5. Are the distributions of principal confounders in each group of subjects to be compared clearly described? Yes=2, Partially=1, No=0 6. Are the main findings of the study clearly described? 7. Does the study provide estimates of the random variability in the data for the main outcomes? 8. Have all important adverse events that may be a consequence of the intervention been reported? 9. Have the characteristics of patients lost to follow-up been described? 10. Have actual probability values been reported (e.g., 0.035 rather than <0.05) for the main outcomes except where the probability value is less than 0.001? External validity: Yes=1, No=0, Unable to determine=0 11. Were the subjects asked to participate in the study representative of the entire population from which they were recruited? 12. Were those subjects who were prepared to participate representative of the entire population from which they were recruited? 13. Were the staff, places, and facilities where the patients were treated, representative of the treatment the majority of patients receive? Internal validity - bias: Yes=1, No=0, Unable to determine=0 14. Was an attempt made to blind study subjects to the intervention they have received? 15. Was an attempt made to blind those measuring the main outcomes of the intervention? 16. If any of the results of the study were based on ata dredging was this made clear? 17. In trials and cohort studies, do the analyses adjust for different lengths of follow-up of patients, or in case-control studies, is the time period between the intervention and outcome the same for cases and controls? 18. Were the statistical tests used to assess the main outcomes appropriate? 19. Was compliance with the intervention/s reliable? 20. Were the main outcome measures used accurate (valid and reliable)? Internal validity - confounding (selection bias): Yes=1, No=0, Unable to determine=0 21. Were the patients in different intervention groups (trials and cohort studies) or were the cases and controls (case-control studies) recruited from the same population? 22. Were study subjects in different intervention groups (trials and cohort studies) or were the cases and controls (case-control studies) recruited over the same period of time? 23. Were study subjects randomised to intervention groups? 24. Was the randomised intervention assignment concealed from both patients and health care staff until recruitment was complete and irrevocable? 25. Was there adequate adjustment for confounding in the analyses from which the main findings were drawn? 26. Were losses of patients to follow-up taken into account? Power 27. Did the study have sufficient power to detect a clinically important effect where the probability value for a difference being due to chance is less than 5%? Sample sizes have been calculated to detect a difference of x% and y%. Size of smallest intervention group A 1<n1 0 B n1-n2 1 C n3-n4 2 D n5-n6 3 E n7-n8 4 F n8+ 5 각항목에최대한개의 을줄수있으며, 비교가능성항목에대해서는최대 2개의 을줄수있다. 특히 Newcastle- Ottawa 척도는 8개항목으로이루어진비교적간단한척도로환자-대조군연구의질을평가하는데에유용한척도이다. 결론 근거중심의학의핵심역량인근거기반의사결정을위하여근거중심임상진료지침과같은기초자료를활용함에있 대한의사협회지 425

Lim SM et al. Table 7A. Newcastle-Ottawa quality assessment scale: case-control studies [29] NEWCASTLE - OTTAWA QUALITY ASSESSMENT SCALE CASE CONTROL STUDIES Note: A study can be awarded a maximum of one star for each numbered item within the Selection and Exposure categories. A maximum of two stars can be given for Comparability. Selection 1) Is the case definition adequate? a) yes, with independent validation b) yes, eg record linkage or based on self reports c) no description 2) Representativeness of the cases a) consecutive or obviously representative series of cases b) potential for selection biases or not stated 3) Selection of controls a) community controls b) hospital controls c) no description 4) Definition of controls a) no history of disease (endpoint) b) no description of source Comparability 1) Comparability of cases and controls on the basis of the design or analysis a) study controls for (Select the most important factor.) b) study controls for any additional factor (This criteria could be modified to indicate specific control for a second important factor.) Exposure 1) Ascertainment of exposure a) secure record (eg surgical records) b) structured interview where blind to case/control status c) interview not blinded to case/control status d) written self report or medical record only e) no description 2) Same method of ascertainment for cases and controls a) yes b) no 3) Non-response rate a) same rate for both groups b) non respondents described c) rate different and no designation 어서검색된문헌의질을평가하고, 평가결과를권고수준의등급화과정에체계적으로반영할수있도록하기위하여국제적으로사용되고있는타당한척도들을활용하고그결과를해석하는방법론을습득하는일은진료현장에있는의사들에게직 간접적으로도움이될수있다. 특히우리나라의경우근거창출에필요한대부분의근거가비무작위연구이므로이연구결과들을기반으로의사결정을내려야할경우보편타당한근거분석및평가방법론에대한기초연구는무엇보다중요하다. 무작위배정비교임상시험혹은비무작위연구등근거수준별로문헌의질평가에사용될수있는국제적인평가도구들을체계적으로검토하고, 이중국내에서주로생산되고있는근거의실정에알맞는사용이간편하 면서도국제적인기준에부합되는평가도구를발굴하고활용방법을제시하는것은근거기반의료문화를정착시켜나가는데꼭필요한기초연구중의하나이다. 따라서본연구에서근거중심진료지침개발과정의핵심구성요소로서근거수준별로문헌의질평가에활용될수있는국제적인도구들을검토하고국내현실에적합한도구들을모색해본결과무작위배정비교임상시험연구에대한근거의질을평가하기위한타당한도구로는 van Tulder 척도및코크란연합의질평가방법이권장될수있겠다. 비무작위연구근거의질을평가하는데에유용한도구로는 Downs and Black 도구가있으며, 환자-대조군연구의질평가를위해 Newcastle-Ottawa 척도를코크란연합에서 426 근거수준별문헌의질평가도구

Tools for assessing quality and risk of bias by levels of evidence 의료정책 Table 7B. Newcastle-Ottawa quality assessment scale: cohort studies [29] NEWCASTLE - OTTAWA QUALITY ASSESSMENT SCALE COHORT STUDIES Note: A study can be awarded a maximum of one star for each numbered item within the Selection and Outcome categories. A maximum of two stars can be given for Comparability Selection 1) Representativeness of the exposed cohort a) truly representative of the average (describe) in the community b) somewhat representative of the average in the community c) selected group of users eg nurses, volunteers d) no description of the derivation of the cohort 2) Selection of the non exposed cohort a) drawn from the same community as the exposed cohort b) drawn from a different source c) no description of the derivation of the non exposed cohort 3) Ascertainment of exposure a) secure record (eg surgical records) b) structured interview c) written self report d) no description 4) Demonstration that outcome of interest was not present at start of study a) yes b) no Comparability 1) Comparability of cohorts on the basis of the design or analysis a) study controls for (select the most important factor) b) study controls for any additional factor (This criteria could be modified to indicate specific control for a second important factor.) Outcome 1) Assessment of outcome a) independent blind assessment b) record linkage c) self report d) no description 2) Was follow-up long enough for outcomes to occur a) yes (select an adequate follow up period for outcome of interest) b) no 3) Adequacy of follow up of cohorts a) complete follow up - all subjects accounted for b) subjects lost to follow up unlikely to introduce bias - small number lost - > % (select an adequate %) follow up, or description provided of those lost) c) follow up rate < % (select an adequate %) and no description of those lost d) no statement 추천하고있다. 특히국내문헌혹은근거의대부분이비무 작위연구임을감안할때이상의질평가도구들을적절히 활용하여비무작위연구중에서도질적수준이높은근거들 을구분하고, 권고수준의등급화기준에비무작위연구들의 질적수준을반영하여국내연구결과를임상에서의사결정 의중요한근거로활용할수있도록기본여건을마련하는 것이향후중요한과제이다. 핵심용어 : 자다드척도 ; 반튤더척도 ; 코크란도구 ; 다운스앤블랙도구 ; 뉴캐슬오타와척도 REFERENCES 21. Higgins JP, Green S, editors. Cochrane handbook for systematic reviews of interventions. Ver. 5.1.0. The Cochrane Collaboration; 2011 [cited 2011 Jan 7]. Available from: http://www. cochrane-handbook.org. 22. Chalmers TC, Smith H Jr, Blackburn B, Silverman B, Schroeder B, Reitman D, Ambroz A. A method for assessing the quality of a randomized control trial. Control Clin Trials 1981; 2:31-49. 23. Moher D, Jadad AR, Nichol G, Penman M, Tugwell P, Walsh S. Assessing the quality of randomized controlled trials: an annotated bibliography of scales and checklists. Control Clin Trials 1995;16:62-73. 대한의사협회지 427

Lim SM et al. 24. Barratt A. Evidence based medicine and shared decision making: the challenge of getting both evidence and preferences into health care. Patient Educ Couns 2008;73:407-412. 25. Kranke P. Evidence-based practice: how to perform and use systematic reviews for clinical decision-making. Eur J Anaesthesiol 2010;27:763-772. 26. Parrilla-Castellar ER, Almeyda R, Nogales E, Velez M, Ramos M, Rivera JE, Da Vila B, Torres V, Capriles J, Adamsons K. Evidence-based medicine as a tool for clinical decision-making in Puerto Rico. P R Health Sci J 2008;27:135-140. 27. Tiburi MF. Evidence-based medicine as viewed by key decision-makers of health plans in southern Brazil. Health Serv Manage Res 2008;21:185-191. 28. Stolba N, Nguyen TM, Tjoa AM. Towards sustainable decisionsupport system facilitating EBM. Conf Proc IEEE Eng Med Biol Soc. 2007;2007:4355-4358. 29. Wyer PC, Silva SA. Where is the wisdom? I. a conceptual history of evidence-based medicine. J Eval Clin Pract 2009;15: 891-898. 10. Collins J. Evidence-based medicine. J Am Coll Radiol 2007;4: 551-554. 11. Anastasiu M, Strambu V, Popa F. Evidence-based medicine, conceptual challenge or the future of daily practice? Chirurgia (Bucur) 2007;102:527-530. 12. Carter MJ. Evidence-based medicine: an overview of key concepts. Ostomy Wound Manage 2010;56:68-85. 13. Maluf-Filho F. The importance of evidence-based medicine concepts for the clinical practitioner. Arq Gastroenterol 2009; 46:87-89. 14. Borgerson K. Valuing evidence: bias and the evidence hierarchy of evidence-based medicine. Perspect Biol Med 2009; 52:218-233. 15. Isaac CA, Franceschi A. EBM: evidence to practice and practice to evidence. J Eval Clin Pract 2008;14:656-659. 16. Kruer MC, Steiner RD. The role of evidence-based medicine and clinical trials in rare genetic disorders. Clin Genet 2008;7 4:197-207. 17. Rogers W, Ballantyne A. Justice in health research: what is the role of evidence-based medicine? Perspect Biol Med 2009;52:188-202. 18. Kitto S, Petrovic A, Gruen RL, Smith JA. Evidence-based medicine training and implementation in surgery: the role of surgical cultures. J Eval Clin Pract 2010 Aug 4 [Epub]. DOI: 10.1111/j.1365-2753.2010.01526.x. 19. Carretier J, Bataillard A, Fervers B. The patient s role in evidence-based medicine. J Chir (Paris) 2009;146:537-544. 20. Soll RF. Evaluating the medical evidence for quality improvement. Clin Perinatol 2010;37:11-28. 21. Deeks JJ, Dinnes J, D Amico R, Sowden AJ, Sakarovitch C, Song F, Petticrew M, Altman DG; International Stroke Trial Collaborative Group; European Carotid Surgery Trial Collaborative Group. Evaluating non-randomised intervention studies. Health Technol Assess 2003;7:iii-x, 1-173. 22. MacLehose RR, Reeves BC, Harvey IM, Sheldon TA, Russell IT, Black AM. A systematic review of comparisons of effect sizes derived from randomised and non-randomised studies. Health Technol Assess 2000;4:1-154. 23. Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJ, Gavaghan DJ, McQuay HJ. Assessing the quality of reports of randomized clinical trials: is blinding necessary? Control Clin Trials 1996;17:1-12. 24. van Tulder M, Furlan A, Bombardier C, Bouter L; Editorial Board of the Cochrane Collaboration Back Review Group. Updated method guidelines for systematic reviews in the cochrane collaboration back review group. Spine (Phila Pa 1976) 2003;28:1290-1299. 25. Newell SA, Sanson-Fisher RW, Savolainen NJ. Systematic review of psychological therapies for cancer patients: overview and recommendations for future research. J Natl Cancer Inst 2002;94:558-584. 26. Scottish Intercollegiate Guideline Network. SIGN 50: a guideline developer s handbook 2008 [Internet]. Edinburgh: Scottish Intercollegiate Guidelines Network; 2008 [cited 2011 Jan 28]. Available from: http://www.sign.ac.uk/pdf/sign50.pdf. 27. National Institute for Health and Clinical Excellence. The guidelines manual 2009 [Internet]. London: National Institute for Health and Clinical Excellence; 2009 [cited 2011 Jan 27]. Available from: http://www.nice.org.uk/media/5f2/44/the_ guidelines_manual_2009_-_all_chapters.pdf. 28. Downs SH, Black N. The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. J Epidemiol Community Health 1998;52:377-384. 29. Wells GA, Shea B, O Connell D, Peterson J, Welch V, Losos M, Tugwell P. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses [Inter net]. Ottawa: Ottwa Hospital Research Institute; c1996-2010 [cited 2011 Mar 28]. Available from: http://www.ohri.ca/prog rams/clinical_epidemiology/oxford.asp. 30. Reisch JS, Tyson JE, Mize SG. Aid to the evaluation of therapeutic studies. Pediatrics 1989;84:815-827. 31. Thomas H. Quality assessment tool for quantitative studies: effective public health practice project. Hamilton: McMaster University; 2003. 32. Zaza S, Wright-De Agüero LK, Briss PA, Truman BI, Hopkins DP, Hennessy MH, Sosin DM, Anderson L, Carande-Kulis VG, Teutsch SM, Pappaioanou M. Data collection instrument and procedure for systematic reviews in the Guide to Community Preventive Services. Task Force on Community Preventive Services. Am J Prev Med 2000;18(1 Suppl):44-74. 428 근거수준별문헌의질평가도구

Tools for assessing quality and risk of bias by levels of evidence 의료정책 Peer Reviewers Commentary 근거중심의학 (Evidence Based Medicine) 은특정질병의원인, 진단, 치료, 예방에대하여연구한연구문헌들을체계적으로고찰하여과학적으로유효하고합리적인결과를진료에활용하는것이다. 최근각학회를중심으로만들어지고있는임상진료지침도근거중심의학을기본으로하여작성되고있다. 이때검색된연구문헌에대한근거의질평가는매우중요하다. 근거문헌에대한질평가를통해연구설계, 수행및분석에서나타날수있는비뚤림 (bias) 의가능성을없애거나최소화할수있다. 논문의결론을수용할것인지를결정하고, 추가연구가필요한지에대하여판단하기위한수단으로근거문헌에대한질평가는필수적이다. 본논문은근거중심의학을위한기초자료를활용함에있어서검색된문헌의질을평가하고, 평가결과를권고수준등급화과정에체계적으로반영할수있는국제적도구들을검토하여국내현실에적합한평가방법을제시하고있다. 근거중심의학의방법론을표준화하는데도움이될것으로판단되며앞으로더욱많은연구로우리나라의근거중심의학이발전할것을기대해본다. [ 정리 : 편집위원회 ] 대한의사협회지 429