593
Table 1. ACR BI-RADS US Lexicon, Simplified for This Study 1. Masses Shape Oval Round Irregular Orientation Parallel Not parallel Margin Circumscribed Not circumscribed Indistinct Angular Microlobulated Spiculated Lesion boundary Abrupt interface Echogenic halo Echo pattern Anechoic Hyperechoic Complex Hypoechoic Isoechoic Post. acoustic features No posterior acoustic features Enhancement Shadowing Combined pattern 2. Calcifications If present Macrocalcifications Microcalcifications out of mass Microcalcifications in mass 3. Final assessment category Category 2: benign finding Category 3: probably benign finding Category 4: suspicious abnormality Category 5: highly susggestive of malignancy 594
Table 3. Proportion of Agreement between Readers for Detailed Items of Descriptors POA(%) 95% CI Shape Oval 61.3 42.2 77.6 Round 44.4 15.3 77.4 Irregular 67.7 49.4 82.0 Orientation Parallel 75.6 59.4 87.1 Not parallel 65.5 45.7 81.4 Margin Circumscribed 66.7 47.1 82.1 Indistinct 35.7 19.3 55.9 Angular 00 Microlobulated 36.8 17.2 61.4 Spiculated 00 Lesion boundary Abrupt interface 43.3 30.8 56.7 Echogenic halo 00 Echo pattern Anechoic N/A N/A Hyperechoic 00 Table 2. Interobserver Agreement for BI-RADS US Lexicon for Complex 20.0 01.1 70.1 60 Solid Breast Masses Hypoechoic 78.9 64.9 88.5 value (SE) Isoechoic 30.8 10.4 61.1 Post. acoustic features No post feature 59.1 43.3 73.3 BI-RADS descriptors Enhancement 50.0 28.8 71.2 Shape 0.61 (0.09) Shadowing 40.0 0 7.3 83.0 Orientation 0.65 (0.10) Combined pattern 11.1 00.6 49.3 Margin 0.47 (0.09) Microcalcifications In mass ( ) 85.5 72.8 93.1 Lesion Boundary 0.43 ( 0.49, 0.20)* In mass (+) 38.5 15.1 67.7 Echo Pattern 0.44 (0.12) Final assessment Category 2, 3 52.9 04.0 72.6 Posterior Acoustic Features 0.42 (0.10) Category 4, 5 61.9 32.3 68.4 Microcalcifications in mass 0.49 (0.15) Management Biopsy 61.9 45.7 76.0 BI-RADS final assessment category 0.46 (0.13) F/U 52.9 35.4 69.8 Patient management 0.49 (0.09) Note. SE: standard errors * overall proportion of agreement with 95% confidence interval 595 Note. POA: proportion of agreement CI: confidence interval N/A: no account
Table 4. Intraobserver Agreement for BI-RADS US Lexicon for 60 solid Breast Masses Reader 1 Reader 2 BI-RADS descriptors Shape 0.77 (0.08) 0.74 (0.08) Orientation 0.87 (0.07) 0.83 (0.07) Margin 0.71 (0.07) 0.72 (0.07) Lesion Boundary 0.70 (0.09) 0.57 (0.14) Echo Pattern 0.62 (0.10) 0.57 (0.14) Posterior Acoustic Features 0.59 (0.09) 0.65 (0.10) Microcalcifications in mass 0.90 (0.07) 0.82 (0.13) BI-RADS final assessment category 0.72 (0.09) 0.65 (0.08) Patient management 0.83 (0.08) 0.77 (0.08) Note. SE: standard errors Table 5. Proportion of Agreement Within the Reader for Detailed Items of Descriptors Reader 1 Reader 2 POA(%) 95% CI POA(%) 95% CI Shape Oval 77.3 54.2 91.3 70.0 50.4 84.6 Round 62.5 25.9 89.8 71.4 30.3 94.9 Irregular 79.0 62.2 89.9 78.1 59.6 90.1 Orientation Parallel 88.6 72.3 96.3 87.2 71.8 95.2 Not parallel 86.2 67.4 95.5 80.8 60.0 92.7 Margin Circumscribed 76.9 55.9 90.3 71.4 51.1 86.1 Indistinct 71.4 47.7 87.8 63.6 40.8 82.0 Angular 0 25.0 01.3 78.1 Microlobulated 61.9 38.7 81.1 73.3 44.8 91.1 Spiculated 0 66.7 12.5 98.2 Lesion boundary Abrupt interface 76.3 59.4 88.0 87.0 74.5 94.2 Echogenic halo 71.0 51.8 85.1 46.2 20.4 73.9 Echo pattern Anechoic N/A N/A Hyperechoic 50.0 09.2 90.8 0 Complex 50.0 14.0 86.1 33.3 01.8 87.5 Hypoechoic 83.3 69.2 92.0 84.9 71.9 92.8 Isoechoic 41.7 16.5 71.4 50.0 22.3 77.7 Post. features No feature 64.1 47.2 78.3 78.3 63.2 88.6 Enhancement 56.5 34.9 76.1 66.7 41.2 85.6 Shadowing 40.0 07.3 83.0 25.0 01.3 78.1 Combined pattern 62.5 25.9 89.8 50.0 26.6 97.3 Microcalcifications In mass ( ) 95.9 84.9 99.3 96.4 86.4 99.4 In mass (+) 84.6 53.7 97.3 71.4 30.3 94.9 Final assessment Category 2, 3 77.8 38.6 87.5 85.3 36.9 87.2 Category 4, 5 91.3 68.0 92.5 83.9 57.3 88.4 Management Biopsy 91.3 78.3 97.2 78.1 59.7 90.1 F/U 77.8 51.9 92.6 80.0 62.5 90.0 Note. POA: proportion of agreement CI: confidence interval N/A: no account 596
A B Fig. 1. Two cases showing complete inter- and intraobserver agreement in both the description and the final assessment category. A. Transverse sonogram in a 46-year-old woman hiving a screening-detected mass. Both observers agreed completely; oval shape, parallel orientation, circumscribed margin, abrupt interface of lesion boundary, hypoechoic echo pattern, no posterior acoustic features and no calcification; category 2 requiring routine follow-up. This case was confirmed as fibrocystic change. B. Longitudinal sonogram in a 74-year-old woman with a palpable mass. Both observers agreed completely; irregular shape, not parallel orientation, microlobulated margin, abrupt interface of lesion boundary, hypoechoic echo pattern, no posterior acoustic feature, and no calcification; category 4 requiring biopsy. Pathologic result of gun biopsy revealed invasive ductal carcinoma. 597
이은혜 외: 고형 유방 종괴에 대한 BI-RADS 초음파검사 표준용어 및 최종평가분류 연구는 많지 않다(14, 19). Baker 등(14)은 Stavros(17)가 제안한 초음파검사 용어에 대한 판독자 변동성 연구에서 판독 자간 일치도는 보통이거나 우수하며(κ =0.40-0.80), 판독자내 일치도는 우수하다고 발표하였다(κ =0.62-0.79). 그러나 최종 평가에 대한 판독자간 및 판독자내 일치도는 보통 수준에 불 과하며(κ =0.51; κ =0.63) 전반적으로 초음파검사 용어에 대한 일관성이 아직 미흡하다고 평가하였다. Lazarus 등(19)은 BIRADS 초음파검사 표준용어에 대한 판독자간 일치도가 다양 A B Fig. 2. Two malignant cases showing the interobserver variability. A. Transverse sonogram in a 46-year-old woman having a screening-detected mass. Both observers agreed on the description of the mass completely; oval shape, parallel orientation, microlobulated margin, abrupt interface of lesion boundary, hypoechoic echo pattern, no posterior acoustic feature, and no calcification. One observer assessed the mass as category 4 and the other as category 3. Pathologic result of gun biopsy revealed invasive ductal carcinoma. B. Transverse sonogram in a 41-year-old woman having a screening-detected mass (arrows). One reader described the mass as having an irregular shape and microlobulated margin and assessed as category 4. The other described the mass as having an irregular shape and indistinct margin, but assessed as category 3. This case was confirmed as invasive ductal carcinoma. A B Fig. 3. Two malignant cases showing the intraobserver variability. A. Transverse sonogram in a 38-year-old woman with a palpable mass, confirmed as invasive ductal carcinoma. One reader initially described the mass as having an oval shape with final assessment of category 3. But 4 weeks later, the reader described the mass as having an irregular shape with final assessment of category 4. B. Transverse sonogram in a 36-year-old woman with a palpable mass, confirmed as ductal carcinoma in situ. One reader initially described the mass as having an indistinct margin with final assessment of category 3. Four weeks later, the reader described the mass as having a circumscribed margin with final assessment of category 4. 598
599
11. Ciatto S, Houssami N, Apruzzese A, Bassetti E, Brancato B, Carozzi F, et al. Reader variability in reporting breast imaging according to BI-RADS assessment categories (the Florence experience). Breast 2006;15:44-51 12.,,,,,.. 2004;51:351-356 13. Skaane P, Engedal K, Skjennald A. Interobserver variation in the interpretation of breast imaging. Comparison of mammography, ultrasonography, and both combined in the interpretation of palpable noncalcified breast masses. Acta Radiol 1997;38:497-502 14. Baker JA, Kornguth PJ, Soo MS, Walsh R, Mengoni P. Sonography of solid breast lesions: observer variability of lesion description and assessment. AJR Am J Roentgenol 1999;172:1621-1625 15. Skaane P, Olsen JB, Sager EM, Abdelnoor M, Berger A, Kullmann 1. Baker JA, Kornguth PJ, Lo JY, Williford ME, Floyd CE Jr. Breast G, et al. Variability in the interpretation of ultrasonography in patients with palpable noncalcified breast tumors. Acta Radiol cancer: prediction with artificial neural network based on BI- RADS standardized lexicon. Radiology 1995;196:817-822 1999;40:169-175 2. Liberman L, Abramson AF, Squires FB, Glassman JR, Morris EA, 16. Arger PH, Sehgal CM, Conant EF, Zuckerman J, Rowling SE, Dershaw DD. The breast imaging reporting and data system: positive predictive value of mammographic features and final assessscriptions of solid breast masses: pilot study. Acad Radiol Patton JA. Interreader variability and predictive value of US dement categories. AJR Am J Roentgenol 1998;171:35-40 2001;8:335-342 3. Orel SG, Kay N, Reynolds C, Sullivan DC. BI-RADS categorization 17. Stavros AT, Thickman D, Rapp CL, Dennis MA, Parker SH, Sisney as a predictor of malignancy. Radiology 1999;211:845-850 GA. Solid breast nodules: use of sonography to distinguish between benign and malignant lesions. Radiology 1995;196:123-134 4. Beam CA, Layde PM, Sullivan DC. Variability in the interpretation of screening mammograms by US radiologists. Findings from a national sample. Arch Intern Med 1996;156:209-213 system, breast imaging atlas. 4th ed. Reston, Va: American College 18. American College of Radiology. Breast imaging reporting and data 5. Baker JA, Kornguth PJ, Floyd CE Jr. Breast imaging reporting and of Radiology, 2003 data system standardized mammography lexicon: observer variability in lesion description. AJR Am J Roentgenol 1996;166:773-778 BI-RADS lexicon for US and mammography: interobserver vari- 19. Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS. 6. Kerlikowske K, Grady D, Barclay J, Frankel SD, Ominsky SH, ability and positive predictive value. Radiology 2006;239:385-391 Sickles EA, et al. Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging categorical data. Biometrics 1977;33:159-174 20. Landis JR, Koch GG. The measurement of observer agreement for Reporting and Data System. J Natl Cancer Inst 1998;90:1801-1809 21. Feinstein AR, Cicchetti DV. High agreement but low Kappa: I. The 7. Caplan LS, Blackman D, Nadel M, Monticciolo DL. Coding mammograms using the classification probably benign finding--short 22. Cicchetti DV, Feinstein AR. High agreement but low Kappa: II. problems of two paradoxes. J Clin Epidemiol 1990;43:543-549 interval follow-up suggested. AJR Am J Roentgenol 1999;172:339- The problems of two paradoxes. J Clin Epidemiol 1990;43:551-558 342 23. Lantz CA, Nebenzahl E. Behavior and interpretation of the statistic: resolution of the two paradoxes. J Clin Epidemiol 1996;49: 431-8. Berg WA, Campassi C, Langenberg P, Sexton MJ. Breast imaging reporting and data system: inter- and intraobserver variability in 434 feature analysis and final assessment. AJR Am J Roentgenol 2000; 24. Lehman CD, Miller L, Rutter CM, Tsu V. Effect of training with 174:1769-1777 the American college of radiology breast imaging reporting and data system lexicon on mammographic interpretation skills in devel- 9. Lehman C, Holt S, Peacock S, White E, Urban N. Use of the American College of Radiology BI-RADS guidelines by community oping countries. Acad Radiol 2001;8:647-650 radiologists: concordance of assessments and recommendations assigned to screening mammograms. AJR Am J Roentgenol 2002;179: RS, et al. Does training in the breast imaging reporting and data 25. Berg WA, D Orsi CJ, Jackson VP, Bassett LW, Beam CA, Lewis 15-20 system (BI-RADS) improve biopsy recommendations or feature 10. Pijnappel RM, Peeters PH, Hendricks JH, Mali WP. Reproducibility of mammographic classifications for non-palpable suspect leraphy? Radiology 2002;224:871-880 analysis agreement with experienced breast imagers at mammogsions with microcalcifications. Br J Radiol 2004;7:312-314 600
Breast Imaging Reporting and Data System (BI-RADS) US lexicon and Final Assessment Category for Solid Breast Masses: the Rates of Inter- and Intraobserver Agreement 1 Eun Hye Lee, M.D., Joo Hee Cha, M.D., Byung Jae Cho, M.D. 2, Young Hwan Koh, M.D., Byung Jae Youn, M.D., Woo Kyung Moon, M.D. 3 1 Department of Radiology, Seoul Municipal Boramae Hospital 2 Department of Radiology, Cheil General Hospital & Women s Healthcare Center 3 Department of Radiology and Clinical Research Institute, Seoul National University Hospital Purpose: To evaluate the rates of inter- and intraobserver agreement of the BI-RADS US lexicon. Materials and Methods: Two radiologists reviewed 60 sonograms of solid breast masses to evaluate interobserver agreement. After four weeks, the radiologists reinterpreted the series to evaluate the intraobserver agreement. The radiologists described shape, orientation, margin, lesion boundary, echo pattern, posterior acoustic features and microcalcifications. Final assessment categories and management plans were suggested for each case. The rates of inter- and intraobserver agreements were measured by the use of kappa statistics. Results: Interobserver agreement ranged from the highest for orientation ( =0.65) and shape ( =0.61) to the lowest for posterior acoustic features ( =0.42). For the final assessment categories ( =0.46) and management ( =0.49), interobserver agreements were moderate. Intraobserver agreement ranged from the highest for microcalcifications in mass ( =0.90, 0.82) and orientation ( =0.87, 0.83) and the lowest for echo patterns ( =0.62, 0.57) and posterior acoustic features ( =0.59, 0.65). In the final assessment category and management, intraobserver agreements were substantial or nearly complete ( =0.65 0.83). Conclusion: There were variable ranged inter- and intraobserver agreements in the description of the BI- RADS US lexicon of solid breast masses. Among them, margin and lesion boundary showed lower agreements. A modification of the BI-RADS US lexicon with more detailed guidelines, followed by continuous education, are suggested. Index words : Breast, US Breast neoplasms, US Images, interpretation Ultrasound (US), quality assurance Address reprint requests to : Byung Jae Cho, M.D., Department of Radiology, Cheil General Hospital & Women s Healthcare Center 1-19, Mookjung-dong, Chung-gu, Seoul 100-380, Korea. Tel. 82-2-2000-7382 Fax. 82-2-2000-7389 E-mail: bj31.cho@cgh.co.kr 601