KISEP Head and Neck Korean J Otolaryngol 1999;42:1160-8 후두전적출술후기관식도발성에의한음성재활환자의성도와발성명료도 김철수 1 왕수건 1 심우영 1 박형진 1 김창수 1 박중환 1 장형준 1 이석훈 1 이석홍 2 양병곤 3 백무진 4 조철우 5 The Vocal Tract and Speech Intelligibility of Tracheoesophageal Shunt Patients after Total Laryngectomy Cheul-Su Kim, MD 1, Soo-Geun Wang, MD 1, Woo-Young Shim, MD 1, Hyung-Jin Park, MD 1, Chang-Su Kim, MD 1, Jung-Hwan Park, MD 1, Hyeong-Jun Jang, MD 1, Suk-Hun Lee, MD 1, Suck-Hong Lee, MD 2, Byung-Gon Yang, PhD 3, Moo-Jin Baek, MD 4 and Cheol-Woo Jo, PhD 5 1 Department of Otolaryngology & 2 Diagnostic radiology, College of Medicine, Pusan National University, Pusan, 3 Department of English, College of Humanities, Dongeui University, Pusan, 4 Department of Otolaryngology, College of Medicine, Inje University, Pusan Paik Hospital, Pusan 5 Department of Control and Instrumentation Engineering, Changwon National University, Changwon, Korea ABSTRACT Background and ObjectivesIn this paper, rehabilitation of voice after total laryngectomy has been suggested through the correct estimation and simulation of patients vocal tract. Material and MethodsThe author studied the shape of vocal tract during the phonation of five Korean vowels /u, o, a, e, i/ in tracheoesophageal shunt patients by magnetic resonance images MRI. The same vocal tract was determined in each vowels from MRI. First, speech data produced by them were analyzed and also checked for speech intelligibility. Then the author tried to synthesize vowels from the vocal tract area of each vowels and from the expanded pharyngeal section of the vocal tract. ResultsThe obtained results were as follows 1 The sounds of /a/, /e/, /i/ were similar to natural sounds in actual patients' speech. The sound of /o/ was heard as //. The sound of /u/ was heard as strained /u/. 2 The synthesized vowels of /a/, /e/ from MRI were heard as natural sounds. The sounds of /u/, /o/, /i/ were heard as other sounds. 3 The synthesized vowel by the expanded pharyngeal section of 3 times in vowel /o/ was more naturally heard than that of 2 times. The synthesized vowel from Formfrek was more naturally heard than that from AreatoFormant. ConclusionIn conclusion, some of the synthesized sounds from MRI disagrees with the actual sounds produced by the subjects. This could be best identified by the synthesis from the area data. Future MRI studies should consider this problem for more accurate measurements. Also, pharyngeal areas with varied sizes should be experimented to secure better speech output because the correct shapes of vocal tract ensures correct vowel pronunciation. Korean J Otolaryngol 1999;42:1160-8 KEY WORDSLaryngectomy Vocal tract Tracheoesophageal shunt.. 1160
실제음성의분석과발성명료도 (Speech intelligibility) 성도의자기공명영상촬영법 (MRI acquisition of vocal tract) 1161
후두전적출술후 기관식도발성환자의 성도와 발성면료도 이용하여 컴퓨터에 입력한 뒤, 성도의 공간부위를 선택 영에 소요된 시간은 19초이었다. 하여 이것을 단면적 계산 프로그램인 Area properties 치아보정 (V3.2)에 입력하여 자동적으로 각 분절의 단면적을 계 MRI에서는 치아의 간섭현상에 의해 선명한 영상을 얻기 산하였다(Fig. 3). 어려워 치아가 있는 증례 Ⅱ와 Ⅲ에서는 관상단면(coronal) CT를 함께 촬영하여 CT에서 선명하게 나타나는 치아의 공 자기공명영상을 이용한 포만트 계산(Formant calculat- 간을 측정하여 이를 MRI에서 단면적 측정시 모든 모음에 적 ion from cross sectional area by MRI) 용하여 보정하였다. 증례 Ⅰ과 Ⅳ는 치아가 없었다(Fig. 2). 컴퓨터 프로그램(Area properties, V3.2)을 이용하여 성도단면적의 계산(Calculation of cross sectional area) 현상된 MRI에서 분절의 단면을 스캐너(scanner)를 Fig. 3. Cross sectional area in case III. a A-A', b B-B', c C-C'. Fig. 1. Midsagittal section of vocal tract on MRI in case III. A 1162 B Fig. 2. Cross section of midoral cavity on CT (a)and MRI (b) in case II. Korean J Otolaryngol 1999;42:1160-8
음성합성및분석 (Acoustic synthesis and analysis from formant calculated by AF and FF) Table 1. Synthetic parameters for vowel /o/ SenSyn1.0 Total number of waveform samples8000 CURRENT CONFIGURATION60 parameters SYM V/C MIN VAL MAX SYM V/C MIN VAL MAX DU C 30 400 5000 UI C 1 5 20 SR C 5000 20000 20000 NF C 1 6 6 SS C 1 2 3 RS C 1 8 8191 SB C 0 1 1 CP C 0 0 1 OS C 0 0 20 GV C 0 60 80 GH C 0 60 80 GF C 0 60 80 F0 V 0 1000 5000 AV C 0 60 80 OQ V 10 50 99 SQ V 100 200 500 TL V 0 0 41 FL V 0 0 100 DI V 0 0 100 AH V 0 0 80 AF V 0 0 80 F1 V 180 540 1300 B1 V 30 90 1000 DF1 V 0 0 100 DB1 V 0 0 400 F2 V 550 900 3000 B2 V 40 110 1000 F3 V 1200 2600 4800 B3 V 60 150 1000 F4 V 2400 3500 4990 B4 V 100 100 1000 F4 V 3000 3700 4990 B5 V 100 100 1500 F6 V 3000 4500 4990 B6 V 100 100 4000 FNP V 180 280 500 BNP V 40 900 1000 FNZ V 180 280 800 BNZ V 40 900 1000 FTP V 300 2150 3000 BTP V 40 900 1000 FTZ V 300 2150 3000 BTZ V 40 900 2000 A2F V 0 0 80 A3F V 0 0 80 A4F V 0 0 80 A5F V 0 0 80 A6F V 0 0 80 AB V 0 0 80 B2F V 40 250 1000 B3F V 60 300 1000 B4F V 100 320 1000 B5F V 100 360 1500 B6F V 100 1500 80 ANV V 0 0 80 A1V V 0 60 80 A2V V 0 60 80 A3V V 0 60 80 A4V V 0 60 80 ATV V 0 0 80 Varied parameters: Time F0 AV Time F0 AV 0 1766 30 50 1750 48 5 1767 31 : : : 10 1769 33 360 1390 35 15 1771 34 365 1390 34 20 1773 36 370 1390 32 25 1770 37 375 1390 29 30 1776 41 380 1350 26 35 1762 42 385 1350 23 40 1758 45 390 1350 20 45 1754 47 395 1350 17 1163
Table 2. Results of sound from patients speech Case Vowel // // // // // // // // // // // // // // // // // // // // // // // // // 단면적을이용한합성음 Table 3. Results of formant from Formfrek Hz Case Vowel F1 F2 F3 Nomal // 370 730 2600 // 540 900 2600 // 800 1345 2710 // 550 2100 2900 // 330 2520 3230 // 880 1539 2589 // 952 1525 3002 // 539 1214 2595 // 644 1494 2595 // 604 1345 2391 // 464 1920 2530 // 593 1899 2415 // 724 1632 2515 // 576 1688 2903 // 300 2224 3357 // 827 1456 2426 // 918 1336 2463 // 883 1414 2563 // 878 1747 2679 // 619 1769 2565 // 590 1867 2551 // 660 1539 2531 // 800 1483 2816 // 611 1635 2450 // 430 1736 2894 *FFormant Table 4. Results of formant from AreatoFormant Hz Case Vowel F1 F2 F3 B1 B2 B3 Nomal // 370 730 2600 80 90 60 // 540 900 2600 90 110 150 // 800 1345 2710 110 120 120 // 550 2100 2900 60 90 150 // 330 2520 3230 150 70 80 // 929 1794 2476 110 552 314 // 983 1844 3104 97 2146 404 // 820 1231 2403 91 83 98 // 929 1685 2733 80 118 169 // 690 1635 2312 69 216 153 // 497 2371 3035 61 169 1808 // 612 2062 2915 64 194 2270 // 756 1932 2662 70 340 1122 // 598 1653 2924 94 143 240 // 353 2136 3357 70 94 134 // 981 1686 2440 69 102 120 // 1036 1591 2386 74 154 126 // 1003 1472 2524 90 145 148 // 1007 1730 2783 86 104 172 // 683 1690 2764 61 85 128 // 654 1942 2731 61 96 118 // 757 1971 2789 63 111 139 // 843 1749 2940 72 247 179 // 706 1763 2426 61 92 107 // 527 1629 2600 60 81 118 *FFormant, BFormant bandwidth 1164 Korean J Otolaryngol 1999;42:1160-8
단면적변환에의한합성음 모음 / 우 / 모음 / 오 / 모음 / 아 / Table 5. Results of formant after expansion of pharyngeal segment in vowel /o/ Hz Expansion of area 2x Case Method F1 F2 F3 F1 F2 F3 AF 901 1820 3117 837 1807 3106 FF 614 1037 2845 600 987 2805 AF 921 1614 3049 826 1600 3152 FF 771 1147 2874 708 1083 2858 *AFAreatoFormant, FFFormfrek Table 6. Results of synthesized vowels Case Vowel AF FF // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // // *AFAreatoFormant, FFFormfrek Table 7. Results of synthesized sound after expansion of pharyngeal segment in vowel /o/ Expansion of area 2x 3x Case Method Sound Sound 3x AF // // FF //// // AF // // FF // //// *AFAreatoFormant, FFFormfrek 1165
때문으로여겨졌고, 증례 Ⅰ의경우에는 모음 / 에 / Fig. 4. Cross sectional area in /u/ phonation. Fig. 7. Cross sectional area in /e/ phonation. Fig. 5. Cross sectional area in /o/ phonation. Fig. 6. Cross sectional area in /a/ phonation. 모음 / 이 / 1166 Fig. 8. Cross sectional area in /i/ phonation. Korean J Otolaryngol 1999;42:1160-8
1167
REFERENCES 1) Gussenbauer C. Cited from Vailey and Biller s surgery of the larynx. Saunders Co1985. p.367-84. 2) Gate GA, Hearne EM III. Predicting esophageal speech.. Ann Otol Rhinol Laryngol 1982;91454-7. 3) Omori K, Kojima H, Nonomura Ml. Mechanism of T-E shunt phonation. Arch Oto laryngol Head Neck Surg 1994120648-652. 4) Yang B. A perceptual study of Synthesized Korean Monophtongs. Korean J of Linguistics 199520-3127-46. 5) Chiba T. Kajiyama M. The vowel Its nature and structure. Phonetic Society of Japan. Tokyo Kaiseikan, 1941. 6) Fant G. Formants and Cavities. In Proceeding of the Fifth International Congress of Phonetic Sciences ed. Zwirner E, Bethge W, KargerBasel press 1965. p.120-40. 7) Johansson C, Sundberg J, Wilbrand H, Ytterbergh C. From sagittal distance to area A study of transverse, cross sectional area in the pharynx by means of computer tomography. R Inst Technol STL-QPSR 1983439-49. 8) Fujimura O, Kiritani S, Ishida H. Computer controlled radiography for observation of movements of articulatory and other human organs. Comput Biol Med 19733371-84. 9) Schonle PW, Grabe K, Wenig P, Hohne J, Schrader J, Conrad B. Electromagnetic articulography use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain Lang 19873126-35. 10) Stone M. A three-dimensional model of tongue movement based on ultrasound and x-ray microbeam data. J Acoust Soc Am 1990872207-17. 11) Kim CJ. Relationship between pure tone and speech hearing level utillizing meaningful monosyllabic words list. J Busan Med College 1976161-16. 12) Yang CS, Kasuya H. Accurate measurement of vocal tract shapes from magnetic resonance images of child, female and male subjects. ICSLPYokohama press 1994. p.623-6. 13) Sondhi MM. Model for wave propergation in lossy vocal tract. J Acoust Soci Am 1974551070-5. 14) Fant G. Acoustic theory of speech production. The Hague Mouton, 1970. 15) Baer T, Gore JC, Gracco LC, Nye PW. Analysis of vocal tract shape and dimensions using magnetic resonance imaging Vowels. J Acoustic Soc Am 199190799-827. 16) Matsumura M, Nikawa T, Shimizu K. Measurement of 3D shape of vocal tract, dental crown and nasal cavity Vowels and Fricatives. ICSLPYokohama press 1994. p.619-22. 17) Lakshminarayanan AV, Lee SB, McCutcheon MJ. MR imaging of the vocal tract during vowel production. J Magn Reson Imag 1991171-6. 18) Martelli A. An application of heuristic search methods to edge and contour detection. Comm ACM 19761973-83. 19) Rubin P, Baer T, Mermelstein P. An articulatory synthesizer for perceptual research. J Acoust Soc Am 198170321-8. 20) Back MJ, Oh IJ, Wang SG, Chon KM..Comparison of the Amatsu tracheoesophageal shunt operation for speech and esophageal speech after total laryngectomy. Korean J Otolaryngol 199336102-9. 1168 Korean J Otolaryngol 1999;42:1160-8