제 2 장자연언어처리의역사
Early History (1) 최초의시도 Warren Weaver : 기계번역제안 (1949) Idea: Translation is a process of dictionary lookup, plus substitution, plus grammatical reordering. Example I must go home Ich muss nach hause gehen 초기기계번역연구 W.Weaver and A.D.Booth : 영어 - 불어 (Early 1950) George Town Univ. 와 IBM : 러시아어 - 영어 (1954) 2
Early History (2) - 초기기계번역의교훈 - Translation is really not possible without understanding. Example (English Russian English) The spirit is willing but flesh is weak The vodka is strong but the meat is rotten. A great amount of world knowledge was needed, a program had to understand what was being said in order to be able to translate it properly. The pen is in the box. The box is in the pen. Syntactic Ambiguities They are flying planes. Time flies like an arrow. He saw a man on the hill with a telescope. Give a great deal of impetus to work on syntactic theories. 3
Early History (3) - 정보검색 - IBM 1950년대말대량의연구논문을대상으로한정보검색연구시작 1964년에의학문헌의정보검색시스템 MEDLARS 서비스개시 4
Early History (4) - 기타관련연구 - Automata Theory 1950 년대말부터 1960 년대에여러 Automata 모델제안 계산이론의기초일뿐만아니라, 언어분석모델로서중요한역할 Introduction of the idea of heuristic search Newell and Simon (1956) Introduction of the LISP programming language John McCarthy (1960) 5
Early History (5) - 언어학이론 - Chomsky Syntactic Structure(1957), Aspect of the Theory of Syntax(1965) 변형생성문법 C. Hockett 구구조개념, 변형개념 문장의기본은구구조이며, 문장은구구조의변형이다. Grammar for the Hearer(1961) 인간의언어이해는문장을끝까지다들은후, 구문분석을시도하는것이아니고, 문장을들으면서그때까지의구문구조를이해하고있으며, 다음에어떤어구, 문장의구조가발화되는지예상하면서듣는다. 6
1960 년대의자연언어처리 Ideas The use of limited domains for language-understanding systems The use of key words to trigger certain actions The translation of English into formal languages Some Systems Key-word systems : ELIZA, DOCTOR, PARRY, etc. Translating English into a Formal System : STUDENT Data-base Question Answering : BASEBALL 7
BASEBALL (1) Bert F. Green, Jr., Alice K. Wolf, Carol Chomsky, and Kenneth Laughery (1963) Database Question Answering system Database query generation from English 미국프로야구자료를자연언어로검색하는시스템 8
BASEBALL (2) BASEBALL S database MONTH PLACE DAY GAME WINNER/SCORE LOSER/SCORE July Cleveland July Boston 7 July Detroit 7 6 95 White Sox/2 Indians/0 96 Red Sox/5 Yankees/3 97 Tigers/10 Athletics/2 Question: Who did the Yankees play on July 7? After Query Generator (OR (July 7 Yankees/?ANSWER/ (July 7?ANSWER/ Yankees ) ) Answer : Red Sox 9
ELIZA (1) Joseph Weizenbaum (at MIT), 1966 인간과컴퓨터간의대화시스템 Key-word에의한패턴비교기법이용 상당한지적능력을가진시스템처럼보여짐 Pattern (* computers *) (* mother *) (* I believe that *) <nothing matched> Respond Do computers frighten you? Tell me more about your family. How long have you believe that. Please go on. 10
ELIZA (2) User : Eliza : User : Eliza : User : Eliza : User : Eliza : User : Eliza : User : Eliza : User : Eliza : Men are all alike. In what way? They re always bugging us about something or other. Can you think of a specific example? Well, my boyfriend made me come here. Your boyfriend made you come here. He says I m depressed much of the time. I am sorry to hear you are depressed. It s true. I am unhappy. Do you think coming here will help you not to be unhappy? I need some help; that much seems certain. What would it mean to you if you got some help? Perhaps I could learn to get along with my mother. Tell me more about your family. 11
STUDENT Developed by Daniel Bobrow, 1968. Translates algebra problems into a set of linear equations John s age now is two times Mary s age. JA = 2 * MA In three years John will be 6 years older than Mary. JA + 3 = MA + 6 패턴비교기법이용 패턴의빈자리 (slot) 에필요한요소를끼워넣는방법 단순한패턴비교와소수의경험적규칙만으로도문장의분석이가능하다는것을보여준프로그램 12
격문법 (Case Grammar) C. Fillmore (1968) 문장의각주요명사구가술어동사에대해어떤격으로역할하는가에주목 격관계를의미적으로해석 행위자격 (agent), 대상격 (object), 도구격 (instrument) 등 다음두문장의표면구조는다르나심층격은동일 He opened the door by the key. A key opened the door 기계적으로처리하기매우어려움 하나하나의개별동사에대해그동사가어떤의미의격 ( 명사구 ) 를요구하는지상세하게사전에기술해야함 의미소라는것을수십내지수백개설정 13
1970 년대의자연언어처리 The flowering of Semantic Information Processing and Seeds of Cognitive Science Systems SHRDLU (1972) LUNAR (1972) MARGIE (1973) NLPQ (1974) 14
SHRDLU Terry Winograd (1972) Transform sentences into programs (in Block-world domain) Carry out various tasks(e.g., moving blocks on a table), or search for information in SHRDLU s database, or generate an answer for its user. Can handle sentences exhibiting a wide variety of linguistic phenomena Interpreted declarative sentences as database updates, interrogative sentences as database searches, and imperative sentences as specifications for goals; these goals were achieved Linguistic coverage was very broad compared to previous programs Can handle quantifications, generate natural-sounding dialogue, and answer questions about the history of its dialogue and plan execution. 15
LUNAR Woods, Kaplan, and Nash-Webber (1972) A Natural Language Front-end for a database containing moon rock sample analysis Use ATNs (Augmented Transition Networks) Very general notion of quantification based on predicate calculus Use sophisticated techniques to translate questions into database queries. 16
SHRDLU and LUNAR Use relatively unconstrained language Work in very narrow domain SHRDLU : Block-world LUNAR : Moon-rock sample analysis Have complete, privileged knowledge of their work 17
MARGIE (1) Shank, Goldman, Rieger, and Riesbeck (1973) Deal with much more unconstrained language, particularly language about human actions Based on Conceptual Dependency Theory (by Shank) Every EVENT has : an ACTOR an ACTION an OBJECT performed by that actor that the action is performed upon a DIRECTION in which that action is oriented CD primitive actions ATRANS MTRANS SPEAK INGEST PTRANS MBUILD GRASP EXPEL PROPEL ATTEND MOVE 18
MARGIE (2) (e.g.) John gave Mary a book. actor John action ATRANS /* transfer possession */ object book direction FROM John TO Mary John P ATRANS O book R Mary John 19
1970 년대의교훈 Knowledge Representation Central importance to all natural processing Issues How should items in memory be indexed and accessed How should context be represented How should memory be updated How can programs deal with inconsistency Common Sense Knowledge of the outside world (e.g.) The city councilmen refused the women a permit because they feared violence // they : city councilmen they advocated revolution // they : women 20
FRAMES Minskey, 1975 Structures consisting of a core and slots Each slot corresponding to Either a facet or participant of a concept embodied in the frame or a space for a pointer to a related concept Provide a neat explanation for default reasoning 21
SCRIPTS Roger Shank and his collaborators at Yale (1977) (e.g.) Track : Coffee Shop Props : Table Roles : S Customers Manu W Waiters F Food C Cook Check M Cashier Money O Owner 22
Unification-based Grammar Formalisms Grammatical Theories LFG (Lexical Functional Grammar) : Bresnan (1982) GPSP (Generalized Phrase Structure Grammar) : Gazdar (1985) HPSG (Head-driven Phrase Structure Grammar) : Pollard (1985) Grammatical Tools DCG (Definite Clause Grammar) : Pereira & Warren (1980) FUG (Functional Unification Grammar) : Kay (1983) PATR-II : Shieber et al. (1983) 23
Unification-based Grammar Formalisms Augmented Phrase Structure Grammar Context-Free based grammar rules Use feature structures instead of simple grammar symbols Feature structure Complex-feature-based informational elements Associations between features and values Unification Information-combining operation main operation in unification-based grammar formalisms 24
Feature Structure 명사 철수 와동사 먹다 의자질구조 (HPSG 의예 ) 25 LEX N MAJ HEAD LOC SYN " 철수 " PHON LEX OBJ GR N MAJ HEAD LOC SYN GR SUBJ N MAJ HEAD LOC SYN SUBCAT V MAJ HEAD LOC SYN " 먹다 " PHON
Unification 26 2) ( : third person :singular number : agreement FS ) 1 ( NP : cat FS 3) ( third person: :singular number : agreement NP : cat 2 1 FS FS FS
Unification cat : NP agreement : number person : : singular third ( FS3) cat : NP agreement : number : plural ( FS 4) FS 3 FS 4 Unificatio n Failed Unification of FS3 and FS4 is failed because the values of agreement : number feature of them are not the same (conflict) 27
최근자연언어처리연구동향 문법규칙의단순화, 사전의대용량화 각종대용량분석사전, 시소러스등 Corpus에기반한언어처리 원시 Corpus, Tagged Corpus 문법, 어휘정보등각종언어정보추출 통계기반언어처리 기계학습기반언어처리 실용수준의자연언어처리시스템개발 상용기계번역시스템 정보검색시스템 문서분류, 요약시스템등 딥러닝 (Deep Learning) 기술의발달 이미지인식, 음성인식분야에서딥러닝기술이최고의성능을보여줌 자연어처리분야에도최근딥러닝기술이많은응용분야에서최고성능을보여주고있음 28
기계번역의역사 (1) GAT 1952년에시작하여 1965년에완성 소련어-영어번역시스템 번역대상 : 물리학분야논문 단어대단어에숙어처리가미 번역의질은매우떨어졌으나, 1979년까지미국원자에너지국에서사용 29
기계번역의역사 (2) CETA 1967년에완성되어 1971년까지사용 프랑스 Grenoble 대학에서시작 언어학이론에기반한번역 Interlingua 방식 (Pivot approach) GETA Interlingua : 개별언어와독립적표현 CETA의후속시스템 CETA의실패를거울삼아변환방식 (transfer approach) 채택 30
기계번역의역사 (3) TAUM 일기예보대상 영어-불어번역시스템 순수한변환방식 METEO TAUM을확장한완전자동번역시스템 번역성공률이 90-95% 수준 실패하는경우도대부분철자오류등임 31
기계번역의역사 (4) SYSTRAN 최초로상품화된기계번역시스템 1970년미국연방정부 FTD 사용 ( 러시아-영어 ) 1974년 NASA 사용 ( 러시아-영어 ) 1976년 EC 사용 ( 영어-불어 ) 1978년불어-영어 1979년영어-이태리어 1985년불어-독어, 영어-독어 32
기계번역의역사 (5) METAL 1982년에개발된독어-영어양방향기계번역시스템 GPSG를이용한영어분석 EUROTRA 유럽공동체의 9개언어번역을시도 1992년 1단계연구종료 : 시스템개발에는실패 유럽공동체예산의 40% 정도가번역비용으로드는만큼, 연구개발이계속될전망 33
기계번역의역사 (6) 일본의연구 1964년교토대학 Nagao 교수에의해시작 1990년현재 20여개시스템이상품화 기계번역연구를가장활발히진행하는국가중하나임 한국의연구 1980년정도부터대학및연구소에서연구시작 현재영-한, 일-한, 한-일번역시스템상품화 대학, 기업체중심으로연구개발 34
기계번역의역사 (7) Statistical Machine Translation (SMT) 구글번역기, Word based model GIZA++ (IBM model 1~6) Phrase based model Moses Parallel corpus (sentence aligned corpus) word alignment (GIZA++) phrase extraction reordering model language model (SRILM) decoding 35
SMT: example 36
기계번역의역사 (8) Neural Machine Translation (NMT) 딥러닝을이용한 end-to-end 기계번역시스템 Word-based Recurrent Neural Network (RNN) encoder + RNN decoder 로구성됨 Parallel corpus (sentence aligned corpus) NMT training RNN decoding 최근에는 Attention Mechanism 을도입하여더욱높은성능을보임 Phrase-based MT, Hierarchical Phrase-based MT 보다높은성능을보임 37
NMT example 38