제4장 자연언어처리, 인공지능 , 기계학습

Similar documents
1-1-basic-43p

Multi-pass Sieve를 이용한 한국어 상호참조해결 반-자동 태깅 도구

딥러닝 첫걸음

Overview Decision Tree Director of TEAMLAB Sungchul Choi

Output file

지능정보연구제 16 권제 1 호 2010 년 3 월 (pp.71~92),.,.,., Support Vector Machines,,., KOSPI200.,. * 지능정보연구제 16 권제 1 호 2010 년 3 월

김경재 안현철 지능정보연구제 17 권제 4 호 2011 년 12 월

Introduction to Deep learning

김기남_ATDC2016_160620_[키노트].key

3 Gas Champion : MBB : IBM BCS PO : 2 BBc : : /45

PowerChute Personal Edition v3.1.0 에이전트 사용 설명서

조사연구 권 호 연구논문 한국노동패널조사자료의분석을위한패널가중치산출및사용방안사례연구 A Case Study on Construction and Use of Longitudinal Weights for Korea Labor Income Panel Survey 2)3) a

자연언어처리

슬라이드 1

adfasdfasfdasfasfadf

PowerPoint Presentation

<313120C0AFC0FCC0DA5FBECBB0EDB8AEC1F2C0BB5FC0CCBFEBC7D15FB1E8C0BAC5C25FBCF6C1A42E687770>

#Ȳ¿ë¼®

탄도미사일 방어무기체계 배치모형 연구 (Optimal Allocation Model for Ballistic Missile Defense System by Simulated Annealing Algorithm)

Microsoft PowerPoint - ai-8 기계 학습-I

자연언어처리

OR MS와 응용-03장

¹Ìµå¹Ì3Â÷Àμâ

<32382DC3BBB0A2C0E5BED6C0DA2E687770>

Buy one get one with discount promotional strategy

본문01

OP_Journalism


04김호걸(39~50)ok

example code are examined in this stage The low pressure pressurizer reactor trip module of the Plant Protection System was programmed as subject for

DIY 챗봇 - LangCon

step 1-1

Microsoft PowerPoint - ch03ysk2012.ppt [호환 모드]

Page 2 of 6 Here are the rules for conjugating Whether (or not) and If when using a Descriptive Verb. The only difference here from Action Verbs is wh

03±èÀçÈÖ¾ÈÁ¤ÅÂ

PowerPoint Presentation

, ( ) 1) *.. I. (batch). (production planning). (downstream stage) (stockout).... (endangered). (utilization). *

Manufacturing6

- 2 -


강의록

<C7A5C1F620BEE7BDC4>

R을 이용한 텍스트 감정분석

PowerPoint 프레젠테이션

DBPIA-NURIMEDIA

<B3EDB9AEC1FD5F3235C1FD2E687770>

Gray level 변환 및 Arithmetic 연산을 사용한 영상 개선

°í¼®ÁÖ Ãâ·Â

2 : (Juhyeok Mun et al.: Visual Object Tracking by Using Multiple Random Walkers) (Special Paper) 21 6, (JBE Vol. 21, No. 6, November 2016) ht

<32392D342D313020C0FCB0C7BFED2CC0CCC0B1C8F12E687770>

빅데이터_DAY key

에너지경제연구 Korean Energy Economic Review Volume 17, Number 2, September 2018 : pp. 1~29 정책 용도별특성을고려한도시가스수요함수의 추정 :, ARDL,,, C4, Q4-1 -


6자료집최종(6.8))

solution map_....

기관고유연구사업결과보고

#중등독해1-1단원(8~35)학

Probability Overview Naive Bayes Classifier Director of TEAMLAB Sungchul Choi

03.Agile.key

untitled

PowerPoint Presentation

2 min 응용 말하기 01 I set my alarm for It goes off. 03 It doesn t go off. 04 I sleep in. 05 I make my bed. 06 I brush my teeth. 07 I take a shower.

untitled

Ch 1 머신러닝 개요.pptx

정보기술응용학회 발표


Can032.hwp

KCC2011 우수발표논문 휴먼오피니언자동분류시스템구현을위한비결정오피니언형용사구문에대한연구 1) Study on Domain-dependent Keywords Co-occurring with the Adjectives of Non-deterministic Opinion

Chap 6: Graphs

PowerPoint 프레젠테이션

Journal of Educational Innovation Research 2017, Vol. 27, No. 2, pp DOI: : Researc

11이정민

산선생의 집입니다. 환영해요

한국성인에서초기황반변성질환과 연관된위험요인연구

Microsoft PowerPoint - 알고리즘_5주차_1차시.pptx

Microsoft PowerPoint - AC3.pptx

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Jul.; 29(7),

Page 2 of 5 아니다 means to not be, and is therefore the opposite of 이다. While English simply turns words like to be or to exist negative by adding not,

<30312DC1A4BAB8C5EBBDC5C7E0C1A4B9D7C1A4C3A52DC1A4BFB5C3B62E687770>

서론 34 2

<3130C0E5>

<C7C1B7A3C2F7C0CCC1EE20B4BABAF1C1EEB4CFBDBA20B7B1C4AA20BBE7B7CA5FBCADB9CEB1B35F28C3D6C1BE292E687770>

Lecture12_Bayesian_Decision_Thoery

歯15-ROMPLD.PDF

¼º¿øÁø Ãâ·Â-1

<32B1B3BDC32E687770>

슬라이드 제목 없음

단순 베이즈 분류기

untitled

Journal of Educational Innovation Research 2016, Vol. 26, No. 2, pp DOI: * Experiences of Af


< C6AFC1FD28B1C7C7F5C1DF292E687770>

Chap 6: Graphs

Stage 2 First Phonics

[ReadyToCameral]RUF¹öÆÛ(CSTA02-29).hwp


... 수시연구 국가물류비산정및추이분석 Korean Macroeconomic Logistics Costs in 권혁구ㆍ서상범...

public key private key Encryption Algorithm Decryption Algorithm 1

Journal of Educational Innovation Research 2019, Vol. 29, No. 1, pp DOI: (LiD) - - * Way to

hw 2006 Tech guide 64p v5

Transcription:

제 4 장 자연언어처리 인공지능 기계학습

목차 인공지능 기계학습 2

인공지능 정의 ( 위키피디아 ) 인공지능은철학적으로인간이나지성을갖춘존재, 혹은시스템에의해만들어진지능, 즉인공적인지능을뜻한다 일반적으로범용컴퓨터에적용한다고가정한다 이용어는또한그와같은지능을만들수있는방법론이나실현가능성등을연구하는과학분야를지칭하기도한다 다양한연구주제 지식표현, 탐색, 추론, 문제해결, 학습, 인지, 행동, 자연언어처리 3

지식표현및추론 지식표현 명제논리 Prolog, Lisp Semantic Network 추론 개념간의관계를망형태로표현 전문가시스템 Theorem Prover 4

탐색및문제해결 게임이론 탐색 : branch and bound, min-max 체스, 바둑, 장기 체스의경우, 컴퓨터가세계챔피언을이김 최적화및탐색방법 Greedy search Beam search Gradient Simulated annealing 유전자알고리즘 5

기계학습 정의 ( 위키피디아 ) 기계학습 (machine learning) 은인공지능의한분야로, 컴퓨터가학습할수있도록하는알고리즘과기술을개발하는분야를말한다 가령, 기계학습을통해서수신한이메일이스팸인지아닌지를구분할수있도록훈련할수있다 관련분야 인공지능 Bayesian Methods Computational Complexity Theory Control Theory Information Theory Statistics Philosophy Psychology and Neurobiology 6

자연언어처리와인공지능 인공지능의연구분야로서의자연언어처리 음성인식, 형태소분석, 통사분석, 의미분석 언어이해 인공지능 자연언어처리를위한인공지능기법 형태론, 구문론, 의미론, 화용론적언어지식 지식표현 (WordNet) 자연언어처리문제해결 기계학습 7

WordNet 자연언어처리를위한영단어의관계망 8

자연언어처리와기계학습 자연언어처리의문제해결을위한기계학습 자연언어처리에이용되는지식을자동으로학습 통계적및경험적인공지능기법 9

말뭉치데이터 신문, 잡지, 교과서등에서추출한다양한문장들로구성 언어에대한다양한표식 품사, 문장성분, 구문분석결과 KIBS, 세종코퍼스 Brown Corpus, Penn Treebank, 10

브라운말뭉치 11

기계학습기반의자연언어처리 중의성해소 분류문제 구조표지, 품사표지, 중의성해소, 전치사접속결정등 언어습득및이해 규칙추론, 정보추출및검색, 자동요약, 기계번역 12

기계학습기법구분의예 기호적학습 사례기반학습, 결정트리, 귀납논리, 비기호적학습 신경망, 유전자알고리즘, 확률적학습 베이지안망, 은닉마코프모델, 확률문법 변형기반학습, 능동학습, 강화학습, 13

분류문제 기호적학습 주어진개체의각종특성들로부터그개체의종류 (class) 를결정하는문제 기호적학습 특성과종류간의관계를몇가지규칙으로서술 if-then 규칙등 주어진데이터로부터규칙을학습 14

기호적학습방법 결정트리 결정리스트 변형기반오류에의한학습 선형분리자 사례기반학습 15

결정트리 결정트리 (decision tree) 귀납적학습을위한실용적인방법 이산값을가지는함수의추정 = 규칙집합의구축 생성이용이, 학습을통해생성된결정트리를규칙의집합으로이해가능 16

결정트리학습데이터예 Play tennis? Day Outlook 온도 Humidity Wind Pl D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No

결정트리표현 <outlook, humidity, wind, playtennis> 트리생성경우의수? sunny humidity outlook overcast Yes rain wind high low strong weak No Yes No Yes 18

결정트리학습 Top-down greedy search through the space of possible decision trees. 학습데이터 (training examples) 를가장잘분류할수있는속성 (attribute) 을루트 ( 혹은상위노드 ) 에둔다 Entropy, Information Gain 등을이용 ID3 및 C4.5 알고리즘 데이터단편화 데이터가적은경우일반화성능저하 Pruning 결정리스트 (decision list) 논리곱형식의규칙들의순서화된리스트 19

Entropy Entropy S) 2 pi log pi Minimum number of bits of information needed to encode the classification of an arbitrary member of S entropy = 0, if all members in the same class entropy = 1, if positive examples = negative examples c ( i 1

( ) log2 log Entropy S p p p 2 Entropy ([ 9,5 ]) (9/14)log 2(9/14) (5/14)log 2(5/14) 0.940 p 1.0 Entropy(S) 0.0 P 1.0

Information Gain Gain( S, A) Entropy ( S) v Values( A) Sv Entropy ( Sv) S Expected reduction in entropy caused by partitioning the examples according to attribute A Attribute A 를알게되어얻어지는 entropy 의축소정도

Information Gain cont d Values( Wind) Weak, Strong S [9,5 ] Sweak [6,2 ] Sstrong [3,3 ] Gain( S, Wind) Entropy( S) v Weak, Strong Entropy( S) (8 /14) Entropy( Sweak) (6/14) Entropy( Sstrong) 0.940 (8 /14)0.811 (6/14)1.00 0.048 Sv Entropy( Sv) S

Which Attribute is the Best Classifier? High S:[9+, 5-] E=0.940 Humidity Normal Gain(S,Humidity) 0.940 (7/14) 0.985 (7/14) 0.592 0.151 [3+, 4-] E=0.985 [6+, 1-] E=0.592

Which Attribute is the Best Classifier? cont d S:[9+, 5-] E=0.940 Wind Gain(S,Wind) 0.940 (8/14) 0.811 Weak Strong (6/14) 1.0 0.048 [6+, 2-] E=0.811 [3+, 3-] E=1.000 Classifying examples by Humidity provides more information gain than by Wind.

Hypothesis Space Search Training examples 에적합한하나의 hypothesis 를찾는다. ID3 의 hypothesis space the set of possible decision trees hill-climbing search Information gain hill-climbing 의 guide Single current hypothesis 만유지 No back-tracking

Pruning Overfitting 문제 학습데이터에만맞도록학습되어일반성을잃어버림 Occam s razor Prefer the simplest hypothesis that fits the data Shorter trees are preferred over larger trees Prunning Cross Validation ( 교차검증 ) Validation set 의성능을측정하여 validation set 의성능이떨어지기시작하면학습을멈춤 ( 혹은가지치기 ) 28

결정트리예 29

변형기반오류에의한학습 말뭉치기반의자연언어처리를위한규칙학습방법 1990 by Eric Brill 템플릿이용 ( 규칙의후보 ) 초기에빈도를이용한태깅 ( 가장많은태그로태깅 ) 오류를가장많이수정하는규칙순으로규칙을학습 품사태깅, 전치사접속결정, 구문분석, 철자교정, 중의성해소등에적용됨 30

학습된규칙 예 The first rules learnt by Brill s POS tagger # From To If Rule 1 NN NB previous tag is TO to/to conflict/nn NB 2 VBP VB one of the previous 3 tags is MD 3 NN VB one of the previous two tags is MD 4 VB NN one of the previous two tags is DT might/md vanish/vbp VB might/md not reply/nn VB the/dt amazing play/vb NN 31

학습데이터를 모두 저장 귀납적감독학습 (inductive supervised learning) k-nearest neighbor 잡음에약함 실행속도가느림 TiMBL (Tilburg memorybased learning environment) 사례기반학습 32

선형분리자 가중치갱신방법으로학습 잡음, 고차원문제에적합 철자교정, 품사태깅, 문서분류 SNOW (sparse network of Winnows) Widrow-Hoff rule, EG (exponentially gradient) Perceptron SVM (Support Vector Machine) linear kernel 33

Linear Functions w x = w x = 0 -- - - - - - - - - - - - - - 34

The Perceptron LTU Sigmoid Features

Perceptron learning rule On-line, mistake driven algorithm. Rosenblatt (1959) suggested that when a target output value is provided for a single neuron with fixed input, it can incrementally change weights and learn to produce the output using the Perceptron learning rule Perceptron == Linear Threshold Unit x 1 x 6 1 2 3 4 5 6 w 1 w 6 7 y ˆ w i x i i w T x T y 36

Perceptron learning rule We learn f:x {-1,+1} represented as f = sgn{w x) n n n Where X= {0,1} or X= R w R 1 1 2 2 m m Given Labeled examples: {(x,y ),(x,y ),...,(x,y )} 1. Initialize w=0 R n 2. Cycle through all examples a. Predict the label of instance x to be y = sgn{w x) b. If y y, update the weight vector: w = w + r y x (r - a constant, learning rate) Otherwise, if y =y, leave weights unchanged. 37

Footnote About the Threshold On previous slide, Perceptron has no threshold But we don t lose generality: x x,1 x w, x,1 0 w x w w, x 0 x 0 x 1 x 1 38

Geometric View 39

40

41

42

Perceptron Learnability Only linearly separable functions Minsky and Papert (1969) wrote an influential book demonstrating Perceptron s representational limitations Parity functions can t be learned (XOR) 43

Voted-Perceptron Which v i should we use? Maybe the last one? Here it s never gotten any test cases right! (Experimentally, the classifiers move around a lot.) Maybe the best one? But we improved it with later mistakes

Voted-Perceptron cont d Idea two: keep around intermediate hypotheses, and have them vote [Freund and Schapire, 1998] At the end, a collection of linear separators w 0, w 1, w 2,, along with survival times: c n = amount of time that w n survived. This c n is a good measure of the reliability of w n. To classify a test point x, use a weighted majority vote:

Voted-Perceptron cont d Problem: need to keep around a lot of w n vectors Solutions: (i)find representatives (ii) Alternative prediction rule: w avg

From Freund & Schapire, 1998: Classifying digits with VP

비기호적학습 신경망 인간의뇌의정보처리를모방하려고하는학습모델 병렬처리에기반 회귀 (regression), 분류 (classification) 문제에적용 유전자알고리즘 생물의진화를모방한학습방법 지역해를벗어나는것이목표 48

신경망의표현 입출력간의사상을학습 y = f(x 1, x 2,..., x n ) y h 1 h 2 h k x 1 x 2 x 3 x n 49

연결가중치 w 0 x 1 w 1 x 2 w 2 x 3 w 3 n i o w w x 0 1 i i 1 1 exp( o) x n w n 50

신경망학습 가중치조절 헤비안학습규칙, 오류역전파, 볼츠만방법 다층퍼셉트론 (multi-layer perceptron) Universal Approximator 재귀망 (recurrent network) 동적데이터 자기조직신경망 (self-organizing map) 클러스터링 51

신경망의응용 필기체문자인식, 음성인식, 얼굴인식 자연언어처리 문자인식, 음성인식과합성 품사태깅 구절경계찾기, 구문분석, 문법추론, 전치사접속결정, 중의성해소, 문서분류, 철자교정 52

유전자알고리즘 생물의진화과정모델링 함수최적화에이용 개체군 (population) 적합도 (fitness function) 선택, 복제, 교차, 돌연변이 군탐색방법 (populationbased search) 확률적연산 전역해 (global solution) 53

진화과정 00010101011101 00010100001101 0001011111101 reproduction crossover mutation 00010101011101 00010100011101 1111011111101 54

유전알고리즘의응용 최적화문제 결정트리학습, 신경망학습 자연언어처리 품사태깅, 구문분석 정보검색, 동사분류 55

확률적학습 확률모델 관찰되는데이터를생성하는과정을기술하는모델 확률망 (probabilistic network) 형태 확률변수간의확률적의존을표현 결합확률분포 (joint probability distribution) 를표현 56

나이브베이즈 (Naïve Bayes) 분류기 개체의종류가정해진경우각특성들간의독립을가정 C a 1 a 2 a 3 a n 57

나이브베이즈분류기의확률추론 데이터 (a 1,, a n ) 의종류 (class) c * 58 n k i k i c i i n c n i i n c n i c c a P c P c P c a a a P a a a P c P c a a a P a a a c P c i i i i 1 2 1 2 1 2 1 2 1 * ) ( ) ( argmax ) ( ),...,, ( argmax ),...,, ( ) ( ),...,, ( argmax ),...,, ( argmax

나이브베이즈분류기의응용 문맥의존철자교정, 품사태깅, 의미중의성해소 문서분류 문서표현 : term vector (t 1, t 2,, t n ) 문서를종류별로구분 59

기타기계학습방법 최대엔트로피 다양한통계적증거들을최대엔트로피원리에의거해결합, 활용 SVM 계산학습이론에기반 문서분류 은닉마코프모델 음성인식, 합성, 품사태깅 Viterbi 알고리즘 (dynamic programming) 베이지안망 확률그래프모델 인과관계의추론 클러스터링 비지도학습 앙상블머신 품사태깅, 철자교정 배깅, 부스팅 60

인공지능 결론 지능적인기계의개발 자연언어처리가필요 자연언어처리 자연언어의이해 기계학습의이용 기계학습 61