Structural SVMs 및 Pegasos 알고리즘을 이용한 한국어 개체명 인식

Similar documents
RNN & NLP Application

PowerPoint Presentation

Structural SVMs 및 Pegasos 알고리즘을 이용한 한국어 개체명 인식

딥러닝NLP응용_이창기

<4D F736F F D20B1E2C8B9BDC3B8AEC1EE2DC0E5C7F5>

Delving Deeper into Convolutional Networks for Learning Video Representations - Nicolas Ballas, Li Yao, Chris Pal, Aaron Courville arXiv:

2 : (Seungsoo Lee et al.: Generating a Reflectance Image from a Low-Light Image Using Convolutional Neural Network) (Regular Paper) 24 4, (JBE

PowerPoint 프레젠테이션

Multi-pass Sieve를 이용한 한국어 상호참조해결 반-자동 태깅 도구

4 : (Hyo-Jin Cho et al.: Audio High-Band Coding based on Autoencoder with Side Information) (Special Paper) 24 3, (JBE Vol. 24, No. 3, May 2019

DIY 챗봇 - LangCon

_KrlGF발표자료_AI

(JBE Vol. 24, No. 1, January 2019) (Special Paper) 24 1, (JBE Vol. 24, No. 1, January 2019) ISSN 2287-

김기남_ATDC2016_160620_[키노트].key

R을 이용한 텍스트 감정분석

(JBE Vol. 23, No. 2, March 2018) (Special Paper) 23 2, (JBE Vol. 23, No. 2, March 2018) ISSN

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Jul.; 29(7),

딥러닝 첫걸음

(JBE Vol. 24, No. 2, March 2019) (Special Paper) 24 2, (JBE Vol. 24, No. 2, March 2019) ISSN

01 AI Definition 02 Deep Learning Theory - Linear Regression - Cost Function - Gradient Descendent - Logistic Regression - Activation Function - Conce

2 : (EunJu Lee et al.: Speed-limit Sign Recognition Using Convolutional Neural Network Based on Random Forest). (Advanced Driver Assistant System, ADA

Slide 1

Microsoft PowerPoint - 실습소개와 AI_ML_DL_배포용.pptx

Manufacturing6

Interactive Transcribed Dialog Data Normalization

사회통계포럼

4 : CNN (Sangwon Suh et al.: Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset) (Regular Paper) 23 6, (J

(JBE Vol. 23, No. 2, March 2018) (Special Paper) 23 2, (JBE Vol. 23, No. 2, March 2018) ISSN

<4D F736F F F696E74202D F ABFACB1B8C8B85FBEF0BEEEC3B3B8AEBFCDB1E2B0E8B9F8BFAAC7F6C8B228C1F6C3A2C1F829>

본문01

02본문

PowerPoint 프레젠테이션

김경재 안현철 지능정보연구제 17 권제 4 호 2011 년 12 월

<313120C0AFC0FCC0DA5FBECBB0EDB8AEC1F2C0BB5FC0CCBFEBC7D15FB1E8C0BAC5C25FBCF6C1A42E687770>

PowerPoint 프레젠테이션

Ch 1 머신러닝 개요.pptx

3 Gas Champion : MBB : IBM BCS PO : 2 BBc : : /45

PowerPoint Presentation

Buy one get one with discount promotional strategy

High Resolution Disparity Map Generation Using TOF Depth Camera In this paper, we propose a high-resolution disparity map generation method using a lo

KCC2011 우수발표논문 휴먼오피니언자동분류시스템구현을위한비결정오피니언형용사구문에대한연구 1) Study on Domain-dependent Keywords Co-occurring with the Adjectives of Non-deterministic Opinion

종합설계 I (Xcode and Source Control )

Page 2 of 5 아니다 means to not be, and is therefore the opposite of 이다. While English simply turns words like to be or to exist negative by adding not,

Microsoft PowerPoint - AC3.pptx

<4D F736F F D20C3D6BDC C0CCBDB4202D20BAB9BBE7BABB>

PowerPoint Presentation

방송공학회논문지 제18권 제2호

2 : (Juhyeok Mun et al.: Visual Object Tracking by Using Multiple Random Walkers) (Special Paper) 21 6, (JBE Vol. 21, No. 6, November 2016) ht

<4D F736F F D20C3D6BDC C0CCBDB4202D20BAB9BBE7BABB>

산선생의 집입니다. 환영해요

Software Requirrment Analysis를 위한 정보 검색 기술의 응용

Electronics and Telecommunications Trends 인공지능을이용한 3D 콘텐츠기술동향및향후전망 Recent Trends and Prospects of 3D Content Using Artificial Intelligence Technology

PowerPoint 프레젠테이션

Journal of Educational Innovation Research 2018, Vol. 28, No. 4, pp DOI: * A S

NLTK 6: 텍스트 분류 학습 (` `%%%`#`&12_`__~~~ౡ氀猀攀)

소프트웨어개발방법론

untitled

Recommender Systems - Beyond Collaborative Filtering

cha4_ocw.hwp

Artificial Intelligence: Assignment 6 Seung-Hoon Na December 15, Sarsa와 Q-learning Windy Gridworld Windy Gridworld의 원문은 다음 Sutton 교재의 연습문제

2 : CNN (Jaeyoung Kim et al.: Experimental Comparison of CNN-based Steganalysis Methods with Structural Differences) (Regular Paper) 24 2, (JBE

Vol.259 C O N T E N T S M O N T H L Y P U B L I C F I N A N C E F O R U M

2 : (Rahoon Kang et al.: Image Filtering Method for an Effective Inverse Tone-mapping) (Special Paper) 24 2, (JBE Vol. 24, No. 2, March 2019) h

<32392D342D313020C0FCB0C7BFED2CC0CCC0B1C8F12E687770>

다중 곡면 검출 및 추적을 이용한 증강현실 책

example code are examined in this stage The low pressure pressurizer reactor trip module of the Plant Protection System was programmed as subject for

1-1-basic-43p

歯15-ROMPLD.PDF

Naver.NLP.Workshop.SRL.Sogang_Alzzam

목차 AI Boom Chatbot Deep Learning Company.AI s Approach AI Chatbot In Financial service 2

Microsoft PowerPoint - 알고리즘_5주차_1차시.pptx

6주차.key

Pattern Recognition

6 : (Gicheol Kim et al.: Object Tracking Method using Deep Learing and Kalman Filter) (Regular Paper) 24 3, (JBE Vol. 24, No. 3, May 2019) http

Page 2 of 6 Here are the rules for conjugating Whether (or not) and If when using a Descriptive Verb. The only difference here from Action Verbs is wh

Contents I II Project Overview 상황분석 및 여건진단 II-1 문화재청 정책 및 사업 분석 II-2 II-3 문화재청 정책 커뮤니케이션 분석 문화재청 일반인식 분석 III 조직분석 및 사례 연구 III-1 문화재청 홍보 조직 및 예산 분석 III

Product A4

Probability Overview Naive Bayes Classifier Director of TEAMLAB Sungchul Choi

Ⅱ. Embedded GPU 모바일 프로세서의 발전방향은 저전력 고성능 컴퓨팅이다. 이 러한 목표를 달성하기 위해서 모바일 프로세서 기술은 멀티코 어 형태로 발전해 가고 있다. 예를 들어 NVIDIA의 최신 응용프 로세서인 Tegra3의 경우 쿼드코어 ARM Corte

(JBE Vol. 21, No. 1, January 2016) (Regular Paper) 21 1, (JBE Vol. 21, No. 1, January 2016) ISSN 228

SW_faq2000번역.PDF

09권오설_ok.hwp

sna-node-ties

(, sta*s*cal disclosure control) - (Risk) and (U*lity) (Synthe*c Data) 4. 5.

Reinforcement Learning & AlphaGo

(JBE Vol. 7, No. 4, July 0)., [].,,. [4,5,6] [7,8,9]., (bilateral filter, BF) [4,5]. BF., BF,. (joint bilateral filter, JBF) [7,8]. JBF,., BF., JBF,.

Slide 1

PowerPoint 프레젠테이션

SOSCON-MXNET_1014

4 CD Construct Special Model VI 2 nd Order Model VI 2 Note: Hands-on 1, 2 RC 1 RLC mass-spring-damper 2 2 ζ ω n (rad/sec) 2 ( ζ < 1), 1 (ζ = 1), ( ) 1

PowerPoint 프레젠테이션

Chapter4.hwp

ePapyrus PDF Document

untitled

Disclaimer IPO Presentation,. Presentation...,,,,, E.,,., Presentation,., Representative...

ecorp-프로젝트제안서작성실무(양식3)

PowerPoint 프레젠테이션


- 이 문서는 삼성전자의 기술 자산으로 승인자만이 사용할 수 있습니다 Part Picture Description 5. R emove the memory by pushing the fixed-tap out and Remove the WLAN Antenna. 6. INS

Pattern Recognition

Transcription:

Deep Learning

차례 현재딥러닝기술수준소개 딥러닝 딥러닝기반의자연어처리

Object Recognition https://www.youtube.com/watch?v=n5up_lp9smm

Semantic Segmentation https://youtu.be/zjmtdrbqh40

Semantic Segmentation VGGNet + Deconvolution network

Image Completion https://vimeo.com/38359771

Neural Art Artistic style transfer using CNN

Hand Writing by Machine LSTM RNN: Input: recurrent neural network handwriting generation demo Style: http://www.cs.toronto.edu/~graves/handwriting.html

Music Composition https://highnoongmt.wordpress.com/2015/05/22/lisls-stis-recurr ent-neural-networks-for-folk-music-generation/

Image Caption Generation W t+1 Softmax Multimodal CNN GRU Image Embedding W t 한어린소녀가풀로덮인들판에서있다 건물앞에서있는한남자 구명조끼를입은한작은소녀가웃고있다 분홍색개를데리고있는한여자와한여자

Visual Question Answering Facebook: Visual Q&A

Word Analogy King Man + Woman Queen Queen King + Kings Queens Japan Korean + Hangul =? http://deeplearner.fz-qqq.net/

Neural Machine Translation T S 777 항공편은 3 시간동안지상에 있 겠 습니다. </s> flight 0.5 0.4 0 0 0 0 0 0 0 0 0 0 0 777 0.3 0.6 0 0 0 0 0 0 0 0 0 0 0 is 0 0.1 0 0 0.1 0.2 0 0.4 0 0.1 0 0 0 on 0 0 0 0 0 0 0 0.7 0.2 0.1 0 0 0 the 0 0 0 0.2 0.3 0.3 0.1 0 0 0 0 0 ground 0 0 0 0.1 0.2 0.5 0.3 0 0 0 0 0 0 for 0 0 0 0.1 0.2 0.5 0.1 0.1 0 0 0 0 0 three 0 0 0 0.2 0.2 0.6 0 0 0 0 0 0 0 hours 0 0 0 0.1 0.3 0.5 0 0 0 0 0 0 0. 0 0 0 0.4 0 0.1 0.2 0.1 0.1 0.1 0 0 0 </s> 0 0 0 0 0 0 0 0.1 0 0.1 0.1 0.3 0.3

Neural Conversation Model 세탁기가전원이안들어와요 A) 많이불편하셨겠습니다 A) 드럼이세요아니면일반세탁기세요 언제방문해줄수있나요? A) 언제방문드리면편하실까요 A) 언제방문드리면편하시겠어요 출장비가얼마죠? A) 출장점검료는 15000 원이구요 A) 출장비가만오천원이구요

Abstractive Text Summarization RNN_search+input_feeding+CopyNet 로드킬로숨진친구의곁을지키는길고양이의모습이포착되었다.

Learning to Execute LSTM RNN

Learning Approximate Solutions Travelling Salesman Problem: NP-hard Pointer Network can learn approximate solutions: O(n^2)

One Shot Learning Learning from a few examples Matching Nets use attention and memory a(x 1,x 2 ) is a attention kernel

차례 현재딥러닝기술수준소개 딥러닝 딥러닝기반의자연어처리

Neural Networks 20

Deep Neural Networks Deep Neural Network = Neural Network + multiple levels of nonlinear operations. 21

Why Deep Neural Networks? 사람의인지과정과유사함 추상화 : 저수준의표현 고수준의표현 22

Why Deep Neural Networks?: Integrated Learning 기존기계학습방법론 Handcrafting features time-consuming Deep Neural Network: Feature Extractor + Classifier < 겨울학교 14 Deep Learning 자료참고 > 23

Why Deep Neural Networks?: Unsupervised Feature Learning 기계학습에많은학습데이터필요 소량의학습데이터 학습데이터구축비용 / 시간 대량의원시코퍼스 (unlabeled data) Semi-supervised, Unsupervised Deep Neural Network Pre-training 방법을통해대량의원시코퍼스에서자질학습 Restricted Boltzmann Machines (RBM) Stacked Autoencoder, Stacked Denosing Autoencoder Word Embedding (for NLP) 24

DNN Difficulties Now 학습이잘안됨 Unsupervised Pre-training Back-propagation 알고리즘 X 많은계산이필요함 하드웨어 /GPU 발전 Many parameters Over-fitting 문제 Pre-training, Drop-out, 25

Deep Belief Network [Hinton06] Key idea Pre-train layers with an unsupervised learning algorithm in phases Then, fine-tune the whol e network by supervised learning DBN are stacks of Restricted Boltzmann Machines (RBM) 26

Restricted Boltzmann Machine A Restricted Boltzmann ma chine (RBM) is a generative stochastic neural network that can learn a probability distribution over its set of inputs Major applications Dimensionality reduction Topic modeling, 27

Training DBN: Pre-Training 1. Layer-wise greedy unsupervised pre-training Train layers in phase from the bottom layer 28

Training DBN: Fine-Tuning 2. Supervised fine-tuning for the classification task 29

The Back-Propagation Algorithm

Autoencoder Autoencoder is an NN whose desired output is the same as the input To learn a compressed representation (encoding) for a set of data. Find weight vectors A and B that minimize: Σ i (y i -x i ) 2 < 겨울학교 14 Deep Learning 자료참고 > 31

Stacked Autoencoders After training, the hidden node extracts features from the input nodes Stacking autoencoders constructs a deep network < 겨울학교 14 Deep Learning 자료참고 > 32

Dropout (Hinton12) In training, randomly dropout hidden units with probability p. < 겨울학교 14 Deep Learning 자료참고 > 33

Non-linearity (Activation Function) 34

Convolutional Neural Network (LeCun98) Convolutional NN Convolution Layer Sparse Connectivity Shared Weights Multiple feature maps Sub-sampling Layer Ex. LeNet Average/max pooling NxN 1 Multiple feature maps

CNN Architectures

CNN for Audio

Recurrent Neural Network Recurrent property dynamical system over time

Bidirectional RNN Exploit future context as well as past

Long Short-Term Memory RNN LSTM can preserve gradient information

차례 현재딥러닝기술수준소개 딥러닝 딥러닝기반의자연어처리

텍스트의표현방식 One-hot representation (or symbolic) Ex. [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0] Dimensionality 50K (PTB) 500K (big vocab) 3M (Google 1T) Problem Motel [0 0 0 0 0 0 0 0 1 0 0] AND Hotel [0 0 0 0 0 0 1 0 0 0 0] = 0 Continuous representation Latent Semantic Analysis, Random projection Latent Dirichlet Allocation, HMM clustering Neural Word Embedding Dense vector By adding supervision from other tasks improve the representation

Neural Network Language Model (Bengio00,03) Idea A word and its context is a positive training sample A random word in that sam e context negative trainin g sample Score(positive) > Score(neg.) Training complexity is high Hidden layer output Softmax in the output layer Hierarchical softmax Negative sampling Ranking(hinge loss) LT: V *d, Input(one hot): V *1 LT T I Shared weights = Word embedding Input Dim: 1 Dim: 2 Dim: 3 Dim: 4 Dim: 5 1 (boy) 0.01 0.2-0.04 0.05-0.3 2 (girl) 0.02 0.22-0.05 0.04-0.4

한국어 Word Embedding: NNLM

전이기반의한국어의존구문분석 : Forward Transition-based(Arc-Eager): O(N) 예 : CJ 그룹이 1 대한통운 2 인수계약을 3 체결했다 4 [root], [CJ 그룹이 1 대한통운 2 ], {} 1: Shift [root CJ 그룹이 1 ], [ 대한통운 2 인수계약을 3 ], {} 2: Shift [root CJ 그룹이 1 대한통운 2 ], [ 인수계약을 3 체결했다 4 ], {} 3: Left-arc(NP_MOD) [root CJ 그룹이 1 ], [2 인수계약을 3 체결했다 4 ], {( 인수계약을 3 대한통운 2 )} 4: Shift [root CJ 그룹이 1 2 인수계약을 3 ], [ 체결했다 4 ], {( 인수계약을 3 대한통운 2 )} 5: Left-arc(NP_OBJ) [root CJ 그룹이 1 ], [3 체결했다 4 ], {( 체결했다 4 인수계약을 3 ), } 6: Left-arc(NP_SUB) [root], [(1,3) 체결했다 4 ], {( 체결했다 4 CJ 그룹이 1 ), } 7: Right-arc(VP) [root 4 (1,3) 체결했다 4 ], [], {(root 체결했다 4 ), }

딥러닝기반한국어의존구문분석 ( 한글및한국어 14) Transition-based + Backward O(N) 세종코퍼스 의존구문변환 보조용언 / 의사보조용언후처리 Deep Learning 기반 ReLU(> Sigmoid) + Dropout Korean Word Embedding NNLM, Ranking(hinge, logit) Word2Vec Feature Embedding POS (stack + buffer) 자동분석 ( 오류포함 ) Dependency Label (stack) Distance information Valency information Mutual Information 대용량코퍼스 자동구문분석 LT 1 LT N Input Word S[w t-2 w t-1 ] B[w t ] Word Lookup Table M 1 x M 2 x Linear ReLU Linear concat Input Feature f 1 f 2 f 3 LT 1 f 4 Feature Lookup Table LT D h #output

한국어의존구문분석실험결과 기존연구 : UAS 85~88% Structural SVM 기반성능 : UAS=89.99% LAS=87.74% Pre-training > no Pre. Dropout > no Dropout ReLU > Sigmoid MI feat. > no MI feat. Word Embedding 성능순위 1. NNLM 2. Ranking(logit loss) 3. Word2vec 4. Ranking(hinge loss)

LSTM RNN + CRF LSTM-CRF 제안 y(t+1) y(t ) y(t+1) y(t+1) y(t ) y(t+1) h(t-1) h(t ) h(t+1) x(t-1) x(t ) x(t+1) x(t-1) x(t ) x(t+1) y(t+1) y(t ) y(t+1) f (t ) h(t-1) h(t ) h(t+1) i (t ) o(t ) x(t-1) x(t ) x(t+1) x(t ) C(t) h(t )

영어개체명인식 (KCC 15, Journal submitted) 영어개체명인식 (CoNLL03 data set) F1(dev) F1(test) SENNA (Collobert) - 89.59 Structural SVM (baseline + Word embedding feature) - 85.58 FFNN (Sigm + Dropout + Word embedding) 91.58 87.35 RNN (Sigm + Dropout + Word embedding) 91.83 88.09 LSTM RNN (Sigm + Dropout + Word embedding) 91.77 87.73 GRU RNN (Sigm + Dropout + Word embedding) 92.01 87.96 CNN+CRF (Sigm + Dropout + Word embedding) 93.09 88.69 RNN+CRF (Sigm + Dropout + Word embedding) 93.23 88.76 LSTM+CRF (Sigm + Dropout + Word embedding) 93.82 90.12 GRU+CRF (Sigm + Dropout + Word embedding) 93.67 89.98

한국어감성분석 CNN Mobile data Train: 4543, Test: 500 EMNLP14 모델 (CNN) 적용 Matlab 으로구현 Word embedding: 한국어 10 만단어 + 도메인특화 1420 단어 Data set Model Accuracy Mobile Train: 4543 Test: 500 SVM (word feature) 85.58 CNN(relu,kernel3,hid50)+Word embedding (word feature) 91.20

LSTM RNN 기반한국어감성분석 LSTM RNN-based encoding Sentence embedding 입력 Fully connected NN 출력 GRU encoding 도유사함 h(1) h(2 ) h(t) y x(1) x(2 ) x(t) Data set Model Accuracy Mobile Train: 4543 Test: 500 SVM (word feature) 85.58 CNN(relu,kernel3,hid50)+Word embedding (word feature) 91.20 GRU encoding + Fully connected NN 91.12 LSTM RNN encoding + Fully connected NN 90.93

Recurrent NN Encoder Decoder for Statistical Machine Translation (EMNLP14)

Sequence to Sequence Learning with Neural Networks (NIPS14 Google) Source Voc.: 160,000 Target Voc.: 80,000 Deep LSTMs with 4 layers Train: 7.5 epochs (12M sentences, 10 days with 8- GPU machine)

Neural MT by Jointly Learning to Align and Translate (ICLR15) GRU RNN + Alignment Encoding GRU RNN Decoding Vocab: 30,000 (src, tgt) Train: 5 days

J-to-E Neural MT (WAT) 1/2 ASPEC-JE data Neural MT (RNN-search) GRU RNN + Alignment Encoding GRU RNN Decoding Vocab size: 20,000 (src, tgt) BLEU(test): 21.63 (beam=10) WAT14(Juman): PBMT=18.45, HPBMT=18.72, NAIST(1 위,forest-tostring)=23.29

J-to-E Neural MT 실험 (WAT) 2/2 最後 /ncc:0 に /ps:1,/sl:2 将来 /nca:3 展望 /ncs:4 に /ps:5 つい /vc:6 て /pj:7 記述 /ncs:8 </s>:9 the/dt:0 future/jj:1 view/nn:2 is/vbz:3 described/vbn:4./.:5 </s>:6 食物 /ncc:0 アレルギー /ncc:1 は /pc:2 アナフィラキシー /ncc:3 の /ps:4 主要 /dc:5 な /vx:6 原因 /ncs:7 抗原 /ncc:8 の /ps:9 一 /nn:10 つ /xnn:11 で /vx:12 ある /vd:13 /op:14 </s>:15 the/dt:0 food/nn:1 allergy/nn:2 is/vbz:3 one/cd:4 of/in:5 the/dt:6 main/jj:7 causal/jj:8 antigen/nn:9 of/in:10 the/dt:11 anaphylaxis/nn:12./.:13 </s>:14

Neural Conversation Model 세탁기가전원이안들어와요 A) 많이불편하셨겠습니다 A) 드럼이세요아니면일반세탁기세요 언제방문해줄수있나요? A) 언제방문드리면편하실까요 A) 언제방문드리면편하시겠어요 출장비가얼마죠? A) 출장점검료는 15000 원이구요 A) 출장비가만오천원이구요

Abstractive Text Summarization RNN_search+input_feeding+CopyNet 로드킬로숨진친구의곁을지키는길고양이의모습이포착되었다.

Learning to Execute LSTM RNN

Learning Approximate Solutions Travelling Salesman Problem: NP-hard Pointer Network can learn approximate solutions: O(n^2)

End-to-End Neural Speech Recognition (15)

Neural Image Caption Generator (14)

Korean Image Caption Generation W t+1 Softmax Multimodal CNN GRU Image Embedding W t 한어린소녀가풀로덮인들판에서있다 건물앞에서있는한남자 구명조끼를입은한작은소녀가웃고있다 분홍색개를데리고있는한여자와한여자