의미정보를활용한관계추출 시스템개발및성능평가

Similar documents
Getting Started

untitled

Microsoft PowerPoint - ch03ysk2012.ppt [호환 모드]


Precipitation prediction of numerical analysis for Mg-Al alloys

Microsoft Word - KSR2012A038.doc


하반기_표지

untitled

Microsoft PowerPoint - ch10 - 이진트리, AVL 트리, 트리 응용 pm0600

hapter_ i i 8 // // 8 8 J i 9K i? 9 i > A i A i 8 8 KW i i i W hapter_ a x y x y x y a /()/()=[W] b a b // // // x x L A r L A A L L A G // // // // /

전기설비의 검사˚점검 및 시험등

<C1A4C3A5BAB8B0EDBCAD2D D30355F33B1B32E687770>

<C1A4C3A5BAB8B0EDBCAD D325F32B1B32E687770>

기본서(상)해답Ⅰ(001~016)-OK

232 도시행정학보 제25집 제4호 I. 서 론 1. 연구의 배경 및 목적 사회가 다원화될수록 다양성과 복합성의 요소는 증가하게 된다. 도시의 발달은 사회의 다원 화와 밀접하게 관련되어 있기 때문에 현대화된 도시는 경제, 사회, 정치 등이 복합적으로 연 계되어 있어 특

chap 5: Trees

d*%7 *%7 Í f. : 6'6 ú: Ð : Ë Í : ä ö{d r üz : 02/<.27(5/$17,)5,&7,21&2$7,1*7+,11(5 r xu : r Ì : Ï Í³ ͳ üz : ý~u(v )ˆõ : j Ú¼v u j u j þñ: n úu : n :

242..

Microsoft Word - FS_ZigBee_Manual_V1.3.docx



sna-node-ties

source.pdf

탄도미사일 방어무기체계 배치모형 연구 (Optimal Allocation Model for Ballistic Missile Defense System by Simulated Annealing Algorithm)

Microsoft PowerPoint - 27.pptx


#Ȳ¿ë¼®

슬라이드 제목 없음

<30385FC1A4C3A2C8C42E687770>

Microsoft PowerPoint - Chap5 [호환 모드]

정보기술응용학회 발표

(JBE Vol. 21, No. 1, January 2016) (Regular Paper) 21 1, (JBE Vol. 21, No. 1, January 2016) ISSN 228

½Éº´È¿ Ãâ·Â

#KM560

2 A A Cs A C C A A B A B 15 A C 30 A B A C B. 1m 1m A. 1 C.1m P k A B u k GPS GPS GPS GPS 4 2


<B3EDB9AEC1FD5F3235C1FD2E687770>

EA0015: 컴파일러

Korean 654x Quick Start Guide

?

.4 편파 편파 전파방향에수직인평면의주어진점에서시간의함수로 벡터의모양과궤적을나타냄. 편파상태 polriion s 타원편파 llipill polrid: 가장일반적인경우 의궤적은타원 원형편파 irulr polrid 선형편파 linr polrid k k 복소량 편파는 와 의

한약재품질표준화연구사업단 강활 ( 羌活 ) Osterici seu Notopterygii Radix et Rhizoma 생약연구과

<C3CA3520B0FAC7D0B1B3BBE7BFEB202E687770>

untitled

歯_ _ 2001년도 회원사명단.doc

untitled

Berechenbar mehr Leistung fur thermoplastische Kunststoffverschraubungen

YV-150-S.CHINESE1.0-1

untitled

Microsoft Word - KSR2012A103.doc

Vol.258 C O N T E N T S M O N T H L Y P U B L I C F I N A N C E F O R U M

12È«±â¼±¿Ü339~370

내용 q Introduction q Binary passand modulation Ÿ ASK (Amplitude Shift Keying) Ÿ FSK (Frequency Shift Keying) Ÿ PSK (Phase Shift Keying) q Comparison of

Microsoft PowerPoint - PL_03-04.pptx

Ä¡¿ì³»ÁöÃÖÁ¾

Microsoft Word - KSR2013A299

step 1-1

300 구보학보 12집. 1),,.,,, TV,,.,,,,,,..,...,....,... (recall). 2) 1) 양웅, 김충현, 김태원, 광고표현 수사법에 따른 이해와 선호 효과: 브랜드 인지도와 의미고정의 영향을 중심으로, 광고학연구 18권 2호, 2007 여름

untitled

< C6AFC1FD28C3E0B1B8292E687770>

[ReadyToCameral]RUF¹öÆÛ(CSTA02-29).hwp

TutorialOnHowToUseTheKoreanRomanizationAndWordDivision(BasicGuide)_

Microsoft Word - KSR2012A021.doc

Sun ONE Portal Server, Mobile Access, h 6.2 ƒe 1 Ï Û ( Ñ ) l d e ' f el d ' f f i. 2 f CPU d Šf th. l hh Š Š h h Š. l hh Š f f hšš. l hh j j l

216 동북아역사논총 41호 인과 경계공간은 설 자리를 잃고 배제되고 말았다. 본고에서는 근세 대마도에 대한 한국과 일본의 인식을 주로 영토와 경계인 식을 중심으로 고찰하고자 한다. 이 시기 대마도에 대한 한일 양국의 인식을 살펴볼 때는 근대 국민국가적 관점에서 탈피할

지능정보연구제 16 권제 1 호 2010 년 3 월 (pp.71~92),.,.,., Support Vector Machines,,., KOSPI200.,. * 지능정보연구제 16 권제 1 호 2010 년 3 월

SW

1 SW

Microsoft Word - KSR2012A219.doc

chap01_time_complexity.key

½ºÅ丮ÅÚ¸µ3_³»Áö

272*406OSAKAÃÖÁ¾-¼öÁ¤b64ٽÚ

PART

Part Part

£01¦4Àå-2

À±½Â¿í Ãâ·Â

(5차 편집).hwp

03¹ü¼±±Ô

Mango220 Android How to compile and Transfer image to Target

thesis

歯M PDF

untitled

Vertical Probe Card Technology Pin Technology 1) Probe Pin Testable Pitch:03 (Matrix) Minimum Pin Length:2.67 High Speed Test Application:Test Socket

untitled

Microsoft Word - KSR2012A172.doc

PowerPoint 프레젠테이션

김경재 안현철 지능정보연구제 17 권제 4 호 2011 년 12 월

untitled

슬라이드 1

한국성인에서초기황반변성질환과 연관된위험요인연구

#KM-235(110222)

BSC Discussion 1

16(1)-3(국문)(p.40-45).fm

01KRCOV-KR

#KLZ-371(PB)

Microsoft PowerPoint - 제8장-트리.pptx

Microsoft PowerPoint Relations.pptx

별지 제10호 서식

Transcription:

의미정보를활용한관계추출 시스템개발및성능평가

의미정보를활용한관계추출 시스템개발및성능평가

Δ λ σ α

l

l

l

x r pt

φ Ф r r N φ : xpt Î X a φ( xpt ) ÎΦ Í φ( x r pt ) ( ) r r r r φ x f x f x f x ( ) ( ) ( )... pt = pt pt N ( pt ) ( ) f = the number of subtree Î S appearing in i S = a set of all the unique subtrees of the entire tree set. i φ( x r pt ) r r r r K x x φ x φ x ( ) = ( ) ( ) pt pt pt pt pt = r ( ) ( ) N å é fi xpt fi x ù pt i= ë û r r r K ( x x ) pt pt pt

의최상위노드가 이면 아니면 Δ Δ Δ 3 4 5 6 7 8 9 0 3 4 FUNCION delta(reenode n reenode n λ σ) n = one node of ; // n = one node of ; λ = tree kernel decay factor; // σ = substructure division methods; // S(0) SS() BEGIN nc = get_children_number(n ); // nc = get_children_number(n ); // IF nc EQUAL 0 AND nc EQUAL 0 HEN nv = get_node_value(n ); // nv = get_node_value(n ); // ( )

5 6 7 8 9 0 3 4 5 6 7 8 9 30 3 3 33 34 35 36 37 38 39 40 4 IF nv EQUAL nv HEN REURN ; ENDIF ENDIF np = get_production_rule(n ); // np = get_production_rule(n ); IF np NO EQUAL np HEN // REURN 0; END IF // // // IF np EQUAL np AND nc EQUAL AND nc EQUAL HEN REURN λ; END IF // delta // delta mult_delta = ; FOR I = O nc nch = I th child of n ; nch = I th child of n ; // delta mult_delta = mult_delta (σ + delta(nch nch λ σ)); END FOR REURN λ mult_delta; END σ

σ Δ λ

3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 FUNCION word_sense_disambiguation(word POS context level) word = target word to be disambiguated; POS = Part-Of-Speech of the word; context = neighboring words of word; level = synset level to be considered in extracting synset words; BEGIN END synsets = search_word_in_wordnet(word POS); IF (synsets IS EMPY) HEN REURN NULL; max_dups = 0; max_synset = NULL; FOR EACH synset IN synsets retrieved BEGIN END FOR sw = get_synset_words(synset level); dups = get_duplication_count(sw context); IF max_dups < dups HEN END IF REURN max_synset; max_dups = dups; max_synset = synset;

K sem ( l s a ) = D ( n n l s a ) å å sem n Î n ÎN N λ Ÿ Ÿ (tree depth)

Ÿ σ Ÿ (Subree S) Ÿ (SubSet ree SS) Ÿ α Ÿ 0 : WSD synset Ÿ : synset synset Ÿ : synset synset Δ [ 4] Δ sem(n n λ σ α) 3 4 5 6 7 8 9 0 3 4 5 6 FUNCION Semantic_Delta(reeNode n reenode n λ σ α) BEGIN IF n and n are both terminal nodes HEN concept = get_semantic_concpet(n α); concept = get_semantic_concept(n α); IF concept == concept HEN REURN ; REURN 0; END IF IF n and n are from different productions HEN REURN 0; END IF

7 8 9 0 3 4 END IF the productions of n and n are the same HEN IF n and n are pre-terminal nodes HEN REURN λ; nc( n ) j j lõ = ( s + Semantic _ Delta( ch ) j n ch n l s a REURN END IF α α

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) å å å å å å å å Ì Î Ì Î Ì Î Ì Î Ì Î Ì Î Ì Î Ì Î ø ö ç ç è æ = ø ö ç ç è æ» ø ö ç ç è æ» ø ö ç ç è æ» º w w w w w w w w W c W w W c W w W c W w W c W w W c W w W c W w W c W w W c W w lex c w synset pos c w synset pos Ι c w synset c w synset Ι c w concept c w concept Ι c c w sim w W sim W sim a a a a a a a a a ( 9-) ( 9-) ( 9-3) ( 9-4) ( 9-5) ( ) ( ) ( ) ( ) s l a a s l a s l sim sim n n K syn lex N n N n sem sem + = = å å D Î Î α λ σ

α λ σ sim syn ( l s ) º ç D( n n l s ) å å ç n ÎN n ÏL n Î N n ÏL æ è ö ø

( ) ( ) ( ) ( ) ( ) ( ) ( ) å å å å Ï Î Ï Î Ì Î Ì Î ø ö ç ç è æ D + ø ö ç ç è æ º w w L n N n L n N n W c W w W c W w sem n n c w synset pos c w synset pos Ι K s l a a a s l

ree Pruning Methods Minimum Complete ree(mc) P a t h - e n c l o s e d ree(p) Chunking ree(c) Context-sensitive P(CP) Context-sensitive C(CC) Flattened P(FP) Flattened CP(FCP) Details 구문트리내에서두개체를포함하고있는최소완전부분트리두개체를연결하는최소경로내에포함된부분트리 P 에서기저구 (Base Phrase) 및품사정보를제외한모든내부노드들을제거한트리 P 에서좌측개체의좌측노드하나 우측개체의우측노드하나를추가한트리 C 에서좌측개체의좌측노드하나 우측개체의우측노드하나를추가한트리 P 에서부모노드및자식노드가각각 개뿐인노드들을제거 ( 품사노드제외 ) C 에서부모노드및자식노드가각각 개뿐인노드들을제거 ( 품사노드제외 ) F Ranking 7 5 3 6 4

AIMed BioInfer HPRD50 IEPA LLL 955 00 45 486 77 (Positive instance) (Negative instance) 000 534 63 335 64 4834 73 70 48 66

Causal (658) Change (599) Relations and #Instances 56 05 Amount 39 Full-Stop Dynamics Negative 48 (90) Positive 80 Start 7 Unspecified 4 Location 55 8 Physical Assembly 788 Break-Down 4 (90) Modification 44 (00) Condition (3) 3 HUMANMADE (4) 4 IS_A (5) 95 Equality (30) 30 7 Observation (55) Spatial 7 emporal PAR_OF (38) Collection:Member 56 Object:Component 6 RELAE (87) 87 Addition 56

Level 3 4 5 6 #otal # Relation ype Classes 4 8 6 8 0 8 # Relation Predicates 4 6 0 7 6 5 68 # otal 8 4 6 5 8 5 96

RCP_L 6 RCP_L RCP_L3 RCO 5

<?xml version=".0" encoding="euc-kr"?> <!-- Overview: DOC := EX NRLIS [attrs: did] EX := I AB I := S+ AB := S+ S := (PCDAA NE)* NE := PCDAA [attrs: edi co_ref class nn] NRLIS := NR* NR := PCDAA [attrs: rid eid_ eid_ rel psv] --> <!-- DOC element consists of EX elements and NRLIS elements that is relations within EX elements. --> <!ELEMEN DOC (EX NRLIS)> <!ALIS DOC did CDAA #REQUIRED> <!-- document identifier --> <!-- EX element consists of I element(title) and AB element(abstract). <!ELEMEN EX (I AB)> <!-- I element consists of S elements(sentence) --> <!ELEMEN I (S+)> <!-- AB element also consists of S elements --> <!ELEMEN AB (S+)> <!-- S element consists of NE elements(named Entity that is an science

and technology core entity. <!ELEMEN S (#PCDAA NE)*> <!-- NE element has information of real entity --> <!ELEMEN NE (#PCDAA)> <!ALIS NE eid CDAA #REQUIRED <!-- an entity identifier --> co_ref CDAA #IMPLIED <!-- a coreference identifier --> class CDAA #REQUIRED <!-- a class of an entity --> nn CDAA #REQUIRED> <!-- a normalized name --> <!-- NRLIS elements which is a collection of relations is consists of NR elements. --> <!ELEMEN NRLIS (NR*)> <!-- NR element has relation information of between entities --> <!ELEMEN NR EMPY> <!ALIS NRLIS rid CDAA #REQUIRED <!-- a relation identifier --> eid_ CDAA #REQUIRED <!-- the first entity id --> eid_ CDAA #REQUIRED <!-- the second entity id --> rel CDAA #REQUIRED <!-- a relation class --> psv (0 ) #REQUIRED> <!-- active (0) or passive () -->

매개설정설명 (details) 범위 (range) 변수개수 λ 구문트리커널소멸인자 0. ~.0 ( 단위 : 0.) 0 C SVM 정규화매개변수.0 ~ 7.0 ( 단위 :.0) 7 시맨틱구문트리커널 0 Node concept 그대로사용 α 에서의어휘개념에대 한추상화수준지정 인자 (generalization level) 현재 node concept의부모를사용현재 node concept의조부모를사 용 N 기존구문트리커널 총시스템수 80 4

Collecti on ree Kernels Abstracti on Level DF (λ) Regularizat ion Factor (C) mi-f Precisi on Recall ma-f AImed SPK 0.5 7.0 89.33 84.86 77.45 80.99 BioInfe r SPK 0 0.5 5.0 89.00 87. 84.8 86.00 IEPA PK - 0.4 7.0 79.7 78.5 78.30 78.4 HPRD SPK/ 50 PK 0// 0.7 6.0 85. 84.74 83.4 84.07 LLL SPK 0.4 4.0 88.48 88.64 88.47 88.55

SPK SPK Coverage Collections PK α = 0 α = α = (total) rate AImed 7 4 5 4 3 65% BioInfer 3 7 5 5 7 85% IEPA 4 3 8 40% HPRD50 6 4 4 6 4 70% LLL 4 5 5 6 6 80% AImed BioInfer HPRD50 IEPA LLL 평균 Airola et al. (008) [3] 56.4 6.3 63.4 75. 76.8 66.60 Miwa et al. (009) [4] 60.8 68. 70.9 7.7 80. 70.3 Our system (PK λ = 0.4) 75.4 8. 77.9 75. 85.5 79.0 Our system (SPK α = 0 λ = 0.4) 75.5 8.4 77.9 75.6 85. 79. Our system (SPK α = λ = 0.4) Our system (SPK α = λ = 0.4) 75. 8.3 77.9 75. 85. 78.94 74.8 8. 77.9 75. 85.5 78.90

Relation Set ree Kernels Abstracti on Level DF (λ) Regulariza tion Factor (C) mi-f Precisio n Recall ma-f RCP_L SPK 0.3 5 9.63 75.05 63.03 68.5 RCP_L SPK 0 0. 7 90.5 76.65 60.7 67.48 RCP_L 3 SPK 0. 4 78.06 7.86 5.65 60.77 RCO SPK 0 0.4 5 78.0 75.46 57.74 65.4 Average SPK - - - 84.55 74.75 58.4 65.54

SPK SPK Coverage Collections PK α = 0 α = α = (total) rate RCP_L 8 9 7 6 73.3% RCP_L 9 8 7 6 70.0% RCP_L3 8 9 7 6 73.3% RCO 5 9 9 7 5 83.3% 관계집합 ( 설정 ) RCP_L (λ=0.3 C=7.0) RCP_L (λ=0. C=5.0) 트리커널종류 mi-f Precision Recall ma-f PK 9.94 75.68 6.3 68.30 SPK(α=0) 9.90 75.63 6.37 68.36 SPK(α=) 9.7 75.55 6.9 68.8 SPK(α=) 9.68 75.0 6.7 67.7 PK 89.99 76.47 58.9 66.55 SPK(α=0) 89.90 76.43 58.80 66.47 SPK(α=) 89.90 76.6 58.88 66.58 SPK(α=) 89.8 76.66 58.03 66.06

RCP_L3 (λ=0. C=4.0) RCO (λ=0.4 C=5.0) PK 78.0 7.84 5.8 60.0 SPK(α=0) 78.0 7.65 5.73 60.75 SPK(α=) 78.06 7.86 5.65 60.77 SPK(α=) 77.53 7.9 5.54 59.83 PK 77.88 74.90 57.05 64.77 SPK(α=0) 78.0 75.46 57.74 65.4 SPK(α=) 77.84 75.50 57.5 65.9 SPK(α=) 77.44 74.96 57.09 64.8

Ÿ Ÿ Ÿ Ÿ Ÿ Ÿ