Overview Decision Tree Director of TEAMLAB Sungchul Choi

Similar documents
Probability Overview Naive Bayes Classifier Director of TEAMLAB Sungchul Choi

adfasdfasfdasfasfadf

Overview Ensemble Model Director of TEAMLAB Sungchul Choi

untitled

강의록

2002년 2학기 자료구조

Index

슬라이드 1

Chapter 7 – Classification and Regression Trees

Tree 기반의 방법

Steven F. Ashby Center for Applied Scientific Computing Month DD, 1997

Sequences with Low Correlation

Microsoft PowerPoint - ch10 - 이진트리, AVL 트리, 트리 응용 pm0600

PowerPoint 프레젠테이션

11강-힙정렬.ppt

<BFACB1B831382D31365FBAF2B5A5C0CCC5CD20BAD0BCAEBFA120C0C7C7D120BFE4C0B2BBEAC1A420B9E6B9FD20BAF1B1B35F33C2F7BCF6C1A E687770>

지능정보연구제 16 권제 1 호 2010 년 3 월 (pp.71~92),.,.,., Support Vector Machines,,., KOSPI200.,. * 지능정보연구제 16 권제 1 호 2010 년 3 월

KD hwp

PowerPoint Presentation

하반기_표지

(, sta*s*cal disclosure control) - (Risk) and (U*lity) (Synthe*c Data) 4. 5.

untitled

example code are examined in this stage The low pressure pressurizer reactor trip module of the Plant Protection System was programmed as subject for

딥러닝 첫걸음

chap x: G입력

3 Gas Champion : MBB : IBM BCS PO : 2 BBc : : /45

歯

2월1일자.hwp

Journal of Educational Innovation Research 2018, Vol. 28, No. 1, pp DOI: * A Study on the Pe

04 Çмú_±â¼ú±â»ç

1 : HEVC (Jeonghwan Heo et al.: Fast Partition Decision Using Rotation Forest for Intra-Frame Coding in HEVC Screen Content Coding Extension) (Regular

untitled

Oracle Apps Day_SEM

R을 이용한 텍스트 감정분석

01.여경총(앞부분)

*지급결제제도 01_차례

슬라이드 1

<31342EBCBAC7FDBFB52E687770>

정보기술응용학회 발표

BSC Discussion 1

sna-node-ties

<C7D1B1B9B1B3C0B0B0B3B9DFBFF85FC7D1B1B9B1B3C0B05F3430B1C733C8A35FC5EBC7D5BABB28C3D6C1BE292DC7A5C1F6C6F7C7D42E687770>

?

<322EBCF8C8AF28BFACBDC0B9AEC1A6292E687770>

Lecture12_Bayesian_Decision_Thoery

chap 5: Trees

Microsoft PowerPoint - 30.ppt [호환 모드]

Getting Started

PowerPoint 프레젠테이션

KBI......_ hwp

제 출 문 한국산업안전공단 이사장 귀하 본 보고서를 2002 년도 공단 연구사업계획에 따라 수행한 산 업안전보건연구수요조사- 산업안전보건연구의 우선순위설정 과제의 최종보고서로 제출합니다. 2003년 5월 연구기관 : 산업안전보건연구원 안전경영정책연구실 정책조사연구팀 연

KAKAO AI REPORT Vol.01

Multi-pass Sieve를 이용한 한국어 상호참조해결 반-자동 태깅 도구

Microsoft PowerPoint - Chap5 [호환 모드]

비트와바이트 비트와바이트 비트 (Bit) : 2진수값하나 (0 또는 1) 를저장할수있는최소메모리공간 1비트 2비트 3비트... n비트 2^1 = 2개 2^2 = 4개 2^3 = 8개... 2^n 개 1 바이트는 8 비트 2 2

에너지경제연구 Korean Energy Economic Review Volume 17, Number 2, September 2018 : pp. 1~29 정책 용도별특성을고려한도시가스수요함수의 추정 :, ARDL,,, C4, Q4-1 -

김기남_ATDC2016_160620_[키노트].key

Artificial Intelligence: Assignment 6 Seung-Hoon Na December 15, Sarsa와 Q-learning Windy Gridworld Windy Gridworld의 원문은 다음 Sutton 교재의 연습문제

저작자표시 - 비영리 - 변경금지 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 다음과같은조건을따라야합니다 : 저작자표시. 귀하는원저작자를표시하여야합니다. 비영리. 귀하는이저작물을영리목적으로이용할

untitled

Line (A) å j a k= i k #define max(a, b) (((a) >= (b))? (a) : (b)) long MaxSubseqSum0(int A[], unsigned Left, unsigned Right) { int Center, i; long Max

Vol.259 C O N T E N T S M O N T H L Y P U B L I C F I N A N C E F O R U M

Reinforcement Learning & AlphaGo

Vol.257 C O N T E N T S M O N T H L Y P U B L I C F I N A N C E F O R U M

Ch 8 딥강화학습

242..

< B0B3C0CEC1A4BAB8BAD0C0EFC1B6C1A4BBE7B7CAC1FD2E687770>

06년팜플렛직접하리용

<30312DC1A4BAB8C5EBBDC5C7E0C1A4B9D7C1A4C3A52DC1A4BFB5C3B62E687770>

Microsoft PowerPoint - 27.pptx

歯세대갈등국민조사97.PDF

Journal of Educational Innovation Research 2018, Vol. 28, No. 2, pp DOI: The Exploratory Stu

°Ÿ»4º¨Ö

쉽게배우는알고리즘 6장. 해시테이블 테이블 Hash Table

표본재추출(resampling) 방법

PowerPoint Presentation

국내 디지털콘텐츠산업의 Global화 전략

hwp

F1-1(수정).ppt

Observational Determinism for Concurrent Program Security

2 : (JEM) QTBT (Yong-Uk Yoon et al.: A Fast Decision Method of Quadtree plus Binary Tree (QTBT) Depth in JEM) (Special Paper) 22 5, (JBE Vol. 2

DBPIA-NURIMEDIA

Disclaimer IPO Presentation,. Presentation...,,,,, E.,,., Presentation,., Representative...

HW5 Exercise 1 (60pts) M interpreter with a simple type system M. M. M.., M (simple type system). M, M. M., M.

170918_hjk_datayanolja_v1.0.1.

제4장 자연언어처리, 인공지능 , 기계학습

¼º¿øÁø Ãâ·Â-1

장연립방정식을풀기위한반복법 12.1 선형시스템 : Gauss-Seidel 12.2 비선형시스템 12.1 선형시스템 : Gauss-Seidel (1/10) 반복법은초기근을가정한후에더좋은근의값을추정하는체계적인절차를이용한다. G-S 방법은선형대수방정

G Power

(160218)'16년_경제사회_전망과_과제_집담회-발제토론내용.hwp

Introduction to Deep learning

Microsoft PowerPoint - analogic_kimys_ch10.ppt

Microsoft PowerPoint - ai-8 기계 학습-I

2_안드로이드UI

WISHBONE System-on-Chip Interconnection Architecture for Portable IP Cores

NSI포럼 정책토론보고서

<30352DB5A5C0CCC5CDB0F8C7D02D4A355F F525BBEC8C7FCB1D95D2E687770>

중견국외교연구회

<4D F736F F D20B1E2C8B9BDC3B8AEC1EE2DC0E5C7F5>

Transcription:

Overview Decision Tree Director of TEAMLAB Sungchul Choi

머신러닝의학습방법들 - Gradient descent based learning - Probability theory based learning - Information theory based learning - Distance similarity based learning

Overview - Akinator 스무고개로유명인, 캐릭터를맞추는게임

Overview

Decision Tree Classifier - Data 를가장잘구분할수있는 Tree 를구성함

Guess who 남자긴머리안경이름 예아니오예브라이언 예아니오아니오존 아니오예아니오에이프라 아니오아니오아니오아이오페

Guess who 안경을쓰고있는가? 남자인가 브라이언 남자인가 안경을쓰고있는가? 머리가긴가 존 머리가긴가 존브라이언에이프라아오이페 에이프라 아오이페

Decision Tree 만들기 - 어떤질문이가장많은해답을줄것인가? - 결국어떤질문은답의모호성을줄여줄것인가? - 문제를통해서 splitting point을설정 è 남은정보로 splitting point를설정하는식

Guess who 안경을쓰고있는가? 남자긴머리안경이름 예아니오예브라이언 예아니오아니오존 브라이언 남자인가 아니오예아니오에이프라 아니오아니오아니오아이오페 존 머리가긴가 에이프라 아오이페

Human knowledge belongs to the world.

Information theory - Entropy Decision Tree Director of TEAMLAB Sungchul Choi

Entropy - Entropy는목적달성을위한경우의수를정량적으로표현하는수치 è 작을수록경우의수가적음 - Higher Entropy à Higher uncertainty - Lower Entropy à Lower uncertainty Entropy 가작으면얻을수있는정보가많다.

Entropy h(d) = m P i=1 p i log 2 (p i ) where ( D p i Data set Probability of label i log 2 p(i) log 2 p(i) p(i) p(i)

Entropy h(d) = m P i=1 p i log 2 (p i ) where ( D p i Data set Probability of label i - 확률이 1 이면 Entropy 0 log 2 p(i) - 확률이작을수록커짐 p(i)

Entropy - Example

h(d) = mx i=1 h(d) = 9 14 log 2 =0.940bits p i log 2 (p i ) 9 14 + 5 14 log 2 5 14

Human knowledge belongs to the world.

Algorithms of Decision Tree Decision Tree Director of TEAMLAB Sungchul Choi

Growing a Decision Tree - Decision Tree 를성장 ( 만드는 ) 시키는알고리즘이필요 - 어떻게하면가장잘분기 (branch) 를만들수있는가? - Data의 attribute 를기준으로분기를생성 - 어떤 attribute 를기준이면가장 entropy 가작은가? - 하나를자른후에그다음은어떻게할것인가?

Growing a Decision Tree

Growing a Decision Tree youth Age senior Middle_aged

Growing a Decision Tree youth Student Age Middle_aged buy senior yes no

Growing a Decision Tree youth Student Age Middle_aged buy senior Credit yes no fair excellent buy NOT

Growing a Decision Tree youth Student Age Middle_aged buy senior Credit yes no fair excellent buy NOT buy NOT

Growing a Decision Tree - Decision Tree 는재귀적으로생김 - 대상라벨에대해어떤 Attribute 가더확실한정보를제공하고있는가? 로 branch attribute를선택 - 확실한정보의선택기준은알고리즘별로차이가남 - Tree 생성후 pruning 을통해 Tree generalization 시행 - 일반적으로효율을위해 Binary Tree 를사용

Decision Tree 의특징 - 비교적간단하고직관적으로결과를표현 - 훈련시간이길고, 메모리공간을많이사용함 - Top-down, Recursive, Divide and Conquer 기법 - Greedy 알고리즘 -> 부분최적화

Decision Tree 의장점 - 트리의상단부분 attribute들이가장중요한예측변수 è attribute 선택기법으로도활용할수있음 - Attribute 의 scaling이필요없음 - 관측치의절대값이아닌순서가중요 à Outlier 에이점 - 자동적변수부분선택 ç Tree pruning

Algorithms of Decision Tree - 크게두가지형태의 decision tree 알고리즘존재 - 알고리즘별 attribute branch 방법이다름 - ID3 à C4.5(Ross Quinlan), CART - 연속형변수를위한 regression tree도존재

Human knowledge belongs to the world.

ID3 & Information Gain Decision Tree Director of TEAMLAB Sungchul Choi

Information Gain - Entropy 함수를도입하여 branch splitting - Information Gain: Entropy를사용하여속성별분류시 Impurity 를측정하는지표 - ( 전체 Entropy 속성별 Entropy ) 로속성별 Information Gain을계산함

Info(D) = nx i=1 Information Gain p i log 2 (p i ) 전체데이터 D 의정보량 Info A (D) = vx j=1 D j D Info(D j) 속성 A 로분류시정보량 Gain(A) =Info(D) Info A (D) A 속성의정보소득

ID3 Process

Growing a Decision Tree

Growing a Decision Tree Age Gain(age) =Info(D) Info a ge(d) Credit Gain(credit) =Info(D) Info credit (D) Income Gain(Income)=Info(D) Info Income (D) Student Gain(Studnet)=Info(D) Info Student (D)

Growing a Decision Tree Age Gain(age) =Info(D) Info a ge(d) Info age (D) = vx j=1 D j D Info(D j) j 3 {youth, moddle age, senior}

Growing a Decision Tree Age Gain(age) =Info(D) Info a ge(d) Info age (D) = vx j=1 Info age (D) = 5 2 14 5 log 2 2 5 + 4 4 14 4 log 2 + 5 14 3 5 log 2 D j D Info(D j) 4 4 3 5 3 5 log 2 3 5 2 5 log 2 2 5

Growing a Decision Tree Age Gain(age) =Info(D) Info a ge(d) Info(D) = 9 14 log 2 9 14 5 14 log 2 5 14 Gain(age) =Info(D) Info a ge(d)

Growing a Decision Tree Age Gain(age) =Info(D) Info a ge(d) Info(D) = 9 14 9 log 2 14 5 14 5 log 2 14

Growing a Decision Tree Age Gain(age) =Info(D) Info a ge(d) j 3 {youth, moddle age, senior} Info age (D) = vx j=1 D j D Info(D j)

Growing a Decision Tree Age Gain(age) =Info(D) Info a ge(d)

Growing a Decision Tree Age Credit Gain(age) =Info(D) Info a ge(d) Income Gain(credit) =Info(D) Info credit (D) Student Gain(Income)=Info(D) Info Income (D) Gain(Studnet)=Info(D) Info Student (D)

Growing a Decision Tree youth Age Middle_aged buy senior

Growing a Decision Tree Income Student Gain(Income)=Info(D youth ) Info Income (D youth ) Gain(Student)=Info(D youth ) Info Student (D youth ) Credit Gain(credit) =Info(D youth ) Info credit (D youth )

Growing a Decision Tree youth Student Age Middle_aged buy senior Credit yes no fair excellent buy NOT

Human knowledge belongs to the world.

C4.5 & Gini Index Decision Tree Director of TEAMLAB Sungchul Choi

Information Gain 의문제점 - Attribute 에포함된값이다양할수록선택하고자함 Gain(A) =Info(D) Info A (D) Info A (D) = vx j=1 D j D Info(D j) Info(D) = nx i=1 p i log 2 (p i ) - 한 Attribute 가모두다른값을가질때? 보완을해줄다른 Measure 가필요

Gain Ratio - ID3 의발전형인 C4.5 알고리즘에서사용되는 measure http://dl.acm.org/citation.cfm?id=152181 - Info(D) 의값을평준화시켜분할정보값을대신사용 SplitInfo A (D) = GainRatio(A) = vx j=1 D j D D j log 2 D log 2 p(i) Gain(A) SplitInfo A (D) = Info(D) Info a(d) SplitInfo A (D) p(i)

Gini Index - CART 알고리즘의 split measure https://www.amazon.com/dp/0412048418?tag=inspiredalgor-20 - 훈련튜플세트를파티션으로나누었때불순한정도측정 Gini(D) =1 mx i=1 p 2 i =1 mx i=1 C i,d D where C i is a class - 데이터의대상속성을얼마나잘못분류할지를계산

Gini Index - 실제 Gini Index 는 Entropy 와비슷한그래프가그려짐 - 0.5 일때 Impurity 의최대화, 약극점에서 0 Gini(D) = mx p i (1 p i )=1 mx p 2 i i=1 i=1

Binary Split - CART 알고리즘은 Binary Split 을전제로분석함 - k 가속성내데이터분류의개수일때 2 k$1 1 개만큼의 Split 생성 k =3 2 3 1 1=3 Gini A (D) = D 1 D Gini(D 1)+ D 2 D Gini(D 2) - 가장 Gini 값이적은분류를선택 https://goo.gl/e7uadc

Growing a Decision Tree

Growing a CART Decision Tree Age Gini age (D) Credit Gini credit (D) Income Gini income (D) Student Gini student (D)

Growing a CART Decision Tree Age Gini age (D) age 2 {youth, middle age, senior} age 1 2 {youth} = {middle age, senior} age 2 2 {middle age} = {youth, senior} age 3 2 {senior} = {youth, middle age} 세가지경우의모든 Gini Index 산출

Growing a CART Decision Tree Age Gini age (D) age 2 {youth, middle age, senior} Gini A (D) = D 1 D Gini(D 1)+ D 2 D Gini(D 2) Gini age1 (D) = 5 14 Gini(D 1)+ 9 14 Gini(D 2) Gini(D 1 )=1 Gini(D 2 )=1 2 3 2 5 5 2 7 2 9 9 2 2

Growing a CART Decision Tree Age Gini age (D) age 2 {youth, middle age, senior} age 1 2 {youth} = {middle age, senior} age 2 2 {middle age} = {youth, senior} age 3 2 {senior} = {youth, middle age} Min(Gini agei )=0.357

Growing a CART Decision Tree Min(Gini agei )=0.357 Min(Gini incomei )=0.443 Min(Gini credit )=0.429 Min(Gini student )=0.367

CODE Gini(D) =1 mx i=1 p 2 i =1 mx i=1 C i,d D

CODE

CODE Gini A (D) = D 1 D Gini(D 1)+ D 2 D Gini(D 2)

Human knowledge belongs to the world.

Tree prunning Decision Tree Director of TEAMLAB Sungchul Choi

Decision Tree 를만들다생기는문제점 - class의 leaf node 의결정 - 너무많을경우 à Overfitting - impurity 또는 variance 는낮은데, 노드에데이터가 1개 - 어떤시점에서트리가지치기해야할지결정이중요

Pre-pruning - Tree 를생성할때, 일정기준이하면노드생성 X è 하위노드개수, 하위노드의 label 비율 ( 한쪽이 95?) - Threshold를잡아줘야하는어려움, CHAID 등사용 - 계산효율이좋고, 작은데이터셋에서잘작동 - 속성을놓칠수있음, underfitting 가능성이높음

Post-pruning - Tree 를생성한후, 트리의오차율최소화를위해실행 - 오분류율최소화, cost complexity 최소화등을사용 - 검증을위한 validation set 을따로생성함

오분류율최소화 - 아래와같은검증데이터가있을경우

오분류율최소화

오분류율최소화

Human knowledge belongs to the world.