딥러닝첫걸음 4. 신경망과분류 (MultiClass)
다범주분류신경망 Categorization( 분류 ): 예측대상 = 범주 이진분류 : 예측대상범주가 2 가지인경우 출력층 node 1 개다층신경망분석 (3 장의내용 ) 다범주분류 : 예측대상범주가 3 가지이상인경우 출력층 node 2 개이상다층신경망분석 비용함수 : Softmax 함수사용
다범주분류신경망 X1 1 0 0 X2 D = 0 1 0 X3 0 0 1 3
다범주분류신경망 : 입출력 Y^1_1 Y1 X1 Y^1_2 X2 Y2 Y^1_ 3 X3 Y3 Y^1_4 4
다범주분류신경망 : 역전파 X 1 X 2 X 3 5
다범주분류신경망 : 가중치조정 X 1 X 2 X 3 6
다범주신경망학습 7
다범주신경망학습 (2) 8
다범주신경망학습 (3) 9
Softmax function 1 계도함수는 Sigmoid 함수와동일
Softmax.m function y = Softmax(x) ex=exp(x); y=(ex)/sum(ex); end
예제 : 글자맟추기입력 25 -> 은닉 50 -> 출력 5 Y^1_1 Y1 X1 Y^1_2 X24 Y4 Y^1_49 X25 Y5 Y^1_50 12
학습데이터, 검증데이터 학습데이터 검증데이터
training function: Multiclass.m function [W1, W2] = Multiclass(W1,W2,X,D) 1. Learning Rate alpha=0.9; 2. Loop Setup.: SGD N=5; for k=1:n 3. Input-Output x=reshape(x(:,:,k),25,1); %25*1 d=d(k,:)'; %5*1 v1=w1*x; %(50*1=50*25 x 25*1 y1=sigmoid(v1); v=w2*y1; %(5*1=5*50 x 50*1 y=softmax(v); 4. Back Propagation e=d-y; delta=e; %(5*1) e1=w2'*delta; %(50*5)*(5*1) delta1=y1.*(1-y1).*e1; %(50*1) 5. Update dw1=alpha*delta1*x'; %(50*1)*(1*25) W1=W1+dW1; dw2=alpha*delta*y1'; %(5*1)*(1*50) W2=W2+dW2; End % end of SGD loop end % end of function 14
TestMulticlass.m clear all rng(3); 1. Data loading X=zeros(5,5,5); X(:,:,1)=[0 1 1 0 0; 0 0 1 0 0; 0 0 1 0 0; 0 0 1 0 0; 0 1 1 1 0 X(:,:,2)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 0; 1 1 1 1 1 X(:,:,3)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 0 0 0 0 1; 1 1 1 1 0 X(:,:,4)=[0 0 0 1 0; 0 0 1 1 0; 0 1 0 1 0; 1 1 1 1 1; 0 0 0 1 0 X(:,:,5)=[1 1 1 1 1; 1 0 0 0 0; 1 1 1 1 0; 0 0 0 0 1; 1 1 1 1 0 D=[1 0 0 0 0; 0 1 0 0 0; 0 0 1 0 0; 0 0 0 1 0; 0 0 0 0 1 2. 가중치초기화 W1=2*rand(50,25)-1; W2=2*rand(5,50)-1; W10=W1; W20=W2; 3. 기계학습 for epoch = 1:1000 [W1, W2]=Multiclass(W1,W2,X,D); end 4. 추정 N=5; for k=1:n x=reshape(x(:,:,k),25,1); v1=w1*x; y1=sigmoid(v1); v=w2*y1; y=softmax(v) end
RealMulticlass.m : 입력데이터 clear all rng(3); X=zeros(5,5,5); 1. 기계학습 TestMultiClass; 2. Data loading X=zeros(5,5,5); X(:,:,1)=[0 0 1 1 0; 0 0 1 1 0; 0 1 0 1 0; 0 0 0 1 0; 0 1 1 1 0 X(:,:,2)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 1; 1 1 1 1 1 X(:,:,3)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 1; 1 1 1 1 0 X(:,:,4)=[0 1 1 1 0; 0 1 0 0 0; 0 1 1 1 0; 0 0 0 1 0; 0 1 1 1 0 X(:,:,5)=[0 1 1 1 1; 0 1 0 0 0; 0 1 1 1 0; 0 0 0 1 0; 1 1 1 1 0 3. 추정 N=5; for k=1:n x=reshape(x(:,:,k),25,1); v1=w1*x; y1=sigmoid(v1); v=w2*y1; y=softmax(v) end
실습. Matlab => R Softmax.m Multiclass.m TestMulticlass.m RealMulticlass.m.m 에서과업특성파악 => 과업 list 작성 => R code
딥러닝첫걸음 5. 딥러닝 ( 심층신경망 )
심층신경망 : 은닉층 2 개이상다층신경망 심층신경망의문제점 : 그래디언트소실, 과적합, 계산부담 그래디언트소실 : 입력층에가까운은닉층에출력층의오차정보소실 ReLU 함수를활성화함수로활용 과적합 Dropout 으로해소 : 무작위로일부 node 를제외 계산부담 : 하드웨어의발달 + 병렬처리로해소 예제 : 글자맟추기 입력층 (25) 제 1 은닉층 ( 20) 제 2 은닉층 ( 20) 제 3 은닉층 (20) 출력층 (5)
예제 : 글자맞추기입출력 W_1 W_2 W_3 W_4 X Y_1 Y_2 Y_3 Y 20
예제 : 글자맞추기 -BackProp W_1^T W_2^T W_3^T W_4^T E_1 Delta_1 E_2 Delta_2 E_3 Delta_3 E delta 21
예제 : 글자맞추기 - 가중치조정 dw_1 dw_2 dw_3 dw_4 Y_1 Y^2 Y^3 X Delta_1 Delta_2 Delta_3 delta 22
심층신경망학습 2 3
심층신경망학습 (2) 24
심층신경망학습 (3) 25
심층신경망학습 (4) 26
ReLU function
ReLU.m function y=relu(x) y=max(0,x); end
학습데이터, 검증데이터 학습데이터 검증데이터
training function: DeepReLu.m function[w1,w2,w3,w4]=deeprelu(w1,w2,w 3,W4,X,D) 1. Learning Rate alpha=0.01; 2. SGD N=5; for k=1:n x=reshape(x(:,:,k),25,1); v1=w1*x; y1=relu(v1); v2=w2*y1; y2=relu(v2); v3=w3*y2; y3=relu(v3); v=w4*y3; y=softmax(v); 4. Back Propagation d=d(k,:)'; e=d-y; delta=e; e3=w4'*delta; delta3=(v3>0).*e3; e2=w3'*delta3; delta2=(v2>0).*e2; e1=w2'*delta2; delta1=(v1>0).*e1; 5. Update dw4=alpha*delta*y3'; W4=W4+dW4; dw3=alpha*delta3*y2'; W3=W3+dW3; dw2=alpha*delta2*y1'; W2=W2+dW2; dw1=alpha*delta1*x'; W1=W1+dW1; End % end of SGD loop end % end of function 30
TestDeepRelu.m clear all 1. Data loading X=zeros(5,5,5); X(:,:,1)=[0 1 1 0 0; 0 0 1 0 0; 0 0 1 0 0; 0 0 1 0 0; 0 1 1 1 0 X(:,:,2)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 0; 1 1 1 1 1 X(:,:,3)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 0 0 0 0 1; 1 1 1 1 0 X(:,:,4)=[0 0 0 1 0; 0 0 1 1 0; 0 1 0 1 0; 1 1 1 1 1; 0 0 0 1 0 X(:,:,5)=[1 1 1 1 1; 1 0 0 0 0; 1 1 1 1 0; 0 0 0 0 1; 1 1 1 1 0 D=[1 0 0 0 0; 0 1 0 0 0; 0 0 1 0 0; 0 0 0 1 0; 0 0 0 0 1 2. 가중치초기화 W1=2*rand(20,25)-1; W2=2*rand(20,20)-1; W3=2*rand(20,20)-1; W4=2*rand(5,20)-1; 3. 기계학습 for epoch = 1:1000 [W1, W2,W3,W4] =DeepReLU(W1,W2,W3,W4,X,D); end 4. 추정 N=5; for k=1:n x=reshape(x(:,:,k),25,1); v1=w1*x; y1=relu(v1); v2=w2*y1; y2=relu(v2); v3=w3*y2; y3=relu(v3); v=w4*y3; y=softmax(v) end
RealRelu.m : 입력데이터실험 clear all rng(3); X=zeros(5,5,5); 1. 기계학습 TestDeepReLU; 2. Data loading X=zeros(5,5,5); X(:,:,1)=[0 0 1 1 0; 0 0 1 1 0; 0 1 0 1 0; 0 0 0 1 0; 0 1 1 1 0 X(:,:,2)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 1; 1 1 1 1 1 X(:,:,3)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 1; 1 1 1 1 0 X(:,:,4)=[0 1 1 1 0; 0 1 0 0 0; 0 1 1 1 0; 0 0 0 1 0; 0 1 1 1 0 X(:,:,5)=[0 1 1 1 1; 0 1 0 0 0; 0 1 1 1 0; 0 0 0 1 0; 1 1 1 1 0 3. 추정 N=5; for k=1:n x=reshape(x(:,:,k),25,1); v1=w1*x; y1=relu(v1); v2=w2*y1; y2=relu(v2); v3=w3*y2; y3=relu(v3); v=w4*y3; y=softmax(v) end
실습. Matlab => R Relu.m DeepReLu.m TestdeepReLu.m.m 에서과업특성파악 => 과업 list 작성 => R code
Dropout: 일부 node 를임의로제외 과적합문제 : 학습데이터에서지나치게많은정보가들어간다 Dropout: 학습데이터에서무작위로일부정보를제외한다. Dropout 과정 1. Random sampling으로제외할 node를고른다 (Dropout.m) 2. 1. 에서제외한 node를빼고입력-> 출력과정을거친다. (y) 3. 2. 에서얻은 y를이용하여역전파과정을거친다. (delta, e) 4. 2. 3 에서얻은 y와 delta를이용하여가중치를조정한다. 5. 1.-4. 를반복한다. (2-5: DeepDropout.m) ( 실습하지않습니다.)
Dropout.m function ym=dropout(y1,ratio) [m,n]=size(y1); ym=zeros(m,n); num=round(m*n*(1-ratio)); idx=randperm(m*n,num); ym(idx)=1/(1-ratio); end
training function: DeepDropout.m function[w1,w2,w3,w4]=deepdropout(w1,w2,w3,w4,x,d) 1. Learning Rate alpha=0.01; 2. SGD N=5; for k=1:n v1=w1*x; y1=sigmoid(v1); y1=y1.*dropout(y1,0.2); v2=w2*y1; y2=sigmoid(v2); y2=y2.*dropout(y2,0.2); v3=w3*y2; y3=sigmoid(v3); y3=y3.*dropout(y3,0.2); v=w4*y3; y=softmax(v); 4. Back Propagation d=d(k,:)'; e=d-y; delta=e; e3=w4'*delta; delta3=y3.*(1-y3).*e3; e2=w3'*delta3; delta2=y2.*(1-y2).*e2; e1=w2'*delta2; delta1=y1.*(1-y1).*e1; 5. Update dw4=alpha*delta*y3'; W4=W4+dW4; dw3=alpha*delta3*y2'; W3=W3+dW3; dw2=alpha*delta2*y1'; W2=W2+dW2; dw1=alpha*delta1*x'; W1=W1+dW1; end % end of SGD loop end % end of function 36
TestDeepDropout.m clear all 1. Data loading X=zeros(5,5,5); X(:,:,1)=[0 1 1 0 0; 0 0 1 0 0; 0 0 1 0 0; 0 0 1 0 0; 0 1 1 1 0 X(:,:,2)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 0; 1 1 1 1 1 X(:,:,3)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 0 0 0 0 1; 1 1 1 1 0 X(:,:,4)=[0 0 0 1 0; 0 0 1 1 0; 0 1 0 1 0; 1 1 1 1 1; 0 0 0 1 0 X(:,:,5)=[1 1 1 1 1; 1 0 0 0 0; 1 1 1 1 0; 0 0 0 0 1; 1 1 1 1 0 D=[1 0 0 0 0; 0 1 0 0 0; 0 0 1 0 0; 0 0 0 1 0; 0 0 0 0 1 2. 가중치초기화 W1=2*rand(20,25)-1; W2=2*rand(20,20)-1; W3=2*rand(20,20)-1; W4=2*rand(5,20)-1; 3. 기계학습 for epoch = 1:1000 [W1, W2,W3,W4] =DeepDropout(W1,W2,W3,W4,X,D); end 4. 추정 N=5; for k=1:n x=reshape(x(:,:,k),25,1); v1=w1*x; y1=sigmoid(v1); v2=w2*y1; y2=sigmoid(v2); v3=w3*y2; y3=sigmoid(v3); v=w4*y3; y=softmax(v) end
Realdropout.m : 입력데이터실험 clear all rng(3); X=zeros(5,5,5); 1. 기계학습 TestDeepDropout; 2. Data loading X=zeros(5,5,5); X(:,:,1)=[0 0 1 1 0; 0 0 1 1 0; 0 1 0 1 0; 0 0 0 1 0; 0 1 1 1 0 X(:,:,2)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 1; 1 1 1 1 1 X(:,:,3)=[1 1 1 1 0; 0 0 0 0 1; 0 1 1 1 0; 1 0 0 0 1; 1 1 1 1 0 X(:,:,4)=[0 1 1 1 0; 0 1 0 0 0; 0 1 1 1 0; 0 0 0 1 0; 0 1 1 1 0 X(:,:,5)=[0 1 1 1 1; 0 1 0 0 0; 0 1 1 1 0; 0 0 0 1 0; 1 1 1 1 0 3. 추정 N=5; for k=1:n x=reshape(x(:,:,k),25,1); v1=w1*x; y1=sigmoid(v1); v2=w2*y1; y2=sigmoid(v2); v3=w3*y2; y3=sigmoid(v3); v=w4*y3; y=softmax(v) end