Caffe 실습 서울대학교융합과학기술대학원패턴인식및컴퓨터지능연구실 박성헌, 황지혜, 유재영
Contents Caffe 설치 Caffe 를이용한 CNN 학습및테스트 2
Deep Learning Deep Neural Net 을이용한학습방법 Neuron Perceptron Multi-layer perceptron Deep neural network 3
Why Deep Learning? Performance 영상인식, 음성인식등다양한분야에서최고성능을보여줌 End-to-End learning 데이터와 label 만지정해주면자동으로학습이가능 Speed GPGPU 을이용하기좋은구조 Versatility 같은 framework 를다양한분야에적용가능 4
Convolutional Neural Network Convolutional Layer 와 Pooling Layer 가핵심역할 영상을다루기적합한 Neural Network http://parse.ele.tue.nl/cluster/2/cnnarchitecture.jpg 5
CNN Tutorials Stanford CNN Tutorial (Andrew Ng) http://deeplearning.stanford.edu/tutorial/ CNN 과 Neural Net 의기본작동원리를배우고 Matlab 를이용해직접구현해볼수있는 tutorial Stanford CNN course (Fei-Fei Li & Andrej Karpathy) http://cs231n.stanford.edu/ https://www.youtube.com/playlist?list=plkt2usq6rbvctenovbg1tpcc7 OQi31AlC CNN 및 RNN 의기초부터최신연구내용까지다룸 6
Caffe Convolutional Architecture for Fast Feature Embedding
Caffe Overview UC Berkeley BVLC (Berkeley Vision and Learning Center) 에서제작 2+ years 1,000+ citations, 150+ contributors Yangqing Jia Evan Shelhamer Travor Darrell Open-source contributors Images from BVLC Caffe tutorial 8
Caffe Overview C++, CUDA 로짜여있음 (Matlab, Python wrapper 도존재 ) Open Source Community https://github.com/bvlc/caffe BSD License ( 수정, 재배포, 상업적사용등가능 ) GPGPU acceleration 적용 Fast, well-tested code Many network models available! 9
Caffe - How Fast? Speed with Krizhevsky's 2012 model: 2 ms/image on K40 GPU <1 ms inference with Caffe + cudnn v4 on Titan X 72 million images/day with batched IO 8-core CPU: ~20 ms/image Intel optimization in progress 9k lines of C++ code (20k with tests) Slides from BVLC Caffe tutorial 10
Caffe Installation (Windows) Requirements Visual Studio 2013 For GPU acceleration CUDA 7.5 (https://developer.nvidia.com/cudatoolkit) cudnn v5 (registration 필요 ) (https://developer.nvidia.com/cudnn) CUDA 가설치된경로에파일추가 Ex) C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5 11 Image from https://developer.nvidia.com/cudnn
Caffe Installation (Windows) Caffe Windows branch from Microsoft https://github.com/microsoft/caffe 12
Caffe Installation (Windows) Setting (CPU build) windows/commonsettings.props.example CommonSettings.props 로이름변경 Release / x64 로 Build target 설정 GPU 를쓰지않는경우 <CpuOnlyBuild> true <UseCuDNN> false C4819: 현재코드페이지 (949) 에서표시할수없는문자가파일에들어있습니다 Warning 이발생하는경우 <TreatWarningAsError> true -> false 13
Caffe Installation (Windows) Setting (GPU build) <CpuOnlyBuild> false cudnn 사용할경우 <UseCuDNN> true CUDA version (7.0, 7.5 등 ) <CudaVersion> CUDA compute capability https://developer.nvidia.com/cuda-gpus 에서확인 장착된 GPU 에맞게 <CudaArchitecture> 설정 Ex) 3.5 인경우 compute_35,sm_35, 5.2 인경우 compute_52,sm_52 14
Caffe Installation (Windows) libcaffe 프로젝트빌드 처음빌드할경우 Nuget 을이용해서필요한 3rd party package 들이다운로드됨 이후컴파일및빌드에약 10 분정도소요 완료후 caffe 프로젝트빌드 15
Caffe Installation (Ubuntu) http://caffe.berkeleyvision.org/install_apt.html General dependencies sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5- serial-dev protobuf-compiler sudo apt-get install --no-install-recommends libboost-all-dev BLAS sudo apt-get install libatlas-base-dev Remaining dependencies, 14.04 / 15.04 / 16.04 sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev 16
Caffe Installation (Ubuntu) Remaining dependencies, 12.04 glog wget https://google-glog.googlecode.com/files/glog-0.3.3.tar.gz tar zxvf glog-0.3.3.tar.gz cd glog-0.3.3./configure make && make install gflags wget https://github.com/schuhschuh/gflags/archive/master.zip unzip master.zip cd gflags-master mkdir build && cd build export CXXFLAGS="-fPIC" && cmake.. && make VERBOSE=1 make && make install lmdb git clone https://github.com/lmdb/lmdb cd lmdb/libraries/liblmdb make && make install 17
Caffe Installation (Ubuntu) CUDA 설치 (GPU version only) CUDA 를지원하는 nvidia GPU 필요 : https://developer.nvidia.com/cuda-gpus nvidia graphic driver 설치 : http://www.nvidia.com/download/index.aspx?lang=en-us GPU 에맞는 CUDA version 다운로드 : https://developer.nvidia.com/cuda-downloads runfile (local) 다운로드 sudo sh cuda_<version>_linux.run --no-opengl-libs Install nvidia driver [Y/N] : N 환경변수설정 open ~/.bashrc 추가 export PATH=/usr/local/cuda-7.5/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH 저장후 source.bashrc 18
Caffe Installation (Ubuntu) Download Caffe git clone https://github.com/bvlc/caffe.git Compilation with Make cp Makefile.config.example Makefile.config # Adjust Makefile.config CPU version: CPU_ONLY := 1 make all make all -j8 을이용해 make 속도를빠르게할수있음. j[parallel thread 의수 ] make test make runtest Ubuntu 15.04 이상버전에서 hdf5.h: No such file or directory 에러메시지가발생하면 Make.config 파일의 INCLUDE_DIRS, LIBRARY_DIRS 를다음과같이수정 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/ LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/ 19
Caffe Installation (Ubuntu) Example Datasets cd $CAFFE_ROOT./data/mnist/get_mnist.sh./examples/mnist/create_mnist.sh 실행./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt CPU 버전일경우 lenet_solver.prototxt 의 solver mode 를 CPU 로변경 결과 20
Caffe Installation (Mac OSX) http://caffe.berkeleyvision.org/install_osx.html 참고 21
Dataset Link 실습에필요한데이터셋 MNIST (52MB) http://yann.lecun.com/exdb/mnist/ Training 및 test 모두다운로드 BVLC alexnet caffemodel (233 MB) http://dl.caffe.berkeleyvision.org/bvlc_alexnet.caffemodel 22
Caffe 실습 23
MNIST Tutorial MNIST dataset [LeCun et al., 1998] 손글씨인식데이터셋 32x32 image, 60,000 training samples and 10,000 test samples 24
Data preparation Caffe 에입력으로사용되는데이터형식 LMDB / LEVELDB 를주로사용 ( 빠른속도 ) LMDB 는 multiple process 에서엑세스가능, LEVELDB 는불가능 Image 를바로넣거나 HDF5 data 를입력으로사용할수도있음 25
MNIST Tutorial 데이터준비 convert_mnist_data 프로젝트빌드 cmd -> Caffe root directory 로이동 Leveldb for training data Build/x64/Release/convert_mnist_data.exe backend= leveldb examples/mnist/train-images.idx3-ubyte examples/mnist/train-labels.idx1-ubyte examples/mnist/mnist_train_leveldb Ubuntu(linux) 의경우 Build/x64/Release/convert_mnist_data.exe 대신에 build/examples/mnist/convert_mnist_data.bin 으로실행 Leveldb for test data Build/x64/Release/convert_mnist_data.exe backend= leveldb examples/mnist/t10k-images.idx3-ubyte examples/mnist/t10k-labels.idx1-ubyte examples/mnist/mnist_test_leveldb 26
MNIST Tutorial examples/mnist/lenet_solver.prototxt solver_mode : CPU 또는 GPU 설정 examples/mnist/lenet_train_test.prototxt LMDB -> LEVELDB 로변경 13 째줄 data_param { source: "examples/mnist/mnist_train_lmdb" batch_size: 64 backend: LMDB } 30 째줄 data_param { source: "examples/mnist/mnist_test_lmdb" batch_size: 100 backend: LMDB } 27 data_param { source: "examples/mnist/mnist_train_leveldb" batch_size: 64 backend: LEVELDB } data_param { source: "examples/mnist/mnist_test_leveldb" batch_size: 100 backend: LEVELDB }
MNIST Tutorial LeNet 모델을이용한 MNIST Training Cmd 창 -> Caffe root 폴더로이동 Build/x64/Release/caffe.exe train solver=examples/mnist/lenet_solver.prototxt 입력 Ubuntu(linux) 의경우 Build/x64/Release/caffe.exe 대신에 build/tools/caffe.bin 으로실행 10,000 iteration 완료시 98~99% accuracy 28
MNIST Tutorial Log 파일경로 기본경로 : C:\Users\User_name\AppData \Local\Temp Ubuntu(linux) 기본경로 : /tmp log_dir flag 로경로설정가능 caffe.cpp 의 main() 함수에서 FLAGS_log_dir= log_folder ; 추가 Iteration 1200, loss = 0.0691786 Train net output #0: loss = 0.0691786 (* 1 = 0.0691786 loss) Iteration 1200, lr = 0.01 Iteration 1300, loss = 0.134115 Train net output #0: loss = 0.134115 (* 1 = 0.134115 loss) Iteration 1300, lr = 0.01 Iteration 1400, loss = 0.165894 Train net output #0: loss = 0.165894 (* 1 = 0.165894 loss) Iteration 1400, lr = 0.01 Iteration 1500, Testing net (#0) Test net output #0: accuracy = 0.9638 Test net output #1: loss = 0.121973 (* 1 = 0.121973 loss) Iteration 1500, loss = 0.104802 Train net output #0: loss = 0.104802 (* 1 = 0.104802 loss) Iteration 1500, lr = 0.01 29
Understanding Caffe Network Training/Testing 을위해보통두가지파일을정의함 Solver 정보를담은파일 Gradient update 를어떻게시킬것인가에대한정보를담음 learning rate, weight decay 등의 parameter 가정의됨 Test interval, snapshot 횟수등정의 Network 구조정보를담은파일 실제 CNN 구조정의 확장자가.prototxt 파일로만들어야함 Google Protocol Buffers 기반 (https://developers.google.com/protocol-buffers/) 30
Understanding Caffe Network Net Caffe 에서 CNN ( 혹은 RNN 또는일반 NN) 네트워크는 Net 이라는구조로정의됨 Net 은여러개의 Layer 들이연결된구조 Directed Acyclic Graph (DAG) 구조만만족하면어떤형태이든 training 이가능함 LogReg LeNet ImageNet, Krizhevsky 2012 31 Images from BVLC Caffe tutorial
Understanding Caffe Network Layer CNN 의한 층 ' 을뜻함 Convolution 을하는 Layer, Pooling 을하는 Layer, activation function 을통과하는 layer, input data layer, Loss 를계산하는 layer 등이있음 소스코드에는각 layer 별로 Forward propagation, Backward propagation 방법이 CPU/GPU 버전별로구현되어있음 Blob Layer 를통과하는데이터덩어리 Image 의경우주로 NxCxHxW 의 4 차원데이터가사용됨 (N : Batch size, C : Channel Size, W : width, H : height) 32 Forward Blob Backward Layer Blob Images from BVLC Caffe tutorial
Understanding Caffe Network Protobuf 파일들여다보기 예제 : examples/mnist/lenet_train_test.prototxt 33
Understanding Caffe Network 입력데이터와관련된 Layer LevelDB data Image data HDF5 data Input Layer 는 top 이두개 LevelDB 경로 Batch size 1/256 Mean file 빼기 Train과 test시에쓸데이터를따로지정가능 34 layer { name: "mnist" type: "Data" top: "data" top: "label" data_param { source: "examples/mnist/mnist_train_leveldb" backend: LEVELDB batch_size: 64 } transform_param { scale: 0.00390625 mean_file: mean_mnist.binaryproto } include: { phase: TRAIN } }
Understanding Caffe Network 입력데이터와관련된 Layer LevelDB data Image data 이미지를변환하지않고바로넣을때사용 LevelDB 또는 LMDB 를이용할때보다속도면에서약간느림 HDF5 data Shuffle 여부 Image list 정보가있는파일 leveldb 만들때입력으로쓴파일과같은형태 layer { name: "mnist" type: "ImageData" top: "data" top: "label" image_data_param { shuffle: true source: "examples/mnist/datalist.txt" batch_size: 64 } transform_param { scale: 0.00390625 mean_file: mean_mnist.binaryproto } include: { phase: TRAIN } }
Understanding Caffe Network 입력데이터와관련된 Layer LevelDB data Image data HDF5 data 영상이외에실수형태의데이터를넣을수있음 layer { name: "mnist" type: "HDF5Data" top: "data" top: "label" hdf5_data_param { source: "examples/mnist/hdf5list.txt" batch_size: 64 } transform_param { scale: 0.00390625 mean_file: mean_mnist.binaryproto } include: { phase: TRAIN } }
Understanding Caffe Network Convolution Layer Input size (i1xi2xi3) Output size (o1xo2xo3) Filter size (f1xf2) 일경우학습할 parameter 수 : o3xi3xf1xf2 o1 = (i1 + 2 x pad_size f1) / stride + 1 Layer 별로 Learning rate 를다르게조정가능. Solver 에서정한 learning rate 에곱해진값이해당 layer 의 learning rate 가됨첫번째는 weight 에대한 learning rate, 두번째는 bias 에대한 learning rate Convolution 후 output 으로나오는 feature map 개수 Convolution에쓰이는 filter의크기 Stride 설정 Padding 설정 Weight에대한 initialization. Gaussian도많이쓰임 Bias 에대한 initialization. Constant 의경우 value 를함께지정가능. Default 0 layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param {lr_mult: 1} param {lr_mult: 2} convolution_param { num_output: 20 kernel_size: 5 stride: 1 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" } } }
Understanding Caffe Network Pooling Layer Size 계산은 convolution 경우와같음 Max, mean, stochastic 가능 layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 } }
Understanding Caffe Network Activation Layer RELU, sigmoid, tanh 등가능 RELU 는 negative_slope 설정가능 layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1" }
Understanding Caffe Network Fully connected layer 일반적인 neural network 에서처럼아래 blob 과위 blob 의모든뉴런간에연결이되어있는 layer Fully connected layer output 뉴런개수 Initialization from Gaussian distribution layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param {lr_mult: 1} param {lr_mult: 2} inner_product_param { num_output: 500 weight_filler { type: "gaussian" std: 0.005 } bias_filler { type: "constant" } } }
Understanding Caffe Network Loss layer 가장마지막 layer 로 label 과비교해서 loss 를계산함 Cross entropy, Euclidean distance 등 다양한 loss 가정의되어있음 Classification 문제에는 Softmax 를주로사용 (Cross Entropy Loss) Loss Layer 는 bottom 이두개 layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss" } 41
Understanding Caffe Network Accuracy layer Test 시에 Accuracy 를표시하기위해주로사용 layer { name: "accuracy" type: "Accuracy" bottom: "ip2" bottom: "label" top: "accuracy" include: { phase: TEST } } 42
Understanding Caffe Network Dropout Layer [Hinton et al. NIPS 2012] 논문에소개된내용으로 over-fitting 을방지하고 generalization 효과가좋음 주로 Fully connected layer 에사용 보통 0.5 사용 layer { name: "drop7" type: "Dropout" bottom: "fc7-conv" top: "fc7-conv" dropout_param { dropout_ratio: 0.5 } } 43
Understanding Caffe Network 많은새로운 Layer 와 Option 이추가되고있음 http://caffe.berkeleyvision.org/tutorial/layers.html 에서많이사용되는 Layer 와그사용법을볼수있음 현재 50 개이상의 Layer 종류가존재함 최근추가된 Layer 는 caffe.proto 파일및 github 의 discussion 등을통해알수있음 44
Understanding Caffe Solver Solver 정의하기 Net 구조를정의한 prototxt 파일 Test시에 iteration 횟수. Test_iter x batch_size만큼 test를함몇번 iteration돌때마다 test를할것인가? Solver type Learning rate momentum Weight decay Learning rate 변화를어떻게시킬것인가 Loss 를보여주는 iteration 횟수총 training iteration 수 Iteration 횟수마다기록을남김..caffemodel 과.solverstate 파일이생성됨 45 net: "examples/mnist/lenet_train_test.prototxt" test_iter: 100 test_interval: 500 type: "SGD" base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 lr_policy: "inv" gamma: 0.0001 power: 0.75 display: 100 Snapshot 파일앞에붙일이름 max_iter: 10000 snapshot: 5000 snapshot_prefix: "examples/mnist/lenet" solver_mode: GPU CPU or GPU
Understanding Caffe Solver Stochastic gradient descent solver (type: "SGD" ) 몇가지 solver 가더있으나 SGD 가가장많이사용됨 V t+1 = μv t α L(W t ) 최종 weight update momentum Learning rate 계산된 gradient W t+1 = W t + V t+1 46
Understanding Caffe Solver Learning Rate 결정하기 주로 step 이많이사용됨. DIGITS 라는 tool 을이용하면 visualization 가능 https://github.com/nvidia/digits inv step multistep 47
Understanding Caffe Solver Learning Rate 결정하기 lr_policy: "inv" gamma: 0.1 power: 0.5 base_lr: 0.01 α = base_lr (1 + γ iter)^( power) lr_policy: step" gamma: 0.1 step: 10000 base_lr: 0.01 α = base_lr (γ^ iter/step ) lr_policy: multistep" gamma: 0.1 stepvalue: 5000 stepvalue: 8000 base_lr: 0.01 48
RMS Prop (type: "RMSProp" ) Gradient가 oscillate할경우더해줌 최종 weight update Understanding Caffe Solver 1 δ 만큼곱함, 그렇지않을경우 δ 만큼 (v t ) i = ቊ (v t 1) i + δ, if L W t i L W t 1 i > 0 (v t 1 ) i 1 δ otherwise (W t+1 ) i = W t i α(v t ) i 49
Understanding Caffe Solver Rule of thumb Momentum = 0.9 Weight decay = 0.0005 Base learning rate = 0.01 그외다양한 Solver 의 Optimization algorithm 은 http://caffe.berkeleyvision.org/tutorial/solver.html 에서확인가능
Terminal Interface Training caffe.exe train solver=solver_file.prototxt (Ubuntu: caffe.bin) Testing Backward propagation 없이 forward propagation 을통한결과값만출력 caffe.exe test gpu=0 iterations=100 weights=weight_file.caffemodel model=net_model.prototxt -model 은 solver 가아닌 net 파일을입력으로줘야함 -weights 는미리학습된 weight 파일 (.caffemodel 확장자 ) -iterations 옵션만큼 iteration 수행 caffe.exe help 옵션으로 flag 에대한도움말을볼수있음 51
MNIST Tutorial 로그들여다보기 I0823 14:33:56.829655 2040 net.cpp:408] mnist -> data I0823 14:33:56.832659 2040 net.cpp:408] mnist -> label I0823 14:33:56.842664 15152 common.cpp:36] System entropy source not available, using fallback algorithm to generate seed instead. I0823 14:33:56.875715 15152 db_leveldb.cpp:18] Opened leveldb examples/mnist/mnist_train_leveldb I0823 14:33:56.962749 2040 data_layer.cpp:41] output data size: 64,1,28,28 I0823 14:33:56.967756 2040 net.cpp:150] Setting up mnist I0823 14:33:56.968763 2040 net.cpp:157] Top shape: 64 1 28 28 (50176) I0823 14:33:56.971756 2040 net.cpp:157] Top shape: 64 (64) I0823 14:33:56.974786 2040 net.cpp:165] Memory required for data: 200960 I0823 14:33:56.978761 2040 layer_factory.hpp:77] Creating layer conv1 I0823 14:33:56.982764 2040 net.cpp:100] Creating Layer conv1 I0823 14:33:56.983767 12856 common.cpp:36] System entropy source not available, using fallback algorithm to generate seed instead. I0823 14:33:56.983767 2040 net.cpp:434] conv1 <- data I0823 14:33:56.986766 2040 net.cpp:408] conv1 -> conv1 I0823 14:33:56.990768 2040 net.cpp:150] Setting up conv1 I0823 14:33:56.990768 2040 net.cpp:157] Top shape: 64 20 24 24 (737280) I0823 14:33:56.991770 2040 net.cpp:165] Memory required for data: 3150080 I0823 14:33:56.992770 2040 layer_factory.hpp:77] Creating layer pool1 I0823 14:33:56.993770 2040 net.cpp:100] Creating Layer pool1 I0823 14:33:56.994771 2040 net.cpp:434] pool1 <- conv1 I0823 14:33:56.994771 2040 net.cpp:408] pool1 -> pool1 I0823 14:33:56.996780 2040 net.cpp:150] Setting up pool1 I0823 14:33:56.997779 2040 net.cpp:157] Top shape: 64 20 12 12 (184320) I0823 14:33:56.998778 2040 net.cpp:165] Memory required for data: 3887360 52
Memory Management CUDA out of memory error (code 2) CNN 구조를바꾸거나 Batch size 를줄여줘야한다. C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe 에서현재메모리사용현황확인가능 53
DIY! Creating Your Own Network Convolution layer 의 filter 개수및 kernel size 조절 Convolution layer 및 Pooling layer 쌓기 다양한 Solver 를이용한 Optimization 54
Data Preparation Dataset 준비하기 (Convert_imageset 프로젝트빌드 ) 영상데이터와 ground truth label 을준비 Label 은다음과같은형태로텍스트파일로만듦 Subfolder1/file1.JPEG 7 Subfolder2/file2.JPEG 3 Subfolder3/file3.JPEG 4 Label 은 0 부터시작 Shuffle, resize 등의옵션을활용 사용법 : 실행파일.exe [FLAGS] ROOTFOLDER/ LISTFILE DB_NAME 예시 : convert_imageset.exe backend= leveldb shuffle=true imagedata/ imagelist.txt imagedata_leveldb Ubuntu(linux) 의경우 convert_imageset.bin, 옵션은동일 55
Data Preparation Mean image 구하기 (Compute_image_mean 프로젝트빌드 ) 대부분의경우 training, testing 시에 image data 에서 mean image 를뺀다 LevelDB 또는 LMDB 를이용해서만듦 사용법 : 실행파일.exe [FLAGS] INPUT_DB [OUTPUT_FILE] 예시 : compute_image_mean.exe backend= leveldb imagedata_leveldb mean_imagedata.binaryproto Ubuntu(linux) 의경우 compute_image_mean.bin, 옵션은동일 실행결과 binaryproto 파일이생성됨. 56
Terminal Interface Training 을중간에멈춘뒤이어서하고싶을때 Snapshot 으로남겨둔 solverstate 파일을이용 (-snapshot 옵션 ) caffe.exe train solver=solver.prototxt -snapshot=lenet_iter_5000.solverstate Fine tuning / Transfer learning Pre-trained model 을이용하는방법 Snapshot 으로남겨둔 caffemodel 파일을이용 (-weights 옵션 ) caffe.exe train solver=solver.prototxt weights=lenet_iter_5000.caffemodel Layer 이름을비교해서이름이같은 Layer 는 caffemodel 파일에서미리 training 된 weight 를가져오고새로운 layer 는새로 initialization 을해서학습함. 57
ImageNet Tutorial ILSVRC2012(ImageNet) data set 약 128 만장의라벨링된 training set, 5 만장의 validation set, 10 만장의 test set 으로구성 1000 개의 class 로구성. http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/ 58
ImageNet Tutorial AlexNet 2012 년 ImageNet 에서우승한 CNN 모델 40.7% Top 1 Error, 18.2% Top 5 Error on validation set 5 개의 convolution network, 3 개의 pooling layer, 2 개의 fully connected layer 로구성 A Krizhevsky et al., NIPS 2012 59
ImageNet Tutorial Image classification using AlexNet deploy.prototxt : network model 파일. 임의의입력을다룰때사용. layer { name: "data" type: "Input" top: "data" input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } } } alexnet.caffemodel : 미리학습된 Alexnet 학습모델. mean.binaryproto: imagenet dataset mean file. label.txt: class label 정보를담고있는 txt 파일 test.jpg: test 할이미지파일. 60
ImageNet Tutorial Image classification using AlexNet classification 프로젝트빌드 Build/x64/Release/classification.exe models/bvlc_alexnet/deploy.prototxt models/bvlc_alexnet/bvlc_alexnet.caffemodel data/ilsvrc12/imagenet_mean.binaryproto data/ilsvrc12/synset_words.txt data/ilsvrc12/test_image.jpg Ubuntu(linux): Build/x64/Release/classification.exe 를 build/examples/cpp_classification/classification.bin 으로대체 최종 top5 예측결과 확률값 분류된 class 결과 61
ImageNet Tutorial Convolution layer & blob visualization Visualization of the first layer (conv1) in AlexNet extract_features 프로젝트빌드 A Krizhevsky et al., NIPS 2012 62
Tips Caffe model zoo https://github.com/bvlc/caffe/wiki/model-zoo 여러논문에사용된네트워크구조가올라와있음 Network-in-Network (NIN) 2013 년 ImageNet 2 위 vggnet 2014 년 ImageNet 1 위 등의모델이 prototxt 파일형태로있어참고하기좋음 63
Tips Slides from BVLC Caffe tutorial 64
Tips Loss 가일정이상줄어들지않는다 CNN 구조를더복잡하게, filter 를더많이써본다. Training Loss 가줄어드는데 Test 성능은좋아지지않는다. => Overfitting 의가능성이높으므로 CNN 구조를간단하게, filter 개수를줄여본다. 초반에 loss 가줄어드는데오래걸린다 => Initialization 에문제가있다. 65
Tips 이외에도많은 Caffe 의기능및옵션들이있음 하지만빠른업데이트로인해최근에추가된기능들의 Documentation 이친절하지는않음 최신의가능한 Option 들을확인하려면 src/caffe/proto/caffe.proto 파일을참고 Github 의 Pull request 와 issue 및 google groups 의검색을생활화 66
References Caffe github https://github.com/bvlc/caffe Caffe intro & tutorial http://caffe.berkeleyvision.org/ Caffe-users Google Groups https://groups.google.com/forum/#!forum/caffe-users Caffe BVLC tutorial slide https://docs.google.com/presentation/d/1uekxvgrvvxg9oudh_uic5g71umscnplvarswer41psu/edit#slide=id. gc2fcdcce7_216_0 LeCun et al., 1998 LeCun, Yann, et al. "Gradient-based learning applied to document recognition."proceedings of the IEEE 86.11 (1998): 2278-2324. Krizhevsky et al., 2012 Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. Dropout Srivastava, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." Journal of Machine Learning Research 15.1 (2014): 1929-1958. 67
Thank You Q&A M I P A L aboratory m a c h i n e i n t e l l i g e n c e & p a t t e r n a n a l y s i s 68