(JBE Vol. 24, No. 2, March 2019) (Special Paper) 24 2, (JBE Vol. 24, No. 2, March 2019) ISSN

(Special Paper) 24 2, 2019 3 (JBE Vol. 24, No. 2, March 2019) https://doi.org/10.5909/jbe.2019.24.2.234 ISSN 2287-9137 (Online) ISSN 1226-7953 (Print) SIFT a), a), a), a) SIFT Image Feature Extraction based on Deep Learning Jae-Eun Lee a), Won-Jun Moon a), Young-Ho Seo a), and Dong-Wook Kim a) SIFT SIFT (Deep Neural Network). DIV2K 33 33, SIFT RGB. (ground truth) (scale, octave) 0, (sigma) 1.6, (intervals) 3 RobHess SIFT. VGG-16 13 23 33,. (sigmoid) (softmax). 99%. Abstract In this paper, we propose a deep neural network which extracts SIFT feature points by determining whether the center pixel of a cropped image is a SIFT feature point. The data set of this network consists of a DIV2K dataset cut into 33 33 size and uses RGB image unlike SIFT which uses black and white image. The ground truth consists of the RobHess SIFT features extracted by setting the octave (scale) to 0, the sigma to 1.6, and the intervals to 3. Based on the VGG-16, we construct an increasingly deep network of 13 to 23 and 33 convolution layers, and experiment with changing the method of increasing the image scale. The result of using the sigmoid function as the activation function of the output layer is compared with the result using the softmax function. Experimental results show that the proposed network not only has more than 99% extraction accuracy but also has high extraction repeatability for distorted images. Keyword : SIFT Feature extraction, Deep learning, VGG, CNN(Convolutional Neural Network), Repeatability a) (Department of Electronic Materials Engineering, Kwangwoon University) Corresponding Author : (Dong-Wook Kim) E-mail: dwkim@kw.ac.kr Tel: +82-2-940-5167 ORCID: https://orcid.org/0000-0002-4668-743x 2018. This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2016R1D1A1B03930691). The present Research has been conducted by the Research Grant of Kwangwoon University in 2019. Manuscript received January 8, 2019; Revised March 6, 2019; Accepted March 8, 2019. Copyright 2019 Korean Institute of Broadcast and Media Engineers. All rights reserved. This is an Open-Access article distributed under the terms of the Creative Commons BY-NC-ND (http://creativecommons.org/licenses/by-nc-nd/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited and not altered.

3: SIFT (Jae-Eun Lee et al.: SIFT Image Feature Extraction based on Deep Learning). (feature extraction),,.,,. (Harris corner) [1].. (scale). Mikolajczyk (Laplacian) [2].,. Shi Tomasi affine Shi& Tomasi [3]. Lowe SIFT(Scale Invariant Feature Transform) [4]. SIFT DoG(Difference of Gaussian). SIFT. Bay SURF(Speed Up Robust Feature) [5] Rosten FAST(Features from Accelerated Segment Test) [6], Mair AGAST [7]., (stitching) SIFT [8]. Rublee FAST (feature descriptor) BRIEF ORB(Oriented FAST and Rotated BRIEF) [9]. SIFT., (deep learning). GPU(Graphic Process- ing Unit), (big data). SIFT. VGG-16,. (repeatability).. SIFT. SIFT (Deep Neural Network, DNN)., SIFT... SIFT SIFT RGB, (intervals), scale space( ). (Gaussian), (σ), σ. σ,.. (1), ( (2)),.

,. DoG DoG, DoG. 1., DoG., DoG DoG, 8, 9, 26. 2., (contrast) (edge). SIFT. 3. SIFT, (Virtual Reality, VR) (stitching). SIFT,., DNN DNN. DNN. - DNN SIFT. 1. SIFT DoG Fig. 1. The process to form a SIFT DoG pyramid. SIFT DNN(feature extraction DNN) SIFT. SIFT, DNN. 1. 2. SIFT Fig. 2. SIFT extrema extraction process

3: SIFT (Jae-Eun Lee et al.: SIFT Image Feature Extraction based on Deep Learning) DIV2K [10]. 800 RobHess SIFT label [11]. SIFT 0, σ 1.6, interval 3. 3 - (hyper-parameter) SIFT. SIFT RGB, 33 33. 16 DNN. 215k, 2,152k 1:10 2M.,. (test) 1:10,.,. 3. 0 SIFT Fig. 3. SIFT features extracted with octave set to 0 2. 2014 (ImageNet challenge) VGG-16 DNN [12]. 5 5 3 3,. VGG-16. 1. 33 33, RGB 3 (33 33 3). 1 A VGG-16. (Conv) 13, (max pooling) 5 (fully connected, FC) 3. 3 3, (stride) 1 1. 3 3 0-(0 padding), 2 2,. 2, 2, 3, 3, 3 2 2, 64, 128, 256, 512, 512. 3. 2 512 1 1 1. (activation) leaky ReLU, 0 1 (sigmoid). 0.5, 0.5. 1 ConvA-B ConvC- D C C B D D ( ). FC-X X (node). 1 B C A 10, 20. D C 6, 12, 19, 26, 33 2 2.

1. DNN Table 1. Configurations of the proposed DNNs A B C D E 16 weight layers 26 weight layers 36 weight layers 36 weight layers 36 weight layers conv3-64 conv3-64 conv3-128 conv3-128 input (33 33 RGB image) conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 FC-512 FC-512 FC-512 sigmoid softmax E A~D (softmax)., A~D E 2 --(one-hot-encoding) 01, 10.. (Python), PC Intel(R) Xeon(R) CPU E3-1275 v6 @3.80GHz, 64GB RAM, 64-bit Windows, GPU GTX 1080ti. PC DNN

3: SIFT (Jae-Eun Lee et al.: SIFT Image Feature Extraction based on Deep Learning) 1 2. (mini-batch) 500, 300 (epoch). 2 1 DNN. DNN 99.4%, 99.109%. C, D, E 100%,. A, B, C,.,. E. (a) (c) 4. ; (a) +25, (b) +50, (c) +75, (d) +100 Fig. 4. Image examples with varying brightness level; (a) +25, (b) +50, (c) +75, (d) +100 (b) (d) 2. DNN Table 2. Experimental results for the proposed DNN DNN Train accuracy(%) Test accuracy(%) A 98.100 96.796 B 99.300 99.002 C 99.900 99.083 D 99.900 99.109 E 99.900 99.084 (a) (b) DNN. ( ) (3) (min ), DNN. 1 [13]. min (c) (e) 5. ; (a), (b) 0.5, (c) 1.0, (d) 1.5, (e) 2.0, (f) 2.5 Fig. 5. Example Images with varying blur level; (a) original, (b) radius 0.5, (c) radius 1.0, (d) radius 1.5, (e) radius 2.0, (f) radius 2.5. (d) (f)

, 4 5. 5(a) 4 5, 4 4, 5(b)~(f) 5., SIFT( ) 6. 6(a), DNN. 1, 2 SIFT DNN,. 6(b),. (a) (b) 6. ; (a), (b) Fig. 6. Results of the feature point repeatability measure for the distorted images; (a) change in brightness, (b) change in blur. 2 6, DNN 99%, SIFT.. SIFT DNN. DNN VGG DNN(VGG-16) ( ), 5 DNN. DNN 99%, SIFT. DNN SIFT /. 5 DNN. 2 2.., 2 2,..,.

3: SIFT (Jae-Eun Lee et al.: SIFT Image Feature Extraction based on Deep Learning).. (References) [1] C. Harris, M. Stephens, A combined corner and edge detector, Proceedings of the Alvey Vision Conference, pp.147-151, 1988. [2] K. Mikolajczyk, C. Schmid, Indexing based on scale invariant interest points, ICCV, Vol.1, pp. 525-531, 2001. [3] J. Shi, C. Tomasi, Good features to track, 9th IEEE Conference on Computer Vision and Pattern Recognition, Springer, Heidelberg, 1994. [4] D. G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, Vol.60, No.2, pp.91-110, 2004. [5] H. Bay, T. Tuytelaars, and L. Van Gool, Surf: Speeded up robust features, In European Conference on Computer Vision, Vol.1, No.2, May 2006. [6] E. Rosten, T. Drummond, Machine learning for high-speed corner detection, Proc. 9th European Conference on Computer Vision (ECCV'06), May 2006. [7] E. Mair, G. Hager, D. Burschka, M. Suppa, and G. Hirzinger, Adaptive and generic corner detection based on the accelerated segment test, Computer Vision-ECCV 2010, Vol.2, No.2, pp.183-196, 2010. [8] M. WonJun, S. Youngho, and K. Dongwook, Parameter Analysis for Time Reduction in Extracting SIFT Keypoints in the Aspect of Image Stitching, Journal of Broadcast Engineering, Vol.23, No.4, pp.559-573, July 2018. [9] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, ORB: an efficient alternative to SIFT or SURF, In Proc. of the IEEE Intl. Conf. on Computer Vision (ICCV), Vol.13, 2011. [10] E. Agustsson, R. Timofte, NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study, In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2017. [11] R. Hess, An Open-Source SIFT Library, ACM Multimedia, pp.1493-1496, 2010. [12] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, In Proc. International Conference on Learning Representations (ICLR), 2015. [13] K. Mikolajczyk, C. Schmid, Scale and affine invariant interest point detectors, IJCV, Vol.1, No.60, pp.63-86, 2004. - 2019 2 : () - 2019 3 ~ : () - ORCID : https://orcid.org/0000-0001-9760-4801 - :,, - 2018 2 : () - 2018 3 ~ : () - ORCID : https://orcid.org/0000-0002-9620-9524 - : Virtual Reality,, 2D,

- 1999 2 : () - 2001 2 : () - 2004 8 : () - 2005 9 ~ 2008 2 : - 2008 3 ~ : - ORCID : http://orcid.org/0000-0003-1046-395x - :, 2D/3D,, SoC - 1983 2 : () - 1985 2 : - 1991 9 : Georgia () - 1992 3 ~ : - ORCID : http://orcid.org/0000-0002-4668-743x - : 3D,, VLSI Testability, VLSI CAD, DSP, Wireless Communication