(JBE Vol. 23, No. 5, September 2018) (Regular Paper) 23 5, 2018 9 (JBE Vol. 23, No. 5, September 2018) https://doi.org/10.5909/jbe.2018.23.5.642 ISSN 2287-9137 (Online) ISSN 1226-7953 (Print) a), a) Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors Jong In Gil a) and Manbae Kim a),,. PIR(pyroelectric infra-red).. PIR....,,.,. Abstract Occupancy sensors installed in buildings and households turn off the light if the space is vacant. Currently PIR(pyroelectric infra-red) motion sensors have been utilized. Recently, the researches using camera sensors have been carried out in order to overcome the demerit of PIR that cannot detect stationary people. The detection of moving and stationary people is a main functionality of the occupancy sensors. In this paper, we propose an on-line human occupancy tracking method using convolutional neural network (CNN) and self-organizing map. It is well known that a large number of training samples are needed to train the model offline. To solve this problem, we use an untrained model and update the model by collecting training samples online directly from the test sequences. Using videos capurted from an overhead camera, experiments have validated that the proposed method effectively tracks human. Keyword : on-line tracking, convolutional neural network, self organizing map, occupancy sensor Copyright 2016 Korean Institute of Broadcast and Media Engineers. All rights reserved. This is an Open-Access article distributed under the terms of the Creative Commons BY-NC-ND (http://creativecommons.org/licenses/by-nc-nd/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited and not altered.
1: (Jong In Gil et al.: Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors). (light on), (light off) (occupancy sensor, motion sensor) [1,2]. PIR (pyroelectric infra-red). PIR (thermal temperature),. PIR [3~7]. PIR,,,.,. PIR. Benezeth CAPTHOM [3].., 2. Han PIR [4]. Nakashima [5].,. Amin PIR a) (Dept. of Computer and Communications Engineering, Kangwon National University) Corresponding Author : (Manbae Kim) E-mail: manbae@kangwon.ac.kr Tel: +82-33-250-6395 ORCID:http://orcid.org/0000-0002-4702-8276 ICT (IITP-2018-0-01433) 2017 () (No. 2017R1D1A3 B03028806). his research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(iitp-2018-0-01433) supervised by the IITP(Institute for Information & communications Technology Promotion) and This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. 2017R1D1A3B03028806). Manuscript received July 6, 2018; Revised August 9, 2018; Accepted, August 9,2018. [6]. PIR, 2. [3,5],,. Gil [7]. Gil,. MHI(Motion History Image)..,,,..., (Tracking-by-Detection: TbD).. TbD.,,.,..,... (overfitting)..
(JBE Vol. 23, No. 5, September 2018) (Convolutional Neural Network: CNN) [8-10]. CNN. TbD. CNN. (Self Organizing Map)..,..,. 1.. 1... (patch).,,,. ±,,, ( )...,, 32, 200, 10. 32,592..... (Self Organizing Map: SOM). 1. Fig. 1. Overall flow diagram of proposed system for determining the location and scale of a target object
1: (Jong In Gil et al.: Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors) SOM, [11]... SOM, SOM. CNN...,..,. SOM... 2...., (, )..,..... (), (),. (positive).., 2. (:, : ) Fig. 2. Positive and negative bounding boxes obtained from a tracked object. (red: positive image patch, blue: negative image patch)
(JBE Vol. 23, No. 5, September 2018) ± ±., (negative)., 2 4. 2 6. 3. (Self Organizing Map: SOM).. K-, Kohonen Kohonen., (winner-take-all). SOM 2-. SOM.,.... SOM. 1 (1 2 1 ) 4., 5. SOM 5 5. 5 1., 5.,. SOM. 1. 1,. SOM, SOM.. SOM.,. K-Means,... SOM. :,, : 1. SOM 5 2. 3. Euclidean distance 4. 5.
1: (Jong In Gil et al.: Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors) 6. 1, 2 7. 8. SOM, CNN,.,. SOM,,, 0.5 1.5 0.1., 11. SOM (Euclidean distance).. 3 CNN. CNN (convolutional layer) (fully-connected layer). (sigmoid),. 32x32x3. 3x3 8. 30x30x8, 392. (hidden layer) 100, 80. (softmax),. 1. CNN CNN, CNN. CNN. CNN,..,.,.. (probability map) BP. 4. BP (Integral Probability Map) P. 3. CNN Fig. 3. Architecture of CNN for visual tracking
(JBE Vol. 23, No. 5, September 2018) BP p, BP n. P p BP p, P n.. 2-class.,. (ground truth), CNN. (1) (3). (2) (3) (4). 4. Fig. 4. Procedure of generating Integral Probability Map 4.. (1).. (2) Cross entropy. ln ln,. ( ) n, ( )., CNN.,., CNN.,, CNN. (4) CNN 6 CNN., CNN,. (epoch).., (5). ln.,
1: (Jong In Gil et al.: Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors).,,,.. 2. CNN..,,. CNN, 2D..,.,..,.. CNN,. 6. 1.0...,,. CNN,.. 6. Fig. 6. Input images and score maps 5. () () Fig. 5. Object search range (yellow) and candidate image patch (blue), (score map). 5.,.. 10. 1024x786 RGB., 2.7m. 10
(JBE Vol. 23, No. 5, September 2018) 7. 4 9. CNN-SOM Fig. 7. Qualitative results of the 9 trackers over 4 test sequences. CNN-SOM is the proposed method, 4 7. 1 4 Video1, Video8, Video9, Video 10. 9,. 9 Adaptive Structural Local Sparse Model(ASL A) [12], Circulant Structure of Tracking-by-Detection with Kernel(CSK) [13], Compressive Tracking(CT) [14], Distribu- tion Fields for Tracking(DFT) [15], Incremental Learning for Robust Visual Tracking(IVT) [16]. L1 Tracking using Accelerated Proximal Gradient Approach(L1APG) [17], Lo- cally Orderless Tracking(LOT) [18], Multi-task Sparse Learning(MTT) [19], Sparsity-based Collaborative Model (SCM) [20]. CNN-SOM.. (Ground Truth).,,,. FOV(Field Of View)..,.,,. (Eu- clidean distance of corners). (8).,. 8.. Y. 10 200.
1: (Jong In Gil et al.: Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors) 8. 9. CNN-SOM Fig. 8. Corner distance of the 9 trackers over 10 test sequence. CNN-SOM is the proposed method.,. 400. 1024x786. 400 (drift). 500,. 1.
(JBE Vol. 23, No. 5, September 2018) 1. Table 1. Average of corner distance error ASLA CSK CT DFT IVT L1APG LOT MTT SCM CNN- SOM Video1 289.40 158.58 131.74 480.11 536.68 531.22 100.10 444.70 263.21 132.88 Video2 377.09 477.00 382.45 454.95 527.95 631,83 351.01 326.88 408.41 133.64 Video3 138.14 388.04 502.93 327.63 542.45 252.52 278.65 247.08 172.48 132.18 Video4 141.14 156.50 212.85 185.97 400.27 371.70 328.44 131.41 447.97 164.78 Video5 116.37 234.07 115.27 176.70 258.83 343.96 136.51 104.38 371.35 94.92 Video6 71.96 108.42 153.86 313.99 88.15 78.41 100.59 84.03 90.48 88.32 Video7 112.52 170.91 153.20 266.55 295.71 118.03 91.24 167.81 110.06 151.73 Video8 146.64 157.94 84.10 113.21 127.44 152.58 123.06 166.07 177.78 123.03 Video9 88.37 112.89 126.86 915.70 205.54 618.17 341.45 186.01 881.52 218.84 Video10 196.11 206.55 178.12 259.10 290.87 313.76 77.04 156.12 302.11 106.05 (overlap ratio).. (9). 9.. 1.0, 0.0. 0.5,,., 0.8., 0.5.. FOV..,. 2. Table 2. Average of overlap ratio ASLA CSK CT DFT IVT L1APG LOT MTT SCM CNN- SOM Video1 0.1818 0.3728 0.4087 0.1062 0.0821 0.0682 0.4962 0.0730 0.1818 0.4183 Video2 0.3842 0.1834 0.3356 0.2207 0.1095 0.0584 0.3490 0.4048 0.2792 0.4214 Video3 0.4617 0.0967 0.0480 0.2887 0.0922 0.2564 0.2979 0.2773 0.3796 0.4235 Video4 0.4259 0.3982 0.2679 0.3329 0.1332 0.0642 0.1952 0.4464 0.0399 0.3616 Video5 0.5244 0.3732 0.5314 0.4703 0.3635 0.3834 0.5151 0.5591 0.3097 0.5963 Video6 0.6883 0.5716 0.4439 0.2526 0.6666 0.6715 0.5873 0.6402 0.6682 0.6261 Video7 0.4149 0.2913 0.3732 0.1511 0.0740 0.4116 0.4433 0.3204 0.4249 0.3248 Video8 0.4381 0.3989 0.5610 0.4604 0.3758 0.3928 0.3963 0.3874 0.3894 0.4492 Video9 0.5358 0.4799 0.4282 0.0039 0.1935 0.0163 0.2341 0.2653 0.0087 0.2452 Video10 0.2865 0.2699 0.3313 0.1759 0.1364 0.0572 0.5427 0.4164 0.0889 0.5369
1: (Jong In Gil et al.: Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors) 9. 9 Fig. 9. Overlap ratio of the 9 trackers over 10 test sequence...,. 2....
(JBE Vol. 23, No. 5, September 2018), CNN. CNN SOM. SOM.,., 10 3, 1, 1, 4, 1, 5, 5, 3, 5, 2, 3. 2, 1, 2, 4, 1, 6, 6, 3, 5, 2, 2.. (References) [1] P. Liu, S. Nguang, and A. Partridge, Occupancy inference using pyro-electric infrared sensors through hidden markov model, IEEE Sensors Journal, 16(4), Feb, 2016. [2] F. Wahl, M. Milenkovic, and O. Amft, A distributed PIR-based approach for estimating people count in office environments, IEEE Conf. on Computational Science and Engineering, 2012. [3] Y. Benezeth, H. Laurent, B. Emile, and C. Rosenberger, Towards a sensor for detecting human presence and characterizing activity, Energy and Buildings, 43, 2011. [4] J. Han, and B. Bhanu, Fusion of color and infrared video for moving human detection, Pattern Recognition, 40, 2007. [5] S. Nakashima, Y. Kltazono, L. Zhang, and S. Serikawa, Development of privacy-preserving sensor for person detection, Proceedia-Social and Behavioral Sciences, 2(1)n, 2010. [6] I. Amin, A. Taylor, F. Junejo, A. Al-Habaibeh, and R. Parkin, Automated people-counting by using low-resolution infrared and visual cameras, Measurement, 41, 2008. [7] J. Gil, and M. Kim, Real-time People Occupancy Detection by Camera Vision Sensor, Journal of Broadcast Engineering, 22(6), pp. 774-784, 2017. [8] H. Li, Y. Li, and F. Porikli, DeepTrack: Learning Discriminative Feature Representations Online for Robust Visual Tracking, IEEE Trans. on Image Processing, Vol. 25, No. 4, pp. 1834-1848, April 2016. [9] K. Zhang, Q. Liu, and M. Yang, Robust Visual Tracking via Convolutional Networks Without Training, IEEE Trans. on Image Processing, Vol. 25, No. 4, pp. 1779-1792, April 2016. [10] X. Zhou, L. Xie, P. Zhang, and Y. Zhang, An Ensemble of Deep Neural Networks for Object Tracking, IEEE Conf. on Image Processing, pp. 843-847, 2014. [11] T. Kohonen, Self-organized formation of topologically correct feature maps, Biological cybernetics, 43(1), pp. 59-69, 1982. [12] X. Jia, H. Lu and M. H. Yang, Visual Tracking via Adaptive Structural Local Sparse Appearance Model, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1822-1829, 2012. [13] J. F. Henriques, R. Caseiro, P. Martin and J. Batista, Exploiting the Circulant Structure of Tracking-by-Detection with Kernels, European Conf. on Computer Vision, pp. 702-715, 2012. [14] K. Zhang, L. Zhang and M. H. Yang, Real-time Compressive Tracking, European Conf. on Computer Vision, pp. 864-877, 2012. [15] L. Sevilla-Lara and E. Learned-Miller, Distribution Fields for Tracking, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1910-1917, 2012. [16] D. A. Ross, J. Lim, R. S. Lin and M. H. Yang, Incremental Learning for Robust Visual Tracking, International Journal of Computer Vision, Vol. 77, Issue 1-3, pp. 125-141, 2008. [17] C. Bao, Y. Wu, H. Ling and H. Ji, Real Time Robust L1 Tracking Using Accelerated Proximal Gradient Approach, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1830-1837, 2012. [18] S. Oron, A. Bar-Hillel, D. Levi and S. Avidan, Locally Orderless Tracking, International Journal of Computer Vision, Vol. 111, No. 2, pp. 213-228, 2015. [19] T. Zhang, B. Ghanem, S. Liu and N. Ahuja, Robust Visual Tracking via Multi-task Sparse Learning, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2042-2049, 2012. [20] W. Zhong, H. Lu and MH. Yang, Robust Object Tracking via Sparsity-based Collaborative Model, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1838-1845, 2012.
1: (Jong In Gil et al.: Online Human Tracking Based on Convolutional Neural Network and Self Organizing Map for Occupancy Sensors) - 2010 8 : - 2012 8 : - 2012 9 ~ : IT - :,,, - 1983 : - 1986 : University of Washington, Seattle - 1992 : University of Washington, Seattle - 1992 ~ 1998 : - 1998 ~ : - 2016 ~ : - ORCID : http://orcid.org/0000-0002-4702-8276 - : 3D,,