(JBE Vol. 25, No. 2, March 2020) (Special Paper) 25 2, 2020 3 (JBE Vol. 25, No. 2, March 2020) https://doi.org/10.5909/jbe.2020.25.2.192 ISSN 2287-9137 (Online) ISSN 1226-7953 (Print) CPU a), a), a) Towards Real-time Multi-object Tracking in CPU Environment Kyung Hun Kim a), Jun Ho Heo a), and Suk-Ju Kang a)... -. Abstract Recently, the utilization of the object tracking algorithm based on the deep learning model is increasing. A system for tracking multiple objects in an image is typically composed of a chain form of an object detection algorithm and an object tracking algorithm. However, chain-type systems composed of several modules require a high performance computing environment and have limitations in their application to actual applications. In this paper, we propose a method that enables real-time operation in low-performance computing environment by adjusting the computational process of object detection module in the object detection-tracking chain type system. Keyword : multi object tracking, data association, real time object tracking a) (Dept. of Electronic Engineering, Sogang University) Corresponding Author : (Suk-Ju Kang) E-mail: sjkang@sogang.ac.kr Tel: +82-2-705-8466 ORCID: https://orcid.org/0000-0002-4809-956x 2019. 2018 () (No. 2018R1D1A1B07048421), ( 19PQWO-B153369-01), ICT (IITP-2020-2018-0-01421) This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2018R1D1A1B07048421) and a grant(19pqwo-b153369-01) from Smart road lighting platform development and empirical study on test-bed Program funded by Ministry of the Interior and Safety of Korean government, and the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(iitp-2020-2018-0-01421) supervised by the IITP(Institute for Information & communications Technology Promotion). Manuscript received January 14, 2020; Revised March 3, 2020; Accepted March 3, 2020. Copyright 2020 Korean Institute of Broadcast and Media Engineers. All rights reserved. This is an Open-Access article distributed under the terms of the Creative Commons BY-NC-ND (http://creativecommons.org/licenses/by-nc-nd/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited and not altered.
2: CPU (Kyung Hun Kim et al.: Towards Real-time Multi-object Tracking in CPU Environment). [1], [2], [3],. (Multi Object Track-ing; MOT) (Multi Target Tracking; MTT)..... [5].,. [4].,,.,. -, [9]... Convolutional Neural Network(CNN) [9]. CNN CPU. 1. 1. Fig. 1. Multi-object tracking video result. 1.. 1950 XOR,, Restricted Boltz-
(JBE Vol. 25, No. 2, March 2020) mann Machine(RBM). CNN (Feature Extraction).,.,. CNN. [12]. CNN 2014 Region-CNN(R-CNN) [13]. R-CNN Scale Invariant Feature Transform(SIFT) [15], Histo- gram of Oriented Gradient(HOG) [14], Optical Flow [16], haar-like features [17], low-level feature. R-CNN,. R-CNN R-CNN, SPP-Net [18], Fast R-CNN [19], Faster R-CNN [20]. R-CNN.,,. Single Shot Multi-box Detector(SSD) [21], YOLO [10]. 2..,,..,.,..,. 2 ( x,y 2. Fig. 2. Kalman filter-based Multi-Object Tracking System Overview
2: CPU (Kyung Hun Kim et al.: Towards Real-time Multi-object Tracking in CPU Environment) a, h) (x,y,a,h ).. -(IOU) [22]., IOU min IOU.. DEEP SORT [9] [11] IOU distance Mahalanobis distance CNN Deep cosine metric [23]. YOLO-v3 [10].. 1. CNN [9]. 3 CNN.,. 3. Fig. 3. Deep Learning-based Multi-Object Tracking System Overview ( 4. ). 4 4. Fig. 4. Overview of tracking method using Kalman filter and deep learning 2. -, (1 ) 10 30, 0.6... 5 trade-off. 5 (%) (frame per second; fps). 5 1, 4 1, 3 1.
(JBE Vol. 25, No. 2, March 2020) 5. Fig. 5. Accuracy and Speed Correlation with Detection Frame Rate 1 1, 2 1,. 3. 6. [9] 7. Fig. 7. Comparison between existing and proposed methods when two objects have similar image features. 7 Frame 3. 7.,. ( 2. ). (1). O, O, id. feature feature id id. deep cosine metric [23], feature. 6. Fig. 6. Method to update previous to previous frame maxo O O O
2: CPU (Kyung Hun Kim et al.: Towards Real-time Multi-object Tracking in CPU Environment) 1.. MOT16 [8]. Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz 12 core CPU. CLEAR MOT [6] 7. GT ground truth 1. 1. Table 1. Metrics used for benchmarking Metric Better Perfect Description MOTA higher 100% Overall Tracking Accuracy. MOTAL higher 100% Log Tracking Accuracy. MOTP higher 100% Percentage alignment of predicted bounding box and ground truth. FP lower 0 Number of false positives. FN lower 0 Number of false negatives. IDsw lower 0 FPS higher Inf. Identity switches, see [80] for details. Processing speed in frames per second. log 2. 2 2. 2 Table 2. Experimental results applying the original method and the two proposed method METHOD VIDEO MOTA MOTP MOTAL FP FN IDs FPS original frame skip frame skip + previous to previous frame MOT16-02 16.5 76.5 16.7 545 14307 40 3.3 MOT16-04 34.9 79.4 35.2 3051 27751 157 3.08 MOT16-05 29.5 77.1 30 332 4440 32 5.3 MOT16-09 47.2 74.8 48.1 481 2244 53 4.5 MOT16-10 32 75.8 32.3 634 7701 36 3.2 MOT16-11 46.1 79.5 46.4 654 4266 23 4.6 MOT16-13 15 74.4 15.1 170 9546 22 3.3 AVERAGE 30.7 78 31.1 5867 70255 363 3.9 MOT16-02 15.9 76.3 16.1 437 14520 41 12.6 MOT16-04 34.2 79.4 34.5 2509 28653 132 11.9 MOT16-05 27.6 76.5 27.9 220 4693 20 49.8 MOT16-09 46.4 74.8 47.2 364 2409 47 12.3 MOT16-10 30 75.7 30.3 486 8099 38 31.8 MOT16-11 45.8 79 46.1 516 4428 25 12.9 MOT16-13 12.7 73.9 12.8 133 9845 18 12.6 AVERAGE 29.7 77.9 30 4664 72647 321 20.6 MOT16-02 16.6 76.4 16.8 605 14237 39 11.8 MOT16-04 34.1 79.3 34.5 3618 27524 175 11.1 MOT16-05 29.8 76.9 30.2 395 4362 30 49.6 MOT16-09 46.5 74.7 37.4 543 2218 54 11.2 MOT16-10 32.1 75.7 32.3 683 7651 34 12.2 MOT16-11 45.5 79.5 45.7 742 4238 22 12.1 MOT16-13 14.9 74.3 15.1 234 9488 24 12.2 AVERAGE 30.3 77.9 30.7 6820 69718 378 17.2
(JBE Vol. 25, No. 2, March 2020). CPU 4 fps. 3 1 20 fps, 0.3,., MOT16-13.,...., trade-off (MOTA). ( 2. ( 2 1 ).. - CPU. 3 1,.,. (References) [1] Betke, Margrit, Esin Haritaoglu, and Larry S. Davis. "Real-time multiple vehicle detection and tracking from a moving vehicle." Machine vision and applications 12.2 (2000): 69-83. [2] Hu, Weiming, et al. "A survey on visual surveillance of object motion and behaviors." IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 34.3 (2004): 334-352. [3] Lu, Wei-Lwun, et al. "Learning to track and identify players from broadcast sports videos." IEEE transactions on pattern analysis and machine intelligence 35.7 (2013): 1704-1716. [4] Murray, Samuel. "Real-time multiple object tracking-a study on the importance of speed." arxiv preprint arxiv:1709.03572 (2017). [5] Luo, Wenhan, et al. "Multiple object tracking: A literature review." arxiv preprint arxiv:1409.7618 (2014). [6] Bernardin, Keni, and Rainer Stiefelhagen. "Evaluating multiple object tracking performance: the CLEAR MOT metrics." EURASIP Journal on Image and Video Processing 2008 (2008): 1-10. [7] Ristani, Ergys, et al. "Performance measures and a data set for multi-target, multi-camera tracking." European Conference on Computer Vision. Springer, Cham, 2016. [8] Milan, Anton, et al. "MOT16: A benchmark for multi-object tracking." arxiv preprint arxiv:1603.00831 (2016). [9] Wojke, Nicolai, Alex Bewley, and Dietrich Paulus. "Simple online and realtime tracking with a deep association metric." 2017 IEEE international conference on image processing (ICIP). IEEE, 2017. doi: 10.1109/ICIP.2017.8296962 [10] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. [11] A. Bewley, Z. Ge, L. Ott, F. Ramos and B. Upcroft, "Simple online and realtime tracking," 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, 2016, pp. 3464-3468. doi: 10.1109/ICIP.2016.7533003 [12] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324. doi: 10.1109/5.726791 [13] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [14] Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05). Vol. 1. IEEE, 2005. [15] Lowe, David G. "Object recognition from local scale-invariant features." Proceedings of the seventh IEEE international conference on computer vision. Vol. 2. Ieee, 1999. [16] Horn, Berthold KP, and Brian G. Schunck. "Determining optical flow." Techniques and Applications of Image Understanding. Vol. 281.
2: CPU (Kyung Hun Kim et al.: Towards Real-time Multi-object Tracking in CPU Environment) International Society for Optics and Photonics, 1981. [17] Viola, Paul, and Michael Jones. "Rapid object detection using a boosted cascade of simple features." Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001. Vol. 1. IEEE, 2001. [18] He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." IEEE transactions on pattern analysis and machine intelligence 37.9 (2015): 1904-1916. [19] Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE international conference on computer vision. 2015. [20] Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [21] Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016. [22] Kuhn, Harold W. "The Hungarian method for the assignment problem." Naval research logistics quarterly 2.12 (1955): 83-97. [23] Wojke, Nicolai, and Alex Bewley. "Deep cosine metric learning for person re-identification." 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 2018. - 2019 : - 2019 : - ORCID : https://orcid.org/0000-0002-3196-3470 - :,, - 2018 : - 2018 ~ 2020 : - ORCID : https://orcid.org/0000-0002-7575-0354 - :, - 2006 : - 2011 : - 2011 ~ 2012 : LG Display - 2015 ~ : - ORCID : https://orcid.org/0000-0002-4809-956x - :,,