(JBE Vol. 24, No. 4, July 2019) (Special Paper) 24 4, (JBE Vol. 24, No. 4, July 2019) ISSN

Similar documents
(JBE Vol. 23, No. 2, March 2018) (Special Paper) 23 2, (JBE Vol. 23, No. 2, March 2018) ISSN

09권오설_ok.hwp

(JBE Vol. 23, No. 2, March 2018) (Special Paper) 23 2, (JBE Vol. 23, No. 2, March 2018) ISSN

2 : (Seungsoo Lee et al.: Generating a Reflectance Image from a Low-Light Image Using Convolutional Neural Network) (Regular Paper) 24 4, (JBE

(JBE Vol. 22, No. 2, March 2017) (Regular Paper) 22 2, (JBE Vol. 22, No. 2, March 2017) ISSN

<30312DC1A4BAB8C5EBBDC5C7E0C1A4B9D7C1A4C3A52DC1A4BFB5C3B62E687770>

2 : (JEM) QTBT (Yong-Uk Yoon et al.: A Fast Decision Method of Quadtree plus Binary Tree (QTBT) Depth in JEM) (Special Paper) 22 5, (JBE Vol. 2

2 : 3 (Myeongah Cho et al.: Three-Dimensional Rotation Angle Preprocessing and Weighted Blending for Fast Panoramic Image Method) (Special Paper) 23 2

(JBE Vol. 24, No. 2, March 2019) (Special Paper) 24 2, (JBE Vol. 24, No. 2, March 2019) ISSN

High Resolution Disparity Map Generation Using TOF Depth Camera In this paper, we propose a high-resolution disparity map generation method using a lo

4 : (Hyo-Jin Cho et al.: Audio High-Band Coding based on Autoencoder with Side Information) (Special Paper) 24 3, (JBE Vol. 24, No. 3, May 2019

(JBE Vol. 23, No. 5, September 2018) (Regular Paper) 23 5, (JBE Vol. 23, No. 5, September 2018) ISSN

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE. vol. 29, no. 10, Oct ,,. 0.5 %.., cm mm FR4 (ε r =4.4)

2 : (Juhyeok Mun et al.: Visual Object Tracking by Using Multiple Random Walkers) (Special Paper) 21 6, (JBE Vol. 21, No. 6, November 2016) ht

°í¼®ÁÖ Ãâ·Â

(JBE Vol. 21, No. 1, January 2016) (Regular Paper) 21 1, (JBE Vol. 21, No. 1, January 2016) ISSN 228

1 : (Sunmin Lee et al.: Design and Implementation of Indoor Location Recognition System based on Fingerprint and Random Forest)., [1][2]. GPS(Global P

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Nov.; 26(11),

08김현휘_ok.hwp

(JBE Vol. 23, No. 5, September 2018) (Special Paper) 23 5, (JBE Vol. 23, No. 5, September 2018) ISSN

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Dec.; 27(12),

(JBE Vol. 23, No. 1, January 2018) (Special Paper) 23 1, (JBE Vol. 23, No. 1, January 2018) ISSN 2287-

1 : UHD (Heekwang Kim et al.: Segment Scheduling Scheme for Efficient Bandwidth Utilization of UHD Contents Streaming in Wireless Environment) (Specia

DBPIA-NURIMEDIA

DBPIA-NURIMEDIA

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE. vol. 29, no. 6, Jun Rate). STAP(Space-Time Adaptive Processing)., -

(JBE Vol. 24, No. 1, January 2019) (Regular Paper) 24 1, (JBE Vol. 24, No. 1, January 2019) ISSN 2287

À±½Â¿í Ãâ·Â

다중 곡면 검출 및 추적을 이용한 증강현실 책

(JBE Vol. 23, No. 6, November 2018) (Special Paper) 23 6, (JBE Vol. 23, No. 6, November 2018) ISSN 2

<313120C0AFC0FCC0DA5FBECBB0EDB8AEC1F2C0BB5FC0CCBFEBC7D15FB1E8C0BAC5C25FBCF6C1A42E687770>

DBPIA-NURIMEDIA

<4D F736F F D20B1E2C8B9BDC3B8AEC1EE2DC0E5C7F5>

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Jul.; 27(7),

3 : 3D (Seunggi Kim et. al.: 3D Depth Estimation by a Single Camera) (Regular Paper) 24 2, (JBE Vol. 24, No. 2, March 2019)

4 : CNN (Sangwon Suh et al.: Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset) (Regular Paper) 23 6, (J

19_9_767.hwp

8-VSB (Vestigial Sideband Modulation)., (Carrier Phase Offset, CPO) (Timing Frequency Offset),. VSB, 8-PAM(pulse amplitude modulation,, ) DC 1.25V, [2

DBPIA-NURIMEDIA

(JBE Vol. 20, No. 5, September 2015) (Special Paper) 20 5, (JBE Vol. 20, No. 5, September 2015) ISS


14.531~539(08-037).fm

05( ) CPLV12-04.hwp

(JBE Vol. 23, No. 5, September 2018) (Regular Paper) 23 5, (JBE Vol. 23, No. 5, September 2018) ISSN

김기남_ATDC2016_160620_[키노트].key

1 : 360 VR (Da-yoon Nam et al.: Color and Illumination Compensation Algorithm for 360 VR Panorama Image) (Special Paper) 24 1, (JBE Vol. 24, No

(JBE Vol. 23, No. 1, January 2018). (VR),. IT (Facebook) (Oculus) VR Gear IT [1].,.,,,,..,,.. ( ) 3,,..,,. [2].,,,.,,. HMD,. HMD,,. TV.....,,,,, 3 3,,

3 : (Won Jang et al.: Musical Instrument Conversion based Music Ensemble Application Development for Smartphone) (Special Paper) 22 2, (JBE Vol

04김호걸(39~50)ok

<31325FB1E8B0E6BCBA2E687770>

I

(JBE Vol. 23, No. 4, July 2018) (Special Paper) 23 4, (JBE Vol. 23, No. 4, July 2018) ISSN

<30362E20C6EDC1FD2DB0EDBFB5B4EBB4D420BCF6C1A42E687770>

Software Requirrment Analysis를 위한 정보 검색 기술의 응용

02손예진_ok.hwp

Journal of Educational Innovation Research 2019, Vol. 29, No. 1, pp DOI: (LiD) - - * Way to

(JBE Vol. 22, No. 2, March 2017) (Special Paper) 22 2, (JBE Vol. 22, No. 2, March 2017) ISSN

03-서연옥.hwp

2 : (Jaeyoung Kim et al.: A Statistical Approach for Improving the Embedding Capacity of Block Matching based Image Steganography) (Regular Paper) 22

DBPIA-NURIMEDIA

(JBE Vol. 23, No. 1, January 2018) (Regular Paper) 23 1, (JBE Vol. 23, No. 1, January 2018) ISSN 2287

07.045~051(D04_신상욱).fm

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Nov.; 28(11),

04 최진규.hwp

4 : WebRTC P2P DASH (Ju Ho Seo et al.: A transport-history-based peer selection algorithm for P2P-assisted DASH systems based on WebRTC) (Special Pape

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Feb.; 29(2), IS

04_이근원_21~27.hwp

디지털포렌식학회 논문양식

<B8F1C2F72E687770>

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Mar.; 28(3),

< FBEC8B3BBB9AE2E6169>

63-69±è´ë¿µ

3 : ATSC 3.0 (Jeongchang Kim et al.: Study on Synchronization Using Bootstrap Signals for ATSC 3.0 Systems) (Special Paper) 21 6, (JBE Vol. 21

<353420B1C7B9CCB6F52DC1F5B0ADC7F6BDC7C0BB20C0CCBFEBC7D120BEC6B5BFB1B3C0B0C7C1B7CEB1D7B7A52E687770>

1 : HEVC Rough Mode Decision (Ji Hun Jang et al.: Down Sampling for Fast Rough Mode Decision for a Hardware-based HEVC Intra-frame encoder) (Special P

09구자용(489~500)

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Jul.; 29(7),

DBPIA-NURIMEDIA

±è¼ºÃ¶ Ãâ·Â-1

에너지경제연구 Korean Energy Economic Review Volume 17, Number 2, September 2018 : pp. 1~29 정책 용도별특성을고려한도시가스수요함수의 추정 :, ARDL,,, C4, Q4-1 -

3. 클라우드 컴퓨팅 상호 운용성 기반의 서비스 평가 방법론 개발.hwp

Journal of Educational Innovation Research 2017, Vol. 27, No. 4, pp DOI: A Study on the Opti

6 : (Gicheol Kim et al.: Object Tracking Method using Deep Learing and Kalman Filter) (Regular Paper) 24 3, (JBE Vol. 24, No. 3, May 2019) http

04 김영규.hwp

03-16-김용일.indd

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Mar.; 25(3),

DBPIA-NURIMEDIA

DBPIA-NURIMEDIA

WHO 의새로운국제장애분류 (ICF) 에대한이해와기능적장애개념의필요성 ( 황수경 ) ꌙ 127 노동정책연구 제 4 권제 2 호 pp.127~148 c 한국노동연구원 WHO 의새로운국제장애분류 (ICF) 에대한이해와기능적장애개념의필요성황수경 *, (disabi

<372DBCF6C1A42E687770>

Journal of Educational Innovation Research 2018, Vol. 28, No. 4, pp DOI: A Study on Organizi

Journal of Educational Innovation Research 2017, Vol. 27, No. 1, pp DOI: * The

, ( ) 1) *.. I. (batch). (production planning). (downstream stage) (stockout).... (endangered). (utilization). *

Microsoft Word - 1-차우창.doc

(JBE Vol. 23, No. 5, September 2018) (Regular Paper) 23 5, (JBE Vol. 23, No. 5, September 2018) ISSN

歯1.PDF

3 : OpenCL Embedded GPU (Seung Heon Kang et al. : Parallelization of Feature Detection and Panorama Image Generation using OpenCL and Embedded GPU). e

THE JOURNAL OF KOREAN INSTITUTE OF ELECTROMAGNETIC ENGINEERING AND SCIENCE Jun.; 27(6),

untitled

3 Gas Champion : MBB : IBM BCS PO : 2 BBc : : /45

Transcription:

(JBE Vol. 24, No. 4, July 2019) (Special Paper) 24 4, 2019 7 (JBE Vol. 24, No. 4, July 2019) https://doi.org/10.5909/jbe.2019.24.4.564 ISSN 2287-9137 (Online) ISSN 1226-7953 (Print) a), a) Integral Regression Network for Facial Landmark Detection Do Yeop Kim a) and Ju Yong Chang a).,,... Abstract With the development of deep learning, the performance of facial landmark detection methods has been greatly improved. The heat map regression method, which is a representative facial landmark detection method, is widely used as an efficient and robust method. However, the landmark coordinates cannot be directly obtained through a single network, and the accuracy is reduced in determining the landmark coordinates from the heat map. To solve these problems, we propose to combine integral regression with the existing heat map regression method. Through experiments using various datasets, we show that the proposed integral regression network significantly improves the performance of facial landmark detection. Keyword : Face Alignment, Facial Landmark Detection, Deep Learning a) (Department of Electronics and Communications Engineering, Kwangwoon University) Corresponding Author : (Ju Yong Chang) E-mail: jychang@kw.ac.kr Tel: +82-2-940-5136 ORCID: https://orcid.org/0000-0003-3710-7314 IPIU 2019. This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(msit) (No. 2018-0-00735, Media Interaction Technology based on Human Reaction and Intention to Content in UHD Broadcasting Environment) The present Research has been conducted by the Research Grant of Kwangwoon University in 2019. Manuscript received May 7, 2019; Revised July 5, 2019; Accepted July 5, 2019. Copyright 2016 Korean Institute of Broadcast and Media Engineers. All rights reserved. This is an Open-Access article distributed under the terms of the Creative Commons BY-NC-ND (http://creativecommons.org/licenses/by-nc-nd/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited and not altered.

1: (Do Yeop Kim et al.: Integral Regression Network for Facial Landmark Detection). (facial landmark detection). (facial landmarks),,.,, (face morphing). (convolutional neural network; CNN).. CNN (heatmap) (heatmap regression) [1].. (convolutional layer), (over- fitting).,.,. argmax, argmax.., argmax. (integral regression) [2]., argmax (expectation)., (Gaussian) (mean). argmax, argmax...,, 1. (normalization). ReLU 0 (sum-normalization).,... (deep learning) CNN,. CNN. [3] CNN CNN 3 cascaded CNN. [4] 4 cascaded CNN coarse-to-fine 50. cascaded CNN, CNN

(JBE Vol. 24, No. 4, July 2019). Multi-task learning task. task task. [5], (head pose estimation),,,,. [6] ResNet-101 [7],,,. Multi-task learning [5],[6]. task.,. [1] Hourglass [17] Stacked Hourglass Network. [1] 2D 3D., [1] Hourglass.. 1. Table 1. Detail of the proposed network module name heat map regression module integral regression module layer name output size operation conv1 128 128 7 7, 64, stride 2 3 3 max pool, stride 2 conv2_x 64 64 conv3_x 32 32 conv4_x 16 16 conv5_x 8 8 deconv1 16 16 stride deconv2 32 32 stride deconv3 64 64 stride regression 64 64 normalization 64 64 expectation 2 1. Fig. 1. Overview of proposed facial landmark detection network

1: (Do Yeop Kim et al.: Integral Regression Network for Facial Landmark Detection) RGB 68. 1, 1. 1 operation., conv2_x operation 3 stride 2 max pooling, 1 1 64, 3 3 64, 1 1 256 3 (bottleneck architecture) 3 [7]. Integral regression module. 1. RGB. ResNet-50 [7],, pooling linear (fully convolutional network; FCN), 3 deconvolution. 256 256 ResNet-50 8 8, 3 deconvolution 64 64. 2..,.. sum-normalization. ReLU 0 2. ReLU (activation function). (1) 0 1, 1. (1). 3. (loss function). (mean squared error; MSE) L 1.., 1.0, 1.0.. (5) (cost function).

(JBE Vol. 24, No. 4, July 2019) 1.. ResNet-50, deconvolution,. ResNet-50 ImageNet [8]. ResNet-50 1,000 2, deconvolution.. 2.. 300W-LP [9], 300W [10], Menpo [11], [12]. 300W-LP (large-pose) [9] 300W 61,225. large-pose. 300W 300-W Challenge [10] Indoor Outdoor 600. Menpo FDDB [13] AFLW [14]. 68, 39. 39 68. 9,000 68 6,679. [12] 114 114 218,595, 68. 114 64 50. 1, 2, 3 3. 300W-LP 300-W, Menpo,. 1, 2, 3 cat1, 300- VW cat2, cat3. 3. Adam [15], 32. (learning rate) 30. PyTorch [16], GTX1080Ti GPU. (normalized mean error; NME) [1]. NME. NME,. NME (area under curve; AUC). AUC NME. baseline. [1] Facial Alignment Network (FAN) baseline. FAN,. 4.

1: (Do Yeop Kim et al.: Integral Regression Network for Facial Landmark Detection) 300W, Menpo, cat1, cat2, cat3. [1] FAN-Trained, [1] FAN, (ResNet50), (ResNet50-Int). FAN FAN-Int. 2 NME. FAN FAN-Trained., ResNet50 FAN 2. (ResNet50-Int) NME Fig. 2. NME curves of proposed method (ResNet50-Int) and baseline models

570 방송공학회논문지 제24권 제4호, 2019년 7월 (JBE Vol. 24, No. 4, July 2019) 표 2. 각 모델에 대한 AUC(%) 표 3. 각 모델의 프레임 당 처리 시간 (ms) Table 2. AUC (%) of each model Table 3. Processing time per frame (ms) of each model 300W Menpo cat1 cat2 cat3 ResNet50 65.61 65.36 61.46 61.28 55.54 ResNet50-Int 71.71 73.62 72.31 74.96 64.61 Dataset Model Dataset 300W MENPO Cat1 Cat2 Cat3 ResNet50 6.88 6.62 6.39 6.64 6.46 ResNet50-Int 7.20 7.51 6.66 6.49 6.36 FAN-Trained 19.68 20.17 19.26 19.08 18.77 Model FAN-Trained 63.39 63.66 62.10 60.97 48.61 FAN 66.68 65.97 68.99 61.24 55.79 FAN 18.78 19.35 18.94 18.73 18.84 FAN-Int 72.60 74.54 73.01 75.61 67.38 FAN-Int 20.16 20.01 20.35 20.31 21.30 준 듈 성능을 보여 다. 그러나 적분 회귀 모 이 추가된 ResNet 델 준 50-Int 모 은 FAN보다 더 좋은 성능을 보여 다. 한편, FAN-Int의 성능이 FAN의 결과에 비해 크게 향상된 모습을 체 모델 중 가장 뛰어난 성능을 보인다. 표 2는 위 그래프의 성능 차이를 AUC를 사용하여 정량적으로 보 여준다. 제안하는 모델인 ResNet50-Int는 FAN-Int를 제외 한 모든 모델보다 뛰어난 성능을 보인다. ResNet50-Int와 보이며, 전 FAN-Int의 실험 결과로부터 히트맵 회귀 기반 네트워크에 듈 적분 회귀 모 을 적용하는 것이 얼굴 특징점 검출의 성능 킴 증명되었다. 다음으로 우리는 제안하는 모델의 복잡도를 평가하기 위 해서 각 모델들의 수행속도를 측정하였다. 그 결과가 위의 표 3에 나타나 있다. ResNet50 기반의 방법이 FAN에 비해 빠르다는 것을 확인할 수 있다. 이는 FAN 모델의 경우 Hourglass 모듈 이 4개 반복된 구조로서 기본적으로 계산 량이 많기 때문이다. 그리고, 히트맵 회귀 네트워크인 Res Net50과 적분 회귀 모듈을 추가한 ResNet50-Int의 처리 시 을 크게 향상시 이 간의 차이는 모든 테스트 데이터셋에서 1ms 이내이며 결과적 으로 적분 회귀 모듈이 전체 복잡도에 미치는 영향은 매우 작다는 것을 알 수 있다. FAN과 FAN-Int의 처리 시간의 차 이 또한 3ms 이내로 모델의 복잡도에 큰 영향을 주지 않는다. 마지막으로 그림 3은 제안하는 방법을 통해 얻은 얼굴 특징점 결과를 정성적으로 보여준다. 그림 3(a)와 (b)는 각 각 특징점을 성공적으로 검출한 경우와 검출에 실패한 경 우를 보여준다. 성공한 경우 참값에 매우 가까운 특징점을 검출하였으며 턱수염, large-pose, 가리워짐 등에 강인하게 동작한다는 것을 확인할 수 있다. 한편 얼굴이 다른 사물에 의해 매우 많이 가리워진 경우와 얼굴이 크게 회전된 경우 검출에 실패하였다. Ⅴ. 결 론 [17] 본 논문에서는 히트맵 회귀 기반의 얼굴 특징점 검출 방법 그림 3. 제안하는 모델의 얼굴 특징점 검출 결과 (a) 성공적으로 검출한 경우 (b) 검출에 실패한 경우 Fig. 3. Examples of facial landmarks detected by proposed method (a) success cases (b) failure cases

1: (Do Yeop Kim et al.: Integral Regression Network for Facial Landmark Detection).,, state-of-theart. 3. (References) [1] A. Bulat and G. Tzimiropoulos, How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks), IEEE International Conference on Computer Vision, pp. 1021-1030, 2017. [2] X. Sun, B. Xiao, F. Wei, S. Liang, and Y. Wei, Integral human pose regression, European Conference on Computer Vision, pp. 529-545, 2018. [3] Y. Sun, X. Wang, and X. Tang, Deep convolutional network cascade for facial point detection, IEEE Conference on Computer Vision and Pattern Recognition, 2013. [4] E. Zhou, H. Fan, Z. Cao, Y. Jiang, and Q. Yin, Extensive facial landmark localization with coarse-to-fine convolutional network cascade, IEEE International Conference on Computer Vision Workshops, 2013. [5] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, Facial landmark detection by deep multi-task learning, European Conference on Computer Vision, 2014. [6] R. Ranjan, V. M. Patel, and R. Chellappa, Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 121-135. 2019. [7] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016. [8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, pp. 1097-1105, 2012. [9] X. Zhu, Z. Lei, X. Liu, H. Shi, and S. Z. Li, Face alignment across large poses: A 3d solution, IEEE Conference on Computer Vision and Pattern Recognition, pp. 146-155, 2016. [10] C. Sagonas, E. Antonakos, G. Tzimiropoulos, S. Zafeiriou, and M. Pantic, 300 faces in-the-wild challenge: Database and results, Image and Vision Computing, vol. 47, pp. 3-18, 2016. [11] S. Zaferiou, The menpo facial landmark localisation challenge, IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017. [12] J. Shen, S. Zafeiriou, G. G. Chrysos, J. Kossaifi, G. Tzimiropoulos, and M. Pantic, The first facial landmark tracking in-the-wild challenge: Benchmark and results, IEEE International Conference on Computer Vision Workshops, 2015. [13] V. Jain and E. Learned-Miller, Fddb: A benchmark for face detection in unconstrained settings, UMass Amherst Technical Report, 2010. [14] M. Köstinger, P. Wohlhart, P. M. Roth, and H. Bischof, Annotated facial landmarks in the wild: A large-scale, real world database for facial landmark localization, IEEE International Conference on Computer Vision Workshops, 2011. [15] D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, International Conference on Learning Representations, 2015. [16] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, Automatic differentiation in pytorch, Advances in Neural Information Processing Systems Workshops, 2017. [17] A. Newell, K. Yang, and J. Deng, Stacked hourglass network for human pose estimation, European Conference on Computer Vision, pp. 483-499, 2016. - 2019 2 : - 2019 3 ~ : - ORCID : https://orcid.org/0000-0002-0624-5469 - : (Face Landmark Detection, 3D Shape Reconstruction)

(JBE Vol. 24, No. 4, July 2019) - 2001 2 : - 2008 2 : - 2008 2 ~ 2009 1 : Mitsubishi Electric Research Laboratories (MERL) Postdoctoral Researcher - 2009 4 ~ 2011 1 : DMC - 2011 4 ~ 2012 2 : BK - 2012 3 ~ 2017 2 : - 2017 3 ~ : - ORCID : https://orcid.org/0000-0003-3710-7314 - :,