a), a) A Depth-map Coding Method using the Adaptive XOR Operation Kyung Yong Kim a) and Gwang Hoon Park a).,., /. XOR. DCT (H.264/AVC). H.264/AVC BD-PSNR 0.9 db ~ 1.5 db BD-rate 11.8 % ~ 20.8 %. BD-PSNR 0.5 db ~ 0.8 db BD-rate 7.7 % ~ 12.2 %. DCT,. Abstract This paper proposes an efficient coding method of the depth-map which is different from the natural images. The depth-map are so smooth in both inner parts of the objects and background, but it has sharp edges on the object-boundaries like a cliff. In addition, when a depth-map block is decomposed into bit planes, the characteristic of perfect matching or inverted matching between bit planes often occurs on the object-boundaries. Therefore, the proposed depth-map coding scheme is designed to have the bit-plane unit coding method using the adaptive XOR method for efficiently coding the depth-map images on the object-boundary areas, as well as the conventional DCT-based coding scheme (for example, H.264/AVC) for efficiently coding the inside area images of the objects or the background depth-map images. The experimental results show that the proposed algorithm improves the average bit-rate savings as 11.8 % ~ 20.8% and the average PSNR (Peak Signal-to-Noise Ratio) gains as 0.9 db ~ 1.5 db in comparison with the H.264/AVC coding scheme. And the proposed algorithm improves the average bit-rate savings as 7.7 % ~ 12.2 % and the average PSNR gains as 0.5 db ~ 0.8 db in comparison with the adaptive block-based depth-map coding scheme. It can be confirmed that the proposed method improves the subjective quality of synthesized image using the decoded depth-map in comparison with the H.264/AVC coding scheme. And the subjective quality of the proposed method was similar to the subjective quality of the adaptive block-based depth-map coding scheme. Keyword: 3D Video Coding, Depth-map Coding, FTV(Free view-point TV), MVC(Multi-view Video Coding)
Left Center Right 3차원 L 비디오 C 부호화시스템 R LCR 가상시점영상 N- 시점영상출력 깊이정보맵생성 L C R 깊이정보맵 가상시점영상합성 1. 3 N- Fig. 1. N-view system with the 3-view configuration. 3 3.,, 3. 3 HD 3. 3D HDTV,.. 19.4 Mbps.. 3D 3 ISO/IEC MPEG(Moving Picture Experts Group) 3 a) Media Lab., College of Electronics and Information, Kyung Hee University : (ghpark@khu.ac.kr) IT (NIPA-2011-(C1090-1111-0001)). (2010 9 7 ),(1 :2011 1 11,2 :3 7 ), (2011 3 7 )., 3 3. 3 (Depth-map).,. 3D., 3. 1 MPEG 3 N(N>3)- [1].,,,,.,, 3..,..,,
3. 2 HM(Head Mount) TV(FTV; Free view-point TV) [2]. 3 MPEG 3.., 3., 3. DCT(Discrete Cosine Transform) [3].....,... II. (a) (b) (a) (c) (a) (d) (c) (e) (f) (e) (g) (e) (h) (g) 2. "Breakdancers" (a) (e) Fig. 2. Comparison of the subjective picture quality of the synthesized image generated by using the original depth map(a) and the decoded depth map(e) after encoding in the "Breakdancers" sequence
,. (Edge). DCT,.. 2 MPEG Breakdancers. 2(a) Breakdancers 2(e) Breakdancers (Quantization Parameter) '37' H.264( MPEG-4 part 10 AVC; Advanced Video Coding) [4]. 2(c) 2(g) 2(a) (e). MPEG 3 (VSRS, View Synthesis Reference Software) [5,6]., 8 2 (2 ) 2 (1 3 ) (1 3 ). 2(e) ( 2(f)), DCT. 2(g) ( (h)),.. DCT [7,8]. DCT ( 3(A)) ( 3(B)) H.264/ AVC - (Rate-distortion Optimization) [9].. DCT, DCT. 3. 3(A) DCT, H.264/AVC. 3(B),,,. A DCT 기반동영상부호화방법 깊이정보맵블록입력 B 그레이코드변환 비트평면분리 비트평면부호화 다중화 비트스트림 3. [7] Fig. 3. Adaptive block-based depth-map coding method
[10]... (P; Most significant bit-plane) (P; Least significant bit-plane) MPEG-4 Part-2 Visual(ISO/IEC 14496-2) [11] (binary shape coding) [12]. (CAE; Context-based Arithmetic Encoding) [13]. III. [7,8]..,., 4 Breakdancers (16x16 ). 5. ( 5(a) 5(b) 5(c)) ( 5(d) 5(e) 5(f)),., 6 비트평면블록 그레이코드로변환된비트평면블록 (a) 비트평면블록 그레이코드로변환된비트평면블록 (b) 비트평면블록 그레이코드로변환된비트평면블록 4. Breakdancers Fig. 4. Bit-plane analysis of the object boundary block with the depth-map of Breakdancers sequence (c)
(a) 4(a) -5 (b) 4(b) -6 (c) 4(c) (d) 4(a) -5 (e) 4(b) -6 (f) 4(c) 5. Fig. 5. Comparison of the bit-plane's binary image of before and after the gray coding 깊이정보맵블록 XOR( ),. 3 비트평면단위부호화 yes 1 모두 0 인지혹은모두 255 인지에대한모드정보만부호화 비트평면분리 2 적응적 XOR 연산 블록내모든이진영상값이동일한가? no 모드정보부호화와 CAE 이용하여이진영상을부호화 6. XOR Fig. 6. bit-plane unit coding method using the adaptive XOR method 1.. 2. XOR., -(i), -(i-1) XOR ( A ). -(i), -(i-1) XOR ( B ). A B. XOR. 3. 6. 3-1., Step 3-2 Step 3-3. 3-2. 0(0) 255(1). 3-3. (CAE).
XOR DCT, DCT. DCT XOR. IV. XOR, XOR., DCT. 7. 7. DCT ( 7(A)) ( 7(B)) H.264/AVC -.. DCT ( 7(A)) ( 7(B)).. 깊이정보맵블록 DCT 기반부호화 비트평면부호화 A 움직임예측 움직임보상 인터 + _ 변환 양자화 엔트로피부호화 다중화 비트스트림 영상버퍼 참조영상버퍼 인트라예측 인트라 재구성된영상버퍼 디블록킹필터 + + 역변환 역양자화 B 비트평면분리 비트율조절 XOR 연산 비트평면단위부호화 깊이정보맵구성 비트평면결합 XOR 연산 비트평면단위복호화 7. Fig. 7. Encoder block diagram of the proposed method
7(A) DCT H.264/AVC, 16x16.,. (Intra) (Inter). DCT. (1). 7(A),. 7(A),. ( ) (Motion Vector).. (2), ( ) (Residual Block),.,,,,. (2-1) DCT (Transform Coefficient). (Spatial Domain) (Frequency Domain),. (2-2) (Quantized Coefficient). (2-3) DC (Zigzag). (2-4) (Entropy). (CAVLC; Context-adaptive Variable Length Coding) (CABAC; Context-based Adaptive Binary Arithmetic Coding [15] ). (3) DCT. (4) (Inter-frame)..,. /., (Blocking Artifact) ( ). (5),. 7(B) XOR. 16x16. 7(B),, XOR,,.
(1) N- N. (2).,. H.264/AVC -. (3) n -1, -2, -3, -4, -5, -6,,.. (3-1) XOR ( ) XOR. XOR ( ). XOR 8. ( A )., XOR ( B ). A B A B XOR. XOR XOR (1 ), XOR 1 XOR '0' XOR. (3-2) MPEG-4 Part-2 Visual(ISO/IEC 14496-2) [11] [12]. (CAE; Context-based Arithmetic Encoding) [13]. (4). (5),. (m : ), m. (6) XOR XOR. XOR XOR. (7) m M-. (8) M- (N- ). N- (+m+1 ) 0(0). (9) 7(A) DCT 7(B) /,. 현재비트평면을이진부호화한후비트양측정 (A) 시작 A > B yes XOR 연산수행 끝 XOR 연산을수행한비트평면을이진부호화한후비트양측정 (B) 8. XOR Fig. 8. Method deciding whether or not the execution of XOR operation no
s 7 MPEG-4 Part-2 Visual(ISO/IEC 14496-2) [11] (binary shape coding) [12], 9. 현재비트평면블록 모드결정 재구성된영상 비트평면분리 XOR 연산 CAE 부호화다중화비트스트림 9. Fig. 9. Block diagram of the bit-plane encoding method 9. (1),,. 255(1) all_1, 0(0) all_0., intracae CAE [13]. (2) CAE [13]., (Binary Arithmetic Coding). (3), XOR, CAE. 참조비트평면 참조비트평면 XOR 연산 참조비트평면 XOR 연산 XOR 연산 XOR 연산 XOR 연산 참조비트평면 부호화할비트평면 10. Fig. 10. Construction method of the reference bit-plane
9. DCT ( 7(A)) ( 7(B)).. XOR XOR. 10 (-3) XOR CAE,. -3 CAE -3. -3, -3. -3 XOR, XOR. 7 MPEG-4 Part-2 Visual(ISO/IEC 14496-2) [11] (binary shape coding) [12], 11. 11,. (1), XOR, CAE. (2) all_0 all_1,. intra- CAE, CAE [13]. (3) all_0, 0(0). all_1, 255(1). (3) CAE.,.. XOR XOR. 비트스트림 역다중화 재구성된영상 비트평면분리 XOR 연산 CAE 복호화 동일수준블록복호화 재구성된비트평면블록 11. Fig. 11. Decoder block diagram of the bit-plane decoding method 12. 12. 12(C), DCT 12(D). 12(C) XOR.
비트평면복호화 C 비트평면복호화 XOR 연산 비트평면결합 깊이정보맵구성 비트스트림 역다중화 DCT 기반복호화 D 엔트로피복호화 역양자화 역변환 + 디블록킹필터 + 인트라 인트라예측 재구성된영상버퍼 재구성된깊이정보맵블록 인터 움직임보상 참조영상버퍼 영상버퍼 12. Fig. 12. Decoder block diagram of the proposed method (m : ). m. 11. m XOR XOR XOR. M-. M- (N- ) (+m+1 ) 0(0) N-. 12(D) 12(C).. 12(D) DCT (H.264/AVC),,......,. V. H.264/AVC JM(Joint Model) 13.2 [14]
(a) breakdancers (b) ballet (c) champagne_tower (d) breakdancers (e) ballet (f) champagne_tower 13. Fig. 13. Real image and depth-map image DCT (H.264/AVC) [7], 13. 1, I-P-P-P CAVLC CAE, Hierarchical B CABAC(Context-based Adaptive Binary Arithmetic Coding) [15]. 1.. Table 1. Test Condition Ballet,Breakdancers (1024x768, 15Hz), Champagne Tower (1280x960, 30Hz) 100 Frames YUV 4:0:0 22, 27, 32, 37 I-P-P-P-, Hierarchical B CAVLC, CABAC, CAE 14 15 16 DCT (H.264/AVC) [7], PSNR(Peak Signal-to-Noise Ratio) RD(Rate-distortion)-Curve. 2. 14 15, 16, 2,.. Breakdancers Ballet Champagne_ tower PSNR BD-PSNR [16], DCT I-P-P-P 1.3 db, Hierarchical B 1.0 db. I-P-P-P 0.8 db, Hierarchical B 0.6 db. bit-rate BD-bitrate [16]
14. Breakdancers (1 ) : (CAVLC), (CABAC) Fig. 14. Comparison of PSNR results in depth-map(view #1) of Breakdancers sequence: left(cavlc), right(cabac) 15. Ballet (1 ) : (CAVLC), (CABAC) Fig. 15. Comparison of PSNR results in depth-map(view #1) of Ballet sequence: left(cavlc), right(cabac) 16. Champagne_tower (39 ) : (CAVLC), (CABAC) Fig. 16. Comparison of the PSNR results in the depth-map(view #39) of Champagne_tower sequence: left(cavlc), right(cabac)
2. Table 2. Coding efficiency comparison of the depth-map coding methods DCT (H.264/AVC) BD-PSNR (db) CAVLC, I-P-P-P BD-rate (%) [7] BD-PSNR (db) BD-rate (%) DCT (H.264/AVC) BD-PSNR (db) CABAC, Hierarchical B BD-rate (%) [7] BD-PSNR (db) ballet 1.5-20.8 0.8-11.7 0.9-13.3 0.5-7.7 breakdancers 1.3-18.6 0.8-10.3 0.9-14.5 0.5-8.5 BD-rate (%) champagne tower 1.0-11.8 0.7-8.5 1.1-16.9 0.8-12.2 Average 1.3-17.1 0.8-10.2 1.0-14.9 0.6-9.5, DCT I-P-P-P 17.1 %, Hierarchical B 14.9 %. I-P-P-P 10.2 %, Hierarchical B 9.5 %. 3 PSNR. 3 DCT, PSNR. PSNR, PSNR 1 db. 3. PSNR. Table 3. Comparison of the PSNR from the synthesized images. QP H.264/AVC Ballet Breakdancers Champagne tower PSNR (db) PSNR (db) PSNR (db) 29.98 29.76 32.06 [7] H.264/AVC [7] H.264/AVC [7] PSNR (db) PSNR (db) PSNR (db) PSNR (db) PSNR (db) PSNR (db) PSNR (db) PSNR (db) PSNR (db) CAVLC, I-P-P-P 37 29.27 29.23 29.18 28.94 28.88 28.87 32.05 32.07 32.17 32 29.57 29.54 29.54 29.03 29.00 29.00 32.14 32.23 32.27 27 29.74 29.76 29.78 29.08 29.05 29.04 32.18 32.19 32.19 22 29.82 29.84 29.86 29.12 29.12 29.12 32.11 32.15 32.11 QP CABAC, Hierarchical B 37 29.38 29.53 29.30 28.89 28.88 28.86 32.23 32.26 32.24 32 29.58 29.66 29.64 29.03 29.03 28.99 32.28 32.34 32.37 27 29.76 29.78 29.79 29.09 29.10 29.06 32.28 32.28 32.28 22 29.81 29.87 29.86 29.12 29.14 29.11 32.24 32.21 32.16
3 (warping).. Ballet Breakdancers, (arc)., 17 ( 17(b)) MPEG 3 (VSRS, View Synthesis Reference Software) [5,6]. 2 (2 ), 2 (1 3 ) (1 3 ). 17(c) ( 17(a)) ( 17(b)). 17(a) 17(b), ( 17(c)).. (PSNR). 3., DCT, 18. 18 2, 2 (1 3 ) (1 3 ). DCT ( 18(c), 18(f), 18(i)).. ( 18(e), 18(h), 18(k)). DCT. ( 18(e), 18(h), 18 (a) 2 (b) 2 (c) 2 17. 'Breakdancers' 2 Fig. 17. Comparison of the real image and the synthesized image and the subtracted image of the two images in the Breakdancers sequence(view #2).
(a) DCT (b) (c) (d) DCT (e) (f) (g) DCT (h) (i) 18. 'Breakdancers' 2 Fig. 18. Comparison of the subjective quality of the synthesized image(view #2) in the Breakdancers sequence (k)) ( 18(d), 18(g), 18(j)). VI. 3
. / XOR. DCT.,,. XOR, DCT... [1] ISO/IEC JTC1/SC29/WG11, Draft Report on Experimental Framework for 3D Video Coding, N11478, Geneva, Switzerland, July 2010. [2] M. Tanimoto, Overview of Free Viewpoint Television, Signal Processing: Image Communication, Vol. 21, No. 6, pp.454-461, July 2006. [3] A. Smolic, K. Mueller, N. Stefanoski, J. Ostermann, A. Gotchev, G.B. Akar, G.A. Triantafyllidis and A.Koz: Coding Algorithms for 3DTV - A Survey, IEEE Trans. οn Circuits and Systems for Video Technology, Vol. 7, Issue 11, pp. 1606-1621, November 2007. [4] ITU-T Recommendation H.264 and ISO/IEC 14496-10 (MPEG-4 Part 10 AVC), Advanced Video Coding for Generic Audiovisual Services, Version 1: March 2003, Version 2: May 2004, Version 3: March 2005, Version 4: September 2005, Version 5 and Version 6: June 2006, Version 7: April 2007, Version 8: July 2007. [5] Y. Mori, N. Fukushima, T. Yendo, T. Fujii, M. Tanimoto, View generation with 3D warping using depth information for FTV, Image Communication, Vol. 24 No. 1-2, pp. 65-72, January 2009. [6] M. Tanimoto, T. Fujii, View synthesis algorithm in view synthesis reference software 2.0 (VSRS 2.0), ISO/IEC JTC 1/SC29/WG11 M16090, Lausanne, Switzerland, February 2009. [7] K.Y. Kim, G.H. Park, D.Y. Suh, Adaptive Depth-map Coding for 3D-Video, IEICE Trans. on INF. & SYST., Vol. E93-D, No. 8, pp. 2262-2272, August 2010. [8] K.Y. Kim, G.H. Park, D.Y. Suh, Bitplane-based lossless depth-map coding, SPIE Optical Engineering, Vol. 49, No. 6, 067403, June. 2010. [9] T. Wiegand, H. Schwarz, A. Joch, F. Kossentini, and G.J. Sullivan, Rate-Constrained Coder Control and Comparison of Video Coding Standards, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 688-703, July 2003. [10] J. R. Bitner, G. Ehrlich, and E. M. Reingold, Efficient generation of the binary reflected gray code and its applications, Commun. ACM, Vol. 19, No. 9, pp.517-521, September 1976. [11] ISO/IEC 14496-2 (MPEG-4 Visual), Coding of Audio-Visual Objects - Part 2: Visual, Version 1: April 1999, Version 2: February 2000, Version 3: May 2004. [12] N. Brady, and F. Bossen, Shape compression of moving objects using context-based arithmetic encoding, Signal Processing: Image Communication, Vol. 15, No. 7, pp. 601-617, May 2000. [13] N. Brady, F. Bossen, and N. Murphy, Context-based arithmetic encoding of 2D shape sequences, Special session on shape coding (ICIP '97), Vol. 1, pp. 29-32, 1997. [14] F. Heinrich-Hertz-Institut, H.264 Reference Software Version JM13.2, http://iphome.hhi.de/suehring/tml, May 2008. [15] D. Marpe, H. Schwarz, and T. Wiegand, Context-Based Adaptive Binary Arithmetic Coding in the H.264/AVC Video Compression Standard, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 620-636, July 2003. [16] G. Bjøntegaard, "Calculation of average PSNR differences between RD-curves," ITU-T SG16 Q.6, VCEG-M33, Texas, USA, April 2001.
- 2007 2 : - 2009 2 : - 2009 3 ~ : - :,, - 1985 2 : - 1987 7 : - 1991 1 : Case Western Reserve University, Dept. of EEAP - 1995 1 : Case Western Reserve University, Dept. of EEAP - 1995 3 ~ 1997 2 : - 1997 3 ~ 2001 2 : - 2001 3 ~ : - :,,,,