MPEG G3 차원비디오 압축부호화표준화작업 2010. 08. 18. 호요성
발표내용 방송기술의발전과 3 차원 TV 3 차원비디오압출을위한 MPEG 표준화작업 3D Audio Visual (3DAV) Multi-view Video Coding (MVC) Free Viewpoint TV (FTV) 3D Video Coding (3DVC) 맺음말
방송기술의발전추세 Quality 2012 년 ~ 고화질실감 TV ~ 2001 년 2002 년 ~ 2006 년 2007 년 ~ 2011 년 디지털 HDTV UDTV 방송 입체 TV 방송 실감방송 T V 컬러 TV 디지털 TV DCATV 지능형방송 IPTV 흑백 TV DMB 음향 Radio DAB Interactivity 단순시청형정보선택형정보맞춤형정보창조형
TV 기술의혁명 3DTV 흑백 TV 칼라 TV 디지털 TV HDTV UHDTV
차세대방송기술 실감형방송서비스 고화질방송 + 3 차원방송
3 차원영상시대 3 차원디스플레이장치의상용화 3 차원영화의흥행성공
3D Audio-Visual
3D Audio-Visual (3DAV) 3D Video 3 차원영상의표현방법과부호화형식 영상획득장치의기하학적정보를포함 주요특징 상호작용 (interactivity): 자유로운시점변환 자연스러운영상 (photo-realistic image)
3DAV 표준화작업 2001 /12 2002 /05 2002 /12 2003 /10 2004 /10 time First Proposal on 3D Video EEs on 3DAV 3DAV Seminar CfC on 3DAV CfE on MVC 3D Video 3DAV activities Applications and requirements Representation format and camera parameters Expolation Experiemnts on 3DAV
3DAV 탐색실험 기존의 MPEG 기술과연동가능성을실험함 EE1: Omni-directional video EE2: FTV/Free viewpoint video EE3: Stereoscopic video (coding efficiency test) EE4: Stereoscopic video (depth based rendering)
EE1: 전방향비디오 전방향비디오영상을 3D Mesh 를이용하여표현하는방법을연구 표현형식 = 비디오 + 텍스쳐매핑정보 Mirror Camera Polyhedral mesh model 전방향카메라 전방향영상
전방향비디오 Hyperboloidal Mirror Focal Point Mirror Image Plane Camera Camera Center Video Cameras Omni-directional Image
Concentric Mosaic Extensionofof planar 2D image plane to spherical or cylindrical image plane
EE2: FTV / FVV 자유시점 TV / 자유시점비디오 영상기반렌더링 : 좁은카메라간격 모델기반렌더링 : 넓은카메라간격
자유시점비디오
광선공간표현
EE3: 양안식비디오 Depth Range Positive parallax Screen Negative parallax convergence point 좌영상 우영상
양안식비디오부호화 MPEG-4 4MAC 시스템을이용한양안식비디오부호화성능실험 Disparity Disparity Estimation Estimation Left-image Left-image Disparity Disparity Map Map AC[0] MPEG-4 MPEG-4 Encoder Encoder Right-image Right-image PSNR Encoder side Decoder side Reconstructed Reconstructed Right-image + Right-image Reconstructed Reconstructed Disparity Map Disparity Map Reconstructed Reconstructed leftleftim im a ag g e e MPEG-4 MPEG-4 Decoder Decoder
2D Video + Depth Map Color Video DepthMap
EE4: 깊이맵부호화 MPEG-4 시스템을이용한깊이맵부호화실험 컬러영상마스크깊이맵
Multi-view Video Coding
다시점비디오부호화 Multi-view i Video Coding (MVC) z Decoded Multiple Views Scope of MPEG Standardization Multiple Cameras x From Y Axis MVC Bitstream Multiview Video Encoder Multiview Video Decoder Displays 2D TV/HDTV Stereo Display Multi-view Display
MVC 표준화작업 2005 /01 2005 /04 2005 /07 2006 /01 2006 /04 2006 /07 time First Draft CfP Second Draft CfP CfP on MVC Evaluation of Proposals Core Experiments MVC Work in JVT MVC works in MPEG MVC works in JVT Test Data Fix Test Conditions MVC Work
다시점비디오카메라배열 1D Parallel with 8 Cameras 2D Parallel with 5 Cameras Convergent 4 cameras 1D Arc with 8 Cameras 2D Array with 128 Cameras Divergent 4 cameras
응용분야 : 3DTV VIEW-1 TV/HDTV VIEW-1 TV/HDTV VIEW-2 VIEW-3 TV/HDTV VIEW-2 VIEW-3 TV/HDTV VIEW-3 Stereo system Channel Multi Multi-view view video video encoder encoder Multi Multi-view view video video decoder decoder VIEW-3 Stereo system Channel Multi Multi-view view video video encoder encoder Multi Multi-view view video video decoder decoder - - - Multi-view - - - Multi-view - - VIEW-N 3DTV - - VIEW-N 3DTV 3DTV 3DTV
응용분야 : 자유시점 TV Provides The ability to change viewpoint freely Multiple l Views Available Render one view (real or virtual) to legacy 2D display Useful lfor Surveillance, broadcast TV and stored interactive video
MVC 기술적문제 영상획득 N cameras (N=2-128 or more) How to calibrate multiple cameras? How to translate and rotate each camera? Elaborate camera control system is required (Hardware) All the camera parameters should be stored 데이터크기 The huge amount of data Raw data rate with no compression (example) VGA color video, 8 views, 30 fps, 10 sec. 1024 x 768 x 3 bytes (R, G, and B) x 8 views x 30 fps x 10 sec. = 5,662 Gbytes For 1 multi-view video, 8 CDs are required If the image resolution is HD (1920 x 1080)? Over 14Gb Gbytes for a single 10 sec. multi-view i video More than 3 DVDs for storage only
MVC 기술적문제 시점간색상불일치 카메라간동기화 방대한양의데이터전송 실시간재현, 고속알고리즘 다시점비디오를이용한자유시점영상재현
MVC 요구사항 Requirements for MVC shall (Mandatory) should (Desirable) Compression efficiency View scalability/free viewpoint scalability Backward compatibility Low delay Resolution, bit depth, chroma sampling format Temporal random access/view random access Resource management Parallel processing Spatial/Temporal/SNR scalability Resource consumption Robustness Picture quality among views Spatial random access
MVC 요구사항 System Requirements shall (Mandatory) Synchronization should (Desirable) View generation Non-planar imaging and display systems Camera parameters
다시점비디오영상 테스트영상 (8 개 ) Data Set Sequences Image Property Camera Arrangement MERL (m12077) Ballroom and Exit 640x480, 25fps (rectified) 8 cameras with 20cm spacing 1D/parallel HHI (m11894) Uli 1024x768, 25fps 8 cameras with 20cm spacing (non-rectified) 1D/parallel l convergent KDDI (m10533) Race 1 Flamenco2 640x480, 30fps (non-rectified) 640x480, 30fps (non-rectified) 8 cameras with 20cm spacing 1D parallel 5 cameras with 20cm spacing 2D parallel (cross) Microsoft Research Nagoya Univ. (m12022) Breakdancers Rena Akko&Kayo 1024x768, 15fps (non-rectified) 640x480, 30fps (rectified) 8 cameras with 20cm spacing 1D/arc 100 cameras with 5cm spacing 1D/parallel 640x480, 30fps 100 cameras with 5cm horizontal and (non-rectified) 20 cm vertical spacing; 2D array
MVC 예측구조 MVC using H.264/AVC Fully compatible to H.264/MPEG-4 AVC Uses hierarchical-b pictures combined in interview and temporal dimension Reorganization of input images into a single stream prior to encoding Inter-view-temporal prediction structure based on AVC, using hierarchical B pictures Associated reordering of multi-view input for compression with AVC
부호화결과 (1) Performance Curves Anchor: AVC anchor coding results, as described in CfP (green curve) Simulcast: results using hierarchical B pictures only (blue curve) Inter-view Prediction: results of the MVC reference model combining interview prediction with hierarchical B pictures (red curve) Ballroom Exit
부호화결과 (2)
부호화결과 (3)
Summary: Coding Results
3D Video Coding
3차원비디오 (3DV) 3DV 주요기능 자유로운시청시점선택 Auto-stereoscopic 재생장치를이용 다시점비디오와깊이영상의부호화기술 다시점깊이영상이용 디스플레이의형식에맞는가상시점영상생성 3 차원워핑 (warping) 을이용한시점이동
3DV 기본구조
Vision of 3D Video 3 차원비디오포맷 스테레오디스플레이뿐만아니라다시점디스플레이장치까지도지원할수있는보다진보되고상호사용이가능한기술들을포함 제한된수의카메라를이용하여스테레오디스플레이와다시점디스플레이에모두사용할수있어야함 Limited Camera Inputs Data Format Constrained Rate (based on distribution) ib i Stereoscopic displays Variable stereo baseline Adjust depth perception Data Format Left Right Auto-stereoscopic N-view displays Wide viewing angle Large number of output views
2D+ 깊이영상 vs. 다시점비디오 2D+ 깊이영상과다시점비디오성능비교 2D+ 깊이영상 : 현재의비디오포맷과호환이가능하지만, 시야각 (viewing angle) 이좁고폐색영역 (occlusion) 을처리하기어려움 다시점비디오 : 넓은시야각을제공할수있지만, 데이터양이카메라개수에비례하여증가하기때문에효율적인부호화기술필수적 Simulcast Bit Rate 3DV should be compatible with: existing standards mono and stereo devices existing or planned infrastructure MVC 3DV 2D 2D+Depth 3D Rendering Capability
3DV 표준화작업 2007 /04 2008 /01 2008 /04 2008 2010 /07 /04 time Request for FTV Work Call for 3D Test Data EEs on 3DV Vision on Preparing 3DV for CfP FTV/3DV Applications and requirements on 3DV Viewing i test for evaluation Updating DERS and VSRS
3 차원비디오주요기능 자유로운시청시점선택 다시점재현장치를이용한입체화면재현 Pos1 R L Pos2 Pos3 R L R L MV 3D Display V1 V2 V3 V4 V5 V6 V7 V8 V9 DIBR DIBR V1 D1 V5 D5 V9 D9 Decoded MVD Data
FTV 표준화주요이슈 FTV 데이터형식 Hardware-independent FTV data format 디코더모듈 Light decoder 보간모듈 To guarantee QoS (Quality of Service)
FTV 표준화현황 표준화의기술적검토를위한환경구축단계 주요기술의기준 SW 를요청한단계 Call for Test Material (CfT) 3DV needs multi-view video and its depth video Test data : multi-view video Depth map generation: DERS View synthesis: VSRS
다시점비디오획득과정 Camera Setting Color Correction Capturing Camera Calibration Multi-view Image Image Video Cropping Rectification Multiple Camera Array 1D parallel camera rig Camera distance: 5~6.5 cm
3 차원비디오테스트영상 6 개기관, 13 개의영상 영상이름제공기관시점수해상도특이사항 Pantomime Nagoya Univ. 80 1280x960 Champagne_tower Nagoya Univ. 80 1280x960 Book_arrival HHI 16 1024x768 Lovebird1 ETRI 12 1024x768 Newspaper GIST 9 1024x768 Mobile Philips 720x540 Beer Garden Philips 1920x1080 Kendo Nagoya Univ. 8 1024x768 Moving camera Balloons Nagoya Univ. 8 1024x768 Moving camera Café GIST 5 1920x1080 Multiview color + depth Poznan_hall Poznan Univ. 9 1920x1080 Moving camera Poznan_ street Poznan Univ. 9 1920x1080 Poznan_carpark Poznan Univ. 9 1920x1080
3DV 테스트영상다운로드 3DV Test Sequence 접속정보 영상이름제공기관다운로드사이트접속 Pantomime Nagoya Univ. http://www.tanimoto.nuee.nagoyatanimoto nuee nagoya- usr: mpegftv Champagne_tower Nagoya Univ. u.ac.jp/mpeg/mpeg_ftv.html password: fngoyftv Book_arrival HHI ftp.hhi.de/hhimpeg3dv/ Lovebird1 ETRI ftp://203.253.128.142 Newspaper GIST ftp://203.253.128.142 Usr: mpeg3dv Pwd: Cah#K9xu Usr: 3DV Pwd: 3dvkr Usr: 3DV Pwd: 3dvkr Mobile Beer Garden Philips Philips ftp.ehv.campus.philips.com/ee4 Pwd: Dmki724 Kendo Balloons Nagoya Univ. Nagoya Univ. (TBD) (TBD) Café GIST ftp://203.253.128.142 Usr: 3DV Pwd: 3dvkr Poznan_hall Poznan_street Poznan Univ. Poznan Univ. (TBD) (TBD) Poznan_carpark Poznan Univ. (TBD) (TBD)
3DV 탐색실험 EE1: Depth Estimation 깊이영상생성방법개발 현재 depth estimation reference software (DERS) 5.0 배포 성능개선중 EE2: View Synthesis 깊이영상을이용한중간시점영상합성기술 현재 view synthesis reference software (VSRS) 3.5 배포 EE3: Layered Depth Image 계층적깊이영상을이용한중간시점영상합성 현재중단되었음 EE4: Coding Experiments Call for Proposal(CfP) 를위한사전실험진행 테스트영상을더수집하여진행
3DV 향후표준화계획 제 91 차미팅 (2010 년 1 월 ) 깊이영상생성과중간시점합성실험 테스트영상에대한부호화실험 모든테스트영상에대한재평가 제 93 차미팅 (2010 년 7 월 ) 최종테스트영상선정 ( 색상영상 + 깊이영상 ) 예비 anchor 코딩수행 제 94 차미팅 (2010 년 10 월 ) 테스트영상에대한 anchor 코딩마무리 Draft CfP 제 95 차미팅 (2011 년 01 월 ) 최종 CfP
깊이맵생성 두시점이상의영상을이용한깊이정보추정 Multiview depth map estimation Multiview camera system Multiview images Multiview depth map
깊이영상탐색의중요성 Essential for arbitrary view synthesis Practical limitations of free-viewpoint functionality Distance between cameras Huge cost of cameras Unnatural viewpoint change
깊이맵생성소프트웨어 깊이영상 카메라로부터각화소의거리정보 자유로운시점의선택을위한중간시점생성에사용
깊이맵생성소프트웨어 Nagoya 대학에서모든 SW 를통합 Temporal Enhancement Segmentation based Depth Estimation Semi-automatic Depth Estimation Stereo Matching Algorithm 이용 Graph Cuts Algorithm 을이용하여변이값정제 DERS S/W 다운로드 h // 11 29 / / /MPEG http://wg11.sc29.org/svn/repos/mpeg- 4/test/trunk/3D/depth_estimation/DERS/DERS
시간적상관도향상 깊이맵의시간적상관도를높이는기술 Separately estimate the depth value for frame by frame Different depth for the same region causes flickering artifact <11 st frame> <12 nd frame> <13 rd frame>
영상분할을이용한깊이맵생성 적용기술 mean shift algorithm phyramid segmentation K mean clustering < Mean shift algorithm> < Phyramid segmentation> < K mean clustering >
인접시점깊이영상참조방법 이미탐색한인접시점의깊이영상을현재시점의깊이영상탐색에이용 구하고자하는시점의깊이영상 인점시점 현재시점
중간시점영상생성 두개의시점을이용한중간시점영상생성 참조시점 : # 1, 3, 5, 7, 생성시점 : # 2, 4, 6, 생성한영상은 Stereo 모니터를이용하여평가 View Synthesis View Synthesis View_#i-1 Virtual View View _#i Virtual View View _#i+1
영상합성절차 DIBR based view synthesis method Left view Right view Texture image Depthimage Depthimage Texture image Depth preprocessing Depth-based 3D warping Depth-based 3D warping Depth-based histogram matching Depth-based histogram matching Base and assistant view blending Depth-based in-painting Final synthesized view
3차원워핑 (warping) Depth based 3D Warping Direct texture warping causes contours caused by round-off errors 3D warp the depth image instead of texture image Median filtering 3D warped depth image
영상합성소프트웨어 Two Approaches 3D warping based view synthesis: VSRS 1.0 -> VSRS 2.0 Disparity based view synthesis: ViSBD 1.0 -> ViSBD 2.1 Integrated version: VSRS 3.0 Latest version: VSRS 3.5 Download: http://wg11.sc29.org/svn/repos/mpeg- g p 4/test/tags/3D/view_synthesis/VSRS_3_5 General Mode 1D Mode Providerofprototype of Nagoya University Thomson Viewpoint Shifting Method 3D warping Disparity based viewpoint shifting Sub-pel Precision Valid Valid
요약 3 차원오디오- 비주얼 (3DAV) 3 차원입체감을제공하는오디오와비디오기술 전방향비디오와양안식영상부호화 다시점비디오부호화 (MVC) 시점간상관도를고려한다양한기술이검토됨 계층적 B 화면예측구조를기본으로사용 3 차원비디오 (3DV) 깊이맵을포함하는 3 차원비디오부호화 깊이맵추정과중간시점영상합성기술이검토 향후 2 년안에표준안이제정될것으로보임
감사합니다 호요성교수 062-970-2211, 010-3162-3669 hoyo@gist.ac.kr http://vclab.gist.ac.kr/ 실감방송연구센터 062-970-2263 2263 http://rbrc.gist.ac.kr/ 광주과학기술원 http://www.gist.ac.kr/
두양사 02-3417-4417 www.dooyangsa.co.kr 참고문헌