Microsoft PowerPoint - Image Processing_Standards.ppt

Similar documents

PowerPoint 프레젠테이션

MPEG-4 Visual & 응용 장의선 삼성종합기술원멀티미디어랩

DBPIA-NURIMEDIA

a), b), c), b) Distributed Video Coding Based on Selective Block Encoding Using Feedback of Motion Information Jin-soo Kim a), Jae-Gon Kim b), Kwang-d

8-VSB (Vestigial Sideband Modulation)., (Carrier Phase Offset, CPO) (Timing Frequency Offset),. VSB, 8-PAM(pulse amplitude modulation,, ) DC 1.25V, [2

example code are examined in this stage The low pressure pressurizer reactor trip module of the Plant Protection System was programmed as subject for

FMX M JPG 15MB 320x240 30fps, 160Kbps 11MB View operation,, seek seek Random Access Average Read Sequential Read 12 FMX () 2

(JBE Vol. 21, No. 3, May 2016) HE-AAC v2. DAB+ 120ms..,. DRM+(Digital Radio Mondiale plus) [3] xhe-aac (extended HE-AAC). DRM+ DAB HE-AAC v2 xhe-aac..

(JBE Vol. 20, No. 6, November 2015) (Regular Paper) 20 6, (JBE Vol. 20, No. 6, November 2015) ISSN

1. 3DTV Fig. 1. Tentative terrestrial 3DTV broadcasting system. 3D 3DTV. 3DTV ATSC (Advanced Television Sys- tems Committee), 18Mbps [1]. 2D TV (High

08김현휘_ok.hwp

[ReadyToCameral]RUF¹öÆÛ(CSTA02-29).hwp

<4D F736F F F696E74202D204D504547B1B9C1A6C7A5C1D8C8AD5FC8A3BFE4BCBA BC8A3C8AF20B8F0B5E55D>

Microsoft PowerPoint - ch03ysk2012.ppt [호환 모드]

À±½Â¿í Ãâ·Â

Microsoft PowerPoint - AC3.pptx

High Resolution Disparity Map Generation Using TOF Depth Camera In this paper, we propose a high-resolution disparity map generation method using a lo

歯I-3_무선통신기반차세대망-조동호.PDF

°í¼®ÁÖ Ãâ·Â

그림 2. 최근 출시된 스마트폰의 최대 확장 가능한 내장 및 외장 메모리 용량 원한다. 예전의 피쳐폰에 비해 대용량 메모리를 채택하고 있지 만, 아직 데스크톱 컴퓨터 에 비하면 턱없이 부족한 용량이다. 또한, 대용량 외장 메모리는 그 비용이 비싼 편이다. 그러므로 기존

○ 제2조 정의에서 기간통신역무의 정의와 EU의 전자커뮤니케이션서비스 정의의 차이점은

., 3D HDTV. 3D HDTV,, 2 (TTA) [] 3D HDTV,,, /. (RAPA) 3DTV [2] 3DTV, 3DTV, DB(, / ), 3DTV. ATSC (Advanced Television Systems Committee) 8-VSB (8-Vesti

(JBE Vol. 21, No. 1, January 2016) (Regular Paper) 21 1, (JBE Vol. 21, No. 1, January 2016) ISSN 228

APOGEE Insight_KR_Base_3P11

02손예진_ok.hwp

방송공학회논문지 제18권 제2호

Page 2 of 6 Here are the rules for conjugating Whether (or not) and If when using a Descriptive Verb. The only difference here from Action Verbs is wh

2 : (JEM) QTBT (Yong-Uk Yoon et al.: A Fast Decision Method of Quadtree plus Binary Tree (QTBT) Depth in JEM) (Special Paper) 22 5, (JBE Vol. 2

歯AG-MX70P한글매뉴얼.PDF

歯15-ROMPLD.PDF

<31325FB1E8B0E6BCBA2E687770>

김기남_ATDC2016_160620_[키노트].key

CD-RW_Advanced.PDF

CONTENTS INTRODUCTION CHARE COUPLED DEVICE(CCD) CMOS IMAE SENSOR(CIS) PIXEL STRUCTURE CONSIDERIN ISSUES SINAL PROCESSIN

,. 3D 2D 3D. 3D. 3D.. 3D 90. Ross. Ross [1]. T. Okino MTD(modified time difference) [2], Y. Matsumoto (motion parallax) [3]. [4], [5,6,7,8] D/3

Contents Why DMB? When DMB? Where DMB? What DMB? Who DMB? How DMB? Demonstration Conclusion 2/ 27

을 할 때, 결국 여러 가지 단어를 넣어서 모두 찾아야 한다는 것이다. 그 러나 가능한 모든 용어 표현을 상상하기가 쉽지 않고, 또 모두 찾기도 어 렵다. 용어를 표준화하여 한 가지 표현만 쓰도록 하여야 한다고 하지만, 말은 쉬워도 모든 표준화된 용어를 일일이 외우기는

airDACManualOnline_Kor.key

04-다시_고속철도61~80p

1 : HEVC Rough Mode Decision (Ji Hun Jang et al.: Down Sampling for Fast Rough Mode Decision for a Hardware-based HEVC Intra-frame encoder) (Special P

38이성식,안상락.hwp

09권오설_ok.hwp

09È«¼®¿µ5~152s

300 구보학보 12집. 1),,.,,, TV,,.,,,,,,..,...,....,... (recall). 2) 1) 양웅, 김충현, 김태원, 광고표현 수사법에 따른 이해와 선호 효과: 브랜드 인지도와 의미고정의 영향을 중심으로, 광고학연구 18권 2호, 2007 여름

4 CD Construct Special Model VI 2 nd Order Model VI 2 Note: Hands-on 1, 2 RC 1 RLC mass-spring-damper 2 2 ζ ω n (rad/sec) 2 ( ζ < 1), 1 (ζ = 1), ( ) 1

<30362E20C6EDC1FD2DB0EDBFB5B4EBB4D420BCF6C1A42E687770>

SchoolNet튜토리얼.PDF

Intra_DW_Ch4.PDF

슬라이드 제목 없음

07.045~051(D04_신상욱).fm

01이국세_ok.hwp

PJTROHMPCJPS.hwp

Microsoft Word doc

2011년 10월 초판 c 2011 Sony Corporation. All rights reserved. 서면 허가 없이 전체 또는 일부를 복제하는 것을 금합니다. 기능 및 규격은 통보 없이 변경될 수 있습니다. Sony와 Sony 로고는 Sony의 상표입니다. G L

63-69±è´ë¿µ

VZ94-한글매뉴얼

(Exposure) Exposure (Exposure Assesment) EMF Unknown to mechanism Health Effect (Effect) Unknown to mechanism Behavior pattern (Micro- Environment) Re

5 : HEVC GOP R-lambda (Dae-Eun Kim et al.: R-lambda Model based Rate Control for GOP Parallel Coding in A Real-Time HEVC Software Encoder) (Special Pa

<B3EDB9AEC1FD5F3235C1FD2E687770>

<313120C0AFC0FCC0DA5FBECBB0EDB8AEC1F2C0BB5FC0CCBFEBC7D15FB1E8C0BAC5C25FBCF6C1A42E687770>

2 라이선스 라이선스 돌비 래버러토리스의 허가를 얻어 제조한 제품입니다. 돌비 및 더블 D 심볼은 래버러토리스의 상표입니다. DivX 비디오에 관하여 DivX 는 Rovi Corporation 의 자회사 DivX, LLC가 개발한 디지털 비디오 포맷입니다. 본 제품은

강의지침서 작성 양식

- 이 문서는 삼성전자의 기술 자산으로 승인자만이 사용할 수 있습니다 Part Picture Description 5. R emove the memory by pushing the fixed-tap out and Remove the WLAN Antenna. 6. INS

Microsoft Word - HD-35 메뉴얼_0429_.doc

PowerPoint 프레젠테이션

歯이시홍).PDF

03이승호_ok.hwp

1217 WebTrafMon II

01박기준.hwp

Left Center Right 3차원 L 비디오 C 부호화시스템 R LCR 가상시점영상 N- 시점영상출력 깊이정보맵생성 L C R 깊이정보맵 가상시점영상합성 1. 3 N- Fig. 1. N-view system with the 3-view configuration.

DBPIA-NURIMEDIA

12È«±â¼±¿Ü339~370

(JBE Vol. 20, No. 2, March 2015) (Special Paper) 20 2, (JBE Vol. 20, No. 2, March 2015) ISSN

1 : MPEG-DASH MMT (MinKyu Park et al.: MMT-based Broadcasting Services Combined with MPEG-DASH) (Regular Paper) 20 2, (JBE Vol. 20, No. 2, Marc

Transcription:

International Standards for Image/Video Coding 위원회

Multimedia Everywhere Towards Multimedia : Computer Consumer Electronics Multimedia Tele- Communication Broadcasting 위원회

Still Picture Compression Standards 1980 : ITU-T T.4 : G3 FAX for PSTN Modified Huffman and Modified READ 1984 : ITU-T T.6 : G4 FAX for ISDN Modified MR 1992 : JPEG (ISO 10918, ITU-T T.81) : Color Still Pictures used for Color Fax, Electronic Still Camera, Color Printer, Computer Applications etc Lossless/Lossy Modes, Baseline/Extended Modes, Progressive/Sequential Modes DPCM + DCT + Q + RLE + Huffman/Arithmetic Codes Motion JPEG can be used for Moving Pictures. 1993 : JBIG (ISO 11544, ITU-T T.82) : Bi-level Pictures Improvement on T.4 and T.6 Recently: JPEG-LS, JBIG2, etc 위원회

Moving Picture Compression Standards 1982 : ITU-R BT.601 : Studio Quality PCM Component Video Common to 525/60 and 625/50 Systems 13.5 MHz Sampling, 8 bit/sample, 4:2:2 Format 1990 : ITU-T H.261 : Video Phone/Conference Application via ISDN Bitrate = p x 64 kbps, p = 1-30 MC DPCM + DCT + Q + RLE + Huffman Codes Reference Model 1-8 1992 : MPEG-1 Video : DSM Applications (e.g. Video CD) Bitrate = 1.5 Mbps MC DPCM + DCT + Q + RLE + Huffman Codes GOP Structure for Random Access and Error Recovery (I, P, B Frames) Simulation Model 1-3 위원회

Moving Picture Compression Standards (Continued) 1994 : MPEG-2 Video (ISO 13818-2, ITU-T H.262) : Generic Algorithm for Various Applications (Broadcasting, Communication, Network, DSM etc) 5 Profiles of Functionality (Simple, Main, Spatial Scalable, SNR Scalable, High) 4 Levels of Resolution (Low, Main, High-1440, High) Deals with Interlaced Scan as well as Progressive Scan Field/Frame ME & DCT, Dual Prime ME, Intra VLC, Altenate Scan, Nonuniform Q, etc 1993 : ITU-R CMTT.721 : 140 Mbps Contribution Quality Video Adaptive DPCM, Componentwise 1993 : ITU-R CMTT.723 : 34-45 Mbps Contribution Quality Video MC DPCM + DCT + Q + RLE + Huffman Codes 위원회

Moving Picture Compression Standards (Continued) 1995 : ITU-T H.263 : Videophone via PSTN Bitrate < 64 kbps (V.34 modem = 33.6 kbps, Recent modem = 56 kbps) Improved version of H.261 1998 : MPEG-4 Bitrates < 2 Mbps Targets: Multimedia data base access Wireless multimedia communication Components of H.263 are incorporated Content-based compression Synthetic and natural video/audio Multiple tools/algorithms/profiles => Flexibility 1999 : MPEG-4 Version 2, MPEG-7 위원회

Bilevel image compression standards ITU-T recommendation T.4(G3 Fax) and T.6(G4 Fax) Application : facsimile(transmission of bilevel documents) Coding scheme - G3 : 1-D nonadaptive run-length + Huffman 2-D nonadaptive run-length + Huffman - G4 : 2-D nonadaptive run-length + Huffman References - G3: ITU-T Recommendation T.4, Standardization of Group 3 Facsimile Apparatus for Document Transmission, - G4: ITU-T Recommendation T.6, Facsimile Coding Scheme and Control Functions for Group 4 Facsimile Apparatus. - Rafael C. Gonzalez, Richard E. Woods Digital Image Processing, Addison Wesley, 1992 - Anil K. Jain, Fundamentals Of Digital Image Processing, Prentice-Hall, 1989 위원회

One-dimensional coding scheme 1-D run-length + Huffman Data - Each code word : all white or all black 00000000111000001111000000000000 8W 3B 5W 4B 12W - Column synchronization begining of all data lines : a white run-length code Coding algorithm - Run length 0 ~ 63 : terminating code(modified Huffman code) 64 ~ 2560 : the largest makeup code word (not exceeding the run-length) plus terminating code End-of-line (000000000001) - End of each line - First line of a page - Six consecutive EOL : the end of a document transmission 위원회

Terminating codes 위원회

Makeup codes 위원회

Two-dimensional coding scheme 2-D run-length coding scheme Principle - the position of each transition is coded with respect to the position of a reference element a 0 - Similar to RAC (Relative Address Coding) Definition of changing picture elements a 0 : The reference or starting changing element on the coding line At the start of the line, a 0 is set on an imaginary white changing element a 1 : The next changing element to the right of a 0 on the coding line a 2 : The next changing element to the right of a 1 on the coding line b 1 : The first changing element on the reference line to the right of a 0 and of opposite color to a 0 b 2 : The next changing element to the right of b 1 on the reference line Reference line Coding line 위원회

Coding mode Pass mode -b 2 lies to left of a 1 -next a 0 : the element of the coding line below b 2 - code word : 0001 Vertical mode - a 1 b 1 3 -next a 0 : current a 1 - code word : defined in 2-D code table 위원회

Coding mode (cont.) Horizontal mode - a 1 b 1 > 3 -next a 0 : current a 2 - code word : 001 + M(a 0 a 1 ) + M(a 1 a 2 ) M(a x a y ) : distance a x a y is coded by termination and makeup codes of 1-D compression 위원회

Two-dimensional code table 위원회

Modified READ algorithm 위원회

Continuous-tone still image compression standards JPEG(Joint Photographic Experts Group) Applications : color FAX, digital still camera, multimedia computer, internet JPEG Standard consists of - a lossy baseline coding system - an extended coding system for greater compression, higher precision or progressive reconstruction applications - a lossless independent coding system for reversible compression References - ITU-T recommendation T.81, Information Technology - Digital compression and Coding of Continuous-Tone Still Images - Requirements and Guideline, 92. 2 - K. R. Rao, J. J. Hwang, Techniques & Standards for Image, Video & Audio Coding, Prentice Hall PTR, 1996 위원회

Baseline system Baseline system : most widely used among JPEG standards Data precision - 8 bits for input and output - 11 bits for quantized DCT coefficients Algorithm - DCT + quantization + variable length coding Compression Guideline - 0.25 ~ 0.5 bits/pixel : moderate to good quality, some applications - 0.5 ~ 0.75 bits/pixel : good to very good quality, many applications - 0.75 ~ 1.5 bits/pixel : excellent quality, most applications - 1.5 ~ 2.0 bits/pixel : indistinguishable (visually lossless) quality, most demanding applications 위원회

Baseline system block diagram Baseline system encoder Baseline system decoder 위원회

FDCT and IDCT Two-dimensional FDCT and IDCT Zero shift for input signal - [0, 2 p -1] [ - 2 p-1, 2 p-1-1 ] ( p=8 or 12 ) reduce the internal precision requirement in the DCT calculation 8 8 DCT - efficient energy compaction(close to KLT) - blocking artifacts at high compression ratios Definition - Fast FDCT and IDCT algorithms exist, e.g. Lee algorithm. 위원회

Quantization and inverse quantization Quantization table - No default values for quantization tables - Application may specify the tables - Q(u, v) : quantization table integer value from 1 to 255 Quantization : Dequantization : F Q F ( ) ( u, v) u, v = round Q( u, v) Q ( u, v) = F ( u, v) Q( u, v) R 위원회

Example f (x,y) F (u,v) F Q (u,v) FDCT Quant. r (x,y) e (x,y) Inverse Q & IDCT 위원회

Entropy Coding DC Coefficient Coding Differential Coding DC coefficients of adjacent blocks are strongly correlated. VLC(Huffman Coding) 위원회

Entropy Coding (cont.) AC coefficients Coding - Zigzag Scanning - VLC(Variable Length Coding, Huffman Coding) 위원회

Example Zigzag scanning [39, -3, 2, 1, -1, 1, 0, 0, 0, 0, 0, -1, EOB] (run, value) assuming : DC coefficient of previous block = 35 [5, (0,-3 ), (0,2 ), (0,1 ), (0,-1), (0,1), (5,-1), EOB] dc(cat, value), ac( run/cat, value) [dc(3, 5), ac(0/2,-3 ), ac(0/2,2 ), ac(0/1,1 ),ac(0/1,-1 ), ac(0/1, 1), ac(5/1,-1), EOB] Entropy Coding [100 101 / 01 00 / 01 10 / 00 1 / 00 0 / 00 1 / 1111010 0 / 1010] 512 bits 35bits 위원회

Table for luminance AC coefficients 위원회

Table for luminance AC coefficients 위원회

Table for chrominance AC coefficients 위원회

Table for chrominance AC coefficients 위원회

JPEG Compression Examples Original image (24bpp) JPEG Compressed image (8:1 -- 3bpp) JPEG Compressed image ( 32:1 -- 0.75bpp ) JPEG Compressed image ( 128:1 -- 0.1875bpp ) 위원회

MPEG Digital Video Technology MPEG-1( ISO/IEC 11172 ) and MPEG-2( ISO/IEC 13818 ) Applications : MPEG-1 : Digital Storage Media(CD-ROM ) MPEG-2 : Higher bit rates and broader generic applications Coding scheme : ( Consumer electronics, Telecommunications, Digital Broadcasting, HDTV, DVD, VOD, etc. ) Spatial redundancy : DCT + Quantization Temporal redundancy : Motion estimation and compensation Statistical redundancy : VLC References : - ISO/IEC 11172-2 (MPEG-1), ISO/IEC 13818-2 (MPEG-2) - K.R.RAO and J.J. HWANG, TECHNIQUES & STANDARDS FOR IMAGE VIDEO & AUDIO CODING, Prentice Hall, 1996. 위원회

MPEG Overview MPEG : - Motion Picture Experts Group - Specifies a standard compression, transmission, and decompression scheme for video and audio. - ISO/IEC 11172 : MPEG-1 - ISO/IEC 13818 : MPEG-2 - Consists of 3 parts. Part 1 : System Part 2 : Video Part 3 : Audio 위원회

Functional comparison between MPEG-1 1 and MPEG-2 2 video MPEG-1 MPEG-2 Video format SIF progressive SIF, 4:2:0, 4:2:2, 4:4:4 progressive/interlaced Picture quality VHS Distribution/contribution Bit rate Variable Variable up to 100Mbps ( 1.856 Mbps) Low delay mode < 150 ms < 150 ms (no B pictures) Accessibility Random access Random access/channel hopping Scalability SNR, spatial, temporal, simulcast, data partitioning Compatibility Forward, backward, upward, and downward Transmission error Error protection Error resilience Editing bit stream Yes Yes DCT Noninterlaced Field (progressive) or frame (interlaced) Motion estimation Noninterlaced Field, frame, and dual-prime based. Top (16 8) block and bottom (16 8) block Motion vectors Scanning of DCT coefficients Motion vectors for P, B picture only Zigzag scan Concealment motion vectors for I pictures besides MV for P & B Zigzag scan, alternate scan for interlaced video 위원회

MPEG System Structure MPEG System Stream Structure MPEG system stream is made up of two layers - System layer : timing and other information demultiplex and synchronize the audio and video streams - Compression layer : audio and video streams General Decoding Process 위원회

Video Stream Data Hierarchy Video Stream Data Hierarchy Video Sequence - Begins with a sequence header (may contain additional sequence headers). - Includes one or more groups of pictures, and ends with an end-of-sequence code. Group of Pictures (GOP) - A header and a series of one or more pictures intended to allow random access into the sequence. 위원회

Video Stream Data Hierarchy (Cont.) Picture - The primary coding unit of a video sequence. - Consists of three rectangular matrices representing luminance (Y) and two chrominance (Cb and Cr) values. Slice - One or more ``contiguous'' macroblocks. - Slices are important in the handling of errors. If the bitstream contains an error, the decoder can skip to the start of the next slice. Macroblock - A 2 by 2 section of Block ( 4 Y blocks + 1 Cb block + 1 Cr block ) - Basic unit for motion estimation and motion compensation Block - A block is an 8-pixel by 8-line set of values of a luminance or a chrominance component. - Basic unit for DCT ( discrete cosine transform ) 위원회

MPEG compression of Video How to remove spectral, spatial, temporal, and statistical redundancy? 위원회

Intra-frame Compression Rate Control Quantization step size Video DCT Entropy Q MUX Buffer Coding No information loss No data reduction Information loss Data reduction RLE Data reduction VLC Data reducetion Compressed Data Coefficients processing order to encourage runs of 0s Run Length Coding Generates (Run, Level) symbols Variable Length Coding Use short words for most frequent symbols (like Morse code) 111 110 101 100 011 010 001 000 8-bit quantization Input Value Input Value Quantizing Reduce the number of bits for each coefficient. Give preference to certain coefficients. Reduction can differ for each coefficient 11 10 01 00 2-bit quantization 위원회

Spatial redundancy Pixel Coding using the DCT As human eyes are insensitive to HF color changes, the R,G, B signal is converted into a luminance and two color difference signals. We can remove redundancy more on U, V than on Y. The top left DCT component is taken as the dc datum for the block. DCT coefficients to the right are increasingly higher horizontal spatial freqs. DCT coefficients below are higher vertical spatial frequencies. 위원회

Spatial redundancy (Cont.) Quantization & Entropy coding This all has a cost. That is shown in the pictures below: the upper picture is unquantized, the lower one quantized The higher the DCT frequency is, the greater the Quant Matrix value becomes. This makes many coefficients go to zero To generate efficient (Run, Level) symbols, Zig-zag scanning is applied to the quantized 8 8 DCT coefficients 위원회

Field & Frame based mode in MPEG-2 For interlaced video format, MPEG-2 provides two coding modes : Field-based mode, Frame-based mode Mapping from 16 16 Blocks to 8 8 Blocks for Frame-Organized Data Mapping from 16 16 Blocks to 8 8 Blocks for Field-Organized Data 위원회

Two scanning methods of the DCT coefficients in MPEG-2 (a) Zigzag scan (b) Alternate scan Zigzag scan is typical for progressive (noninterlaced) mode processing. Alternate scan is more efficient for interlaced format video. 위원회

Chrominance Format There are three formats : - 4:4:4 the chrominance and luminance planes are sampled at the same resolution. - 4:2:2 the chrominance planes are subsampled at half resolution in horizontal direction. - 4:2:0 the chrominance planes are subsampled at half resolution in both horizontal and vertical directions. 위원회

Inter-frame Compression Activity calculator Rate control Field/Frame DCT selector MQ Side informations SOURCE INPUT Frame reordering Field/Frame memory + + DCT Q VLC MUX BUFFER CODED BITSTREAM De Q Motion estimator 1 IDCT + Adaptive predictor Field/Frame memory Motion estimator 2 Side informations 위원회

Temporal redundancy Inter-frame prediction & motion estimation This really reduces the overall bit rate from frame to frame 위원회

Motion Estimation 위원회

Putting it all together I, P, B Frames The Intra Frames contain full picture information Predicted(P) Frames are predicted from past I, or P frames Bi-directional predicted frames offer the greatest compression and use past and future I & P frames for motion compensation. 위원회

MPEG-2 2 Level and Profiles This expandability of MPEG-2 format allows it to serve the needs of many different kinds of application. This is aided by defining several levels of decoders, and several profiles of video source. 위원회

Upperbound parameters in profile and levels Profile Simple Main SNR scalable Spatially scalable High Frame rate (Hz) Bit rate (Mbps) VBV size (Mbits) MV range (pels) Level H.size (pels) V.size (pels) Main 720 576 30 15 1.835-128 ~ 127.5 Low 352 288 30 4 0.489-64 ~ 63.5 Main 720 576 30 15 1.835-128 ~ 127.5 High 1440 1440 1152 60 60 7.340-128 ~ 127.5 High 1920 1152 60 80 9.787-128 ~ 127.5 Low 352 288 30 3 0.367 (4) (0.487) -64 ~ 63.5 Main 720 576 30 10 1.223-128 ~ (15) (1.835) 127.5 High 720 576 30 15 1.835-128 ~ 1440 (40) (4.893) 127.5 (1440) Main 352 High 1440 High (720) 720 (1440) 960 (1920) (1152) 288 (576) 576 (1152) 576 (1152) (60) 30 (30) 30 (60) 30 (60) (60) 4 (15) (20) 20 (60) (80) 25 (80) (100) (7.340) 0.489 (1.835) (2.447) 2.447 (7.340) (9.786) 3.036 (9.787) (12.233) Note: Numbers in parentheses refer to the enhanced layers. -128 ~ 127.5-128 ~ 127.5-128 ~ 127.5 위원회

Building the Elementary Stream This slide shows how the actual blocks, slices, frames etc. are all put together to form the elementary stream Along with the actual picture data, header information is required to reconstruct the I, B, P frames. This header structure is shown. The next stage is to take this ES and convert it into something that can be transmitted and decoded at the other end. 위원회

The Packetized Elementary Stream(PES) 위원회

Ordering frames for decoding The PTS & DTS In odering for a decoder to reconstruct a B-frame from the preceding I and following P frames, both these must arrive first. So the order of frame transmission must be different from the order they appear on the TV screen. 위원회

Ordering frames for decoding (Cont( Cont.) The decoder must also know at what time it should show the frames. That is their order in time. The Decoding Time Stamp(DTS) : tells the decoder when to decode the frame. The Presentation Time Stamp(PTS) : tells the decoder when to display the frame. In addition, a clock must be embedded, to allow a time reference to be created. In MPEG-1, the clock is 33 bits with 90 khz input; while in MPEG-2, the clock is 42 bits with 27 MHz input The clock, known as the Programme Clock Reference(PCR), is contained in the Transport Stream(TS). The System Clock Reference(SCR) is used in the Programme Clock Reference(PCR) and in the MPEG-1 system stream. 위원회

Ordering frames for decoding (Cont( Cont.) Frame Reordering 위원회

MPEG-2 2 Transport Stream Multiplexing many programs 위원회

Videoconferencing Compression Standards ITU-T recommendation for Video Coding : H.261 and H.263 Application : video phone/video conference via ISDN/PSTN Coding scheme - ME/MC + DCT + Q + VLC References - ITU-T. Recommendation H.261: Video Codec for Audiovisual Services at p*64 kbits/s - ITU-T. Recommendation H.263: Video Coding for Low Bit Rate Communication - Techniques & Standards for Image, Video & Audio Coding K.R.Rao, J.J.Hwang. 위원회

Overview of Videoconferencing Audiovisual Communication Multimedia documents including text, tables and images. Videoconferencing 위원회

Desktop Videoconferencing Compression HW or SW Decompression HW or SW Video and Audio Information Communication Network Video and Audio Information Hardware equipments used in a desktop videoconferencing system 위원회

CIF and QCIF Format Two video signal formats to permit a single recommendation for the different video formats, such as the 625-line(PAL or SECAM) and 525-line(NTSC) formats. Common Intermediate Format(CIF).. Y : 352 * 288, Cb &Cr : 176*144 Qurdature Common Intermediate Format(QCIF).. Y: 176 * 144, Cb & Cr: 88*72 NTSC NTSC PAL Pre-processing to CIF Encoder/Decoder Post-processing from CIF to... PAL SECAM SECAM 위원회

Brief Specification on H.261 For videoconferencing and videophone over integrated service digital network (ISDN) at p x 64 kbps 1. The conversion to CIF from the video source such as NTSC, PAL, SECAM, ITU-R 601, etc., and vice versa. 2. The decoding of the BCH(511, 493) error correction code. 3. The use of intra or inter mode. 4. Motion estimation in the encoder.(one MV per macroblock may be transmitted.) 5. The use of the loop filter in the encoder. 6. The arithmetic process for computing the FDCT. 7. The control of the video data rate. 8. Any pre- or postprocessing. 위원회

Brief Specification on H.263 For videoconferencing and videophone over the plain old telephone service(pots) at 33.6 kbps 1. Include various video formats such as QCIF, 4CIF, 16CIF. 2. Weighted quantizer matrix and VLC for B-blocks. 3. No loop filter; no macroblock addressing. 4. 1-bit coded or not-coded macroblock information in MB layer (Separate coded block patterns for luminance (CBPY) and chrominance (MCBPC) components and for intra / inter mode) 5. 2-bit differerntial quantizer information in MB layer and 5-bit quantizer information in picture layer and in GOB layer. 6. Advanced prediction mode: half-pel motion estimation, median-based MV prediction, 4 MVs per macroblock, and overlapped block MC. 7. Unrestricted MV mode: when MV points outside the picture area, use edge pixels. 8. A syntax-based arithmetic coding(sac) mode. 9. PB-frames mode. (forward and bidirectional prediction) 10. 3D VLC (Last-Run-Level) for coding the transform coefficients. 위원회

H.261 standards in videoconferencing Overview of the H.320 family of standards Video : H.261 (BCH(511,493)) Audio : G.711 (64kbps PCM) / G.728 (16kbps LD-CELP) Data : T.120 Mux/Demux : H.221 Signaling control : H.230, H.242 MCU control : H.243, H.231 위원회

H.263 standards in videoconferencing Overview of the H.324 family of standards Video : H.263 Audio : G.723 CELP Data : T.120 / T.434/ T.84 Mux/Demux : H.223 Signaling control : H.245 MCU control : N/A 위원회

Design Considerations for H.263 Low bitrate for GSTN application (consider V.34 modem = 33.6 kbps) Use of available Technology Low complexity (low cost) Interoperability and/or coexistence with H.320/H.261 Robust operation in the presence of channel errors Flexibility to allow for future extensions (e.g., higher bitrate) Quality-of-Service parameters such as resolution, delay, frame rate, color performance/rendition Subjective quality measurements 위원회

H.261 Source Coder 위원회

H.261 Source Coding Algorithm Intra frame coding Sent only for the first picture or after a change of scene No motion estimation for the intra frame DCT, quantization, zig-zag scan and VLC, Huffman coding are used for each MB Inter frame coding Motion estimation and motion compensation Transformed by DCT, quantized, zig-zag scanned and coded using VLC and Huffman coding. Forced intra coding Loop filter To control the accumulation of inverse transform mismatch error, each MB shall be coded in INTRA mode at least once every 132 times. Removes the high-frequency noise can be used to improve the visual effect. 위원회

H.263 Source Codec video encoder video decoder ME1: pel motion estimation and intra/inter decision ME2: half-pel motion estimation M: Frame Store MBTYPE: decide block type and block pattern CC: Coding Control DCT: Discrete Cosine Transform PRED: make prediction block VLC(C): VLC for transform coefficients VLC(M): VLC for motion vectors 위원회

Source Format PARAMETERS CIF QCIF Y 360(352) 180(176) Pels per lines Cr 180(176) 90(88) Cb 180(176) 90(88) Y 288 144 Lines per frame Cr 144 72 Cb 144 72 Frames per second 29.97 Interlace N/A Positioning of luminance and chrominance pixels (4:2:0) 위원회

Hierarchical Structure GOBs and macroblocks 위원회

Motion Estimation/Compensation Motion estimation Half pixel values(best matched MV) are found using bilinear interpolation Motion compensation (OBMC: Overlapped Motion Compensation) (a) remote motion vector selection for OBMC (b) weighting matrix for current luminance block (c) weighting matrix for top/bottom luminance block (d) weighting matrix for left/right luminance block 위원회

H.263 Syntax and Semantics A syntax diagram for the H.263 video bit stream 위원회

H.263 Syntax and Semantics (cont.) Picture Layer 위원회

H.263 Syntax and Semantics (cont.) GOB Layer 위원회

H.263 Syntax and Semantics (cont.) Macroblock Layer 위원회

H.263 Syntax and Semantics (cont.) Block Layer INTRADC is present for every block of the macroblock if MCBPC indicates MB type 3 or 4. TCOEF is present if indicated by MCBPC or CBPY. TCOEF1, TCOEF2, TCOEF3 and TCOEFr : Last-Run-Level symbols 위원회

Extension to H.263++ Unrestricted motion vector mode Motion vectors are allowed to point outside the picture. Syntax-based arithmetic coding (SAC) mode All the corresponding VLC/VLD operations of H.263 are replaced with arithmetic coding/decoding operations in this mode. Advanced prediction mode : taking the median from the candidate predictors. In the case of one motion vector per macroblock: 위원회

Extension to H.263++ (cont.) In the case of four motion vector per macroblock: Optional PB-frames mode 위원회

MPEG-4 4 Visual Compression MPEG-4( ISO/IEC 14496 ) Applications : Internet Multimedia Wireless Multimedia Communication Multimedia Contents for Computers and Consumer Electronics Interactive Digital TV Coding scheme : Spatial redundancy : DCT + Quantization, Wavelet Transform Temporal redundancy : Motion estimation and compensation Statistical redundancy : VLC (Huffman Coding, Arithmetic Coding) Shape Coding : Context-based Arithmetic Coding References : - ISO/IEC 14496 위원회

위원회

위원회

위원회

위원회

위원회

위원회

Applications of MPEG-4 Multimedia( playback and retrieval of audiovisual programs) Interactive multimedia databases, Multimedia videotext Multimedia presentations Slide show, production/authoring Scalable & Interactive Applications on the WWW Animated talking head with speech synthesis Interactive DVD applications Remotes sensing( acquisition and monitoring of audio visual data) home, building, campus, traffic monitoring, visual input from human agent Video store-and-forward Multimedia E-Mail, Video answering machines 위원회

위원회

위원회

MPEG-4 Parts and Versions 위원회

위원회

위원회

위원회

위원회

위원회

위원회

위원회

위원회

위원회

위원회

위원회

Content-Based Layering of Video Each Video Object in a Scene is Coded and Transmitted Separately VOP 0 Coding VOP 0 Decoding Input VOP Definition VOP 1 Coding VOP 2 Coding VOP 1 MUX Bitstream DEMUX Decoding VOP 2 Composition Output Decoding 위원회

Simplified Block Diagram of Natural Video Encoding VOP_of_arbitrary_sha Shape Coding Shape information VOP_of_arbitrary_shape Motion information MUX Buffer Motion Estimation Motion Compensation Texture Coding Previous Reconstructed VOP + Texture information 위원회

Face Decoding Still Texture Decoding Mesh Decoding Entropy Decoding and Visual Demux Shape Decoding Texture Decoding Motion Compensation Decoding To Composition A high level view of basic visual decoding; Specialized decoding such as scalable, sprite and error resilient decoding are not shown. 위원회

위원회

video_object_layer_shape Coded Bit Stream (Shape) Shap e Decoding Previous Reconstructed VOP Coded Bit Stream (Motion) Motion Decoding Motion Compensation Demultiplexer Coded Bit Stream (Texture) Variable Length Decoding Inverse Scan VOP Reconstruction Inverse DC & AC Prediction Inverse Quantization IDCT Texture Decoding Simplified Video Decoding Process 위원회

0 1 2 3 10 11 12 13 4 5 8 9 17 16 15 14 6 7 19 18 26 27 28 29 20 21 24 25 30 31 32 33 22 23 34 35 42 43 44 45 36 37 40 41 46 47 48 49 38 39 50 51 56 57 58 59 52 53 54 55 60 61 62 63 0 4 6 20 22 36 38 52 1 5 7 21 23 37 39 53 2 8 19243440 50 54 3 9 18253541 51 55 10 17 26 30 42 46 56 60 11 16 27 31 43 47 57 61 12 15 28 32 44 48 58 62 13 14 29 33 45 49 59 63 0 1 5 6 14152728 2 4 7 1316262942 3 8 121725304143 9 11182431404453 10 19 23 32 39 45 52 54 20 22 33 38 46 51 55 60 21 34 37 47 50 56 59 61 35 36 48 49 57 58 62 63 (a) Alternate-Horizontal scan (b) Alternate-Vertical scan (c) Zigzag scan Three Scanning Patterns of DCT Coefficients in MPEG-4 Video 위원회

B C D or or A X Y Macroblock Previous neighboring blocks used in DC prediction 위원회

B C D or or A X Y Macroblock Previous neighboring blocks and coefficients used in AC prediction 위원회

c9 c8 c6 c5 c4 c1 c0? c7 c3 c2 Current BAB c3 c2 c1 c0? Motion compensated BAB c7 c8 c6 c5 c4 (a) (b) (a) The INTRA template (b) The INTER template where c6 is aligned with the pixel to be decoded. The pixel to be decoded is marked with? Templates for Context-based Arithmetic Coding of Binary Shape 위원회

위원회

위원회

List of major natural video tools 위원회

Static sprite coding tools (1/3) 위원회

Static sprite coding tools (2/3) 위원회

Static sprite coding tools (3/3) 위원회

Scalable Texture Coding : Encoder The basic modules Decomposition of the texture using discrete wavelet transform(dwt) Quantization of the wavelet coefficients Coding of the lowest frequency subband using a predictive scheme Zero-tree scanning of the higher order subband wavelet coefficient input DWT Low-Low QUANT Prediction AC Bitstream Other Bands QUANT ZeroTree Scanning AC 위원회

Scalable Texture Coding 위원회

Scalable Texture Coding : Decoder 위원회

위원회

위원회

위원회

위원회

위원회

위원회

위원회

위원회

Tools and Visual Object Types Visual Tools Simple Core Main Simple Scalable Visual Object Types N-bit Basic X X X X X X I-VOP, P-VOP AC/DC Prediction 4-MV, Unrestricted MV Animated 2D Mesh Basic Animated Texture Still Scalable Texture Simple Face Error resilience X X X X X X Slice Resynchronization Data Partitioning Reversible VLC Short Header X X X X X B-VOP X X X X X P-VOP with OBMC (Texture) Method 1/Method 2 X X X X Quantization P-VOP based temporal X X X X scalability Rectangular Arbitrary Shape Binary Shape X X X X X Grey Shape X Interlace X Sprite X Temporal Scalability X (Rectangular) Spatial Scalability X (Rectangular) N-Bit X Scalable Still Texture X X X 2D Dynamic Mesh with X X uniform topology 2D Dynamic Mesh with X Delaunay topology Facial Animation Parameters X 위원회

Visual Profiles Profiles Object Types Simple Core Main Simple Scalable N-Bit Animated 2D Mesh Basic Animated Texture Scalable Texture Simple X Simple Scaleable X X Core X X Main X X X X N-Bit X X X Hybrid X X X X X X Basic Animated Texture X X X Scaleable Texture Simple FA X Simple Face X 위원회

위원회

H.263 vs. MPEG-4 ITU-T : H.261 -> H.262(MPEG-2) -> H.263 (1995) -> H.263/L (1999) ISO : JPEG -> MPEG-1 -> MPEG-2 -> MPEG-4 (1998) MPEG-4 focuses on content-based compression and synthetic/natural hybrid coding for multimedia database access and mobile communication MPEG-4 uses H.263 as a benchmark for subjective test. MPEG-4 adopts many compression components of H.263. H.263 proves to be an excellent compression algorithm. H.263/L will be developed in cooperation with MPEG-4. MPEG-4 is a multiple-tool, multiple-algorithm, and multiple-profile standard. 위원회

Potential MPEG-4 Markets MPEG-4 will not replace MPEG-2 in digital broadcasting, DVD, VOD, etc. MPEG-4 may compete with H.32x in mobile videophone. MPEG-4 may compete with MHEG-5 based interactive TV or HTML based interactive TV. MPEG-4 may compete with Quicktime or Video-for-windows in multimedia title industry. MPEG-4 may be used in many other audiovisual applications. 위원회