International Standards for Image/Video Coding 위원회
Multimedia Everywhere Towards Multimedia : Computer Consumer Electronics Multimedia Tele- Communication Broadcasting 위원회
Still Picture Compression Standards 1980 : ITU-T T.4 : G3 FAX for PSTN Modified Huffman and Modified READ 1984 : ITU-T T.6 : G4 FAX for ISDN Modified MR 1992 : JPEG (ISO 10918, ITU-T T.81) : Color Still Pictures used for Color Fax, Electronic Still Camera, Color Printer, Computer Applications etc Lossless/Lossy Modes, Baseline/Extended Modes, Progressive/Sequential Modes DPCM + DCT + Q + RLE + Huffman/Arithmetic Codes Motion JPEG can be used for Moving Pictures. 1993 : JBIG (ISO 11544, ITU-T T.82) : Bi-level Pictures Improvement on T.4 and T.6 Recently: JPEG-LS, JBIG2, etc 위원회
Moving Picture Compression Standards 1982 : ITU-R BT.601 : Studio Quality PCM Component Video Common to 525/60 and 625/50 Systems 13.5 MHz Sampling, 8 bit/sample, 4:2:2 Format 1990 : ITU-T H.261 : Video Phone/Conference Application via ISDN Bitrate = p x 64 kbps, p = 1-30 MC DPCM + DCT + Q + RLE + Huffman Codes Reference Model 1-8 1992 : MPEG-1 Video : DSM Applications (e.g. Video CD) Bitrate = 1.5 Mbps MC DPCM + DCT + Q + RLE + Huffman Codes GOP Structure for Random Access and Error Recovery (I, P, B Frames) Simulation Model 1-3 위원회
Moving Picture Compression Standards (Continued) 1994 : MPEG-2 Video (ISO 13818-2, ITU-T H.262) : Generic Algorithm for Various Applications (Broadcasting, Communication, Network, DSM etc) 5 Profiles of Functionality (Simple, Main, Spatial Scalable, SNR Scalable, High) 4 Levels of Resolution (Low, Main, High-1440, High) Deals with Interlaced Scan as well as Progressive Scan Field/Frame ME & DCT, Dual Prime ME, Intra VLC, Altenate Scan, Nonuniform Q, etc 1993 : ITU-R CMTT.721 : 140 Mbps Contribution Quality Video Adaptive DPCM, Componentwise 1993 : ITU-R CMTT.723 : 34-45 Mbps Contribution Quality Video MC DPCM + DCT + Q + RLE + Huffman Codes 위원회
Moving Picture Compression Standards (Continued) 1995 : ITU-T H.263 : Videophone via PSTN Bitrate < 64 kbps (V.34 modem = 33.6 kbps, Recent modem = 56 kbps) Improved version of H.261 1998 : MPEG-4 Bitrates < 2 Mbps Targets: Multimedia data base access Wireless multimedia communication Components of H.263 are incorporated Content-based compression Synthetic and natural video/audio Multiple tools/algorithms/profiles => Flexibility 1999 : MPEG-4 Version 2, MPEG-7 위원회
Bilevel image compression standards ITU-T recommendation T.4(G3 Fax) and T.6(G4 Fax) Application : facsimile(transmission of bilevel documents) Coding scheme - G3 : 1-D nonadaptive run-length + Huffman 2-D nonadaptive run-length + Huffman - G4 : 2-D nonadaptive run-length + Huffman References - G3: ITU-T Recommendation T.4, Standardization of Group 3 Facsimile Apparatus for Document Transmission, - G4: ITU-T Recommendation T.6, Facsimile Coding Scheme and Control Functions for Group 4 Facsimile Apparatus. - Rafael C. Gonzalez, Richard E. Woods Digital Image Processing, Addison Wesley, 1992 - Anil K. Jain, Fundamentals Of Digital Image Processing, Prentice-Hall, 1989 위원회
One-dimensional coding scheme 1-D run-length + Huffman Data - Each code word : all white or all black 00000000111000001111000000000000 8W 3B 5W 4B 12W - Column synchronization begining of all data lines : a white run-length code Coding algorithm - Run length 0 ~ 63 : terminating code(modified Huffman code) 64 ~ 2560 : the largest makeup code word (not exceeding the run-length) plus terminating code End-of-line (000000000001) - End of each line - First line of a page - Six consecutive EOL : the end of a document transmission 위원회
Terminating codes 위원회
Makeup codes 위원회
Two-dimensional coding scheme 2-D run-length coding scheme Principle - the position of each transition is coded with respect to the position of a reference element a 0 - Similar to RAC (Relative Address Coding) Definition of changing picture elements a 0 : The reference or starting changing element on the coding line At the start of the line, a 0 is set on an imaginary white changing element a 1 : The next changing element to the right of a 0 on the coding line a 2 : The next changing element to the right of a 1 on the coding line b 1 : The first changing element on the reference line to the right of a 0 and of opposite color to a 0 b 2 : The next changing element to the right of b 1 on the reference line Reference line Coding line 위원회
Coding mode Pass mode -b 2 lies to left of a 1 -next a 0 : the element of the coding line below b 2 - code word : 0001 Vertical mode - a 1 b 1 3 -next a 0 : current a 1 - code word : defined in 2-D code table 위원회
Coding mode (cont.) Horizontal mode - a 1 b 1 > 3 -next a 0 : current a 2 - code word : 001 + M(a 0 a 1 ) + M(a 1 a 2 ) M(a x a y ) : distance a x a y is coded by termination and makeup codes of 1-D compression 위원회
Two-dimensional code table 위원회
Modified READ algorithm 위원회
Continuous-tone still image compression standards JPEG(Joint Photographic Experts Group) Applications : color FAX, digital still camera, multimedia computer, internet JPEG Standard consists of - a lossy baseline coding system - an extended coding system for greater compression, higher precision or progressive reconstruction applications - a lossless independent coding system for reversible compression References - ITU-T recommendation T.81, Information Technology - Digital compression and Coding of Continuous-Tone Still Images - Requirements and Guideline, 92. 2 - K. R. Rao, J. J. Hwang, Techniques & Standards for Image, Video & Audio Coding, Prentice Hall PTR, 1996 위원회
Baseline system Baseline system : most widely used among JPEG standards Data precision - 8 bits for input and output - 11 bits for quantized DCT coefficients Algorithm - DCT + quantization + variable length coding Compression Guideline - 0.25 ~ 0.5 bits/pixel : moderate to good quality, some applications - 0.5 ~ 0.75 bits/pixel : good to very good quality, many applications - 0.75 ~ 1.5 bits/pixel : excellent quality, most applications - 1.5 ~ 2.0 bits/pixel : indistinguishable (visually lossless) quality, most demanding applications 위원회
Baseline system block diagram Baseline system encoder Baseline system decoder 위원회
FDCT and IDCT Two-dimensional FDCT and IDCT Zero shift for input signal - [0, 2 p -1] [ - 2 p-1, 2 p-1-1 ] ( p=8 or 12 ) reduce the internal precision requirement in the DCT calculation 8 8 DCT - efficient energy compaction(close to KLT) - blocking artifacts at high compression ratios Definition - Fast FDCT and IDCT algorithms exist, e.g. Lee algorithm. 위원회
Quantization and inverse quantization Quantization table - No default values for quantization tables - Application may specify the tables - Q(u, v) : quantization table integer value from 1 to 255 Quantization : Dequantization : F Q F ( ) ( u, v) u, v = round Q( u, v) Q ( u, v) = F ( u, v) Q( u, v) R 위원회
Example f (x,y) F (u,v) F Q (u,v) FDCT Quant. r (x,y) e (x,y) Inverse Q & IDCT 위원회
Entropy Coding DC Coefficient Coding Differential Coding DC coefficients of adjacent blocks are strongly correlated. VLC(Huffman Coding) 위원회
Entropy Coding (cont.) AC coefficients Coding - Zigzag Scanning - VLC(Variable Length Coding, Huffman Coding) 위원회
Example Zigzag scanning [39, -3, 2, 1, -1, 1, 0, 0, 0, 0, 0, -1, EOB] (run, value) assuming : DC coefficient of previous block = 35 [5, (0,-3 ), (0,2 ), (0,1 ), (0,-1), (0,1), (5,-1), EOB] dc(cat, value), ac( run/cat, value) [dc(3, 5), ac(0/2,-3 ), ac(0/2,2 ), ac(0/1,1 ),ac(0/1,-1 ), ac(0/1, 1), ac(5/1,-1), EOB] Entropy Coding [100 101 / 01 00 / 01 10 / 00 1 / 00 0 / 00 1 / 1111010 0 / 1010] 512 bits 35bits 위원회
Table for luminance AC coefficients 위원회
Table for luminance AC coefficients 위원회
Table for chrominance AC coefficients 위원회
Table for chrominance AC coefficients 위원회
JPEG Compression Examples Original image (24bpp) JPEG Compressed image (8:1 -- 3bpp) JPEG Compressed image ( 32:1 -- 0.75bpp ) JPEG Compressed image ( 128:1 -- 0.1875bpp ) 위원회
MPEG Digital Video Technology MPEG-1( ISO/IEC 11172 ) and MPEG-2( ISO/IEC 13818 ) Applications : MPEG-1 : Digital Storage Media(CD-ROM ) MPEG-2 : Higher bit rates and broader generic applications Coding scheme : ( Consumer electronics, Telecommunications, Digital Broadcasting, HDTV, DVD, VOD, etc. ) Spatial redundancy : DCT + Quantization Temporal redundancy : Motion estimation and compensation Statistical redundancy : VLC References : - ISO/IEC 11172-2 (MPEG-1), ISO/IEC 13818-2 (MPEG-2) - K.R.RAO and J.J. HWANG, TECHNIQUES & STANDARDS FOR IMAGE VIDEO & AUDIO CODING, Prentice Hall, 1996. 위원회
MPEG Overview MPEG : - Motion Picture Experts Group - Specifies a standard compression, transmission, and decompression scheme for video and audio. - ISO/IEC 11172 : MPEG-1 - ISO/IEC 13818 : MPEG-2 - Consists of 3 parts. Part 1 : System Part 2 : Video Part 3 : Audio 위원회
Functional comparison between MPEG-1 1 and MPEG-2 2 video MPEG-1 MPEG-2 Video format SIF progressive SIF, 4:2:0, 4:2:2, 4:4:4 progressive/interlaced Picture quality VHS Distribution/contribution Bit rate Variable Variable up to 100Mbps ( 1.856 Mbps) Low delay mode < 150 ms < 150 ms (no B pictures) Accessibility Random access Random access/channel hopping Scalability SNR, spatial, temporal, simulcast, data partitioning Compatibility Forward, backward, upward, and downward Transmission error Error protection Error resilience Editing bit stream Yes Yes DCT Noninterlaced Field (progressive) or frame (interlaced) Motion estimation Noninterlaced Field, frame, and dual-prime based. Top (16 8) block and bottom (16 8) block Motion vectors Scanning of DCT coefficients Motion vectors for P, B picture only Zigzag scan Concealment motion vectors for I pictures besides MV for P & B Zigzag scan, alternate scan for interlaced video 위원회
MPEG System Structure MPEG System Stream Structure MPEG system stream is made up of two layers - System layer : timing and other information demultiplex and synchronize the audio and video streams - Compression layer : audio and video streams General Decoding Process 위원회
Video Stream Data Hierarchy Video Stream Data Hierarchy Video Sequence - Begins with a sequence header (may contain additional sequence headers). - Includes one or more groups of pictures, and ends with an end-of-sequence code. Group of Pictures (GOP) - A header and a series of one or more pictures intended to allow random access into the sequence. 위원회
Video Stream Data Hierarchy (Cont.) Picture - The primary coding unit of a video sequence. - Consists of three rectangular matrices representing luminance (Y) and two chrominance (Cb and Cr) values. Slice - One or more ``contiguous'' macroblocks. - Slices are important in the handling of errors. If the bitstream contains an error, the decoder can skip to the start of the next slice. Macroblock - A 2 by 2 section of Block ( 4 Y blocks + 1 Cb block + 1 Cr block ) - Basic unit for motion estimation and motion compensation Block - A block is an 8-pixel by 8-line set of values of a luminance or a chrominance component. - Basic unit for DCT ( discrete cosine transform ) 위원회
MPEG compression of Video How to remove spectral, spatial, temporal, and statistical redundancy? 위원회
Intra-frame Compression Rate Control Quantization step size Video DCT Entropy Q MUX Buffer Coding No information loss No data reduction Information loss Data reduction RLE Data reduction VLC Data reducetion Compressed Data Coefficients processing order to encourage runs of 0s Run Length Coding Generates (Run, Level) symbols Variable Length Coding Use short words for most frequent symbols (like Morse code) 111 110 101 100 011 010 001 000 8-bit quantization Input Value Input Value Quantizing Reduce the number of bits for each coefficient. Give preference to certain coefficients. Reduction can differ for each coefficient 11 10 01 00 2-bit quantization 위원회
Spatial redundancy Pixel Coding using the DCT As human eyes are insensitive to HF color changes, the R,G, B signal is converted into a luminance and two color difference signals. We can remove redundancy more on U, V than on Y. The top left DCT component is taken as the dc datum for the block. DCT coefficients to the right are increasingly higher horizontal spatial freqs. DCT coefficients below are higher vertical spatial frequencies. 위원회
Spatial redundancy (Cont.) Quantization & Entropy coding This all has a cost. That is shown in the pictures below: the upper picture is unquantized, the lower one quantized The higher the DCT frequency is, the greater the Quant Matrix value becomes. This makes many coefficients go to zero To generate efficient (Run, Level) symbols, Zig-zag scanning is applied to the quantized 8 8 DCT coefficients 위원회
Field & Frame based mode in MPEG-2 For interlaced video format, MPEG-2 provides two coding modes : Field-based mode, Frame-based mode Mapping from 16 16 Blocks to 8 8 Blocks for Frame-Organized Data Mapping from 16 16 Blocks to 8 8 Blocks for Field-Organized Data 위원회
Two scanning methods of the DCT coefficients in MPEG-2 (a) Zigzag scan (b) Alternate scan Zigzag scan is typical for progressive (noninterlaced) mode processing. Alternate scan is more efficient for interlaced format video. 위원회
Chrominance Format There are three formats : - 4:4:4 the chrominance and luminance planes are sampled at the same resolution. - 4:2:2 the chrominance planes are subsampled at half resolution in horizontal direction. - 4:2:0 the chrominance planes are subsampled at half resolution in both horizontal and vertical directions. 위원회
Inter-frame Compression Activity calculator Rate control Field/Frame DCT selector MQ Side informations SOURCE INPUT Frame reordering Field/Frame memory + + DCT Q VLC MUX BUFFER CODED BITSTREAM De Q Motion estimator 1 IDCT + Adaptive predictor Field/Frame memory Motion estimator 2 Side informations 위원회
Temporal redundancy Inter-frame prediction & motion estimation This really reduces the overall bit rate from frame to frame 위원회
Motion Estimation 위원회
Putting it all together I, P, B Frames The Intra Frames contain full picture information Predicted(P) Frames are predicted from past I, or P frames Bi-directional predicted frames offer the greatest compression and use past and future I & P frames for motion compensation. 위원회
MPEG-2 2 Level and Profiles This expandability of MPEG-2 format allows it to serve the needs of many different kinds of application. This is aided by defining several levels of decoders, and several profiles of video source. 위원회
Upperbound parameters in profile and levels Profile Simple Main SNR scalable Spatially scalable High Frame rate (Hz) Bit rate (Mbps) VBV size (Mbits) MV range (pels) Level H.size (pels) V.size (pels) Main 720 576 30 15 1.835-128 ~ 127.5 Low 352 288 30 4 0.489-64 ~ 63.5 Main 720 576 30 15 1.835-128 ~ 127.5 High 1440 1440 1152 60 60 7.340-128 ~ 127.5 High 1920 1152 60 80 9.787-128 ~ 127.5 Low 352 288 30 3 0.367 (4) (0.487) -64 ~ 63.5 Main 720 576 30 10 1.223-128 ~ (15) (1.835) 127.5 High 720 576 30 15 1.835-128 ~ 1440 (40) (4.893) 127.5 (1440) Main 352 High 1440 High (720) 720 (1440) 960 (1920) (1152) 288 (576) 576 (1152) 576 (1152) (60) 30 (30) 30 (60) 30 (60) (60) 4 (15) (20) 20 (60) (80) 25 (80) (100) (7.340) 0.489 (1.835) (2.447) 2.447 (7.340) (9.786) 3.036 (9.787) (12.233) Note: Numbers in parentheses refer to the enhanced layers. -128 ~ 127.5-128 ~ 127.5-128 ~ 127.5 위원회
Building the Elementary Stream This slide shows how the actual blocks, slices, frames etc. are all put together to form the elementary stream Along with the actual picture data, header information is required to reconstruct the I, B, P frames. This header structure is shown. The next stage is to take this ES and convert it into something that can be transmitted and decoded at the other end. 위원회
The Packetized Elementary Stream(PES) 위원회
Ordering frames for decoding The PTS & DTS In odering for a decoder to reconstruct a B-frame from the preceding I and following P frames, both these must arrive first. So the order of frame transmission must be different from the order they appear on the TV screen. 위원회
Ordering frames for decoding (Cont( Cont.) The decoder must also know at what time it should show the frames. That is their order in time. The Decoding Time Stamp(DTS) : tells the decoder when to decode the frame. The Presentation Time Stamp(PTS) : tells the decoder when to display the frame. In addition, a clock must be embedded, to allow a time reference to be created. In MPEG-1, the clock is 33 bits with 90 khz input; while in MPEG-2, the clock is 42 bits with 27 MHz input The clock, known as the Programme Clock Reference(PCR), is contained in the Transport Stream(TS). The System Clock Reference(SCR) is used in the Programme Clock Reference(PCR) and in the MPEG-1 system stream. 위원회
Ordering frames for decoding (Cont( Cont.) Frame Reordering 위원회
MPEG-2 2 Transport Stream Multiplexing many programs 위원회
Videoconferencing Compression Standards ITU-T recommendation for Video Coding : H.261 and H.263 Application : video phone/video conference via ISDN/PSTN Coding scheme - ME/MC + DCT + Q + VLC References - ITU-T. Recommendation H.261: Video Codec for Audiovisual Services at p*64 kbits/s - ITU-T. Recommendation H.263: Video Coding for Low Bit Rate Communication - Techniques & Standards for Image, Video & Audio Coding K.R.Rao, J.J.Hwang. 위원회
Overview of Videoconferencing Audiovisual Communication Multimedia documents including text, tables and images. Videoconferencing 위원회
Desktop Videoconferencing Compression HW or SW Decompression HW or SW Video and Audio Information Communication Network Video and Audio Information Hardware equipments used in a desktop videoconferencing system 위원회
CIF and QCIF Format Two video signal formats to permit a single recommendation for the different video formats, such as the 625-line(PAL or SECAM) and 525-line(NTSC) formats. Common Intermediate Format(CIF).. Y : 352 * 288, Cb &Cr : 176*144 Qurdature Common Intermediate Format(QCIF).. Y: 176 * 144, Cb & Cr: 88*72 NTSC NTSC PAL Pre-processing to CIF Encoder/Decoder Post-processing from CIF to... PAL SECAM SECAM 위원회
Brief Specification on H.261 For videoconferencing and videophone over integrated service digital network (ISDN) at p x 64 kbps 1. The conversion to CIF from the video source such as NTSC, PAL, SECAM, ITU-R 601, etc., and vice versa. 2. The decoding of the BCH(511, 493) error correction code. 3. The use of intra or inter mode. 4. Motion estimation in the encoder.(one MV per macroblock may be transmitted.) 5. The use of the loop filter in the encoder. 6. The arithmetic process for computing the FDCT. 7. The control of the video data rate. 8. Any pre- or postprocessing. 위원회
Brief Specification on H.263 For videoconferencing and videophone over the plain old telephone service(pots) at 33.6 kbps 1. Include various video formats such as QCIF, 4CIF, 16CIF. 2. Weighted quantizer matrix and VLC for B-blocks. 3. No loop filter; no macroblock addressing. 4. 1-bit coded or not-coded macroblock information in MB layer (Separate coded block patterns for luminance (CBPY) and chrominance (MCBPC) components and for intra / inter mode) 5. 2-bit differerntial quantizer information in MB layer and 5-bit quantizer information in picture layer and in GOB layer. 6. Advanced prediction mode: half-pel motion estimation, median-based MV prediction, 4 MVs per macroblock, and overlapped block MC. 7. Unrestricted MV mode: when MV points outside the picture area, use edge pixels. 8. A syntax-based arithmetic coding(sac) mode. 9. PB-frames mode. (forward and bidirectional prediction) 10. 3D VLC (Last-Run-Level) for coding the transform coefficients. 위원회
H.261 standards in videoconferencing Overview of the H.320 family of standards Video : H.261 (BCH(511,493)) Audio : G.711 (64kbps PCM) / G.728 (16kbps LD-CELP) Data : T.120 Mux/Demux : H.221 Signaling control : H.230, H.242 MCU control : H.243, H.231 위원회
H.263 standards in videoconferencing Overview of the H.324 family of standards Video : H.263 Audio : G.723 CELP Data : T.120 / T.434/ T.84 Mux/Demux : H.223 Signaling control : H.245 MCU control : N/A 위원회
Design Considerations for H.263 Low bitrate for GSTN application (consider V.34 modem = 33.6 kbps) Use of available Technology Low complexity (low cost) Interoperability and/or coexistence with H.320/H.261 Robust operation in the presence of channel errors Flexibility to allow for future extensions (e.g., higher bitrate) Quality-of-Service parameters such as resolution, delay, frame rate, color performance/rendition Subjective quality measurements 위원회
H.261 Source Coder 위원회
H.261 Source Coding Algorithm Intra frame coding Sent only for the first picture or after a change of scene No motion estimation for the intra frame DCT, quantization, zig-zag scan and VLC, Huffman coding are used for each MB Inter frame coding Motion estimation and motion compensation Transformed by DCT, quantized, zig-zag scanned and coded using VLC and Huffman coding. Forced intra coding Loop filter To control the accumulation of inverse transform mismatch error, each MB shall be coded in INTRA mode at least once every 132 times. Removes the high-frequency noise can be used to improve the visual effect. 위원회
H.263 Source Codec video encoder video decoder ME1: pel motion estimation and intra/inter decision ME2: half-pel motion estimation M: Frame Store MBTYPE: decide block type and block pattern CC: Coding Control DCT: Discrete Cosine Transform PRED: make prediction block VLC(C): VLC for transform coefficients VLC(M): VLC for motion vectors 위원회
Source Format PARAMETERS CIF QCIF Y 360(352) 180(176) Pels per lines Cr 180(176) 90(88) Cb 180(176) 90(88) Y 288 144 Lines per frame Cr 144 72 Cb 144 72 Frames per second 29.97 Interlace N/A Positioning of luminance and chrominance pixels (4:2:0) 위원회
Hierarchical Structure GOBs and macroblocks 위원회
Motion Estimation/Compensation Motion estimation Half pixel values(best matched MV) are found using bilinear interpolation Motion compensation (OBMC: Overlapped Motion Compensation) (a) remote motion vector selection for OBMC (b) weighting matrix for current luminance block (c) weighting matrix for top/bottom luminance block (d) weighting matrix for left/right luminance block 위원회
H.263 Syntax and Semantics A syntax diagram for the H.263 video bit stream 위원회
H.263 Syntax and Semantics (cont.) Picture Layer 위원회
H.263 Syntax and Semantics (cont.) GOB Layer 위원회
H.263 Syntax and Semantics (cont.) Macroblock Layer 위원회
H.263 Syntax and Semantics (cont.) Block Layer INTRADC is present for every block of the macroblock if MCBPC indicates MB type 3 or 4. TCOEF is present if indicated by MCBPC or CBPY. TCOEF1, TCOEF2, TCOEF3 and TCOEFr : Last-Run-Level symbols 위원회
Extension to H.263++ Unrestricted motion vector mode Motion vectors are allowed to point outside the picture. Syntax-based arithmetic coding (SAC) mode All the corresponding VLC/VLD operations of H.263 are replaced with arithmetic coding/decoding operations in this mode. Advanced prediction mode : taking the median from the candidate predictors. In the case of one motion vector per macroblock: 위원회
Extension to H.263++ (cont.) In the case of four motion vector per macroblock: Optional PB-frames mode 위원회
MPEG-4 4 Visual Compression MPEG-4( ISO/IEC 14496 ) Applications : Internet Multimedia Wireless Multimedia Communication Multimedia Contents for Computers and Consumer Electronics Interactive Digital TV Coding scheme : Spatial redundancy : DCT + Quantization, Wavelet Transform Temporal redundancy : Motion estimation and compensation Statistical redundancy : VLC (Huffman Coding, Arithmetic Coding) Shape Coding : Context-based Arithmetic Coding References : - ISO/IEC 14496 위원회
위원회
위원회
위원회
위원회
위원회
위원회
Applications of MPEG-4 Multimedia( playback and retrieval of audiovisual programs) Interactive multimedia databases, Multimedia videotext Multimedia presentations Slide show, production/authoring Scalable & Interactive Applications on the WWW Animated talking head with speech synthesis Interactive DVD applications Remotes sensing( acquisition and monitoring of audio visual data) home, building, campus, traffic monitoring, visual input from human agent Video store-and-forward Multimedia E-Mail, Video answering machines 위원회
위원회
위원회
MPEG-4 Parts and Versions 위원회
위원회
위원회
위원회
위원회
위원회
위원회
위원회
위원회
위원회
위원회
위원회
Content-Based Layering of Video Each Video Object in a Scene is Coded and Transmitted Separately VOP 0 Coding VOP 0 Decoding Input VOP Definition VOP 1 Coding VOP 2 Coding VOP 1 MUX Bitstream DEMUX Decoding VOP 2 Composition Output Decoding 위원회
Simplified Block Diagram of Natural Video Encoding VOP_of_arbitrary_sha Shape Coding Shape information VOP_of_arbitrary_shape Motion information MUX Buffer Motion Estimation Motion Compensation Texture Coding Previous Reconstructed VOP + Texture information 위원회
Face Decoding Still Texture Decoding Mesh Decoding Entropy Decoding and Visual Demux Shape Decoding Texture Decoding Motion Compensation Decoding To Composition A high level view of basic visual decoding; Specialized decoding such as scalable, sprite and error resilient decoding are not shown. 위원회
위원회
video_object_layer_shape Coded Bit Stream (Shape) Shap e Decoding Previous Reconstructed VOP Coded Bit Stream (Motion) Motion Decoding Motion Compensation Demultiplexer Coded Bit Stream (Texture) Variable Length Decoding Inverse Scan VOP Reconstruction Inverse DC & AC Prediction Inverse Quantization IDCT Texture Decoding Simplified Video Decoding Process 위원회
0 1 2 3 10 11 12 13 4 5 8 9 17 16 15 14 6 7 19 18 26 27 28 29 20 21 24 25 30 31 32 33 22 23 34 35 42 43 44 45 36 37 40 41 46 47 48 49 38 39 50 51 56 57 58 59 52 53 54 55 60 61 62 63 0 4 6 20 22 36 38 52 1 5 7 21 23 37 39 53 2 8 19243440 50 54 3 9 18253541 51 55 10 17 26 30 42 46 56 60 11 16 27 31 43 47 57 61 12 15 28 32 44 48 58 62 13 14 29 33 45 49 59 63 0 1 5 6 14152728 2 4 7 1316262942 3 8 121725304143 9 11182431404453 10 19 23 32 39 45 52 54 20 22 33 38 46 51 55 60 21 34 37 47 50 56 59 61 35 36 48 49 57 58 62 63 (a) Alternate-Horizontal scan (b) Alternate-Vertical scan (c) Zigzag scan Three Scanning Patterns of DCT Coefficients in MPEG-4 Video 위원회
B C D or or A X Y Macroblock Previous neighboring blocks used in DC prediction 위원회
B C D or or A X Y Macroblock Previous neighboring blocks and coefficients used in AC prediction 위원회
c9 c8 c6 c5 c4 c1 c0? c7 c3 c2 Current BAB c3 c2 c1 c0? Motion compensated BAB c7 c8 c6 c5 c4 (a) (b) (a) The INTRA template (b) The INTER template where c6 is aligned with the pixel to be decoded. The pixel to be decoded is marked with? Templates for Context-based Arithmetic Coding of Binary Shape 위원회
위원회
위원회
List of major natural video tools 위원회
Static sprite coding tools (1/3) 위원회
Static sprite coding tools (2/3) 위원회
Static sprite coding tools (3/3) 위원회
Scalable Texture Coding : Encoder The basic modules Decomposition of the texture using discrete wavelet transform(dwt) Quantization of the wavelet coefficients Coding of the lowest frequency subband using a predictive scheme Zero-tree scanning of the higher order subband wavelet coefficient input DWT Low-Low QUANT Prediction AC Bitstream Other Bands QUANT ZeroTree Scanning AC 위원회
Scalable Texture Coding 위원회
Scalable Texture Coding : Decoder 위원회
위원회
위원회
위원회
위원회
위원회
위원회
위원회
위원회
Tools and Visual Object Types Visual Tools Simple Core Main Simple Scalable Visual Object Types N-bit Basic X X X X X X I-VOP, P-VOP AC/DC Prediction 4-MV, Unrestricted MV Animated 2D Mesh Basic Animated Texture Still Scalable Texture Simple Face Error resilience X X X X X X Slice Resynchronization Data Partitioning Reversible VLC Short Header X X X X X B-VOP X X X X X P-VOP with OBMC (Texture) Method 1/Method 2 X X X X Quantization P-VOP based temporal X X X X scalability Rectangular Arbitrary Shape Binary Shape X X X X X Grey Shape X Interlace X Sprite X Temporal Scalability X (Rectangular) Spatial Scalability X (Rectangular) N-Bit X Scalable Still Texture X X X 2D Dynamic Mesh with X X uniform topology 2D Dynamic Mesh with X Delaunay topology Facial Animation Parameters X 위원회
Visual Profiles Profiles Object Types Simple Core Main Simple Scalable N-Bit Animated 2D Mesh Basic Animated Texture Scalable Texture Simple X Simple Scaleable X X Core X X Main X X X X N-Bit X X X Hybrid X X X X X X Basic Animated Texture X X X Scaleable Texture Simple FA X Simple Face X 위원회
위원회
H.263 vs. MPEG-4 ITU-T : H.261 -> H.262(MPEG-2) -> H.263 (1995) -> H.263/L (1999) ISO : JPEG -> MPEG-1 -> MPEG-2 -> MPEG-4 (1998) MPEG-4 focuses on content-based compression and synthetic/natural hybrid coding for multimedia database access and mobile communication MPEG-4 uses H.263 as a benchmark for subjective test. MPEG-4 adopts many compression components of H.263. H.263 proves to be an excellent compression algorithm. H.263/L will be developed in cooperation with MPEG-4. MPEG-4 is a multiple-tool, multiple-algorithm, and multiple-profile standard. 위원회
Potential MPEG-4 Markets MPEG-4 will not replace MPEG-2 in digital broadcasting, DVD, VOD, etc. MPEG-4 may compete with H.32x in mobile videophone. MPEG-4 may compete with MHEG-5 based interactive TV or HTML based interactive TV. MPEG-4 may compete with Quicktime or Video-for-windows in multimedia title industry. MPEG-4 may be used in many other audiovisual applications. 위원회