개요 9 장영상압축 압축의필요성 데이터의대용량화 저장, 전송매체의한계압축의 2가지분류 손실 (lossy) 압축 시각적으로구분할수없는데이터허용 무손실 (lossless) 압축 처리상에서자료의손실이없으며입출력데이터와완전동일 의료영상, 도면, 설계도등 압축시고려사항 압축률 ( 압축률 = 원시자료 / 압축자료 ) encoding/decoding 시간 알고리즘복잡도계산되는자원의가용성과비용 표준화여부 1 2 런길이 (Run length) 인코딩 압축알고리즘분류도 3 Run 은반복되는문자, Length 는반복되는횟수를의미 BBBBBCCCCCCCCDEEEE (22 byte) => 45B8C1D4E (10 byte) 압축률 : 22 byte / 10 byte = 2.2 MyDogHasFleas (13 byte) => 1M1y1D1o1g1H1a1s1F1l1e1a1s (26 byte) 압축률 : 13 byte / 26 byte = 0.5 ( 원시자료보다 2배증가 ) 유일한자료로이루어진스트링의경우원자료를표현하고, Runlength encoding 은반복된자료에적용 전치문자 + 자료의개수 + 원자료의해당문자 BCDDDDDDDDEEEEEEEEE ( 19 byte) => BC*8D*9E ( 9 byte) 압축률 : 19 byte / 9 byte = 2.11 3 byte 보다긴 run 일경우압축률이높아짐 4 * MacPaint 영상파일의형식 개수바이트와전치문자를결합한런길이인코딩사용 count byte 반복될 Pattern 1 data byte count byte 유일한데이터들 ( 1 to 72 ) 0 data byte data byte MacPaint 부호화형태 변화가없는배경을가진영상에서효율적 만화.. 실제영상에서는많이사용하지않음 RLE를사용하는확장자 *. RLE, *.GEM (TIF, TG는옵션 ) PCX 영상을평면 (bit plane) 으로분리하여일부의평면들에대해 RLE를한다. 5 6 1
허프만 (Huffman) 코딩 배경 RLE 는 3 번이상반복되는문자들에이용되기때문에, 영어문장에서는잘수행이안됨 효율을높이기위해서빈번하게발생되는자료의표현에적은수의비트를사용하고드물게발생되는자료는많은비트들을사용하는가변길이코딩사용 1952 년 David Huffman 의논문 모르스부호와유사 : Dot 와 Dash 를사용 빈도수높은글자 : E(.), T(-) 빈도수낮은글자 : Q(- -, -) toupee t o u p e e (48 bit) 111 0100 10111 10110 100 100 (23 bit) 압축률 : 2.08 7 8 전치특성 (Prefix property) 이진트리코드를사용하는이유 한코드가다른코드앞에위치할수없다 e 가 01 로표현되어졌다면, 010 은다른영문자로인코딩될수없어야함 Decoder가 010 이라는비트스트림을스캔할때, 01 이들어오면 e 를출력하고다음의 0 에서스캔시작한문자에대한코드의길이가다른가변길이부호화에서는모든기법들은전치특성을가져야한다. 9 허프만코드의생성 문자의빈도수에따른문자들의배열을생성 이진트리를사용하여허프만테이블구성 1. 처음에는모든문자들을각기독립된노드인자유노드들로간주 2. 가장낮은빈도수를가진두개의자유노드들을자식노드로하는하나의부모노드를생성, 생성되는부모노드는자식노드들의합과동일한가중치를가짐 3. 연결한두개의자식노드들을자유노드리스트에서제거, 새롭게생성된부모노드를리스트에추가 4. 2단계와 3단계를하나의자유노드만남을때까지반복수행, 이자유노드는구성되는트리의루트가됨 10 빈도수표 인코딩에두단계필요 문자의빈도수측정 트리구조를사용하여압축 일반적으로표준허프만테이블사용 ( 최적화되지못함 ) 허프만코드생성 11 12 단점 2
Huffman coding 수정된허프만 (MH) 코딩 팩시밀리에주로사용 : ITU-TS의권고 T.0 G(group)1, G2 : 아날로그팩스 G3, G4 : 디지털방식, MH, MR, MMR TIFF파일의저장에서옵션으로사용가변길이에대한허프만코딩과 RLE의결합방법 63이하의 run들은종료코드로코드화 64이상의 run들은구성코드와종료코드의조합으로코드화 종료코드 0 ~ 63의 run 을 black bit와 white bit에대해허프만코드생성 구성코드 64 ~ 2560의수중64 step에대하여허프만코드를생성배경 팩시밀리의데이터는 85% 가흰색배경허프만코드기법으로짧은검정과긴흰색의 run들을최적화 13 14 15 16 수정된 RED(MR) 팩시밀리에서사용 Modified Relative Element ddress Designate 수정된허프만코딩을포함하는모집단배경 전송자료의 75% 가한화소의왼쪽이나오른쪽또는바로아래라인의자료가전송 K개의스캔라인에서첫번째라인은수정된허프만기법으로부호화된후에참조라인이된다. 나머지라인들은이전에인코딩된라인을참조로하여현코딩라인을인코딩한다. 17 18 3
패스모드코딩 b2가 a1의왼쪽에위치할경우발생수직모드코딩 a1의수평적인위치가 b1의오른쪽또는왼쪽의 3개의화소내에있을경우사용수평모드코딩 수직모드코딩이사용이될수없을경우사용 수정된 RED 흐름도 19 20 LZW 압축알고리듬 Lempel, Ziv Huffman Coding 은전형적으로한문자만을코딩하는제약사항을가지고있었고, 문자열을인코딩하는방법제안 ( 1977년 braham Lempel 과 Jacob Ziv 의논문 ) Ł LZW (Lempel Ziv Welch), (1984 년 Terry Welch 가 IEEE 에발표 ) 텍스트파일들에서의마침표다음의스페이스또는 the 라는단어같이빈번하게발생하는문자순열활용 LZW 압축알고리듬 encode strings of data by creating a code table no need to include code table with compressed data use 12 bit codewords, so code table has 4096 locations initialize first 256 locations to single characters parse input characters to form strings which are to be inserted into string table in locations 256-4096 decoder creates the same string table during decompression 21 22 LZW initialize first 256 entries to single characters update string table for each character in coded stream, except the first one after codeword is expanded to its corresponding string via string table, the first char of the string is appended to the previous string and added to the table can compress input stream in single pass require no prior information about input stream Encoding algorithm initialize table with single character strings STRING = first input character WHILE not end of input stream CHRCTER = next input character IF STRING+CHRCTER is in the string table STRING =STRING+CHRCTER ELSE output the code for STRING add STRING+CHRCTER to the string table STRING=CHRCTER END WHILE output code for string 23 24 4
Encoding Decoding algorithm BBB Encoder output String table output code representing codeword string 66 B 256 B 65 257 B 256 B 258 B 257 B 259 B 65 260 260 압축율 : 8x9/12x6=1 25 initialize table with single character strings OLD_CODE=first input character output translation of OLD_CODE WHILE not end of input stream NEW_CODE=next input character IF NEW_CODE is not in the string table STRING=translation of OLD_CODE STRING=STRING+CHRCTER ELSE STRING=translation of NEW_CODE output STRING CHRCTER=first character of STRING add OLD_CODE+CHRCTER to the string table OLD_CODE=NEW_CODE END WHILE 26 Decoding Decoder output String table string codeword string B 256 B B 257 B B 258 B 259 B 260 gif file rithmetic Coding, MIT take in complete data stream and output one specific codeword which is floating point number between 0 and 1 two pass algorithm first pass computes characters frequency and constructs probability table with ranges assigned to each entry of the table second pass does actual compression first character of input stream constrains output number to its corresponding range, and the range of next character of input stream further constrains the number, and so on. decoding is reverse of encoding achieve higher compression ratio than Huffman, but computationally expensive 27 28 산술코딩 0의발생확률 P0=2/3, 1의발생확률 P1=1/3 입력심볼 : 001 001의구간 : (2/3)x(2/3)x(2/3)=8/27=0. 01110...b (2/3)x(2/3)=4/9=0.0100... b 011 : 구간내에서비트수가가장작은데이터로코딩함 구간폭이큰, 즉, 발생확률이큰심볼일수록짧은길이의부호데이터가할당됨 LOW=0.0 HIGH=1.0 Encoding lgorithm WHILE not end of input stream get next CHRCTER RNGE=HIGH-LOW HIGH=(LOW+RNGE)*high range of CHRTER LOW=(LOW+RNGE)*low range of character END WHILE output LOW 29 30 5
Encoding Example RTHMETIC symbol probability range 0.0- C - E -0.3 H 0.3-0.4 I 0.4-0.6 M 0.6-0.7 R 0.7-0.8 T 0.8-1.0 31 R I T H M E T I C Encoding example range low high 1.0 0.0 0.07 0.08 0.01 0.074 0.076 0.002 0.0756 0.076 0.0004 0.07572 0.07576 0.0 0.075744 0.075748 0.000004 0.0757448 0.0757452 0.0000004 0.07574512 0.075745168 0.00000008 0.075745152 0.075745168 0.000000016 0.0757451536 0.0757451552 32 Decoding lgorithm Decoding get NUMBER DO find CHRCTER that has HIGH>NUMBER and LOW<NUMBER set HIGH and LOW corresponding to CHRCTER output CHRCTER RNGE=HIGH-LOW NUMBER=NUMBER-LOW NUMBER=NUMBER/RNGE UNTIL no more CHRCTERs 33 num 0.0757451536 0.0757451536 0.57451536 0.8725768 0.362884 0.62884 88400000002 0.884000000024 0.420000000120 00000000598 R I T H M E T I C range low 0.0 0.7 0.4 0.8 0.3 0.6 0.8 0.4 high 1.0 0.8 0.6 1.0 0.4 0.7 0.3 1.0 0.6 34 JPEG JPEG Joint(ISO+CCITT) Photographic Experts Group JPEG 은디지털화된자연이미지 ( continuous tone digital still image data ) 의압축에관한국제표준 80 년대중반에서후반에걸쳐 JPEG 정지영상압축방식을결정하는과정에서, 유럽은 DCT 방식을일본은 VQ 방식을그리고미국특히 IBM 사는 rithmetic Coding 을주장해서첨예한논쟁을벌였다. JPEG 은컬러및그레이스케일영상에적용되며이치영상 ( bi-level ) 에는적용될수없고 JBIG 에서다룬다. 비가역부호화 (DCT) 기본시스템 순차적 (sequential) build-up 방식, Huffman 부호화 확장시스템 점진적 (progressive) build-up 방식, 허프만 / 산술부호, multiple scan encoding, successive image 가역부호화 (DPCT) 가역시스템 순차적 build-up 방식, 허프만 / 산술부호 35 36 6
JPEG mode Baseline processes ( required for all DCT-based decoders ) 인코더 DCT-based process Source image : 8-bit samples within each component Sequential Huffman coding : 2 C and 2 DC tables Decoders shall process scans with 1, 2, 3, and 4 components Interleaved and non-interleaved scans Extended DCT-based processes 디코더 베이스라인 JPEG DCT-based process Source image : 8-bit or 12-bit samples Sequential or progressive Huffman coding : 4 C and 4 DC tables Decoders shall process scans with 1, 2, 3, and 4 components Interleaved and non-interleaved scans 37 38 JPEG Lossless processes Predictive process ( not DCT-based ) Source image : P-bit samples ( 2 P 16 ) Sequential Huffman coding : 4 DC tables Decoders shall process scans with 1, 2, 3, and 4 components Interleaved and non-interleaved scans Hierarchical processes Multiple frames ( non-differential and differential ) Uses extended DCT-based or lossless processes Decoders shall process scans with 1, 2, 3, and 4 components Interleaved and non-interleaved scans still-image compression standard has 3 lossless modes and 1 lossy mode sequential baseline encoding encode in one scan input & output data precision is limited to 8 bits, while quantized DCT values are restricted to 11 bits progressive encoding hierarchical encoding lossless encoding can achieve compression ratios of upto 20 to 1 without noticeable reduction in image quality 39 40 DCT-based mode Encoder work well for continuous tone images, but not good for cartoons or computer generated images tend to filter out high frequency data can specify a quality level (Q) with too low Q, resulting images may contain blocky, contouring and ringing structures. 5 steps of sequential baseline JPEG encoding transform image to luminance/chrominance space (YCbCr) reduce the color components (optional) partition image into 8x8 pixel blocks and perform DCT on each block quantize resulting DCT coefficients variable length code the quantized coefficients 41 42 7
subimage size selection Baseline JPEG compression step1 separate luminance and chrominance because more information can be removed from chrominance. human eye tends to perceive small changes in brightness better than in color step2 subsample color components by 2 horizontally and vertically (option) step3 convert elements in a tile to signed integers by subtracting one half of gray scale (0,0) element of DCTed block is DC and other elements are C 43 44 Step 1, 2 : RGB to YCbCr Y=0.3R+0.59G+1B Cb=-7R-0.33G+0.5B Cr=0.5R-0.42G-0.08B Y DCT Cb, Cr 가로방향으로 1 pixel 씩생략 24bits/pixel 1/20 step4 quantize and threshold by T * (u,v) = round [ T(u,v)/Z(u,v)] T(u,v) = transformed coefficient Z(u,v) = transform normalization array T * (u,v) = thresholded & quantized approximation of T(u,v) when Z(u,v) = c, T*(u,v) Y Cb Cr c 2c T(u,v) 45 46 at a receiver site, T * (u,v) must be multiplied by Z(u,v) and then inverse transformed to get approximate of subimage T ** (u,v) = T * (u,v)xz(u,v) normalization matrix is determined by quality level Q if Z(u,v) > 2T(u,v), then T*(u,v)=0 transform coefficient is completely truncated or discarded when T * (u,v) is represented with variable length code whose length increases as T(u,v) increases, the number of bits is controlled by Z(u,v) reorder quantized coefficients using zigzag pattern to form 1-dim sequence of quantized coefficients zigzag pattern may result in long run of zeros step5 DC coefficient is difference coded relative to DC coefficient of previous subimage C coefficients are broken into runs of zeros ending in a nonzero number. Each broken block is to be variable length coded classify each coefficient into some categories based on its range run length and category specifies basecode and the number of bits of a code determine tailing bits of a code considering least significant bits of coefficient apply one s complement to specify sign of the coefficient 47 48 8
8x8 block DCT 양자화 영상을 8x8 Block 으로나누어서 DCT 를수행 DCT 는공간도메인에서주파수도메인으로변환이며, 1 개의 DC 값과 63 개의 C 값이생성. 그림 ( 가 ) 는원영상이고 ( 나 ) 는 8x8 Block DCT 한영상 ( 나 ) 에서흰점들은 DC 성분이며 DC 성분주위로저주파성분값들이집중되어있고, block 우하단에고주파성분들이집중하게된다. 자연영상은대부분저주파영상이기때문에 block 우하단고주파성분영역에는 0 값이채워지게된다. DCT 계수들을해당양자화값으로나누어서정수값으로반올림양자화값은양자화테이블에제공 JPEG 에서지정한양자화테이블은없으며아래테이블은휘도예임. 가 나 49 50 엔트로피코딩 양자화된 DCT 계수는 1 개의 DC 와 63 개의 C 가생성된다. C 는 ZigZag Scan 순으로일련의스트링을형성하는데, 이때그림과같이 0 이아닌값으로끝나는소단위스트링으로나눌수있다. 즉, {-31},{0,-2},{-1},{-1},{-1},{0,0,-1},{-1},{0 0} 으로나눌수있다. -31 0-1 0 0 0 0 0-2 -1 0 0 0 0 0 0-1 -1 0 0 0 0 0 0-1 0 0 0 0 0 0 0-31 0-1 0 0 0 0 0-2 -1 0 0 0 0 0 0-1 -1 0 0 0 0 0 0-1 0 0 0 0 0 0 0-31 0-1 0 0 0 0 0-2 -1 0 0 0 0 0 0-1 -1 0 0 0 0 0 0-1 0 0 0 0 0 0 0 lossless encoding DC DC=-31, 이전의 DC=-34 DC=DCi-DCi-1=-31-(-34)=+3 C DC, CŁ(symbol-1)+(symbol-2) symbol-1 : 0 런길이 (RUNLENGTH), 비트수 (SIZE to encode non-zero MPLITUDE) VLC(variable-length code) Huffman Codes, prefix property symbol-2 : VLI(variable-length integer) 51 52 Example of 8x8 grayscale image DC 또는소단위스트링은 VLC + VLI 로인코딩된다. VLC Run Length 0 의개수 SIZE 0 이아닌값을코딩하는데필요한비트수 VLI 0 이아닌값자체 -31 0-1 0 0 0 0 0-2 -1 0 0 0 0 0 0-1 -1 0 0 0 0 0 0-1 0 0 0 0 0 0 0-31 0-1 0 0 0 0 0-2 -1 0 0 0 0 0 0-1 -1 0 0 0 0 0 0-1 0 0 0 0 0 0 0-31 0-1 0 0 0 0 0-2 -1 0 0 0 0 0 0-1 -1 0 0 0 0 0 0-1 0 0 0 0 0 0 0 DC 는이전블록의 DC 와의차분값 Diff = DCcurrent DCprevious 를 VLI 로코딩하고 VLC 에는 SIZE 만온다. VLI 가음수일경우는그값에서 1 을빼준다. 즉 2 은 10, -2 은 10-1 = 01 이된다. 는비트스트링을의미한다. C 의경우한소단위스트링에서 Run 은 16 개까지처리할수있으며, 내부적으로 (F,0) 으로처리하고, (F,0) 은연속 3 개까지올수있다. 더이상 0 이아닌계수값이없을때는 EOB 으로끝내며, 내부적으로 (0,0) 코드를사용한다. 53 Use Luminance Quantization Table 9.6 {-31},{0,-2},{-1},{-1},{-1},{0,0,-1},{-1},{0...0} Lossless Encoding DC=+3 SIZE=2 발생확률구함 VLC=011 VLI=11 011 11 11011 01 00 0 00 0 00 0 11100 0 00 0 1010 (Table 9.13) comp ratio=8*64/34 =15 54 9
Decoding 그림 9.16 decode VLCs and VLIs dequantize IDCT 그림 9.17 image degrade Huffman rithmetic coding ( 5-10% better but complex) Lossless 압축 그림 9.18 does not use DCT 압축율 : 1.6-2.5 JPEG file format : no standard JFIF(JPEG file interchange format) JPEG extension (TIFF 6.0) 55 56 Lossless 코딩 (1/2) Prediction of X 57 scheme 0 1 2 3 4 5 6 7 Prediction no prediction (differential encoding) B C +B-C +(B-C)/2 B+(-C)/2 (+B)/2 58 Vector Quantization 개요 VQ 인코더 / 디코더블록도 벡터양자화는원이미지를 n- 차원이미지벡터로분해 예를들어 n 은 l x m 픽셀값이 n 차원벡터를구성할수있다. 각이미지벡터, X, 는코드북에있는코드벡터 Xi, I = 1,., Nc, 와비교한다. 이때코드벡터도 n 차원을갖는다. 선택된 minimum distortion rule 을이용해가장잘매치되는코드벡터를선택해서이코드벡터의인덱스를코딩한다. 이때가장많이사용하는 distortion measure 로는 MSE 를사용한다. 원영상 Form X Minimize distortion index k image vectors d(x, Xi) Codebook Xi, i = 1,...,Nc index k Table Xk channel Look-up channel Codebook Xi, i = 1,...,Nc 59 60 10
기존영상압축기법과의차이 비트맵그래픽과벡터그래픽간의차이와매우유사 개별적인화소들에대한자료를저장하기보다는영상생성을위한명령어나방식을저장 압축시시간은많이걸리고복원시적게걸리기때문에 CD-ROM 과같은응용분야에적합 모든영상이마치그들과유사한보다작은영상들로구성되는것으로가정 프랙탈압축 예를들어푸른하늘은보다작은푸른것의부분들로구성되어있다 이산웨이블렛변환 웨이블렛이론은응용수학에서의새로운물결로, 영상압축은물론양자역학, 결정학, 음향학을포함하는많은과학들의응용분야에서발견되어졌다. 기본적인압축방법은영상을 DWT 변환수행후결과계수들을하나의임계값과비교해임계값보다적으면 0으로설정하고 0이아니면무손실코딩을수행한다. 61 62 동영상정보압축의원리 MPEG Moving Picture (coding) Experts Group 1988 년저장미디어를대상으로한영상압축표준위원회가 ISO 산하에설립 CD-ROM이적용대상화질 VHS-VTR 품질최초의상품 일본빅스터사의 디지털비젼가라오케 (92) 63 64 MPEG 1 MPEG 2 CD 저장미디어, 1.5Mbps I 화면 : 프레임내부호화화면 P 화면 : 프레임간부호화화면 B 화면 : 쌍방향예측부호화화면 SIF(source input format) 4:2:0 순차주사 방송, 통신용 TV, HDTV 65 66 11
H.261 MPEG 오디오의압축원리 ISDN(integrated services digital network) 64kbps-2Mbps 화상전화, 회의용화상 중간화상 (CIF:common intermediate format) 352x288, 30frame/sec, 4:2:0순차주사 압축 MC+DCT+Huffman 67 68 MPEG1 의 video codec 69 12