2. Outline
Image Compression Fundamentals
◦ Why Compress
◦ How is compression possible?
◦ Redundancy
◦ The Flow of Image Compression
◦ General Image Storage System
Lossy compression:
Reduce correlation between pixels
◦ Karhunen-Loeve Transform
◦ Discrete Cosine Transform
◦ Discrete Wavelet Transform
◦ Differential Pulse Code Modulation
◦ Differential Coding
Lossless compression:
Quantization and Source Coding
◦ Huffman Coding
◦ Arithmetic Coding
◦ Run Length Coding
◦ Lempel Ziv 77 Algorithm
◦ Lempel Ziv 78 Algorithm
Overview of Image Compression Algorithms
◦ JPEG
◦ JPEG 2000
◦ Shape-Adaptive Image Compression
3. Outline
Image Compression Fundamentals
Reduce correlation between pixels
Quantization and Source Coding
Overview of Image Compression
Algorithms
4. The Flow of Image Compression
What is the so-called image compression coding?
◦ To store the image into bit-stream as compact as possible and to display the
decoded image in the monitor as exact as possible
Flow of compression
◦ The image file is converted into a series of binary data, which is called the bit-
stream
◦ The decoder receives the encoded bit-stream and decodes it to reconstruct the
image
◦ The total data quantity of the bit-stream is less than the total data quantity of the
original image
6. spatial redundancy
Elements that are duplicated within a structure, such as
pixels in a still image and bit patterns in a file. Exploiting
spatial redundancy is how compression is performed.
Intraframe coding
Compressing redundant areas within a single video frame.
Interframe coding
Interframe coding often provides substantial compression because in many motion sequences,
only a small percentage of the pixels are actually different from one frame to another.
10. Reduce the Correlation between
Pixels
Lossy compression:
Orthogonal Transform Coding
◦ KLT (Karhunen-Loeve Transform)
Maximal Decorrelation Process
◦ DCT (Discrete Cosine Transform)
JPEG is a DCT-based image compression standard, which is a lossy
coding method and may result in some loss of details and
unrecoverable distortion.
Subband Coding
◦ DWT (Discrete Wavelet Transform)
To divide the spectrum of an image into the lowpass and the highpass
components, DWT is a famous example.
JPEG 2000 is a 2-dimension DWT based image compression standard.
Predictive Coding
◦ DPCM
To remove mutual redundancy between successive pixels and encode
only the new information
11.
12.
13.
14. Outline
ImageCompression Fundamentals
Reduce correlation between pixels(lossy)
Quantization and Source Coding(lossless)
Overview of Image Compression Algorithms
15. Quantization and Source Coding
Lossless compression techniques:
Quantization
◦ The objective of quantization is to reduce the precision and to achieve higher
compression ratio
◦ Lossy operation, which will result in loss of precision and unrecoverable
distortion
Source Coding
◦ To achieve less average length of bits per pixel of the image.
◦ Assigns short descriptions to the more frequent outcomes and long
descriptions to the less frequent outcomes
◦ Entropy Coding Methods
Huffman Coding
Arithmetic Coding
Run Length Coding
Dictionary Codes
Lempel-Ziv77
Lempel-Ziv 78
17. Arithmetic Coding (1/3)
Arithmetic Coding: a direct extension of Shannon-Fano-Elias coding
calculate the probability mass function p(x n) and the cumulative distribution
function F(xn) for the source sequence xn
◦ Lossless compression technique
◦ Treate multiple symbols as a single data unit
Arithmetic Coding Algorithm
Input symbol is l
Previouslow is the lower bound for the old interval
Previoushigh is the upper bound for the old interval
Range is Previoushigh - Previouslow
Let Previouslow= 0, Previoushigh = 1, Range = Previoushigh – Previouslow =1
WHILE (input symbol != EOF)
get input symbol l
Range = Previoushigh - Previouslow
New Previouslow = Previouslow + Range* intervallow of l
New Previoushigh = Previouslow + Range* intervalhigh of l
END
18. Arithmetic Coding (2/3)
Symbol Probability Sub-interval Input String : l l u u r e ?
k 0.05 [0.00,0.05)
l l u u re ?
l 0.2 [0.05,0.25)
u 0.1 [0.20,0.35) 0.0713348389
w 0.05 [0.35,0.40)
=2-4+2-7+2-10+2-15+2-16
e 0.3 [0.40,0.70)
r 0.2 [0.70,0.90) Codeword : 0001001001000011
? 0.2 [0.90,1.00)
1 0.25 0.10 0.074 0.0714 0.07136 0.071336 0.0713360
1.00
0.90
? ? ? ? ? ? ? ?
r r r r r r r r
0.70
e e e e e e e e
0.40
0.35
w w w w w w w w
0.25
u u u u u u u u
0.05 l l l l l l l l
0
k k k k k k k k
0 0.05 0.06 0.070 0.0710 0.07128 0.07132 0.0713336
19. Arithmetic Coding (3/3)
Symbol Probability Huffman codeword
Input String : l l u u r e ?
k 0.05 10101
Huffman Coding 18 bits
l 0.2 01
Codeword :
u 0.1 100
01,01,100,100,00,11,1101
w 0.05 10100
e 0.3 11
Arithmetic Coding 16 bits
r 0.2 00
Codeword : 0001001001000011
? 0.2 1011
Arithmetic coding yields better compression because it encodes a message as
a whole new symbol instead of separable symbols
Most of the computations in arithmetic coding use floating-point arithmetic.
However, most hardware can only support finite precision
While a new symbol is coded, the precision required to present the range grows
There is a potential for overflow and underflow
If the fractional values are not scaled appropriately, the error of encoding occurs
20. Zero-Run-Length Coding-JPEG
(1/2)
The notation (L,F)
◦ L zeros in front of the nonzero value F
EOB (End of Block)
◦ A special coded value means that the rest elements are all zeros
◦ If the last element of the vector is not zero, then the EOB marker will not be added
An Example:
1.57, 45, 0, 0, 0, 0, 23, 0, -30, -16, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, ..., 0
2. (0,57) ; (0,45) ; (4,23) ; (1,-30) ; (0,-16) ; (2,1) ; EOB
3. (0,57) ; (0,45) ; (4,23) ; (1,-30) ; (0,-16) ; (2,1) ; (0,0)
4. (0,6,111001);(0,6,101101);(4,5,10111);(1,5,00001);(0,4,0111);(2,1,1);(0,0)
5.1111000 1111001 , 111000 101101 , 1111111110011000 10111 ,
11111110110 00001 , 1011 0111 , 11100 1 , 1010
21. Dictionary Codes
Dictionary based data compression algorithms are based on
the idea of substituting a repeated pattern with a
shorter token
Dictionary codes are compression codes that dynamically
construct their own coding and decoding tables “on the fly”
by looking at the data stream itself
It is not necessary for us to know the symbol probabilities
beforehand. These codes take advantage of the fact that,
quite often, certain strings of symbols are “frequently
repeated” and these strings can be assigned code words that
represent the “entire string of symbols”
Two series
◦ Lempel-Ziv 77: LZ77, LZSS, LZBW
◦ Lempel-Ziv 78: LZ78, LZW, LZMW
22. Outline
Image Compression Fundamentals
Reduce correlation between pixels
Quantization and Source Coding
Overview of Image Compression
Algorithms
23. Outline
ImageCompression Fundamentals
Reduce correlation between pixels(lossy)
Quantization and Source Coding(lossless)
Overview of Image Compression Algorithms