• No results found

Image and Image Processing

N/A
N/A
Protected

Academic year: 2022

Share "Image and Image Processing "

Copied!
203
0
0

Loading.... (view fulltext now)

Full text

(1)

Image and Image Processing

Unit-II

EL-447 (Multimedia Systems and Networks)

(2)

Image

What is An Image?

Grayscale image

A grayscale image is a function I(x,y) of the two spatial coordinates of the image plane.

I(x,y) is the intensity of the image at the point (x,y) on the image plane.

We can regard I(x,y) as taking values in R+ = [0, inf)

We can restrict the image to be bounded by some rectangle [0,a] ×[0,b]

I: [0, a] ×[0, b] [0, inf )

Color image

Can be represented by three functions, R(x,y) for red, G(x,y) for green, and B(x,y) for blue.

Details on color vision and representation will be discussed later.

(3)

Classification of Images

Reflection Images

Information primarily about object surfaces

Examples: Optical imaging, radar, sonar, laser

Emission Images:

Information primarily internal to the object

Example: Thermal, infrared, MRI

Absorption images

Information primarily about the internal structure to the object

Examples: X-ray, transmission microscopy, sonic images

(4)

Digital image

Sampled and quantized image is called digital Image.

A digital image is an image f(x,y) that has been

digitized both in spatial

coordinates and brightness.

the value of f at any point

(x,y) is proportional to the

brightness (or gray level)

of the image at that point.

(5)

Digital image

(6)

Digital image

A digital image can be considered a matrix whose row and column indices identify a point in the image and the corresponding matrix element value

identifies the gray level at that point.

Pixel values in highlighted region

(7)

Examples of Digital image

(8)

A Simple Image Model

Light intensity function:

image refers to a 2D light-intensity function, f(x,y)

the amplitude of f at spatial coordinates (x,y) gives the intensity (brightness) of the image at that point.

light is a form of energy thus f(x,y) must be nonzero and finite.

0 < f(x,y) < ∞

Illumination and reflectance:

the basic nature of f(x,y) may be characterized by 2 components:

Illumination, i(x,y): the amount of source light incident on the scene being viewed.

Reflectance, r(x,y): the amount of light reflected by the objects in the scene.

(9)

A Simple Image Model

f (x,y) = i(x,y) r(x,y)

0 < i(x,y) < ∞

determined by the nature of the light source

0 < r(x,y) < 1

determined by the characteristics of the objects in a scene.

bounded from total absorption (0)

to total reflectance (1).

(10)

Digital Image Processing?

One picture is worth more than ten thousand words.

Interest in DIP stems from two principal application areas:

Improvement of pictorial information for human interpretation.

Processing of image data for storage, transmission and representation for autonomous machine perception.

(11)

What is Image Processing?

Image processing is the subclass of

signal processing concerned specifically with pictures.

Improve image quality for human perception and/or computer

interpretation.

Image Image

Processing

Better Image

(12)

Fields that deal with images

Computer Graphics: the creation of images.

Image Processing : the enhancement or other manipulation of the image – the

result of which is usually another images.

Computer Vision: the analysis of image

content.

(13)

Fields that deal with images

Input/Output Image Description Image Image Computer

Processing Vision Description Computer AI

Graphics

Computer Vision, Image Processing and

Computer Graphics often work together to

produce amazing results.

(14)

Applications of Image Processing

Improvement of pictorial information for human interpretation.

Processing of image data for storage,

transmission, and representation for autonomous machine perception

There are limitless applications of image processing. Some examples are:

Radiation from the Electromagnetic spectrum

Acoustic, geological imaging, Radar Imaging

Medical Imaging : X-ray, Ultrasonic, MRI

Industrial, Law enforcing,

Computer (synthetic images used for modeling and visualization)

(15)

Classification of Image Processing

Low-level : input, output are images

Primitive operations such as image preprocessing to reduce noise, contrast enhancement, and image

sharpening

Mid-level : inputs may be images, outputs are attributes extracted from those images

Segmentation

Description of objects

Classification of individual objects

High-level :

Image analysis

(16)

Digital Image Processing (DIP)

Image: Any 2-D signal/data.

DIP: processing of two dimensional picture by a digital computer.

An image is captured by a sensor (such as a monochrome or color TV camera) and digitized.

If the output of the camera or sensor is

not already in digital form, an analog-to-

digital converter digitizes it.

(17)

Image Acquisition

Camera

Camera consists of two parts

• A lens that collects the

appropriate type of radiation emitted from the object of

interest and that forms an image of the real object

• a semiconductor device – so called charged coupled device or CCD

which converts the irradiance at the image plan into an electrical signal.

(18)

Image Acquisition

Frame Grabber

Frame grabber only needs circuits to digitize the

electrical signal from the

imaging sensor to store

the image in the memory

(RAM) of the computer.

(19)

Image Enhancement

To bring out detail is obscured, or

simply to highlight certain features of

interest in an image.

(20)

Image Restoration

Improving the appearance of an image

Tend to be based on mathematical or

probabilistic models of image degradation

Distorted image Restored image

Examples:

(21)

Image Compression

Reducing the storage required to save an image or the bandwidth required to

transmit it.

Ex. JPEG (Joint Photographic Experts Group) image compression standard.

Wavelet: Foundation for representing images in various degrees of resolution.

Used in image data compression and pyramidal representation (images are subdivided

successively into smaller regions)

(22)

Image Segmentation

computer tries to separate objects from the image

background.

It is one of the most difficult tasks in DIP.

A rugged segmentation

procedure brings the process a long way toward successful

solution of an image problem.

Output of the segmentation stage is raw pixel data,

constituting either the boundary of a region or all the points in

the region itself.

(23)

Representation & Description

Representation make a decision whether the data should be represented as a boundary or as a complete region.

Boundary representation focus on external shape characteristics, such as corners and inflections.

Region representation focus on

internal properties, such as texture or

skeleton shape.

(24)

Representation & Description

(25)

Recognition and Interpretation

Recognition the process that assigns a label to an object based on the

information provided by its descriptors.

Interpretation assigning meaning to an ensemble of recognized objects.

(26)

Colour models

(27)

Color Perception

Motivation for Color Image Processing

Color is a powerful descriptor that often simplifies object identification and segmentation.

Human can identify thousands on color shades and intensities, compared to only two dozen of gray shades. (important in

manual image analysis).

Color Representation for images and video

How the physical spectra of a scene is transformed into RGB components, and how these components are transformed to physical spectra at the display

Cones vs. Rods

3 types of cones (for color)

1 type of rod (night vision, no color)

(28)

Light is a part of EM wave

What is color?

Color is the perceptual result of light having wavelength 400 nm to 700 nm that is incident upon the retina.

“Power distribution exists in the physical world, but color exists only in the eye and the brain”

(29)

Light is a part of EM wave

•Perceived color depends on spectral content (wavelength composition) e.g., 700nm ~ red.

•“spectral color”

A light with very narrow bandwidth

•A light with equal energy in all visible bands appears white.

(30)

Illuminating and Reflecting Light

Illuminating sources (primary light):

emit light (e.g. the sun, light bulb, TV monitors)

perceived color depends on the emitted freq.

follows additive rule

» R+G+B=White

Reflecting sources (Secondary light):

reflect an incoming light (e.g. the color dye, matte surface, cloth)

perceived color depends on reflected freq (=emitted freq -absorbed freq.)

follows subtractive rule

» R+G+B=Black

(31)

Color Perception

The color that human perceive in an object

= the light reflected from the object

Illumination source scene

reflection eye

(32)

Eye vs. Camera

Camera Components Eye Components

Lens Lens, Cornea

Shutter Iris, pupils

Film Retina

Cable to transfer images Optic nerves to send the information to the brain

Source:

http://www.macula/anatomy/retina frame.html

(33)

Human perception of color

Retina contains photo receptors

Cones: day vision, can perceive color tone

Red, green, and blue cones

65% cones are sensitive to red light, 33% to green and 2% to blue light.

Tri-receptor theory of color vision

Rods: night vision, perceive brightness only

Color sensation is characterized by

Luminance (brightness)

Chrominance (Hue and Saturation Together)

Hue (color tone)

specify color tone (redness, greenness, etc.).

depend on peak wavelength.

Saturation (color purity)

describe how pure the color is.

depend on the spread (bandwidth) of light spectrum

reflect how much white light is added.

(34)

Color Mixing

Primary colors for illuminating sources:

Red, Green, Blue (RGB)

Color monitor works by exciting red, green, blue phosphors using separate electronic guns

Primary colors for reflecting sources (also known as

secondary colors):

Cyan, Magenta, Yellow (CMY)

Color printer works by using cyan, magenta, yellow and black (CMYK) dyes.

(35)

Color Mixing (RGB Vs CMY)

(36)

Color complements

Complements on

the color circles Color hue specification

(37)

Color Representation Models

A color model is a specification of a 3-D coordinate system and a subspace within that system where each color is represented by a single point.

Many color models are in use depending upon the application.

Models based on primary colors (hardware Oriented)

RGB (used in color monitors and video cameras).

CMY, CMYK (used in color printers)

Models based on luminance and chrominance (Application Oriented)

HSI (hue, saturation and intensity): used in developing image processing algorithms based on HVS.

HSV (hue, saturation and value): similar to HSI.

YIQ (used in NTSC color TV) (I, Q are two chrominance)

YUV (used in digital color TV, video coding)

(38)

RGB Color Model

Based on Cartesian coordinate system with color subspace as a cube.

RGB are three corners at three axis, cyan, magenta and yellow are at three other corners, black at origin and white is at corner farthest from the origin.

used in color monitors and digital cameras.

all color values are normalized in side a unit cube.

8 bits for each color component, 24 bit/pixel.

•1Kx1K display need

display buffer of 3MB. total 16 million colors.

{(2^8)^3=16,777,261}

(39)

Example: 02

(40)

Examples

(41)

CMY Color Model

C (cyan), M (magenta) and Y (yellow) are

secondary color of light or primary colors of pigments.

Each color is represented by these three.

Cyan coated surface does not reflect red light.

Equal amounts of Cyan, Magenta, and Yellow produce black. In practice, this produce muddy-looking black.

To produce true black, a fourth color, black is added, which is CMYK color model.

Used in color printers and copiers.

RGB to CMY conversion

(42)

CMY Color Model (Example)

original

cyan

Magenta yellow

(43)

HSI or HSV color models

Each color is specified in terms of its Hue (H), Saturation (S) and intensity (I) or value (V).

This model is sometimes referred to as HSV instead of HSI.

The main advantages of this model is that:

Chrominance (H, S) and luminance (I) components are decoupled.

Hue and saturation is intimately related to the way the human visual system perceives color.

In short, the RGB model is suited for image color

generation, whereas the HSI model is suited for image color description.

(44)

HSI or HSV color models

It is related to the RGB model as follows:

(45)

HSI or HSV color models

Converting Color from HSI to RGB

(46)

HSI or HSV color models

(47)

HSI or HSV color models

(48)

HSI or HSV color models

(49)

HSI or HSV color models (Examples)

hue

saturation intensity

original

(50)

YIQ or YUV color models

Each color is represented in terms of a luminance

component (Y) and two chrominance or color components:

inphase (I) and quadrature (Q) components or Y and V components.

YIQ is used in United States commercial TV broadcasting National Television System Committee (NTSC ) system.

YUV is used in most of the European and Asian TV broadcasting (PAL system).

Used for maintaining transmission efficiency and to provide compatibility with monochrome TV.

The Y component provides all the video information required by a monochrome TV receiver/monitor.

The main advantage of these models is that the luminance and chrominance components are decoupled and can be

processed separately.

(51)

YIQ or YUV color models

I signal lies 33

0

counter clockwise to +(R-Y) where eye has maximum color

resolution.

I = 0.74(R-Y) - 0.27(B-Y)

Q signal lies 33

0

counter clockwise to +(B-Y) where eye has minimal color

resolution.

Q = 0.48(R-Y) +0.41(B-Y)

(52)

YIQ or YUV color models

In PAL system, the weighted (B-Y) and (R-Y) signals are modulated without being given phase shift of 330.

Chrominance information is represented in terms of U and V, where

U=0.493(B-Y) & V=0.877(R-Y)

YIQ is related to RGB model by:

(53)

YIQ or YUV color models (Examples)

original

Y

U V

(54)

Comparison Example-1

(55)

Comparison Example-2

(56)

Image Compression & JPEG

(57)

Image Compression

What is Image Compression?

Reduction of the amount of data required to

represent a digital image  removal of redundant data.

Transforming a 2-D pixel array into a statistically uncorrelated data set

Why Compression?

Important in data storage and data transmission

Examples:

Progressive transmission of images (Internet)

Video coding (HDTV, teleconferencing)

Digital libraries and image databases

Remote sensing

Medical imaging

(58)

Why Need Compression?

Savings in storage and transmission

multimedia data (esp. image and video) have large data volume.

difficult to send real-time uncompressed video over current network.

Accommodate relatively slow storage devices

they do not allow playing back uncompressed multimedia data in real time

1x CD-ROM transfer rate ~ 150 kB/s

320 x 240 x 24 fps color video bit rate ~ 5.5MB/s

=> 36 seconds needed to transfer 1-sec uncompressed video from CD

(59)

Example: Storing An Encyclopedia

500,000 pages of text (2kB/page) ~ 1GB => 2:1 compress

3,000 color pictures (64048024bits) ~ 3GB => 15:1

500 maps (64048016bits=0.6MB/map) ~ 0.3GB => 10:1

60 minutes of stereo sound (176kB/s) ~ 0.6GB => 6:1

30 animations with average 2 minutes long

(64032016bits16frames/s=6.5MB/s) ~ 23.4GB => 50:1

50 digitized movies with average 1 minute long

(64048024bits30frames/s = 27.6MB/s) ~ 82.8GB => 50:1

Require a total of 111.1GB storage capacity if without compression

Reduce to 2.96GB if with compression

(60)

Loss less Vs. Lossy compression

Lossless (or information-preserving) compression:

Images can be compressed and restored without any loss of information (e.g., medical imaging, satellite imaging)

Lossless compression tools

Entropy coding : Huffman, Arithmetic, Lempel-Ziv, run-length

Predictive coding: reduce the dynamic range to code

Transform: enhance energy compaction

Lossy compression:

Perfect recovery is not possible but provides a large data compression (e.g., TV signals, teleconferencing)

Lossy compression tools

Discarding and thresholding

Quantization: Scalar quantization and vector quantization

(61)

Data Redundancy

Data are the means by which information is conveyed

Various amounts of data may be used to

represent the same amount of information

Data redundancy: if n

1

and n

2

denote the

number of information-carrying units in two

data sets that represent the same information,

the relative data redundancy of the first data

set is:

(62)

Data Redundancy

(63)

Data Redundancy

In digital image compression there exist three basic data redundancies:

Inter-pixel or spatial redundancy

Psychovisual redundancy

Statistical redundancy

(64)

Inter-pixel Redundancy

(65)

Inter-pixel Redundancy

Second image shows high correlation between pixels 45 and 90 samples apart

Adjacent pixels of both images are highly correlated

Interpixel redundancy: the value of any given pixel can be reasonably predicted from the values of its neighbors; as a consequence, any pixel carries a small amount of information

Interpixel redundancy can be reduced through

mappings (e.g., differences between adjacent

pixels)

(66)

Psycho visual Redundancy

The eye does not respond with equal sensitivity to all visual information

Certain information has less relative importance than other information in normal visual processing

(psychovisually redundant)

It can be eliminated without significantly impairing the quality of image perception

(67)

Transform Coding

(68)

Transform-based Image Coder

(69)

JPEG: Still Image

Compression Standard

JPEG: “Joint Photographic Experts Group”

Formally: ISO/IEC JTC1/SC29/WG1

Work commenced in mid-1980‟s.

Draft international standard 1991.

Widely used for image exchange, WWW, and digital photography.

International organization for Standardization International

Electro-technical Commission

Joint ISO/IEC Technical

Committee (Information Technology)

Sub-committee 29 (Coding of Audio, Picture, Multimedia and Hypermedia information

Working Group 1 (JBIG,JPEG)

(70)

What is JPEG?

The Joint Photographic Expert Group (JPEG), under

both the International Standards Organization (ISO) and the International Telecommunications Union-Telecommunication Sector (ITU-T)

– www.jpeg.org

Has published several standards

JPEG: lossy coding of continuous tone still images

Based on DCT

JPEG-LS: lossless and near lossless coding of continuous tone still images

Based on predictive coding and entropy coding

JPEG2000: scalable coding of continuous tone still images (from lossy to lossless)

Based on wavelet transform

(71)

71

Image Compression: JPEG

Summary:

JPEG Compression

DCT

Quantization

Zig-Zag Scan

RLE and DPCM

Entropy Coding

JPEG Modes

Sequential

Lossless

Progressive

Hierarchical

Sources:

The JPEG website:

http://www.jpeg.org

(72)

JPEG Belong to Hybrid Coding Schemes

RLE, Huffman or Arithmetic

Coding

Lossy Coding

Lossless Coding

(73)

73

Why JPEG?

The compression ratio of lossless methods (e.g., Huffman, Arithmetic, LZW) is not high enough for image and video compression.

JPEG uses transform coding, it is largely based on the following observations:

Observation 1: A large majority of useful image contents change relatively slowly across images, i.e., it is unusual for intensity values to alter up and down several times in a small area, for example, within an 8 x 8 image block.

A translation of this fact into the spatial frequency domain, implies, generally, lower spatial frequency components contain more

information than the high frequency components which often correspond to less useful details and noises.

Observation 2: Experiments suggest that humans are more immune to loss of higher spatial frequency components than loss of lower

frequency components.

(74)

The JPEG Standard

Contains several modes:

Baseline system (what is commonly known as JPEG!): lossy

Can handle gray scale or color images (8bit)

Extended system: lossy

Can handle higher precision (12 bit) images, providing progressive streams, etc.

Lossless version

Baseline system

Each color component is divided into 8x8 blocks

For each 8x8 block, three steps are involved:

Block DCT

Perceptual-based quantization

Variable length coding: Runlength and Huffman coding

(75)

JPEG:Encoder and Decoder

(76)

76

JPEG Coding

Y CCbr

DPCM RLC Entropy

Coding Header

Tables

Data

Coding Tables

Quant…

Tables

f(i, j) DCT

8 x 8

F(u, v) 8 x 8

Quantization Fq(u, v)

Zig Zag Scan

Steps Involved:

Discrete Cosine

Transform of each 8x8 pixel array

f(x,y) T F(u,v)

Quantization using a table or using a constant

Zig-Zag scan to exploit redundancy

Differential Pulse Code Modulation(DPCM) on the DC component and Run length Coding of the AC components

Entropy coding (Huffman) of the final output

(77)

JPEG:Basic Algorithm

(78)

JPEG: Image Partitioning

(79)

Pre-Processing

Color Space Conversion:

Human visual system has less resolution in color than in intensity.

As a result, the original color images are converted into intensity and color channels. One such transformation is RGB->YIQ

Divide image into 8x8 blocks of pixels.

Shift values [0, 2P - 1] to [-2P-1, 2P-1 - 1]

e.g. if (P=8), shift [0, 255] to [-127, 127]

DCT requires range be centered around 0.

Values in 8x8 pixel blocks are spatial values and there are 64 samples values in each block

B G R Q

I Y

311 . 0 523

. 0 212

. 0

321 . 0 275

. 0 596

. 0

114 .

0 587

. 0 299

. 0

(80)

Forward DCT

Convert from spatial to frequency domain

convert intensity function into weighted sum of periodic basis (cosine) functions

identify bands of spectral information that can be thrown away without loss of quality

Intensity values in each color plane often

change slowly

(81)

1D Forward DCT

Given a list of n intensity values I(x), where x = 0, …, n-1

Compute the n DCT coefficients:

1 ...

0 2 ,

) 1 2

cos ( ) ( )

2 ( )

( 1

0

n n u

x x I u

n C u

F n

x





otherwise u u for

C where

1

, 2 0

1 )

(

(82)

Visualization of 1D DCT Basic Functions (or Basis Images)

F(0) F(1) F(2) F(3) F(4) F(5) F(6) F(7)

(83)

1D Inverse DCT

Given a list of n DCT coefficients F(u), where u = 0, …, n-1

Compute the n intensity values:



otherwise u u for

C where

1

, 2 0

1 )

(

1 ...

0 2 ,

) 1 2

cos ( )

( ) 2 (

)

( 1

0

 

n n x

u x C u n F

x

I n

u



(84)

Extend DCT from 1D to 2D

Perform 1D DCT on each row of the block

Again for each column of 1D coefficients

alternatively, transpose the matrix and perform DCT on the rows

X Y

(85)

85

2-D DCT

Images are two-dimensional; How do you perform 2-D DCT?

Two series of 1-D transforms result in a 2-D transform as demonstrated in the figure below

1-D Row- wise

1-D Column- wise

8x8 8x8 8x8

j) f(i,

v) F(u,

r F(0,0) is called the DC component and the rest of F(i,j) are called AC components

(86)

Equations for 2D DCT

Forward DCT:

Inverse DCT:



m

v y

n u y x

x I v

C u nm C

v u

F m

y n

x 2

) 1 2

cos ( 2 *

) 1 2

cos (

* ) , ( )

( ) 2 (

) ,

( 1

0 1 0



m

v y

n u v x

C u C u v nm F

x y

I m

v n

u 2

) 1 2

cos ( 2 *

) 1 2

cos ( ) ( ) ( ) , 2 (

) ,

( 1

0 1

0

(87)

Visualization of 2D DCT Basis Functions

Increasing frequency

Increasing frequency

(88)

Coefficient Differentiation

F(0,0) :is called DC coefficient

includes the lowest frequency in both directions

Determines fundamental color of the block

F(0,1) …. F(7,7)

are called AC coefficients

Their frequency is non-zero in one or both directions

After DCT compression, only a few DCT coefficients have large values in each 8x8 block

These coefficients are floating-point numbers.

(89)

89

Quantization of DCT Coefficients

Each coefficients are converted into an integer number using quatization

The choices of quatization steps

• Large quantization step - Large rounding errors, smaller quantized values

• Small quatization step – Small rounding errors, large quantized coefficients

• The same quatization step for all coefficients ? – Human visual system is more sensitive to relative low frequency changes in images, which implies that for high frequency coefficients, larger quantization steps can be used without causing noticeable image distortion

(90)

90

Quantization of DCT Coefficients

Why? -- To reduce number of bits per sample

F’(u,v) = round(F(u,v)/q(u,v))

Example: 101101 = 45 (6 bits).

Truncate to 4 bits: 1011 = 11. (Compare 11 x 4 =44 against 45) Truncate to 3 bits: 101 = 5. (Compare 8 x 5 =40 against 45) Note, that the more bits we truncate the more precision we lose

Quantization error is the main source of the Lossy Compression.

Uniform Quantization:

q(u,v) is a constant.

Non-uniform Quantization -- Quantization Tables

Eye is most sensitive to low frequencies (upper left corner in frequency matrix), less sensitive to high frequencies (lower right corner)

Custom quantization tables can be put in image/scan header.

JPEG Standard defines two default quantization tables, one each for luminance and chrominance.

(91)

JPEG Quantization Matrix

The low frequency coefficients (upper-left corner) have smaller quantization steps, therefore more accurately encoded

3 5 7 9 11 13 15 17

5 7 9 11 13 15 17 19

7 9 11 13 15 17 19 21 9 11 13 15 17 19 21 23 11 13 15 17 19 21 23 25 13 15 17 19 21 23 25 27 15 17 19 21 23 25 27 29 17 19 21 23 25 27 29 31

(92)

JPEG Quantization Using the Quantization Matrix

92 3 -9 -7 3 -1 0 2

-39 -58 12 17 -2 2 4 2

-84 62 1 -18 3 4 -5 5

-52 -36 -10 14 -10 4 -2 0 -86 -40 49 -7 17 -6 -2 5 -62 65 -12 -2 3 -8 -2 0 -17 14 -36 17 -11 3 3 -1

-54 32 -9 -9 22 0 1 3

30 0 -1 0 0 0 0 0

-7 -8 1 1 0 0 0 0

-12 6 0 -1 0 0 0 0

-5 -3 0 0 0 0 0 0

-7 -3 3 0 0 0 0 0

-4 4 0 0 0 0 0 0

-1 0 -1 0 0 0 0 0

-3 1 0 0 0 0 0 0

Before quantization

After quantization

90 0 -7 0 0 0 0 0

-35 -56 9 11 0 0 0 0

-84 54 0 -13 0 0 0 0

-45 -33 0 0 0 0 0 0

-77 -39 45 0 0 0 0 0

-52 60 0 0 0 0 0 0

-15 0 -19 0 0 0 0 0

-51 19 0 0 0 0 0 0

After reconstruction

(93)

Default Normalized

Quantization Table in JPEG

Actual step size for C(i,j): Q(i,j) = QP*q(i,j)

(94)

Zig-Zag Ordering of DCT Coefficients

Zig-Zag ordering: converting a 2D matrix into a 1D array, so that the frequency (horizontal+vertical) increases in this order, and the coefficient variance decreases in this order.

(95)

95

Zig-Zag Scan

Why? -- to group low frequency coefficients in top of vector and high frequency coefficients at the bottom

Maps 8 x 8 matrix to a 1 x 64 vector

8x8

. . .

1x64

(96)

Coding of Quantized DCT Coefficients

DC coefficient: Predictive coding

– The DC value of the current block is predicted from that of the previous block, and the error is coded using

Huffman coding.

AC Coefficients: Runlength coding

– Many high frequency AC coefficients are zero after first few low frequency coefficients

– Runlength Representation:

• Ordering coefficients in the zig-zag order

• Specify how many zeros before a non-zero value

• Each symbol = (length of zero, non zero value) – Code all possible symbols using Huffman coding

(97)

JPEG:Differential coding of

DC

(98)

98

DPCM on DC Components

The DC component value in each 8x8 block is large and varies across blocks, but is often close to that in the previous block.

Differential Pulse Code Modulation (DPCM): Encode the difference between the current and previous 8x8 block. Remember, smaller number -> fewer bits

45 54 48

32

45

9

-6

12

36 4

. . .

. . .

1x64 1x64

1x64

1x64 1x64

1x64 1x64

1x64

1x64 1x64

(99)

Coding of DC Symbols

Example:

– Current quantized DC index: 2 – Previous block DC index: 4

– Prediction error: -2

―The prediction error is coded in two parts:

• Which category it belongs to (Table of JPEG Coefficient Coding Categories), and code using a Huffman code (JPEG Default DC Code)

– DC= -2 is in category “2”, with a codeword “100”

• Which position it is in that category, using a fixed length code, length=category number

– “-2” is the number 1 (starting from 0) in category 2, with a fixed length code of “01”.

– The overall codeword is “10001”

(100)

JPEG Tables for coding DC

(101)

Example: Coding of AC coefficients

For symbol (0,5):

The value „5‟ is represented in two parts:

Which category it belongs to (Table of JPEG Coefficient Coding Categories), and code the “(runlength, category)” using a Huffman code (JPEG Default AC Code)()

AC=5 is in category “3”,

Symbol (0,3) has codeword “100”

Which position it is in that category, using a fixed length code, length=category number

“5” is the number 5 (starting from 0) in category 3, with a fixed length code of “101”.

The overall codeword for (0, 5) is “100101”

Second symbol (0,9)

„9‟ in category „4‟, (0,4) has codeword „1011‟,‟9‟ is number 9 in category 4 with codeword „1001‟ -> overall codeword for (0,9) is „10111001‟

(102)

102

RLE on AC Components

The 1x64 vectors have a lot of zeros in them, more so towards the end of the vector.

Higher up entries in the vector capture higher frequency (DCT) components which tend to be capture less of the content.

Could have been as a result of using a quantization table

Encode a series of 0s as a (skip,value) pair, where skip is the number of zeros and value is the next non-zero component.

Send (0,0) as end-of-block sentinel value.

. . .

1x64 0 0 0 0 0 1 1 0 0 0 0 0

5,1

0 0

7,2

0 2 . . .

(103)

103

Entropy Coding: AC Components

AC components (range –1023..1023) are coded as (S1,S2 pairs):

S1: (RunLength/SIZE)

RunLength: The length of the consecutive zero values [0..15]

SIZE: The number of bits needed to code the next nonzero AC component’s value. [0-A]

(0,0) is the End_Of_Block for the 8x8 block.

S1 is Huffman coded (see AC code table below)

S2: (Value)

Value: Is the value of the AC component.(refer to size_and_value table)

Run/

SIZE

Code Length

Code

0/0 4 1010

0/1 2 00

0/2 2 01

0/3 3 100

0/4 4 1011

0/5 5 11010

0/6 7 1111000

0/7 8 11111000

0/8 10 1111110110

0/9 16 1111111110000010

0/A 16 1111111110000011

Run/

SIZE

Code Length

Code

1/1 4 1100

1/2 5 11011

1/3 7 1111001

1/4 9 111110110

1/5 11 11111110110

1/6 16 1111111110000100

1/7 16 1111111110000101

1/8 16 1111111110000110

1/9 16 1111111110000111

1/A 16 1111111110001000

… 15/A More Such rows

(104)

JPEG Tables for Coding AC

(Run, Category) Symbols

(105)

JPEG:Example

(106)

JPEG:Example

(107)

JPEG:Example

(108)

JPEG: Example

The DC coefficient is DPCM coded (difference between the DC coefficient of the previous

block and current block)

The AC coefficients are mapped to run-length pairs: (run,value)

(0,5),(0, -3),(0, -1),(0,- 2),(0, -3),(0,1),

(0,1),(0, -1),(0, -1),(2,1),(0,2),(0,3),(0, -2), (0,1),(0,1),(6,1),(0,1),(1,1), EOB

These are then Huffman coded (codes are

specified in the JPEG scheme)

(109)

Decoding

Decoding is the inverse process of the encoding

Recover the quantized coefficient matrix

Recover the coefficient matrix

IDCT

JPEG is lossy because of the quatization

process

(110)

JPEG:Decoding

(111)

JPEG:Decoding

(112)

Evaluation

To evaluate an image compression algorithm, there are two criteria

Subjective - The visual quality vs. bit rate (subjective)

Objective - The rate distortion curve (Peak SNR vs. bit rate)

  

N

i N

e j I i j I i j

N 1 1

2 2

2 1 ( ( , ) '( , ))

The difference image when the quality factor is 3

  

N

i N

I j I i j I i j

N 1 1

2 2

2 1 ( ( , ) ( , ))

) (

log

10 10

e dB I

PSNR

(113)

JPEG:Original Vs

reconstructed Image

(114)

JPEG Example: Lena Image

(115)

JPEG:Example

(116)

JPEG: Pros and Cons

(117)

JPEG bit stream

(118)

JPEG Bitstream

Terminology

Frame – image

Block – 8x8 image block

Segment – a group of blocks

Frame header

Sample precision

(width, height) of image number of components

unique ID (for each component)

horizontal/vertical sampling factors (for each component)

quantization table to use (for each component)

(119)

JPEG Bitstream

Scan header

Number of components in scan

component ID (for each component)

Huffman table for each component (for each component)

Misc. (can occur between headers)

Quantization tables Huffman Tables

Arithmetic Coding Tables Comments

Application Data

References

Related documents

In this thesis, new fuzzy logic based image processing techniques are presented, so as to overcome the drawbacks associated with conventional image processing methods.. In

Table 6.16 Classification result using textural features for different orientations.. The 3 statistical features viz. mean, skewness and kurtosis and the four texture features

Actual Image, Masked Contrast Image and Binary Image for Frame No.. 14 Output for Single object in Night Light Condition with Correlation a) Actual Image, b) Image after

Image denoising is a frequent course of action in digital image processing for the suppression of additive white Gaussian noise (AWGN) that might have corrupted

It is not necessary to develop these techniques from the scratch for professional image processing packages are available-like lRAF (image reduc- tion and

In image processing, there are various problem occur, one of which is regarding segmentation which include pattern matching, image analysis and scene analysis. The project

Its role is to import an image stack from the file, getting image information, selecting the image portions with a rectangle and processing the selected portion only,

The thesis also presents an objective comparison between three popular thermography tech- niques — pulsed, lock-in, and frequency modulated thermal wave imaging (FMTWI) — un-