• No results found

Enhancing effective depth-of-field by multi-focus image fusion using morphological techniques

N/A
N/A
Protected

Academic year: 2023

Share "Enhancing effective depth-of-field by multi-focus image fusion using morphological techniques"

Copied!
142
0
0

Loading.... (view fulltext now)

Full text

(1)

Multi-fous Image Fusion using

Morphologial Tehniques

A Dissertation Presented by

Ishita De Ghosh

to

Indian Statistial Institute

in Partial Fulllment of the Requirements

for the degree of

Dotor of Philosophy

(2)
(3)

A sene to be photographed, usually inludes objets at varying distanes from the

amera. Depth-of-eldof a digitalamera isthe range of distane, allobjets within

whih appear tobe sharp in the image. Dueto the low depth-of-eld of the amera,

images aquired by them oftensuer from degradation alledout-of-fous blur. One

waytoenhanetheeetivedepth-of-eldistoaquireseveral imagesof asenewith

fous on dierent parts of it and then ombine these images into a single image in

suh a way that all regions of the sene are in fous. Aquired images are alled

multi-fous images and the proess of ombination is known as multi-fous image

fusion. The tehniques for multi-fous image fusion belong to the broad ategories,

pixel-based, blok-based and region-based. They onentrate respetively on single

pixels, smallbloks of size mn and arbitrarily shaped regions. Image registration

is a neessary pre-requisite for multi-fous image fusion. The thesis presents a new

tehnique formulti-fous imageregistrationandthreenewtehniques formulti-fous

image fusion. Among these tehniques, the rst one is pixel-based, the seond one

is blok-based and the third one is region-based. All of them use mathematial

morphologial tools. The pixel-based method is a multi-resolution tehnique that

employsmorphologialwavelet asatoolforsignaldeompositionandreonstrution.

The blok-based method uses energy of morphologial gradients as a tool for fous

measure. Finally the region-based method uses multi-sale morphologial tools for

obtaining the fousedregions from the input images. In this ontext, existing fusion

tehniques are studied and ategorized. The thesis inludes experimental results

obtained by applying the proposed methods and other well-known methods on a

variety of input data-set. It also inludes performane analysis of various methods

using standard quantitativeevaluationtehniques. At the end itpresentsonluding

(4)

I would like to express my deep and sinere gratitude tomy aademi advisor, Pro-

fessor Bhabatosh Chanda of ISI, Kolkata. Without his onstant understanding, en-

ouragement,and personalguidane, this work would not be possible.

I would like to thank all the members of the ECSU of ISI, Kolkata for their help

during my stay in the ECSU laboratory.

I aknowledge the tehnial support and help provided by Sri Satrajit Ghosh and

Dr. Bibhas C. Dhara duringdierentphases of the work.

Speial thanks to Dr. Ajoy K. Mukherjee, Prinipal, Barrakpore R. S. College, for

hisooperationand allfaulty-membersand supportingsta ofDepartmentofCom-

puterSiene,BarrakporeRastraguruSurendranathCollege,fortheirgenerosityand

logisti support.

Finally, I express my gratitude tomy familymembers inludinglittleImanfor their

ontinuous supportand enouragement.

(Ishita De Ghosh)

(5)

1 Introdution 1

1.1 Motivation . . . 2

1.2 Review of previous work . . . 4

1.3 Objetive of the thesis . . . 7

1.4 Morphologi Operators . . . 8

1.4.1 Multi-sale morphologioperators . . . 10

1.5 Contributionof the thesis . . . 11

1.6 Experimentalset-up . . . 14

1.6.1 Data used in experimentation . . . 14

1.7 QuantitativePerformane Evaluation . . . 18

1.7.1 Gradientsimilarityindex . . . 25

1.7.2 Fusion quality index . . . 26

1.8 Organizationof the thesis . . . 27

(6)

2.1 Introdution . . . 28

2.2 An iterativehybrid registrationalgorithm . . . 30

2.2.1 Global translation. . . 33

2.2.2 Loalsaling . . . 35

2.2.3 Iteration . . . 38

2.3 Experimentalresults and disussion . . . 39

2.3.1 Quantitativeperformane evaluation . . . 44

2.4 Summary . . . 51

3 Pixel-based fusion 53 3.1 Introdution . . . 53

3.2 Basi theory and a new morphologiwavelet . . . 57

3.2.1 Multi-resolution Analysis. . . 57

3.2.2 A new morphologi Wavelet . . . 59

3.3 Multi-fousimage fusion . . . 62

3.3.1 Algorithm . . . 63

3.3.2 Illustration . . . 64

3.4 Experimentalresults and disussion . . . 65

(7)

3.5 Summary . . . 73

4 Blok-based fusion 74 4.1 Introdution . . . 74

4.2 A new blok-basedfusion algorithm . . . 76

4.2.1 Detetion of foused bloks ina quad-tree struture . . . 77

4.2.2 Reonstrution . . . 81

4.2.3 Energy of MorphologiGradients: a new measure of fous . . 83

4.3 Experimentalresults and disussion . . . 85

4.3.1 Disussion . . . 87

4.4 Summary . . . 93

5 Region based fusion 94 5.1 Introdution . . . 94

5.2 Fusion by multi-salemorphology . . . 96

5.2.1 Multi-sale top-hattransformation . . . 97

5.2.2 Detetion of foused regions . . . 98

5.2.3 Reonstrution . . . 99

5.3 Experimentalresults and Disussion. . . 101

(8)

5.4 Summary . . . 108

6 Conlusion and future work 110 6.1 Future work . . . 113

6.1.1 Fusion by area morphology. . . 113

6.1.2 Extension tomulti-modalimages . . . 114

6.1.3 Hardware embedding . . . 114

A Depth of eld 115

B AÆne transformation 119

(9)

1.1 An example of multi-fous image fusion . . . 3

1.2 Multi-fousimage data-sets usedfor experimentation . . . 17

2.1 Geometri Optis Modelof Lens System . . . 32

2.2 Shemati diagramfor ahybrid and iterativeregistration method . . 33

2.3 Eets of iterative registration . . . 38

2.4 Registration-results of `Doll'images . . . 43

2.5 Registration-results of `Disk' images . . . 45

2.6 Registration-results of `Garden' images . . . 46

2.7 Registration-results of `Rose' images . . . 47

2.8 Registration-results of `News' images . . . 48

2.9 Magniation of seleted areas . . . 49

3.1 Wavelet transform ona22 blok . . . 60

3.2 Illustrationof proposed wavelet transformon a22blok . . . 62

(10)

3.4 Results of multi-fousimage fusionby wavelet transform . . . 66

4.1 A generi shematidiagram forblok-based fusion . . . 76

4.2 Problem of multi-fous image fusionwith equal-sizedbloks . . . 77

4.3 Subdivision of imagesin bloks aording to aquad-tree struture . . 79

4.4 Example of reursive subdivision of a blok in animage . . . 80

4.5 Detetion of fousedbloks up tovarious levels ina quad-tree . . . . 82

4.6 Results of multi-fousimage fusionby blok-based methods . . . 86

5.1 Detetion of fousedregions by multi-salemorphology . . . 100

5.2 Fousedregions and orresponding largestonneted regions . . . 101

5.3 Results of multi-fousimage fusionby region-basedmethods . . . 103

A.1 Par-axialgeometri optismodelof image formation. . . 116

(11)

1.1 Multi-fousimages and their sizes . . . 16

2.1 Performane evaluationof registration by RMSE . . . 50

2.2 Performane evaluationof registration by MI . . . 50

2.3 Performane evaluationof registration by NCC . . . 51

3.1 Performane evaluationof pixel-based methodsby GSI . . . 67

3.2 Performane evaluationof pixel-based methodsby FQI . . . 71

3.3 Time requirementin pixel-basedmethods. . . 72

4.1 Performane evaluationof blok-basedmethodsby GSI . . . 87

4.2 Performane evaluationof blok-basedmethodsby FQI . . . 91

4.3 Time requirementin blok-based methods . . . 92

5.1 Performane evaluationof region-based methods by GSI . . . 102

5.2 Performane evaluationof region-based methods by FQI . . . 104

(12)

6.1 Performanesummeryofproposedpixel-based,blok-basedandregion-

based methods . . . 111

(13)

Word(s) Abbreviation

Depth-of-eld DOF

Energy of gradients EOG

Energy of Laplaian EOL

Energy of morphologigradients EOMG

Fous-measure FM

Fusion quality index FQI

Gradientsimilarityindex GSI

Multi-fous imagefusion MFIF

Mutual information MI

ModiedLaplaian ML

Multi-resolution MR

Multi-resolution deomposition MRD

Multi-sale MS

Multi-sale deomposition MSD

Mean-square-error MSE

Normalized-ross-orrelation NCC

Normalizeddierene infous-measure NDFM

Quad-tree QT

Root-mean-square-error RMSE

Struturing element SE

Spatial frequeny SF

Sum modied Laplaian SML

Strutural similarityindex SSI

(14)

Introdution

Advanements in digital imaging tehnology have inreased the popularity of on-

sumer imagingprodutssuhasdigitalamerasand amorders. Howeverduetothe

physial limitations of the imaging systems, images produed by them often suer

from degradations. A sene to be photographed usually inludes objets at varying

distanes from the amera. Sharpness distribution of an image of suh a sene is

aeted by variousfators. The objet fousedbythe ameraand the objets atthe

same distanefromtheameraasthefousedobjetappeartobethesharpestinthe

image. Sharpnessofthe objetsinfrontofand behindthe fouseddistane dereases

graduallyintheimage. Thissharpness-lossisnotsigniantwithinaertainrangeof

objet distanes. This range is alled depth-of-eld (DOF)of the amera [76℄. DOF

as alulated by par-axial geometri optis model of image formation using a thin

onvex lens is given in Appendix A of the thesis. DOF depends on various fators

suh as,the amountof sharpness-lossregarded asaeptable,foal-length ofthe lens

(longerthe foal-length,shortertheDOF), distane ofthe foused objet(nearerthe

objet,shortertheDOF)andthe apertureused(dereasingtheaperturewillinrease

the DOF).TheextremeaseofdereasingtheapertureformaximizingtheDOFhap-

pens in a pin-holeamera. It has an innite DOF.Unfortunately, the optial power

in the image plane is redued onsiderably due to very smallaperture. So ameras

(15)

with nite DOF are preferred. A nite but large DOF means that objets within a

large range(and henepossibly alargenumberofobjets) willappear tobesharp in

the photograph. On the otherhand, asmallDOFmeans that objetswithin asmall

range (and hene possibly a smallnumber of objets) willappear tobe sharp in the

photograph and all other objets will appear to be out-of-fous in the photograph.

Out-of-fous blur is one of the typial degradations whih our in images aquired

by digital ameras due to their low DOF [60, 76℄. The problem of low DOF is also

enounteredinmirosopyduetoinrementinmagniationand aperture[61,6,36℄.

1.1 Motivation

One way toenhane theeetive DOFistoaquire several imagesofasene foused

on objets at dierent distanes and then integrate these images intoa single image

in suh a way that all regions of the sene are in fous. Aquired images are alled

multi-fous images and the proess of ombination is known as multi-fous image

fusion (MFIF).Theproessproduesanimagewhosetotalarea-in-fousismorethan

that ofany ofthe onstituentimages. Multi-fousimagesof aseneare aquiredone

by one either by hand-held ameras or by ameras plaed on tripods, in idential

environmental onditions in respet to sensor, light, view-diretion, orientation and

objet-ontents in the sene. They an be either grey-level or olor images. Sine

eah image in a set of multi-fous images has fous on objets at dierent distanes

in the sene, an objet whih is in-fous in the near-foused image may be out-of-

fous in other images. Similarly an objet whih is out-of-fous in the near-foused

image maybe in-fous inthe far-foused image. Hene partialdefousing/blurring is

inevitable in this type of images. MFIF produes an image in whih blurred regions

are deblurredandeveryareaisinfous. Thefused imageshouldbebetterforhuman

viewing as well as for subsequent proessing and analysis like segmentation, feature

extration, objetreognitionet. Figure1.1shows anexampleofmulti-fousimages

(16)

(a) Nearfoused image (b) Far fousedimage () Fused image

Figure1.1: An example of multi-fous image fusion

The tehniques for MFIF belong to the broad ategories, pixel-based, blok-based

and region-based. They onentrate respetivelyon singlepixels, smallbloksof size

mn and arbitrarily shaped regions. It is interesting to study and ompare MFIF

tehniqueswithinapartiularategoryandthe onesbelongingtodierentategories.

The ultimate goal of MFIF is to obtain all objets in the nal image in foused

and identiable form. Mathematial morphologi operators have the apability of

handlingobjetsindierentshapesandsizes. Inthisthesis,weexploremathematial

morphology asa tool for MFIF and propose new tehniques for the same employing

this tool. We provide a omparison of results obtained by various tehniques and

outline some related future work. Image registration is a neessary pre-requisite for

MFIF beausebeforefusiontheonstituentimagesmust bepositionedproperlywith

respet to a ommon oordinate system so that orresponding objets are overlaid

properly[41℄. Weproposeanew tehnique formulti-fousimageregistrationalso. In

thishapter,abriefreviewofpreviousworkonMFIFisgiveninSetion1.2,objetive

of the thesis is given in Setion 1.3, a brief aount of mathematial morphologi

operators is given in Setion 1.4, ontribution of the thesis is given in Setion 1.5,

experimentalset-upalongwithdatausedforexperimentationaregiveninSetion1.6,

evaluation tehniques used are given in Setion 1.7, and nally organization of the

(17)

1.2 Review of previous work

The fundamental onept behindMFIF is toselet the sharply foused regions from

the inputimages to forman imagein whih allobjetsare infous. The basi steps

for this are, todivide eah input image into overlapping or non-overlapping regions,

then measure sharpness of fous forall regions, nallyselet the best-foused region

among all orresponding regions to form the fused image. When all the regions of

interestsqueezeintosinglepixels,theapproahisalledapixel-basedapproah,when

they are smallbloks of size mn, the approah is alled a blok-based approah;

otherwise it is alled a region-based approah. Another ategorization is done on

whether the tehnique is based on spatial domain or frequeny domain. In spatial

domaintehniques,inputimagesarefusedinspatialdomainusingphysiallyrelevant

spatialfeatures. Infrequenydomaintehniques,multi-saledeomposition(MSD)or

multi-resolutiondeomposition(MRD) by pyramid orwavelet transform is required.

AnearlyategorizationoffrequenydomainMRDfusionshemeswasgivenbyZhang

and Blum [89℄. Piella[65℄ provided a general framework for these shemes and also

proposed anew methodfor the same. Pajares and Cruz[59℄ presented a omprehen-

sivetutorialonwavelet-based fusionmethods. Goshtasby andNikolov [30℄ presented

an overview of various fusion tehniques. Basi idea of MRD-based fusion shemes

is the following. At rst eah soure image istransformed/deomposed up to alevel

by an MRD sheme. The deomposition gives the saled image as low frequeny

oeÆients and the detail images as high frequeny oeÆients. Salienies of the

oeÆients are measured by their ativity-levels. A seletionor deision map is re-

ated from the ativity-levels of the oeÆients from all transformed images. The

map is used as aguide toonstrut the omposite representation of the transformed

images. Finally fused image is obtained by applying the inverse transform to the

omposite representation. An MRD fusion sheme is ategorized depending on how

the ativity-levelsof MRD-oeÆientsare measured. Ifthe ativity-levelismeasured

for the oeÆientsrelated toindividual pixels, the methodis alledpixel-based, if it

(18)

oeÆient, the methodis alled blok-based and nally; if it is measured depending

on all oeÆients in a region ontaining the onerned oeÆient, then the method

is alled region-based. Images fused by frequeny-domain MRD shemes may lose

some information of the soure images beause of implementation of inverse multi-

resolution transform.

The idea of using MRD shemes forimage fusionwas rst proposed by Burt [8℄as a

modelfor binoular fusion for human stereo vision. He used Laplaian pyramid for

MRD and hoose max rule for oeÆient seletion. Burt and Adelson [10℄ later in-

trodued anew approah toimagefusionbasedonhierarhial imagedeomposition.

Adelson [2℄thenused theLaplaianpyramidtehniquefor MFIF.Toet[78℄proposed

the use of ratio of low-pass pyramids at suessive levels of Gaussian pyramids for

fusion of visible and IR images. Burt [9℄ proposed that fusion within a gradient

pyramid providesimproved stabilityand noise immunity. Akerman[3℄optimized the

Laplaianpyramidfusionin respet of multi-sensor fusion. Burt andKolzynski [11℄

presented gradient pyramid fusion with a loal math measure and a window-based

salieny measure. Li et al. [45℄ used similar method exept that wavelet transform

is used insteadof pyramid transform and onsisteny veriation is done along with

window-based ativity measure. Wavelet based fusion tehniques are proposed later

by many other peopleinludingChipman et al.[16℄, Petrovi and Xydeas [63℄, She-

unders [72℄, Hill et al. [37℄, Hamza et al. [34℄, De and Chanda [19℄, Qu and Yan

[68℄ and Lewis et al. [44℄. Frequeny domaintehniques in various ategories willbe

disussed indetail in relatedhapters.

Sinemulti-fousimagesofaseneareaquiredwithfousonomplementaryregions,

foused regions in an image have more ontrast than their defoused ounter-parts

in other images. Fous-measure (FM) is a quantity for evaluating the ontrast or

sharpness of a pixel, blok or region [39, 50℄. Image variane, image gradients, im-

age Laplaians, energy of image gradients (EOG), energy of image Laplaian(EOL)

are traditional FM's employed and validated for appliations like autofousing [76℄.

(19)

Laplaian[58℄. Spatial frequeny (SF)[25℄ and Tenengrad [36℄ were laterintrodued

as fous measures. In spatial domain MFIF tehniques, input images are fused in

the spatialdomain using fous-measure asa physiallyrelevant spatialfeature in lo-

alized area. Sine these tehniques emphasize on a spei or desired image area,

very little/no hange ours in other areas. Pixel-level weighted averaging is a spa-

tial domain tehnique in whih fusion is done by taking the weighted average of the

pixel intensities of the input images. Weights are determined by tools like priniple

omponent analysis [71℄ or adaptive methods [42℄. Other spatial domain pixel-level

image fusionapproahes inlude, fusion using ontrollable amera [73℄, probabilisti

methods[5℄, image gradient method with majorityltering [23℄.

The basi idea in spatial domain blok based fusion methods is to divide the input

images into a number of bloks, then measure fous on orresponding bloks and

nally selet and ombine the foused bloks to reate the fused image [39℄. Often

onsisteny veriation isdone beforereatingthe nal fused image. Spatialdomain

blok based fusionmethodsare proposed in [47, 48, 55,29, 27, 87, 21℄. Li et al. [47℄

used spatial frequeny (SF) as the fous measure. In a subsequent work they [48℄

used neural network (NN) to selet better foused bloks using three features SF,

visibilityandedgefeature. MiaoandWang[55℄usedenergyofimagegradients(EOG)

to measure fous in image bloks in an MFIF algorithm based on Pulse Coupled

Neural Networks. In the method of Goshtasby [29℄, fous is measured by the sum

of the gradient values of all pixels in the blok. In the method of Fedorov et al.

[27℄ eah image is tiled with overlapping neighborhoods. For eah region the tile

that orresponds tothe best fous (whih is measured by ML) ishosen. Zhang and

Ge[87℄proposedatehniqueinwhihfousedbloksaredeteted bymeasuringtheir

blurriness. De and Chanda [21℄ introdued a new fous measure alled energy of

morphologi gradients (EOMG) and used it for image fusionin a blok-based MFIF

algorithm.

In region-based fusion tehniques, among orresponding regions better foused ones

(20)

onern inthese methods. Spatial-domainregion-basedfusionmethodsareproposed

in [53, 57, 22, 49℄. Methods desribed in [53, 57, 22℄ use multi-sale morphology.

Matsopoulos et al. [53℄ used multi-salemorphologi pyramids. Mukhopadhyay and

Chanda [57℄ usedmorphologitowers insteadofmorphologipyramids. Deet al.[22℄

proposed multi-fous image fusion tehniques using multi-sale top-hat transforma-

tion. Li and Yang's tehnique [49℄ isa spatial-domain,region-basedtehnique whih

does not depend on MRD. In this tehnique, input images are segmented aording

tothesegmentationresultsofatemporaryfused imageandbetterfousedregionsare

seleted and stithed to their desired positions to get the nal fused image. Spatial

domaintehniquesinvariousategorieswillbedisussedindetailinrelatedhapters.

Objetive of the thesis is given now.

1.3 Objetive of the thesis

A numberof researhershavesuggested methodsforMFIF asasolutiontotheprob-

lem of low depth-of-eld. As disussed before, the tehniques belong to the broad

ategories, pixel-based, blok-based and region-based. It is interesting to study and

ompare MFIF tehniqueswithin apartiularategory andthe tehniques belonging

to dierent ategories. A good algorithm for MFIF should possess some important

properties. It should be independent of image ontent and robust against probable

misalignments of input images. It should not produe any unwanted visual eet

or artifat. Quality of the fused image should satisfy the requirement for intended

appliation and nallyomputational omplexity should also be aordable. In gen-

eral, pixel-based tehniques are intuitively straightforward, easy to implement and

omputationally eÆient. But they are sensitive to mis-registrationof input images.

Blok-based and region-based tehniques are more robust in respet of registration

problems thoughthey are more omplexingeneral. Despite the inrease inomplex-

ity, region-based methods have a number of advantages over pixel-based methods.

(21)

to attenuate or aentuate ertain properties to the regions [30℄.

Multi-fous images may ontain objets of dierent shapes and sizes. The ultimate

goal of MFIF is to obtain all objets in the nal image in foused/deblurred form.

Mathematial morphologyisa subjet whih treatsanimage asan ensembleof sets.

Morphologioperatorshavetheapabilityofhandlingobjetsindierentshapesand

sizes. Theyhavesomeinterestingomputationaladvantages aswell. Inthisthesiswe

explore morphologi tehniques as a tool for MFIF. Algorithms for MFIF proposed

in the thesis employ various ombinationsof morphologioperations.

Given this, the objetive of the thesis is to propose and analyze grey-level MFIF

shemes employingmorphologioperatorsand havingthe following desirableproper-

ties,

ability toworkon avariety of input images,

robustness against probable mis-registrationof input images,

extensibilityto fuse multi-fous olor images,

lowomputationalost,

adaptability tohardware implementation.

Sine all algorithms for MFIF proposed in this thesis use morphologi operators, a

brief introdutionto them isgiven now.

1.4 Morphologi Operators

Mathematial morphology treats an image as a set of pixels [74, 75℄. Morphologi

operators work with two sets, the original image to be analyzed and a struturing

(22)

to the operation. Fundamentalmorphologioperations are morphologidilationand

morphologi erosion. At rst we present these two operators for binary images. A

two-dimensional binary image signal is a funtion/mapping from domain D (whih

is a subset of disrete two-dimensional Eulidean spae Z 2

) to a binary-set f0;1g.

Suppose A is the set of points representing the binary-1 pixelsof the originalbinary

imageand B isthe set ofpointsrepresenting binary-1pixelsofthe SE.Then dilation

and erosionof Aby B,are denoted byAB andA B respetively andare dened

as

AB = fb+ajfor b 2B and a2Ag (1.1)

A B = fpjb+p2A for every b2Bg (1.2)

where `+' denotes the binary-or operation. Pratially, AB is the lous of origin

of B suhthat B hits A. Similarly,A B isthe lousof origin ofB suhthat B ts

in A.

We now onsider the ase of grey-sale images. A two-dimensionalgrey-sale image

signal X is a funtion/mapping from domain D (whih is a subset of disrete two-

dimensional spae Z 2

) to the set of grey intensity values fg

1

;g

2

;:::;g

n

g where eah

g

i

is a nonnegative integer. A grey-sale SE h is a mappingfrom its domain to the

above set of grey values. In this thesis, we use at SE's that is SE's for whih the

valueofh isalwayszero. Let(r;) beapointindomainD,wherer anddenotethe

row and olumn oordinates respetively. Dilation and erosion of X(r;) by h(r;)

are denoted by (X h)(r;) and (X h)(r;) respetively and are dened as

(X h)(r;) = max

(i;j)2Domainofh

(X(r i; j)+h(i;j)) (1.3)

(X h)(r;) = min

(i;j)2Domainofh

(X(r+i;+j) h(i;j)) (1.4)

where the maximum and minimum are taken over all (i;j) in the domain of h suh

that (r i; j) and (r+i;+j)areinthedomainofX. Sodilationsimplyreplaes

the value at eahpointof X by the maximum value in the neighborhooddened by

the SE when the origin of SE is plaed at the point. Similarly erosion replaes the

(23)

SE when the origin of SE is plaed at the point. Other morphologi operators are

onstruted by ombining dilationand erosion. For example, opening and losing of

X(r;) by h(r;) are denoted by (X Æh)(r;) and (X h)(r;) respetively and are

dened as

(X Æh)(r;) = ((X h)h)(r;) (1.5)

(X h)(r;) = ((Xh) h)(r;) (1.6)

Both opening and losing are inreasing operations implying that opening (losing)

of an image ontains openings (losings) of all its sub-images. Both opening and

losing are idempotent operations implying that suessive appliations of openings

(losings) do not further modify the image. Finally, opening is an anti-extensive

operation andlosing isanextensiveoperation. Inagrey-sale imageX,anopening

removes all foreground strutures in the image that are not large enoughto ontain

the SE. Similarly,a losing removes all bakground strutures in the image that are

not largeenoughtoontaintheSE. Hereforeground struturemeansanimageregion

of intensity value higherthan the surrounding region.

1.4.1 Multi-sale morphologi operators

Extration offeaturesbymathematialmorphologydependsoneetive useof SE's.

Sizes and shapes of SE's play ruial roles here. A morphologi operator with a

salableSEanextratfeaturesofvariousshapesandsizes. Ashemeofmorphologi

operations with a salable SE is termed as multi-sale morphology [15, 52℄. For a

salable SE h, size of its domain gets hanged. Let B be a set representing the

domain of h. Assume that B has a denite shape. Let n be an integer representing

the sale-fator of B and let nB denote the saled version of B at sale n. If B is

onvex, then nB is obtained by n 1 dilationsof B by itself.

nB =BBBB

| {z }

n 1 times

(1.7)

When n=0,onventionallyB istaken tobeadiskofunitsize sothatnB =f(0;0)g.

(24)

a morphologi operation by SE h redues to an operation by its domain nB. Then

multi-saleopeningand losing of X by salable domainnB are dened respetively

as

(X ÆnB)(r;) = ((X nB)nB)(r;) (1.8)

(X nB)(r;) = ((XnB) nB)(r;) (1.9)

The opening removes all bright/foreground strutures in the image X that are not

large enough to ontain nB. Here foreground struture means an image region of

intensity value higher thanthe surrounding region. Similarly,the losingremovesall

dark/bakgroundstruturesintheimageX thatare notlargeenoughtoontain nB.

These operators are used eetively to detet foused regions whih in general have

more ontrast than orresponding defoused regions.

Given the bakground and the objetive of the thesis and a short introdution to

morphologi operators,ontributionof the thesis ispresented now.

1.5 Contribution of the thesis

It isalreadydisussed thatthe objetiveof thethesis istopropose and analyzegrey-

level MFIF shemes having ertain desirable properties. Mathematial morphology

isexplored asatoolforMFIF and newtehniques are presented employingthis tool.

In additiontoa briefreview ofprevious work,the objetive ofthe thesis and ashort

introdution to morphologi operators, urrent hapter, viz. Chapter 1 inludes the

data-set used for experimentation purpose and the quantitative measures used for

performane evaluation.

Sine registration is a neessary prerequisite for MFIF, a new algorithm for multi-

fous image registration is presented in Chapter 2. It is an iterative algorithm for

registrationofmulti-fousimagesby ombiningglobalandloaltransformationmod-

(25)

the mutualinformationof the soure and the refereneimages and then itis applied

on the soure image. In the seond step, a blok-wise loal saling is appliedon the

translatedsoureimage. Thesale-fatorsare determinedbymaximizingasimilarity

measure oftwoorrespondingbloksofthe translatedsoureimageandthe referene

image. The global and loal transformations onstitute a hybrid tehnique whih

is iterated to obtain the optimal result. The proposed method is automati, easy

to implement and gives good results. Results obtained by applying the method on

dierent sets of multi-fous images are provided with. Performane of the system is

evaluated and is ompared with awidely used method.

Chapter 3 presents a pixel-based algorithm for multi-fous image fusion using mor-

phologi wavelets. A nonlinear morphologi wavelet transform whih preserves the

range inthe saled imagesand involves integerarithmeti onlyisintroduedatrst.

This transform is employed in a fusion algorithm to fuse a set of grey-sale multi-

fous images. The method is omputationally eÆient and produes good results.

Integrated-hip implementationsof imageproessingalgorithmsare goingtobeome

moreommoninnearfuture. Ourmethodwillbeusefulinthisrespet. Theproblem

with this algorithm is that being a pixel-based method, it is not robust to mis-

registration problem.

Chapter 4 presents a blok-based algorithm for multi-fous image fusion using a

morphology-based fous measure ina quad-treestruture. Fous-measure isa quan-

tity for evaluating the ontrast or sharpness of a pixel, blok or region. A new

fous-measure alled energy of morphologi gradients (EOMG) is introdued. It is

used foranovelalgorithmforMFIF whih employsaquad-treestruture foroptimal

subdivision of input images while seleting the sharply foused bloks. Though the

algorithmstarts with bloks,it ultimatelyidenties sharply fousedregions ininput

images. The fous measure EOMG is omparablewith other fous measures viz. en-

ergy of gradients (EOG) and variane. The algorithmis robust in the sense that it

works withanyfous measure. Itisalsorobustagainstpixelmis-registration. Butas

(26)

may appear inthe boundaries of arbitrary-shaped regions.

Chapter 5presentsaregion-basedalgorithmformulti-fousimagefusionusingmulti-

sale morphology. Sine multi-fous imagesof asene areaquired withfous onthe

omplementaryregions,fousedregionsinanimagehavemoreontrastthantheirde-

foused ounter-parts inother images. This implies that the foused regionsontain

larger number of physially relevant features than that ontained in orresponding

defousedregions. Fousedregionsaredetetedbyextratingthebrightanddarkfea-

tures at various sales by multi-saletop-hat transformation. Sine the best-foused

regions are deteted and opied from one image only, a slight error in registration

will have no eet in fusion exept in the borders of the foused regions. Hene

this region-based method is robust to mis-registration. This method resembles the

manual ut-and-paste method of image fusion whih is often used for omparison

purposes. Thus the fused image obtained by the method is very similar to the ideal

fused image. Performane analysis reveals that ourmethodissuperior tofusionby a

state-of-the-art method.

Chapter 6 presents the onlusion of the thesis inluding a omparative study of

tehniques presented in previous hapters. It also presents a disussion on related

future work.

In brief, inthis thesis

Chapter 1presents a brief review onexisting literature, the objetive and on-

tribution of the thesis, data-set used for experimentation and the quantitative

measures used for performane evaluation,

Chapter 2presents aniterativealgorithmfor registrationof multi-fous images

by ombining global and loaltransformation models,

Chapter 3 presents a pixel-based algorithmfor multi-fous image fusion using

(27)

Chapter 4 presents a blok-based algorithmfor multi-fous image fusionusing

a morphology-based fous measure in aquad-tree struture,

Chapter 5 presentsaregion-based algorithmfor multi-fousimagefusionusing

multi-salemorphology,

Chapter6presentsonlusionofthethesisandgivesanoutlineonrelatedfuture

work.

Experimentalset-up and the data-set used for experiments are presented now.

1.6 Experimental set-up

Proposed algorithms are implemented using C language in Unix environment. All

programs are exeuted on a mahine with Intel Pentium proessor T4400 and 1 GB

RAM. Standard algorithms proposed by others have also been implemented in the

same environmentfor omparison purpose.

1.6.1 Data used in experimentation

The algorithms are applied on a large number of multi-fous image-sets whih vary

in their objet-ontents and imaging set-up. Objet-ontents of image-sets vary in

number, shape anddistane of objetsfromthe amera. Texture ofimage-sets varies

in regularity, density and in ombination of miro and maro texture. Some of the

image-sets depit indoor senes whereas others depit outdoor senes. Images of

indoorsenesgenerallyontainhumanbeings, animalsand man-madeobjets. Man-

made objetswith straight-lineedges (for example, book, book-shelf, table,window,

door et.) are helpful to detet artifats like step-eets generated after proessing.

Images of outdoorsenes generally ontain natural objets like owers, plants, trees

(28)

of suh images is diÆult beause in addition to other dierenes temporalhanges

between shots may our due towind. Hene slight mis-registrationmay be present

inthistypeofimages. This mayinturnrevealthe robustnessofthefusionproedure

against mis-registration.

Sine it isnot possible toinlude allexperimentalresults inthe thesis, we have ho-

sen test image-sets in suh a way that experiments are validated by dierent types

of images. Twelve representative image-sets are used in the thesis and they are ob-

tainedfromweb-sites[32,24,26,1,28℄. Theimage-setsnamedas`Doll',`Toy',`Disk',

`Lab', `Pepsi', `Clok', `Campus', `Hydrant', `Garden', `Rose', `News' and `OpenGL'

are shown in Fig. 1.2. Among these, the multi-fous `Doll' images (Fig. 1.2A) are

synthetiimages generated fromthe famous paintingnamed `LasMeninas' by Diego

Velazquez kept at `Museo delPrado' in Madrid. These images have been generated

artiially by a modern painter um art-teaher John Hagan [32℄. He has visually

estimated thedistanes ofvariousobjetspresentinthe painting. Aordinglydier-

ent portionsof the original imageof the painting have been artiially defoused by

him to illustratethe onept of `depth-of-eld'. Though the blurring model and the

parameters are not known to us, we have used this multi-fous image-setas anideal

synthetidata-setforevaluatingtheperformaneoffusionalgorithms. Moreover, this

image-setontainsthree multi-fous images,heneitoersbetter illustrationfaility

than the sets of two images. Image-sets `Toy', `Disk', `Lab', `Pepsi' and `Clok' are

obtained fromweb-site[24℄;`Campus'and `Hydrant'are obtainedfromweb-site[26℄;

`Garden' is obtained from web-site [1℄; `Rose', `News' and `OpenGL' are obtained

from web-site [28℄. The harateristis of test image-sets are given now.

Image-set `Doll'depits anindoorsene with many objetsof arbitrary shapes

and sizes and plaedat dierent distanes.

Image-set `Toy' depits an indoor sene with many objets of regular shapes

plaed before alarge and mostly dark bakground.

(29)

Table 1.1: Multi-fous imagesand theirsizes

Figure Size

Doll 384576

Toy 512512

Disk 448576

Lab 448576

Pepsi 512512

Clok 256256

Campus 480640

Hydrant 480640

Garden 320448

Rose 512704

News 224320

OpenGL 512704

of regular geometri shapes.

Image-set `Pepsi' and `Clok' ontain large objets all of whih have regular

geometri shapes.

Image-sets`Campus'and`Hydrant'depitoutdoorseneswithobjetsofmostly

irregulartexture and atlarge distanes among themselves.

Image-set `Garden' depits anoutdoorsene with dense irregulartexture.

Image-set `Rose' has a large area of regulargrid-like struture asbakground.

Image-set `News' ontains dense but mostly regular texture.

Image-set `OpenGL' ontains both miro and maro textures.

As mentioned in Setion 1.5, image registration is a neessary pre-requisite before

(30)

A.(i)Doll: Nearfoused image

A.(ii) Doll: Middle foused image A.(iii)Doll: Farfousedimage

Figure 1.2: Multi-fous image data-sets used for experimentation

`News') were not registered and we have registered them. Details of registration are

given in Chapter 2. The rest of the images were already registered. Sizes of various

image-sets afterregistration are given inTable 1.1.

(31)

1.7 Quantitative Performane Evaluation

A good fusion algorithm should be able to work on a variety of input images, ro-

bust enough to tolerate probable mis-registration of input images and should not

produe any unwanted visual eet or artifat. Moreover quality of the fused image

shouldsatisfytherequirementforintendedappliationandomputationalomplexity

should also be aordable. Quality and time are inter-dependent and they are often

relateddiretly,that is,betterquality needsmoretime. Sodependingonthe spei

appliation, one has toompromise/tradeo between these two.

Therearetwotypesofassessment,subjetiveorqualitativeandobjetiveorquantita-

tive[62℄. Inqualitativefusionqualityassessment,subjetsorobservers are requested

to examine the input image-sets and the output images obtained by various fusion

tehniquesandthenranktheoutputimagesaordingtotheirvisualquality[64℄. Av-

erage of the ranks given by dierent observers indiatesthe subjetive quality of the

tehniques under examination. Theproess is time onsuming,laborious and expen-

sive. Moreover the assessment in this proess is non-repetitive,that is, for the same

set of imagestheranking given by anobservermayhangefromtimetotime. Quan-

titative fusionquality evaluation overomes these draw-baks by employinga metri

thatquantiesthe qualityofthefusedimages. Themetrishouldestimatehowmuh

information is obtained from the input images beause goal of image fusionis to in-

tegrate information from multiple soures. In onventional methods, the ideal fused

image is used as the referene image and the metris like mean-square-error (MSE),

peak-signal-to-noise-ratio(PSNR) are used to estimate the error between the refer-

ene image and the proessed image. Sine referene images are not available here,

we need to use metris whih donot require them.

Inthis thesis,quantitativeevaluationoffusionalgorithmsisdoneby usingtwodier-

entmetris. Theyare basedrespetivelyonimagegradientsand struturalsimilarity

index. Eah ofthe metrisyieldsanumerialvalue fromtheinput image-setand the

(32)

B.(i) Toy: Near foused image

B.(ii) Toy: Middle foused image B.(iii)Toy: Farfoused image

(33)

C.(i) Disk: Nearfousedimage C.(ii) Disk: Farfoused image

D.(i)Lab: Nearfousedimage D.(ii) Lab: Far foused image

(34)

E.(i) Pepsi: Nearfousedimage E.(ii)Pepsi: Farfoused image

F.(i) Clok: Nearfousedimage F.(ii)Clok: Farfoused image

(35)

G.(i) Campus: Near foused image G.(ii) Campus: Farfoused image

H.(i) Hydrant: Nearfousedimage H.(ii)Hydrant: Farfoused image

(36)

I.(i) Garden: Nearfoused image I.(ii) Garden: Farfoused image

J.(i)Rose: Near foused image J.(ii)Rose: Far foused image

(37)

K.(i) News: Nearfoused image K.(ii) News: Far foused image

L.(i)OpenGL: Nearfoused image L.(ii)OpenGL: Farfousedimage

Figure1.2: Continued

(38)

value means better fusion. The metris are desribed below for two input images,

however they an beextended easilyto three ormore input images.

1.7.1 Gradient similarity index

Gradients operators are useful tools to measure variations in intensity of a pixel

with respet to its immediate neighboring pixels [13℄. It is observed that a pixel

possesses high gradient value when it is sharply foused. So in a set of multi-fous

images, pixels of a sharply-foused region possess higher gradient values than pixels

of the orresponding out-of-fous region. This observation led to an image fusion

performane measure employing image gradients [57, 22℄. Fortwo multi-fous input

images X

1

and X

2

, gradient images G

1

and G

2

are obtained rst. Then G

1

and G

2

are ombined into G by taking the maximum gradient value at eah pixel position

(r;). Therefore

G(r;)=max(G

1

(r;);G

2

(r;))for all(r;) (1.10)

Thus only the sharply foused pixels from the onstituent images have their ontri-

bution in the maximum gradient image G. Let

~

G denotes the gradient of the fused

or reonstruted image F. It is referred to as the gradient of fused image. Then,

more similar G and

~

G are, better is the fusion algorithm. Now, following the usual

denition of signal-to-noise ratio, a simple objetive measure of similarity between

two gradientimages is alulatedas

S(G;

~

G)=1

q

P

(G(r;)

~

G(r;)) 2

p

P

G 2

(r;)+ q

P

~

G 2

(r;)

(1.11)

We all S the gradient similarity index (GSI). Here, q

P

(G(r;)

~

G(r;)) 2

deter-

mines theerror ordissimilaritybetween theimagesand itisnormalized bythe quan-

tity pP

G 2

(r;)+ q

P

~

G 2

(r;) to makethe measure unbiasedto overall brightness

of the images. Sofor anideal fused imageS approahes the value 1. For our experi-

(39)

more than two input images, G(r;) is alulated as the maximum of the gradients

at (r;) taken over allinput images.

1.7.2 Fusion quality index

Strutural similarity index (SSI) proposed by Wang and Bovik [83℄ is an eetive

metri to measure the quality of an image. For two real-valued sequenes X =

(x

1

;x

2

;:::;x

n

;)and Y =(y

1

;y

2

;:::;y

n

;),the metriQ

0

(X;Y)dened as

Q

0

(X;Y)=

4

XY

X

Y

( 2

X +

2

Y )(

2

X +

2

Y )

(1.12)

measures the strutural similarityof X and Y. Here

X

and

Y

are the mean values

of X and Y; 2

X

and 2

Y

are the varianes of X and Y; and

XY

is the ovarianeof

X and Y. Strutural similarityoftwoimagesisdened inasimilarway. Sine image

signals are generally non-stationary, it is more appropriate tomeasure Q

0

over loal

regionsandthenombinethe dierentresultsintoasinglemeasure. Theauthors[83℄

proposed to use a sliding window approah. Startingfrom the top-left orner of the

twoimagesX

1

;X

2

,aslidingwindowofxed size(withn pixels) movespixelby pixel

over the entire image until the bottom-right orner is reahed. For eah window w,

the loalquality index Q

0 (X

1

;X

2

jw)isomputed. Finally,the strutural similarity

index (SSI) Q

0

is omputed by averaging allloalquality indies.

PiellaandHeijmans [66℄proposedvariantsof SSItomeasurequality ofimagefusion.

Fusion quality index (FQI) Q(X

1

;X

2

;F) for input images X

1

;X

2

and output image

F is dened by themas

Q(X

1

;X

2

;F)= 1

jWj X

w2W (

1 (w)Q

0 (X

1

;F jw) +

2 (w)Q

0 (X

2

;F jw)) (1.13)

where Q

0 (X

1

;F jw) is the strutural similarity index of X

1

and F over the loal

window w, W is the family of allloalwindows, jWj is the ardinality of W and

1

and

2

are weights obtained from loal salieny measures. Loal salieny measure

s(X jw)ofinputimageX shouldreettheloalrelevaneofX withinthewindow

(40)

w, and it may depend on ontrast, sharpness or entropy. Given the loal salienies

s(X

1

jw)ands(X

2

jw),theloalweights

1

(w)and

2

(w)isomputed. Itindiatesthe

relativeimportaneofimageX

1

omparedtoimageX

2

. A typialhoiefor

1 (w)is

s(X

1 jw)

s(X1jw)+s(X2jw)

. In ourevaluation,we have taken the window-size tobeof 88pixels

and the sum of gradientvaluesin the loalwindowto be the loalsalieny measure.

Formore thantwoinput images,Qisalulatedasthe averageweighted sumof Q

0 's

alulated for all images. Here weight for a loal window in an image is alulated

as the salieny of the windowin that image divided by sum of loalsalieniesfor all

orresponding windows in allother images.

1.8 Organization of the thesis

Organization of the thesis follows. A survey on multi-fous image registration and

an iterative and hybrid method for the same are presented in Chapter 2. Chapter 3

presents a omputationally eÆient pixel-based algorithm for MFIF using wavelet.

Beforedesribingthe algorithm,thebasitheoryandnewwavelet alledmorphologi

wavelet ispresented. Chapter 4presentsablok-basedmethodfor MFIF.It employs

a new fous-measure alled energy of morphologi gradients. Chapter 5 presents

a region-based method for MFIF using multi-sale morphologi operators. In eah

hapter, after desribing the new algorithm,experimentalresults on data-sets given

in Figure 1.2 are presented. Finally, Chapter 6 presents onluding remarks and

out-lines future work.

(41)

Multi-fous image registration

2.1 Introdution

Image registration is a neessary pre-requisite for multi-fous image fusion beause

before fusion the onstituent images must be positioned properly with respet to

a ommon oordinate system so that orresponding objets are overlaid properly

[41℄. In general,the registrationtehniques may belassiedaording totwo major

aspets: methodologyandappliation-area. Themethodsanbeategorizedintotwo

types: (i)area-basedand (ii)feature-based[92℄. Athird ategoryhasemerged whih

is a hybridof area-based and feature-based tehniques. Registration tehniques may

alsobelassied by their mappingmodels, that is by examiningwhether they apply

global and/or loalmapping models. Global models use informationfrom the entire

imagetoestimatethemappingfuntionparameters. Ontheotherhand,loalmodels

treat the image as a omposition of bloks/regions and the funtion parameters are

estimated separatelyfor eahblok/region.

Registration tehniques for multi-fous image have been proposed in [41, 90,91, 18,

22, 29, 27℄. Among these, the methods proposed in [41, 90, 91, 18, 22℄ use global

aÆne transformationmodelsand the ones proposed in[29, 27℄use global perspetive

(42)

transformation models. The tehnique proposed by Kubota et al. [41℄ is an area-

based multi-sale tehnique. In this tehnique, from the soure and the referene

imagesGaussianpyramidsareobtainedatrst. Attheoarsest levelofthepyramids,

translation, rotation and magniation parameters are estimated by the minimum

MSE between the two images. The parameters are propagated to the next ner

level and are further rened. The renement proess ontinues up to the original

resolution level and the parameters obtained there are used to register the soure

image. Zhang and Blum [90, 91℄ proposed a hybrid multi-sale sheme using both

area-based and feature-based tehniques. In this tehnique also, from the soure

and the referene images Gaussian pyramids are obtained at rst. At the oarsest

level of the pyramids, an initial estimation of transformation parameters (mainly

rotation and translation) is done by using the edge features. The parameters are

updated by iterativerenement of the optial ow estimation. They are propagated

to the next ner level and are further rened. The proess ontinues up tothe nest

level in whih the nal parameters are obtained and are used to register the soure

image. De and Chanda [18, 22℄ desribed an area-based tehnique in whih at rst

the soure and referene images are divided into equal number of bloks. A soure

blok isswiped over the orresponding refereneblokto nd out the best mathing

positionintheblok. Correspondingpoint-pairsaretakenfrombest-mathingbloks.

Finally, aÆne transformation parameters are estimated by the best-mathing pairs

of points by using the least-square method. These parameters are then used to

register the soure image. Goshtasby [29℄ proposed a hybrid registration sheme

in whih the edge-intersetion points are used as unique landmarks. At rst, the

landmarks in the soure image are found. Then orresponding landmarks in the

refereneimageare foundbyorrelationtemplatemathing. Fromtheorresponding

landmarkpairs,thebestfoursatisfyingtheprojetiveonstraintsareidentied. They

are used toalulatethe projetive transformationparameters. Soureimageisthen

registered by using these parameters. Fedorov et al. [27℄ used a hybrid registration

sheme in whih a number of well-loated ontrol-points are extrated globally at

rst. Preliminary mathes of the tie-points are established by identifying the pairs

(43)

are pruned o using RANSAC-like algorithm. Finally perspetive transformation

parameters are estimated by the mathed tie-points using the Normalized Diret

Linear Transformation (DLT) algorithm. Soure image is then registered by using

these parameters.

The methods desribed above use global transformation models and do not apply

any loal model appropriate for registration of multi-fous images. In general, these

images are aquired one by one in suh a way that eah image in the set has fous

on objets at a partiular distane from the amera. This results in global as well

as loal variations in the images. In this hapter we explore these variations and

present an iterative algorithm for registration of multi-fous images whih ombines

bothglobalandloalmappingmodels[20℄. In therststepofthe algorithm,aglobal

translation is determined by maximizing the mutual information of the soure and

the referene images and then it is applied on the soure image. In the seond step,

ablok-wiseloalsalingisappliedonthe translatedsoureimage. Thesale-fators

aredeterminedbymaximizingasimilaritymeasureoftwoorrespondingbloksofthe

translatedsoureimageandtherefereneimage. Theglobalandloaltransformations

onstitute a hybrid tehnique whih is iterated to obtain the optimal result. The

proposed method is automati, easy to implement and gives good results. Results

obtained by applyingthe methodondierentsets of multi-fousimagesare provided

with. Performane of the system is also evaluated and is ompared with a widely

used method. Thehapterisorganizedasfollows. Setion2.2desribestheproposed

algorithm. Experimental results and disussion inluding performane analysis are

given in Setion2.3. Finally,onluding remarks are plaed inSetion 2.4.

2.2 An iterative hybrid registration algorithm

Multi-fous images of a sene are aquired one by one either by hand-held ameras

or by ameras plaed ontripods, in idential environmental onditions inrespet to

(44)

image inthe set has fous onobjetsatdierentdistanes inthe sene. Previous re-

searhindiates thatwhen the distanebetween thesene andthe ameraislarge, it

isusuallypossibletoapproximatethemotionoftheseneusinganaÆnetransforma-

tion [90℄. Notethat anaÆne transformation is usually aombinationof translation,

rotationandsaling(seeAppendixB).Inreality,forsuhappliations,rotationofthe

amera relativetothe sene is insigniant and heneis not onsidered here. Global

sale-hangebetween images mayour due tohanges infoalsettings. However in

most pratial appliations, it is less than three perent [76℄ and hene is not on-

sidered here. We onsider global (horizontal and/or vertial) translation(s) between

images dueto aidentalamera-panbetween shots taken byhand-held amerasand

the hanges due tovariations infoalsettings duringaquisition.

Foal variations are done intentionally to fous on objets at a partiular distane.

For example, objets at the bakground of a sene are farther than those at the

foreground andduringaquisition,fous atbakgroundgeneratesafar-fousedimage

in whih the bakground objets are in fous but the foreground objets are out-of-

fous. Similarly fous at foreground generates a near-foused image in whih the

foreground objets are in fous but the bakground objets are out-of-fous. Hene

partial defousing/blurring is inevitable in multi-fous images. Partial defousing

aets the images in two ways. Firstly, due to point-spreading, a blurred objet

appears to be larger in animage when ompared toits foused ounterpart insome

other image [40℄. In addition tothat the radii-of-blur may vary in near-foused and

far-foused images. This results in loalsale-hange between images. Seondly, the

positionof anout-of-fous objet may behangedwhen ompared tothe positionof

its foused ounterpart in some other image. This is shown in Figure 2.1 by a par-

axialgeometrioptis modelof imageformationusingathinonvexlens. Thefoused

image of a point-objet P is reated as a point-image P 0

on Plane-2 whih is the

in-fous image-planefor P. Allother image-planesnearer toorfarther fromthe lens

thanPlane-2 areout-of-fousimage-planesforP. Plane-1 andPlane-3 aretwosuh

out-of-fousplanes. The blurredimagesofpoint-objetP appear asblur-irleswith

(45)

P

P’

Object Plane Plane 1 Plane 2 Plane 3

C

D B A

Figure2.1: Geometri Optis Modelof Lens System

fousedandblurredimagesofthepoint-objetdovary. Inadditiontothat,blur-irle

isshiftedvertiallyupwardsinPlane-1andthesameisshiftedvertiallydownwardsin

Plane-3. So the foused and blurredimages of anobjet dohave position dierenes

as well. Intensity or radiometri dierenes aused by partial defousing are not

dealt with in this work beause they are intrinsi to multi-fous images and we do

not intendto hange them. Werather onentrate onspatialtransformations due to

amerapanand partialdefousing. A singleglobaltransformationisnotadequateto

apture all these eets. Considering this fat, a registration tehnique is presented

whih works in two steps. To nullify the eets of global translation(s), the soure

image is translated globally in the diretion(s) reverse to that of the amera pan.

One the translation is done, loal variations in size and position are orreted by

blok-wiseloalsaling. Abovetwosteps areiterateduntilaertainerrorriterionis

fullled. A shemati diagramdepiting the iterative steps isshown inFigure2.2.

In registration of a set of multi-fous images, every image is equally authenti with

its oordinate system. One of them is hosen to represent the ommon oordinate

system and isalled the referene/target/destinationimage. Otherimagesare alled

soureimages. Soureimagesarethenregisteredtothe refereneimage. Registration

is a mappingbetween twoimages both spatially and with respet tointensity [7℄. If

soure image X

s

and referene/destination image X

d

are dened astwo-dimensional

(46)

Local Scaling

Images of equal size Translated

Source Image Reference Image

Global Translation Reference

Image

Source Image

Multifocus Images Registered

Images Iteration after an

Source Image Scaled Reference Image

Error <

Limit ?

Stop Yes No

Figure2.2: Shemati diagramfor a hybridand iterative registration method

an be expressed generally as

X 0

s

(u;v)=F

i (X

s (F

g

(r;))) (2.1)

suh that

e=jjX

d

(u;v) X 0

s

(u;v)jj 2

(2.2)

be minimum, where F

i

is the mapping for intensity transformation and F

g

is the

mapping for geometri transformation so that (u;v) = F

g

(r;). The equation may

varydependingontheappliation. Inthiswork, F

g

isaspae-varianttransformation

whih is a ombination of global and loal geometri transformations instead of a

single global transformation generally used for multi-fous image registration. We

desribe below the method for registration of soure image X

s

with referene image

X

d .

2.2.1 Global translation

Sine multi-fous images are aquired one by one, aidental amera pan during

aquisitionmay happen and thisresultsinglobaltranslation(s)of theimageinsmall

amounts. This eetan be nulliedby translatingthe imageinreverse diretion(s).

(47)

X

s

and X

d

. Mutual information (MI), originating from the information theory, is

a measure of statistial dependeny between two data-sets [33℄. MI between two

overlapping imagesX

s

and X

d

is given by

MI(X

s

;X

d

)=H(X

s

)+H(X

d

) H(X

s

;X

d

) (2.3)

where H(X

s

) isthe Shannon entropy dened as

H(X

s )=

X

k p

s

(k)logp

s

(k) (2.4)

where p

s

(k) isthe probability of ourrene of greyvalue k inthe image X

s

. Similar

is the denition for Shannon entropy H(X

d

) of image X

d

. Joint entropy H(X

s

;X

d )

of twoimages X

s

and X

d

is given by

H(X

s

;X

d )=

X

(k;l )

p(k;l)logp(k;l) (2.5)

wherep(k;l)isthejointprobabilityofourrenesofgreyvalueskandlinimagesX

s

andX

d

respetively. Entropyofaprobabilitydistributionislowwhenthedistribution

hasafewsharplydened,dominantpeaksanditismaximumwhenalloutomeshave

anequalhaneofourringthat is,thedistributionisuniform. The sameistruefor

jointentropy. It an be seen from Equation (2.3) that a small value of jointentropy

leads to a large value of MI. The idea that MI an be used for image registration

was pioneered by Collignon et al. [17℄ and Viola and Wells [80℄. Both groups used

the idea forregistration ofmulti-modalimages. Itis basedonthe assumption thatif

twomulti-modalimages areproperlyaligned,then orrespondingobjets(and hene

their respetive range of grey values) from two imagesoverlay onone another. This

results in a few sharply dened, dominant peaks or ridges in the joint probability

distribution of the images. Hene, theirjointentropy isminimizedand onsequently

MI ismaximized.

The idea is extended to multi-fous image registration. Soure image X

s

is swiped

over referene image X

d

in suh a way that grids of both images math properly.

This is done by applying integral amount(s) of translation(s) along the axes. So no

(48)

the translatedsoureimageandtherefereneimagearefound. MI oftheoverlapping

sub-images is alulated. Varying the translation-amountswithina range and alu-

lating the MI of overlapping sub-images,the amountof translation whih maximizes

MI isfound. Suppose, the soure image X

s

after optimum translation(s), ismapped

to X

s (r+T

r

;+T

) where T

r

and T

are respetive translations along the row and

olumnaxes. Aftermapping,only theoverlappingportionsoftranslatedsoureimage

and referene image are retained and the rest are trunated. So essentially, trans-

lated soureimageand referene imagebeomeof samesize aftertrunation. This is

importantbeause innext step weneed the soureand referene imagesto beof the

same size. Heneforth, we shall refer to new soure image as X

s

and new referene

image as X

d .

The hoie for ranges within whih T

r

and T

are varied is an experimental issue.

Greater rangemeans better auray, but that alsomeans greatertime-requirement.

Wehaveexperimented withvariousrangesofT

r and T

and haveseen ingeneralthat

the shifts are within 5 pixels. So we have taken the range of T

r

and T

to be -5 to

+5. Mis-alignments greaterthan 5 pixelsare orreted duringsuessive iterations.

2.2.2 Loal saling

Variationsinfoalsettingsduringaquisitionofmulti-fousimagesresultinloalsale

and position dierenes in foused and defoused images of an objet, as explained

in the beginning of this setion by using Figure 2.1. The problem is addressed by

blok-wise registration of X

s

with X

d

. At rst, X

s

and X

d

are divided into n equal-

sized non-overlapping bloks. Sine X

s

and X

d

are of same size, their blok-sizes

are taken to be equal. We have experimented with dierent values of n and found

that n = 16 is a reasonably goodhoie for pratial purposes. Eah blok of X

s is

saled independently by appropriatefators along the axes. Resultant image isthen

(49)

Best sale-fators for a blok

The best sale-fators (alongrowandolumnaxes) forthe k-thsoure blokis deter-

mined by varying the sale-fators and then nding outthe ones whih givethe best

mathingwith thek-th referene blok. Therangeand preisionfor varying thesale

fators are important. We have experimented with three dierent ranges viz. 0.96-

1.04, 0.97-1.03and 0.98-1.02with three dierentinrement-values ineahrange, viz.

0.02, 0.01 and 0.005. It is observed from our experiments that in general inreasing

the range does not improve the results but ner preision gives better results. The

range 0.98-1.02 with preision 0.005 is found to be suitable for our purpose and we

have used those values for varying the sale-fators along the axes.

Suppose that k-th soure blok is saled upon by the sale-fators s

r and s

respe-

tivelyalongr-axisand -axis. Dependingonthe sale-fators,horizontaland vertial

dimensions of the saled blok are hanged independently. Saled soure blok is

swiped over k-th referene blok. Suppose X k

s

and X k

d

respetively are overlapping

sub-images of k-th soure blok (after saling) and k-th referene blok. To nd out

the best mathing sale-fators of k-th soure blok we need either a similarity or a

dissimilarity measure. Small blok-size redues the statistial power of the proba-

bility distribution estimation [67℄. Hene instead of mutual information,area-based

dissimilarity measure sum of squared dierenes (SSD) is used. For eah swiping

positionof the soureblok, SSDbetween the overlapping sub-images X k

s

and X k

d is

omputed by

SSD(X k

s

;X k

d )=

X

r X

fX

k

s

(r;) X k

d (r;)g

2

(2.6)

The best math ours when the SSD is minimum. The SSD's for best mathing

positions for 9 dierent values (in the range 0.98-1.02with preision0.005) for eah

of the sale-fators s

r and s

are noted. This results in total 81 readings of SSD for

the blok. The minimumofthemgivesthebestsalefatorsfortheblok. Therange

of sale-fators asstated above isobtained from experiments with alarge numberof

(50)

Stithing a saled blok

Stithingasaledblokintheresultantimagerequiresadditionalare. Beforesaling

all soure and referene bloks are of equal size. After saling, if the saled soure

blok issmallerin size than the originalsoure blok and isstithed tothe resultant

image, then some blank area will be reated. In that ase an appropriate larger

blok surrounding the original soure blok is saled and positioned there. If the

resizedblokislargerthantherefereneblok,itislippedafterpositioningproperly.

Essentially,theregisteredsoureblokandtherefereneblokshouldbeofsamesize.

For larity, onsider the following example.

Let us illustrate the situations whih may our due to loalsaling, with a soure

blokofsize,say100100pixels. Supposebestsale-fatorsfortheblokis0:98along

both axes. So after saling, size of the blok is 9898pixels whih is smaller than

itstargetarea. Hene a102102blokontainingthe originalsoure blokissaled

to obtaina 100100 blok whih ts the target area. Now onsider another ase in

whih the best sale-fators for the blok is 1:02 along both axes. After saling its

size willbe 102102 pixels whih is biggerthan itstarget area. The best mathing

position of it is found by swiping it over the orresponding referene blok. After

that itislipped to100100pixels, andthen stithedtoitsproperposition. Finally

onsider the ase where the sale-fators are 1.02 along r-axis and 0.98 along -axis.

So the blok beomes 10298 pixels after saling. In this ase, a bigger blok of

size 100102 ontaining the original blok is taken, so that it beomes 102100

after saling. After nding out the best mathing position, the blok is lipped to

100100 pixels and is stithed to the target blok. Eah soure blok is registered

(51)

Figure 2.3: Average error between soure and referene images as the iteration step

numberinreases

2.2.3 Iteration

Asstatedabove,theproposedregistrationtehniquehastwodistintsteps: (i)global

translationand(ii)loalsaling. Optimumtransformationparametersaredetermined

in these two transformations independently. But when they are ombined, indepen-

dent parameters may not remain optimum any more. Hene we iterate these two

steps in the given order to ahieve more aeptable result. We expet and experi-

mentallyveriedtothatthetransformationsF

and F

i

ofEquation(2.1)areupdated

and the error dened in Equation (2.2) is redued. The iteration is stopped when

there is nosignianthange inerror. Root-mean-square-error(RMSE) between the

soure and referene imagesis taken as the measure of error for our implementation

purpose. Average RMSE between soure and referene images (used in Setion 2.3)

are shown against iteration step number in Figure 2.3. Column-0 indiates RMSE

before registration, and for i=1 to 4, Column-i indiates RMSE after i-th step of

iteration. It is seen from the Figure 2.3 that RMSE dereases onsiderably in the

rst step of iteration, then as the iteration step number inreases RMSE dereases,

(52)

Interpolation

In the global translation step, the soure image is swiped over the referene image

in suh a way that grids of both images math properly. Hene no interpolation

is required in this step. In the loal saling step, however, grids of the soure and

the referene bloks do not math in general. Hene, interpolation is required. Bi-

linear interpolation is a reasonable hoie in terms of ease-of-implementation and

time-omplexity. But during suessive iterations it may redue the ontrast of the

images. A higher-order interpolation like bi-ubi interpolation is a better hoie in

that respet althoughittakesmore time [13℄. Toreduethe time-requirement,bilin-

ear interpolation is used while estimating the sale-fators for a blok, and one the

best sale-fators are obtained, the blok is reonstruted nally to be a part of the

resultant registered image, using bi-ubiinterpolation.

2.3 Experimental results and disussion

The proposed algorithmfor imageregistration has been implemented inC language

in Unix environment. The global translation step has been implemented by varying

the translation-amount from -5 to +5 with unit inrement along eah axis. In the

loal saling step, soure and referene images are divided into 16 bloks and for

eah blok the sale-fators along the axes are varied from 0.98 to 1.02 with an

inrement-valueof0.005. Atmostthreeiterationswereseentobeenoughineahase.

Experimentalresultsforvesetsofmulti-fousimages(`Doll',`Disk',`Garden',`Rose'

and `News') are shown in gures 2.4-2.8. In eah result, original multi-fous images

arefollowedbyregisteredimagesbytheproposedmethod. Therstimageistaken as

the referene image ineah ase. Toshow the eetiveness of the method,dierene

images(between the soureandtherefereneimage)beforeandafterregistrationare

also provided.

References

Related documents

From the above fused images and the obtained values of errors in the quality measure table, it can be concluded that Pixel Level Iterations and Discrete Cosine Transform

1. Acquisition of concerned wall image. Crack detection using two efficient algorithms. Wavelet decomposition and Fusion. The proposed algorithm is shown in Fig.3.1.. Submitted

Hence to supplement the complimentary features of the SIFT and SURF, a new Feature based image mosaicing technique using image fusion has been proposed and

Table 5.2, 5.3, 5.4 presents the recognition rate at Rank-5, and Figure 7 presents a graphical representation of results in the form of Cumulative Match Rate (CMC). On the other hand,

This research contributes to the compression of data using source code PCA technique, transmitting sensor level data with channel code LDPC and M-PSK, M-QAM

The key image processing techniques to be used are wiener filtering, color mapping, threshold based segmentation, morphological operation and ROI (Region of

Image denoising, using wavelet techniques are effective because of its ability to capture the energy of signal in a few high transform values, when natural image is

Thus by using the thermonuclear neu- trons hom the 0-T fusion reactor (With appropriate blanket material) to produce finionable fuel, one can increase the energy