Multi-fous Image Fusion using
Morphologial Tehniques
A Dissertation Presented by
Ishita De Ghosh
to
Indian Statistial Institute
in Partial Fulllment of the Requirements
for the degree of
Dotor of Philosophy
A sene to be photographed, usually inludes objets at varying distanes from the
amera. Depth-of-eldof a digitalamera isthe range of distane, allobjets within
whih appear tobe sharp in the image. Dueto the low depth-of-eld of the amera,
images aquired by them oftensuer from degradation alledout-of-fous blur. One
waytoenhanetheeetivedepth-of-eldistoaquireseveral imagesof asenewith
fous on dierent parts of it and then ombine these images into a single image in
suh a way that all regions of the sene are in fous. Aquired images are alled
multi-fous images and the proess of ombination is known as multi-fous image
fusion. The tehniques for multi-fous image fusion belong to the broad ategories,
pixel-based, blok-based and region-based. They onentrate respetively on single
pixels, smallbloks of size mn and arbitrarily shaped regions. Image registration
is a neessary pre-requisite for multi-fous image fusion. The thesis presents a new
tehnique formulti-fous imageregistrationandthreenewtehniques formulti-fous
image fusion. Among these tehniques, the rst one is pixel-based, the seond one
is blok-based and the third one is region-based. All of them use mathematial
morphologial tools. The pixel-based method is a multi-resolution tehnique that
employsmorphologialwavelet asatoolforsignaldeompositionandreonstrution.
The blok-based method uses energy of morphologial gradients as a tool for fous
measure. Finally the region-based method uses multi-sale morphologial tools for
obtaining the fousedregions from the input images. In this ontext, existing fusion
tehniques are studied and ategorized. The thesis inludes experimental results
obtained by applying the proposed methods and other well-known methods on a
variety of input data-set. It also inludes performane analysis of various methods
using standard quantitativeevaluationtehniques. At the end itpresentsonluding
I would like to express my deep and sinere gratitude tomy aademi advisor, Pro-
fessor Bhabatosh Chanda of ISI, Kolkata. Without his onstant understanding, en-
ouragement,and personalguidane, this work would not be possible.
I would like to thank all the members of the ECSU of ISI, Kolkata for their help
during my stay in the ECSU laboratory.
I aknowledge the tehnial support and help provided by Sri Satrajit Ghosh and
Dr. Bibhas C. Dhara duringdierentphases of the work.
Speial thanks to Dr. Ajoy K. Mukherjee, Prinipal, Barrakpore R. S. College, for
hisooperationand allfaulty-membersand supportingsta ofDepartmentofCom-
puterSiene,BarrakporeRastraguruSurendranathCollege,fortheirgenerosityand
logisti support.
Finally, I express my gratitude tomy familymembers inludinglittleImanfor their
ontinuous supportand enouragement.
(Ishita De Ghosh)
1 Introdution 1
1.1 Motivation . . . 2
1.2 Review of previous work . . . 4
1.3 Objetive of the thesis . . . 7
1.4 Morphologi Operators . . . 8
1.4.1 Multi-sale morphologioperators . . . 10
1.5 Contributionof the thesis . . . 11
1.6 Experimentalset-up . . . 14
1.6.1 Data used in experimentation . . . 14
1.7 QuantitativePerformane Evaluation . . . 18
1.7.1 Gradientsimilarityindex . . . 25
1.7.2 Fusion quality index . . . 26
1.8 Organizationof the thesis . . . 27
2.1 Introdution . . . 28
2.2 An iterativehybrid registrationalgorithm . . . 30
2.2.1 Global translation. . . 33
2.2.2 Loalsaling . . . 35
2.2.3 Iteration . . . 38
2.3 Experimentalresults and disussion . . . 39
2.3.1 Quantitativeperformane evaluation . . . 44
2.4 Summary . . . 51
3 Pixel-based fusion 53 3.1 Introdution . . . 53
3.2 Basi theory and a new morphologiwavelet . . . 57
3.2.1 Multi-resolution Analysis. . . 57
3.2.2 A new morphologi Wavelet . . . 59
3.3 Multi-fousimage fusion . . . 62
3.3.1 Algorithm . . . 63
3.3.2 Illustration . . . 64
3.4 Experimentalresults and disussion . . . 65
3.5 Summary . . . 73
4 Blok-based fusion 74 4.1 Introdution . . . 74
4.2 A new blok-basedfusion algorithm . . . 76
4.2.1 Detetion of foused bloks ina quad-tree struture . . . 77
4.2.2 Reonstrution . . . 81
4.2.3 Energy of MorphologiGradients: a new measure of fous . . 83
4.3 Experimentalresults and disussion . . . 85
4.3.1 Disussion . . . 87
4.4 Summary . . . 93
5 Region based fusion 94 5.1 Introdution . . . 94
5.2 Fusion by multi-salemorphology . . . 96
5.2.1 Multi-sale top-hattransformation . . . 97
5.2.2 Detetion of foused regions . . . 98
5.2.3 Reonstrution . . . 99
5.3 Experimentalresults and Disussion. . . 101
5.4 Summary . . . 108
6 Conlusion and future work 110 6.1 Future work . . . 113
6.1.1 Fusion by area morphology. . . 113
6.1.2 Extension tomulti-modalimages . . . 114
6.1.3 Hardware embedding . . . 114
A Depth of eld 115
B AÆne transformation 119
1.1 An example of multi-fous image fusion . . . 3
1.2 Multi-fousimage data-sets usedfor experimentation . . . 17
2.1 Geometri Optis Modelof Lens System . . . 32
2.2 Shemati diagramfor ahybrid and iterativeregistration method . . 33
2.3 Eets of iterative registration . . . 38
2.4 Registration-results of `Doll'images . . . 43
2.5 Registration-results of `Disk' images . . . 45
2.6 Registration-results of `Garden' images . . . 46
2.7 Registration-results of `Rose' images . . . 47
2.8 Registration-results of `News' images . . . 48
2.9 Magniation of seleted areas . . . 49
3.1 Wavelet transform ona22 blok . . . 60
3.2 Illustrationof proposed wavelet transformon a22blok . . . 62
3.4 Results of multi-fousimage fusionby wavelet transform . . . 66
4.1 A generi shematidiagram forblok-based fusion . . . 76
4.2 Problem of multi-fous image fusionwith equal-sizedbloks . . . 77
4.3 Subdivision of imagesin bloks aording to aquad-tree struture . . 79
4.4 Example of reursive subdivision of a blok in animage . . . 80
4.5 Detetion of fousedbloks up tovarious levels ina quad-tree . . . . 82
4.6 Results of multi-fousimage fusionby blok-based methods . . . 86
5.1 Detetion of fousedregions by multi-salemorphology . . . 100
5.2 Fousedregions and orresponding largestonneted regions . . . 101
5.3 Results of multi-fousimage fusionby region-basedmethods . . . 103
A.1 Par-axialgeometri optismodelof image formation. . . 116
1.1 Multi-fousimages and their sizes . . . 16
2.1 Performane evaluationof registration by RMSE . . . 50
2.2 Performane evaluationof registration by MI . . . 50
2.3 Performane evaluationof registration by NCC . . . 51
3.1 Performane evaluationof pixel-based methodsby GSI . . . 67
3.2 Performane evaluationof pixel-based methodsby FQI . . . 71
3.3 Time requirementin pixel-basedmethods. . . 72
4.1 Performane evaluationof blok-basedmethodsby GSI . . . 87
4.2 Performane evaluationof blok-basedmethodsby FQI . . . 91
4.3 Time requirementin blok-based methods . . . 92
5.1 Performane evaluationof region-based methods by GSI . . . 102
5.2 Performane evaluationof region-based methods by FQI . . . 104
6.1 Performanesummeryofproposedpixel-based,blok-basedandregion-
based methods . . . 111
Word(s) Abbreviation
Depth-of-eld DOF
Energy of gradients EOG
Energy of Laplaian EOL
Energy of morphologigradients EOMG
Fous-measure FM
Fusion quality index FQI
Gradientsimilarityindex GSI
Multi-fous imagefusion MFIF
Mutual information MI
ModiedLaplaian ML
Multi-resolution MR
Multi-resolution deomposition MRD
Multi-sale MS
Multi-sale deomposition MSD
Mean-square-error MSE
Normalized-ross-orrelation NCC
Normalizeddierene infous-measure NDFM
Quad-tree QT
Root-mean-square-error RMSE
Struturing element SE
Spatial frequeny SF
Sum modied Laplaian SML
Strutural similarityindex SSI
Introdution
Advanements in digital imaging tehnology have inreased the popularity of on-
sumer imagingprodutssuhasdigitalamerasand amorders. Howeverduetothe
physial limitations of the imaging systems, images produed by them often suer
from degradations. A sene to be photographed usually inludes objets at varying
distanes from the amera. Sharpness distribution of an image of suh a sene is
aeted by variousfators. The objet fousedbythe ameraand the objets atthe
same distanefromtheameraasthefousedobjetappeartobethesharpestinthe
image. Sharpnessofthe objetsinfrontofand behindthe fouseddistane dereases
graduallyintheimage. Thissharpness-lossisnotsigniantwithinaertainrangeof
objet distanes. This range is alled depth-of-eld (DOF)of the amera [76℄. DOF
as alulated by par-axial geometri optis model of image formation using a thin
onvex lens is given in Appendix A of the thesis. DOF depends on various fators
suh as,the amountof sharpness-lossregarded asaeptable,foal-length ofthe lens
(longerthe foal-length,shortertheDOF), distane ofthe foused objet(nearerthe
objet,shortertheDOF)andthe apertureused(dereasingtheaperturewillinrease
the DOF).TheextremeaseofdereasingtheapertureformaximizingtheDOFhap-
pens in a pin-holeamera. It has an innite DOF.Unfortunately, the optial power
in the image plane is redued onsiderably due to very smallaperture. So ameras
with nite DOF are preferred. A nite but large DOF means that objets within a
large range(and henepossibly alargenumberofobjets) willappear tobesharp in
the photograph. On the otherhand, asmallDOFmeans that objetswithin asmall
range (and hene possibly a smallnumber of objets) willappear tobe sharp in the
photograph and all other objets will appear to be out-of-fous in the photograph.
Out-of-fous blur is one of the typial degradations whih our in images aquired
by digital ameras due to their low DOF [60, 76℄. The problem of low DOF is also
enounteredinmirosopyduetoinrementinmagniationand aperture[61,6,36℄.
1.1 Motivation
One way toenhane theeetive DOFistoaquire several imagesofasene foused
on objets at dierent distanes and then integrate these images intoa single image
in suh a way that all regions of the sene are in fous. Aquired images are alled
multi-fous images and the proess of ombination is known as multi-fous image
fusion (MFIF).Theproessproduesanimagewhosetotalarea-in-fousismorethan
that ofany ofthe onstituentimages. Multi-fousimagesof aseneare aquiredone
by one either by hand-held ameras or by ameras plaed on tripods, in idential
environmental onditions in respet to sensor, light, view-diretion, orientation and
objet-ontents in the sene. They an be either grey-level or olor images. Sine
eah image in a set of multi-fous images has fous on objets at dierent distanes
in the sene, an objet whih is in-fous in the near-foused image may be out-of-
fous in other images. Similarly an objet whih is out-of-fous in the near-foused
image maybe in-fous inthe far-foused image. Hene partialdefousing/blurring is
inevitable in this type of images. MFIF produes an image in whih blurred regions
are deblurredandeveryareaisinfous. Thefused imageshouldbebetterforhuman
viewing as well as for subsequent proessing and analysis like segmentation, feature
extration, objetreognitionet. Figure1.1shows anexampleofmulti-fousimages
(a) Nearfoused image (b) Far fousedimage () Fused image
Figure1.1: An example of multi-fous image fusion
The tehniques for MFIF belong to the broad ategories, pixel-based, blok-based
and region-based. They onentrate respetivelyon singlepixels, smallbloksof size
mn and arbitrarily shaped regions. It is interesting to study and ompare MFIF
tehniqueswithinapartiularategoryandthe onesbelongingtodierentategories.
The ultimate goal of MFIF is to obtain all objets in the nal image in foused
and identiable form. Mathematial morphologi operators have the apability of
handlingobjetsindierentshapesandsizes. Inthisthesis,weexploremathematial
morphology asa tool for MFIF and propose new tehniques for the same employing
this tool. We provide a omparison of results obtained by various tehniques and
outline some related future work. Image registration is a neessary pre-requisite for
MFIF beausebeforefusiontheonstituentimagesmust bepositionedproperlywith
respet to a ommon oordinate system so that orresponding objets are overlaid
properly[41℄. Weproposeanew tehnique formulti-fousimageregistrationalso. In
thishapter,abriefreviewofpreviousworkonMFIFisgiveninSetion1.2,objetive
of the thesis is given in Setion 1.3, a brief aount of mathematial morphologi
operators is given in Setion 1.4, ontribution of the thesis is given in Setion 1.5,
experimentalset-upalongwithdatausedforexperimentationaregiveninSetion1.6,
evaluation tehniques used are given in Setion 1.7, and nally organization of the
1.2 Review of previous work
The fundamental onept behindMFIF is toselet the sharply foused regions from
the inputimages to forman imagein whih allobjetsare infous. The basi steps
for this are, todivide eah input image into overlapping or non-overlapping regions,
then measure sharpness of fous forall regions, nallyselet the best-foused region
among all orresponding regions to form the fused image. When all the regions of
interestsqueezeintosinglepixels,theapproahisalledapixel-basedapproah,when
they are smallbloks of size mn, the approah is alled a blok-based approah;
otherwise it is alled a region-based approah. Another ategorization is done on
whether the tehnique is based on spatial domain or frequeny domain. In spatial
domaintehniques,inputimagesarefusedinspatialdomainusingphysiallyrelevant
spatialfeatures. Infrequenydomaintehniques,multi-saledeomposition(MSD)or
multi-resolutiondeomposition(MRD) by pyramid orwavelet transform is required.
AnearlyategorizationoffrequenydomainMRDfusionshemeswasgivenbyZhang
and Blum [89℄. Piella[65℄ provided a general framework for these shemes and also
proposed anew methodfor the same. Pajares and Cruz[59℄ presented a omprehen-
sivetutorialonwavelet-based fusionmethods. Goshtasby andNikolov [30℄ presented
an overview of various fusion tehniques. Basi idea of MRD-based fusion shemes
is the following. At rst eah soure image istransformed/deomposed up to alevel
by an MRD sheme. The deomposition gives the saled image as low frequeny
oeÆients and the detail images as high frequeny oeÆients. Salienies of the
oeÆients are measured by their ativity-levels. A seletionor deision map is re-
ated from the ativity-levels of the oeÆients from all transformed images. The
map is used as aguide toonstrut the omposite representation of the transformed
images. Finally fused image is obtained by applying the inverse transform to the
omposite representation. An MRD fusion sheme is ategorized depending on how
the ativity-levelsof MRD-oeÆientsare measured. Ifthe ativity-levelismeasured
for the oeÆientsrelated toindividual pixels, the methodis alledpixel-based, if it
oeÆient, the methodis alled blok-based and nally; if it is measured depending
on all oeÆients in a region ontaining the onerned oeÆient, then the method
is alled region-based. Images fused by frequeny-domain MRD shemes may lose
some information of the soure images beause of implementation of inverse multi-
resolution transform.
The idea of using MRD shemes forimage fusionwas rst proposed by Burt [8℄as a
modelfor binoular fusion for human stereo vision. He used Laplaian pyramid for
MRD and hoose max rule for oeÆient seletion. Burt and Adelson [10℄ later in-
trodued anew approah toimagefusionbasedonhierarhial imagedeomposition.
Adelson [2℄thenused theLaplaianpyramidtehniquefor MFIF.Toet[78℄proposed
the use of ratio of low-pass pyramids at suessive levels of Gaussian pyramids for
fusion of visible and IR images. Burt [9℄ proposed that fusion within a gradient
pyramid providesimproved stabilityand noise immunity. Akerman[3℄optimized the
Laplaianpyramidfusionin respet of multi-sensor fusion. Burt andKolzynski [11℄
presented gradient pyramid fusion with a loal math measure and a window-based
salieny measure. Li et al. [45℄ used similar method exept that wavelet transform
is used insteadof pyramid transform and onsisteny veriation is done along with
window-based ativity measure. Wavelet based fusion tehniques are proposed later
by many other peopleinludingChipman et al.[16℄, Petrovi and Xydeas [63℄, She-
unders [72℄, Hill et al. [37℄, Hamza et al. [34℄, De and Chanda [19℄, Qu and Yan
[68℄ and Lewis et al. [44℄. Frequeny domaintehniques in various ategories willbe
disussed indetail in relatedhapters.
Sinemulti-fousimagesofaseneareaquiredwithfousonomplementaryregions,
foused regions in an image have more ontrast than their defoused ounter-parts
in other images. Fous-measure (FM) is a quantity for evaluating the ontrast or
sharpness of a pixel, blok or region [39, 50℄. Image variane, image gradients, im-
age Laplaians, energy of image gradients (EOG), energy of image Laplaian(EOL)
are traditional FM's employed and validated for appliations like autofousing [76℄.
Laplaian[58℄. Spatial frequeny (SF)[25℄ and Tenengrad [36℄ were laterintrodued
as fous measures. In spatial domain MFIF tehniques, input images are fused in
the spatialdomain using fous-measure asa physiallyrelevant spatialfeature in lo-
alized area. Sine these tehniques emphasize on a spei or desired image area,
very little/no hange ours in other areas. Pixel-level weighted averaging is a spa-
tial domain tehnique in whih fusion is done by taking the weighted average of the
pixel intensities of the input images. Weights are determined by tools like priniple
omponent analysis [71℄ or adaptive methods [42℄. Other spatial domain pixel-level
image fusionapproahes inlude, fusion using ontrollable amera [73℄, probabilisti
methods[5℄, image gradient method with majorityltering [23℄.
The basi idea in spatial domain blok based fusion methods is to divide the input
images into a number of bloks, then measure fous on orresponding bloks and
nally selet and ombine the foused bloks to reate the fused image [39℄. Often
onsisteny veriation isdone beforereatingthe nal fused image. Spatialdomain
blok based fusionmethodsare proposed in [47, 48, 55,29, 27, 87, 21℄. Li et al. [47℄
used spatial frequeny (SF) as the fous measure. In a subsequent work they [48℄
used neural network (NN) to selet better foused bloks using three features SF,
visibilityandedgefeature. MiaoandWang[55℄usedenergyofimagegradients(EOG)
to measure fous in image bloks in an MFIF algorithm based on Pulse Coupled
Neural Networks. In the method of Goshtasby [29℄, fous is measured by the sum
of the gradient values of all pixels in the blok. In the method of Fedorov et al.
[27℄ eah image is tiled with overlapping neighborhoods. For eah region the tile
that orresponds tothe best fous (whih is measured by ML) ishosen. Zhang and
Ge[87℄proposedatehniqueinwhihfousedbloksaredeteted bymeasuringtheir
blurriness. De and Chanda [21℄ introdued a new fous measure alled energy of
morphologi gradients (EOMG) and used it for image fusionin a blok-based MFIF
algorithm.
In region-based fusion tehniques, among orresponding regions better foused ones
onern inthese methods. Spatial-domainregion-basedfusionmethodsareproposed
in [53, 57, 22, 49℄. Methods desribed in [53, 57, 22℄ use multi-sale morphology.
Matsopoulos et al. [53℄ used multi-salemorphologi pyramids. Mukhopadhyay and
Chanda [57℄ usedmorphologitowers insteadofmorphologipyramids. Deet al.[22℄
proposed multi-fous image fusion tehniques using multi-sale top-hat transforma-
tion. Li and Yang's tehnique [49℄ isa spatial-domain,region-basedtehnique whih
does not depend on MRD. In this tehnique, input images are segmented aording
tothesegmentationresultsofatemporaryfused imageandbetterfousedregionsare
seleted and stithed to their desired positions to get the nal fused image. Spatial
domaintehniquesinvariousategorieswillbedisussedindetailinrelatedhapters.
Objetive of the thesis is given now.
1.3 Objetive of the thesis
A numberof researhershavesuggested methodsforMFIF asasolutiontotheprob-
lem of low depth-of-eld. As disussed before, the tehniques belong to the broad
ategories, pixel-based, blok-based and region-based. It is interesting to study and
ompare MFIF tehniqueswithin apartiularategory andthe tehniques belonging
to dierent ategories. A good algorithm for MFIF should possess some important
properties. It should be independent of image ontent and robust against probable
misalignments of input images. It should not produe any unwanted visual eet
or artifat. Quality of the fused image should satisfy the requirement for intended
appliation and nallyomputational omplexity should also be aordable. In gen-
eral, pixel-based tehniques are intuitively straightforward, easy to implement and
omputationally eÆient. But they are sensitive to mis-registrationof input images.
Blok-based and region-based tehniques are more robust in respet of registration
problems thoughthey are more omplexingeneral. Despite the inrease inomplex-
ity, region-based methods have a number of advantages over pixel-based methods.
to attenuate or aentuate ertain properties to the regions [30℄.
Multi-fous images may ontain objets of dierent shapes and sizes. The ultimate
goal of MFIF is to obtain all objets in the nal image in foused/deblurred form.
Mathematial morphologyisa subjet whih treatsanimage asan ensembleof sets.
Morphologioperatorshavetheapabilityofhandlingobjetsindierentshapesand
sizes. Theyhavesomeinterestingomputationaladvantages aswell. Inthisthesiswe
explore morphologi tehniques as a tool for MFIF. Algorithms for MFIF proposed
in the thesis employ various ombinationsof morphologioperations.
Given this, the objetive of the thesis is to propose and analyze grey-level MFIF
shemes employingmorphologioperatorsand havingthe following desirableproper-
ties,
ability toworkon avariety of input images,
robustness against probable mis-registrationof input images,
extensibilityto fuse multi-fous olor images,
lowomputationalost,
adaptability tohardware implementation.
Sine all algorithms for MFIF proposed in this thesis use morphologi operators, a
brief introdutionto them isgiven now.
1.4 Morphologi Operators
Mathematial morphology treats an image as a set of pixels [74, 75℄. Morphologi
operators work with two sets, the original image to be analyzed and a struturing
to the operation. Fundamentalmorphologioperations are morphologidilationand
morphologi erosion. At rst we present these two operators for binary images. A
two-dimensional binary image signal is a funtion/mapping from domain D (whih
is a subset of disrete two-dimensional Eulidean spae Z 2
) to a binary-set f0;1g.
Suppose A is the set of points representing the binary-1 pixelsof the originalbinary
imageand B isthe set ofpointsrepresenting binary-1pixelsofthe SE.Then dilation
and erosionof Aby B,are denoted byAB andA B respetively andare dened
as
AB = fb+ajfor b 2B and a2Ag (1.1)
A B = fpjb+p2A for every b2Bg (1.2)
where `+' denotes the binary-or operation. Pratially, AB is the lous of origin
of B suhthat B hits A. Similarly,A B isthe lousof origin ofB suhthat B ts
in A.
We now onsider the ase of grey-sale images. A two-dimensionalgrey-sale image
signal X is a funtion/mapping from domain D (whih is a subset of disrete two-
dimensional spae Z 2
) to the set of grey intensity values fg
1
;g
2
;:::;g
n
g where eah
g
i
is a nonnegative integer. A grey-sale SE h is a mappingfrom its domain to the
above set of grey values. In this thesis, we use at SE's that is SE's for whih the
valueofh isalwayszero. Let(r;) beapointindomainD,wherer anddenotethe
row and olumn oordinates respetively. Dilation and erosion of X(r;) by h(r;)
are denoted by (X h)(r;) and (X h)(r;) respetively and are dened as
(X h)(r;) = max
(i;j)2Domainofh
(X(r i; j)+h(i;j)) (1.3)
(X h)(r;) = min
(i;j)2Domainofh
(X(r+i;+j) h(i;j)) (1.4)
where the maximum and minimum are taken over all (i;j) in the domain of h suh
that (r i; j) and (r+i;+j)areinthedomainofX. Sodilationsimplyreplaes
the value at eahpointof X by the maximum value in the neighborhooddened by
the SE when the origin of SE is plaed at the point. Similarly erosion replaes the
SE when the origin of SE is plaed at the point. Other morphologi operators are
onstruted by ombining dilationand erosion. For example, opening and losing of
X(r;) by h(r;) are denoted by (X Æh)(r;) and (X h)(r;) respetively and are
dened as
(X Æh)(r;) = ((X h)h)(r;) (1.5)
(X h)(r;) = ((Xh) h)(r;) (1.6)
Both opening and losing are inreasing operations implying that opening (losing)
of an image ontains openings (losings) of all its sub-images. Both opening and
losing are idempotent operations implying that suessive appliations of openings
(losings) do not further modify the image. Finally, opening is an anti-extensive
operation andlosing isanextensiveoperation. Inagrey-sale imageX,anopening
removes all foreground strutures in the image that are not large enoughto ontain
the SE. Similarly,a losing removes all bakground strutures in the image that are
not largeenoughtoontaintheSE. Hereforeground struturemeansanimageregion
of intensity value higherthan the surrounding region.
1.4.1 Multi-sale morphologi operators
Extration offeaturesbymathematialmorphologydependsoneetive useof SE's.
Sizes and shapes of SE's play ruial roles here. A morphologi operator with a
salableSEanextratfeaturesofvariousshapesandsizes. Ashemeofmorphologi
operations with a salable SE is termed as multi-sale morphology [15, 52℄. For a
salable SE h, size of its domain gets hanged. Let B be a set representing the
domain of h. Assume that B has a denite shape. Let n be an integer representing
the sale-fator of B and let nB denote the saled version of B at sale n. If B is
onvex, then nB is obtained by n 1 dilationsof B by itself.
nB =BBBB
| {z }
n 1 times
(1.7)
When n=0,onventionallyB istaken tobeadiskofunitsize sothatnB =f(0;0)g.
a morphologi operation by SE h redues to an operation by its domain nB. Then
multi-saleopeningand losing of X by salable domainnB are dened respetively
as
(X ÆnB)(r;) = ((X nB)nB)(r;) (1.8)
(X nB)(r;) = ((XnB) nB)(r;) (1.9)
The opening removes all bright/foreground strutures in the image X that are not
large enough to ontain nB. Here foreground struture means an image region of
intensity value higher thanthe surrounding region. Similarly,the losingremovesall
dark/bakgroundstruturesintheimageX thatare notlargeenoughtoontain nB.
These operators are used eetively to detet foused regions whih in general have
more ontrast than orresponding defoused regions.
Given the bakground and the objetive of the thesis and a short introdution to
morphologi operators,ontributionof the thesis ispresented now.
1.5 Contribution of the thesis
It isalreadydisussed thatthe objetiveof thethesis istopropose and analyzegrey-
level MFIF shemes having ertain desirable properties. Mathematial morphology
isexplored asatoolforMFIF and newtehniques are presented employingthis tool.
In additiontoa briefreview ofprevious work,the objetive ofthe thesis and ashort
introdution to morphologi operators, urrent hapter, viz. Chapter 1 inludes the
data-set used for experimentation purpose and the quantitative measures used for
performane evaluation.
Sine registration is a neessary prerequisite for MFIF, a new algorithm for multi-
fous image registration is presented in Chapter 2. It is an iterative algorithm for
registrationofmulti-fousimagesby ombiningglobalandloaltransformationmod-
the mutualinformationof the soure and the refereneimages and then itis applied
on the soure image. In the seond step, a blok-wise loal saling is appliedon the
translatedsoureimage. Thesale-fatorsare determinedbymaximizingasimilarity
measure oftwoorrespondingbloksofthe translatedsoureimageandthe referene
image. The global and loal transformations onstitute a hybrid tehnique whih
is iterated to obtain the optimal result. The proposed method is automati, easy
to implement and gives good results. Results obtained by applying the method on
dierent sets of multi-fous images are provided with. Performane of the system is
evaluated and is ompared with awidely used method.
Chapter 3 presents a pixel-based algorithm for multi-fous image fusion using mor-
phologi wavelets. A nonlinear morphologi wavelet transform whih preserves the
range inthe saled imagesand involves integerarithmeti onlyisintroduedatrst.
This transform is employed in a fusion algorithm to fuse a set of grey-sale multi-
fous images. The method is omputationally eÆient and produes good results.
Integrated-hip implementationsof imageproessingalgorithmsare goingtobeome
moreommoninnearfuture. Ourmethodwillbeusefulinthisrespet. Theproblem
with this algorithm is that being a pixel-based method, it is not robust to mis-
registration problem.
Chapter 4 presents a blok-based algorithm for multi-fous image fusion using a
morphology-based fous measure ina quad-treestruture. Fous-measure isa quan-
tity for evaluating the ontrast or sharpness of a pixel, blok or region. A new
fous-measure alled energy of morphologi gradients (EOMG) is introdued. It is
used foranovelalgorithmforMFIF whih employsaquad-treestruture foroptimal
subdivision of input images while seleting the sharply foused bloks. Though the
algorithmstarts with bloks,it ultimatelyidenties sharply fousedregions ininput
images. The fous measure EOMG is omparablewith other fous measures viz. en-
ergy of gradients (EOG) and variane. The algorithmis robust in the sense that it
works withanyfous measure. Itisalsorobustagainstpixelmis-registration. Butas
may appear inthe boundaries of arbitrary-shaped regions.
Chapter 5presentsaregion-basedalgorithmformulti-fousimagefusionusingmulti-
sale morphology. Sine multi-fous imagesof asene areaquired withfous onthe
omplementaryregions,fousedregionsinanimagehavemoreontrastthantheirde-
foused ounter-parts inother images. This implies that the foused regionsontain
larger number of physially relevant features than that ontained in orresponding
defousedregions. Fousedregionsaredetetedbyextratingthebrightanddarkfea-
tures at various sales by multi-saletop-hat transformation. Sine the best-foused
regions are deteted and opied from one image only, a slight error in registration
will have no eet in fusion exept in the borders of the foused regions. Hene
this region-based method is robust to mis-registration. This method resembles the
manual ut-and-paste method of image fusion whih is often used for omparison
purposes. Thus the fused image obtained by the method is very similar to the ideal
fused image. Performane analysis reveals that ourmethodissuperior tofusionby a
state-of-the-art method.
Chapter 6 presents the onlusion of the thesis inluding a omparative study of
tehniques presented in previous hapters. It also presents a disussion on related
future work.
In brief, inthis thesis
Chapter 1presents a brief review onexisting literature, the objetive and on-
tribution of the thesis, data-set used for experimentation and the quantitative
measures used for performane evaluation,
Chapter 2presents aniterativealgorithmfor registrationof multi-fous images
by ombining global and loaltransformation models,
Chapter 3 presents a pixel-based algorithmfor multi-fous image fusion using
Chapter 4 presents a blok-based algorithmfor multi-fous image fusionusing
a morphology-based fous measure in aquad-tree struture,
Chapter 5 presentsaregion-based algorithmfor multi-fousimagefusionusing
multi-salemorphology,
Chapter6presentsonlusionofthethesisandgivesanoutlineonrelatedfuture
work.
Experimentalset-up and the data-set used for experiments are presented now.
1.6 Experimental set-up
Proposed algorithms are implemented using C language in Unix environment. All
programs are exeuted on a mahine with Intel Pentium proessor T4400 and 1 GB
RAM. Standard algorithms proposed by others have also been implemented in the
same environmentfor omparison purpose.
1.6.1 Data used in experimentation
The algorithms are applied on a large number of multi-fous image-sets whih vary
in their objet-ontents and imaging set-up. Objet-ontents of image-sets vary in
number, shape anddistane of objetsfromthe amera. Texture ofimage-sets varies
in regularity, density and in ombination of miro and maro texture. Some of the
image-sets depit indoor senes whereas others depit outdoor senes. Images of
indoorsenesgenerallyontainhumanbeings, animalsand man-madeobjets. Man-
made objetswith straight-lineedges (for example, book, book-shelf, table,window,
door et.) are helpful to detet artifats like step-eets generated after proessing.
Images of outdoorsenes generally ontain natural objets like owers, plants, trees
of suh images is diÆult beause in addition to other dierenes temporalhanges
between shots may our due towind. Hene slight mis-registrationmay be present
inthistypeofimages. This mayinturnrevealthe robustnessofthefusionproedure
against mis-registration.
Sine it isnot possible toinlude allexperimentalresults inthe thesis, we have ho-
sen test image-sets in suh a way that experiments are validated by dierent types
of images. Twelve representative image-sets are used in the thesis and they are ob-
tainedfromweb-sites[32,24,26,1,28℄. Theimage-setsnamedas`Doll',`Toy',`Disk',
`Lab', `Pepsi', `Clok', `Campus', `Hydrant', `Garden', `Rose', `News' and `OpenGL'
are shown in Fig. 1.2. Among these, the multi-fous `Doll' images (Fig. 1.2A) are
synthetiimages generated fromthe famous paintingnamed `LasMeninas' by Diego
Velazquez kept at `Museo delPrado' in Madrid. These images have been generated
artiially by a modern painter um art-teaher John Hagan [32℄. He has visually
estimated thedistanes ofvariousobjetspresentinthe painting. Aordinglydier-
ent portionsof the original imageof the painting have been artiially defoused by
him to illustratethe onept of `depth-of-eld'. Though the blurring model and the
parameters are not known to us, we have used this multi-fous image-setas anideal
synthetidata-setforevaluatingtheperformaneoffusionalgorithms. Moreover, this
image-setontainsthree multi-fous images,heneitoersbetter illustrationfaility
than the sets of two images. Image-sets `Toy', `Disk', `Lab', `Pepsi' and `Clok' are
obtained fromweb-site[24℄;`Campus'and `Hydrant'are obtainedfromweb-site[26℄;
`Garden' is obtained from web-site [1℄; `Rose', `News' and `OpenGL' are obtained
from web-site [28℄. The harateristis of test image-sets are given now.
Image-set `Doll'depits anindoorsene with many objetsof arbitrary shapes
and sizes and plaedat dierent distanes.
Image-set `Toy' depits an indoor sene with many objets of regular shapes
plaed before alarge and mostly dark bakground.
Table 1.1: Multi-fous imagesand theirsizes
Figure Size
Doll 384576
Toy 512512
Disk 448576
Lab 448576
Pepsi 512512
Clok 256256
Campus 480640
Hydrant 480640
Garden 320448
Rose 512704
News 224320
OpenGL 512704
of regular geometri shapes.
Image-set `Pepsi' and `Clok' ontain large objets all of whih have regular
geometri shapes.
Image-sets`Campus'and`Hydrant'depitoutdoorseneswithobjetsofmostly
irregulartexture and atlarge distanes among themselves.
Image-set `Garden' depits anoutdoorsene with dense irregulartexture.
Image-set `Rose' has a large area of regulargrid-like struture asbakground.
Image-set `News' ontains dense but mostly regular texture.
Image-set `OpenGL' ontains both miro and maro textures.
As mentioned in Setion 1.5, image registration is a neessary pre-requisite before
A.(i)Doll: Nearfoused image
A.(ii) Doll: Middle foused image A.(iii)Doll: Farfousedimage
Figure 1.2: Multi-fous image data-sets used for experimentation
`News') were not registered and we have registered them. Details of registration are
given in Chapter 2. The rest of the images were already registered. Sizes of various
image-sets afterregistration are given inTable 1.1.
1.7 Quantitative Performane Evaluation
A good fusion algorithm should be able to work on a variety of input images, ro-
bust enough to tolerate probable mis-registration of input images and should not
produe any unwanted visual eet or artifat. Moreover quality of the fused image
shouldsatisfytherequirementforintendedappliationandomputationalomplexity
should also be aordable. Quality and time are inter-dependent and they are often
relateddiretly,that is,betterquality needsmoretime. Sodependingonthe spei
appliation, one has toompromise/tradeo between these two.
Therearetwotypesofassessment,subjetiveorqualitativeandobjetiveorquantita-
tive[62℄. Inqualitativefusionqualityassessment,subjetsorobservers are requested
to examine the input image-sets and the output images obtained by various fusion
tehniquesandthenranktheoutputimagesaordingtotheirvisualquality[64℄. Av-
erage of the ranks given by dierent observers indiatesthe subjetive quality of the
tehniques under examination. Theproess is time onsuming,laborious and expen-
sive. Moreover the assessment in this proess is non-repetitive,that is, for the same
set of imagestheranking given by anobservermayhangefromtimetotime. Quan-
titative fusionquality evaluation overomes these draw-baks by employinga metri
thatquantiesthe qualityofthefusedimages. Themetrishouldestimatehowmuh
information is obtained from the input images beause goal of image fusionis to in-
tegrate information from multiple soures. In onventional methods, the ideal fused
image is used as the referene image and the metris like mean-square-error (MSE),
peak-signal-to-noise-ratio(PSNR) are used to estimate the error between the refer-
ene image and the proessed image. Sine referene images are not available here,
we need to use metris whih donot require them.
Inthis thesis,quantitativeevaluationoffusionalgorithmsisdoneby usingtwodier-
entmetris. Theyare basedrespetivelyonimagegradientsand struturalsimilarity
index. Eah ofthe metrisyieldsanumerialvalue fromtheinput image-setand the
B.(i) Toy: Near foused image
B.(ii) Toy: Middle foused image B.(iii)Toy: Farfoused image
C.(i) Disk: Nearfousedimage C.(ii) Disk: Farfoused image
D.(i)Lab: Nearfousedimage D.(ii) Lab: Far foused image
E.(i) Pepsi: Nearfousedimage E.(ii)Pepsi: Farfoused image
F.(i) Clok: Nearfousedimage F.(ii)Clok: Farfoused image
G.(i) Campus: Near foused image G.(ii) Campus: Farfoused image
H.(i) Hydrant: Nearfousedimage H.(ii)Hydrant: Farfoused image
I.(i) Garden: Nearfoused image I.(ii) Garden: Farfoused image
J.(i)Rose: Near foused image J.(ii)Rose: Far foused image
K.(i) News: Nearfoused image K.(ii) News: Far foused image
L.(i)OpenGL: Nearfoused image L.(ii)OpenGL: Farfousedimage
Figure1.2: Continued
value means better fusion. The metris are desribed below for two input images,
however they an beextended easilyto three ormore input images.
1.7.1 Gradient similarity index
Gradients operators are useful tools to measure variations in intensity of a pixel
with respet to its immediate neighboring pixels [13℄. It is observed that a pixel
possesses high gradient value when it is sharply foused. So in a set of multi-fous
images, pixels of a sharply-foused region possess higher gradient values than pixels
of the orresponding out-of-fous region. This observation led to an image fusion
performane measure employing image gradients [57, 22℄. Fortwo multi-fous input
images X
1
and X
2
, gradient images G
1
and G
2
are obtained rst. Then G
1
and G
2
are ombined into G by taking the maximum gradient value at eah pixel position
(r;). Therefore
G(r;)=max(G
1
(r;);G
2
(r;))for all(r;) (1.10)
Thus only the sharply foused pixels from the onstituent images have their ontri-
bution in the maximum gradient image G. Let
~
G denotes the gradient of the fused
or reonstruted image F. It is referred to as the gradient of fused image. Then,
more similar G and
~
G are, better is the fusion algorithm. Now, following the usual
denition of signal-to-noise ratio, a simple objetive measure of similarity between
two gradientimages is alulatedas
S(G;
~
G)=1
q
P
(G(r;)
~
G(r;)) 2
p
P
G 2
(r;)+ q
P
~
G 2
(r;)
(1.11)
We all S the gradient similarity index (GSI). Here, q
P
(G(r;)
~
G(r;)) 2
deter-
mines theerror ordissimilaritybetween theimagesand itisnormalized bythe quan-
tity pP
G 2
(r;)+ q
P
~
G 2
(r;) to makethe measure unbiasedto overall brightness
of the images. Sofor anideal fused imageS approahes the value 1. For our experi-
more than two input images, G(r;) is alulated as the maximum of the gradients
at (r;) taken over allinput images.
1.7.2 Fusion quality index
Strutural similarity index (SSI) proposed by Wang and Bovik [83℄ is an eetive
metri to measure the quality of an image. For two real-valued sequenes X =
(x
1
;x
2
;:::;x
n
;)and Y =(y
1
;y
2
;:::;y
n
;),the metriQ
0
(X;Y)dened as
Q
0
(X;Y)=
4
XY
X
Y
( 2
X +
2
Y )(
2
X +
2
Y )
(1.12)
measures the strutural similarityof X and Y. Here
X
and
Y
are the mean values
of X and Y; 2
X
and 2
Y
are the varianes of X and Y; and
XY
is the ovarianeof
X and Y. Strutural similarityoftwoimagesisdened inasimilarway. Sine image
signals are generally non-stationary, it is more appropriate tomeasure Q
0
over loal
regionsandthenombinethe dierentresultsintoasinglemeasure. Theauthors[83℄
proposed to use a sliding window approah. Startingfrom the top-left orner of the
twoimagesX
1
;X
2
,aslidingwindowofxed size(withn pixels) movespixelby pixel
over the entire image until the bottom-right orner is reahed. For eah window w,
the loalquality index Q
0 (X
1
;X
2
jw)isomputed. Finally,the strutural similarity
index (SSI) Q
0
is omputed by averaging allloalquality indies.
PiellaandHeijmans [66℄proposedvariantsof SSItomeasurequality ofimagefusion.
Fusion quality index (FQI) Q(X
1
;X
2
;F) for input images X
1
;X
2
and output image
F is dened by themas
Q(X
1
;X
2
;F)= 1
jWj X
w2W (
1 (w)Q
0 (X
1
;F jw) +
2 (w)Q
0 (X
2
;F jw)) (1.13)
where Q
0 (X
1
;F jw) is the strutural similarity index of X
1
and F over the loal
window w, W is the family of allloalwindows, jWj is the ardinality of W and
1
and
2
are weights obtained from loal salieny measures. Loal salieny measure
s(X jw)ofinputimageX shouldreettheloalrelevaneofX withinthewindow
w, and it may depend on ontrast, sharpness or entropy. Given the loal salienies
s(X
1
jw)ands(X
2
jw),theloalweights
1
(w)and
2
(w)isomputed. Itindiatesthe
relativeimportaneofimageX
1
omparedtoimageX
2
. A typialhoiefor
1 (w)is
s(X
1 jw)
s(X1jw)+s(X2jw)
. In ourevaluation,we have taken the window-size tobeof 88pixels
and the sum of gradientvaluesin the loalwindowto be the loalsalieny measure.
Formore thantwoinput images,Qisalulatedasthe averageweighted sumof Q
0 's
alulated for all images. Here weight for a loal window in an image is alulated
as the salieny of the windowin that image divided by sum of loalsalieniesfor all
orresponding windows in allother images.
1.8 Organization of the thesis
Organization of the thesis follows. A survey on multi-fous image registration and
an iterative and hybrid method for the same are presented in Chapter 2. Chapter 3
presents a omputationally eÆient pixel-based algorithm for MFIF using wavelet.
Beforedesribingthe algorithm,thebasitheoryandnewwavelet alledmorphologi
wavelet ispresented. Chapter 4presentsablok-basedmethodfor MFIF.It employs
a new fous-measure alled energy of morphologi gradients. Chapter 5 presents
a region-based method for MFIF using multi-sale morphologi operators. In eah
hapter, after desribing the new algorithm,experimentalresults on data-sets given
in Figure 1.2 are presented. Finally, Chapter 6 presents onluding remarks and
out-lines future work.
Multi-fous image registration
2.1 Introdution
Image registration is a neessary pre-requisite for multi-fous image fusion beause
before fusion the onstituent images must be positioned properly with respet to
a ommon oordinate system so that orresponding objets are overlaid properly
[41℄. In general,the registrationtehniques may belassiedaording totwo major
aspets: methodologyandappliation-area. Themethodsanbeategorizedintotwo
types: (i)area-basedand (ii)feature-based[92℄. Athird ategoryhasemerged whih
is a hybridof area-based and feature-based tehniques. Registration tehniques may
alsobelassied by their mappingmodels, that is by examiningwhether they apply
global and/or loalmapping models. Global models use informationfrom the entire
imagetoestimatethemappingfuntionparameters. Ontheotherhand,loalmodels
treat the image as a omposition of bloks/regions and the funtion parameters are
estimated separatelyfor eahblok/region.
Registration tehniques for multi-fous image have been proposed in [41, 90,91, 18,
22, 29, 27℄. Among these, the methods proposed in [41, 90, 91, 18, 22℄ use global
aÆne transformationmodelsand the ones proposed in[29, 27℄use global perspetive
transformation models. The tehnique proposed by Kubota et al. [41℄ is an area-
based multi-sale tehnique. In this tehnique, from the soure and the referene
imagesGaussianpyramidsareobtainedatrst. Attheoarsest levelofthepyramids,
translation, rotation and magniation parameters are estimated by the minimum
MSE between the two images. The parameters are propagated to the next ner
level and are further rened. The renement proess ontinues up to the original
resolution level and the parameters obtained there are used to register the soure
image. Zhang and Blum [90, 91℄ proposed a hybrid multi-sale sheme using both
area-based and feature-based tehniques. In this tehnique also, from the soure
and the referene images Gaussian pyramids are obtained at rst. At the oarsest
level of the pyramids, an initial estimation of transformation parameters (mainly
rotation and translation) is done by using the edge features. The parameters are
updated by iterativerenement of the optial ow estimation. They are propagated
to the next ner level and are further rened. The proess ontinues up tothe nest
level in whih the nal parameters are obtained and are used to register the soure
image. De and Chanda [18, 22℄ desribed an area-based tehnique in whih at rst
the soure and referene images are divided into equal number of bloks. A soure
blok isswiped over the orresponding refereneblokto nd out the best mathing
positionintheblok. Correspondingpoint-pairsaretakenfrombest-mathingbloks.
Finally, aÆne transformation parameters are estimated by the best-mathing pairs
of points by using the least-square method. These parameters are then used to
register the soure image. Goshtasby [29℄ proposed a hybrid registration sheme
in whih the edge-intersetion points are used as unique landmarks. At rst, the
landmarks in the soure image are found. Then orresponding landmarks in the
refereneimageare foundbyorrelationtemplatemathing. Fromtheorresponding
landmarkpairs,thebestfoursatisfyingtheprojetiveonstraintsareidentied. They
are used toalulatethe projetive transformationparameters. Soureimageisthen
registered by using these parameters. Fedorov et al. [27℄ used a hybrid registration
sheme in whih a number of well-loated ontrol-points are extrated globally at
rst. Preliminary mathes of the tie-points are established by identifying the pairs
are pruned o using RANSAC-like algorithm. Finally perspetive transformation
parameters are estimated by the mathed tie-points using the Normalized Diret
Linear Transformation (DLT) algorithm. Soure image is then registered by using
these parameters.
The methods desribed above use global transformation models and do not apply
any loal model appropriate for registration of multi-fous images. In general, these
images are aquired one by one in suh a way that eah image in the set has fous
on objets at a partiular distane from the amera. This results in global as well
as loal variations in the images. In this hapter we explore these variations and
present an iterative algorithm for registration of multi-fous images whih ombines
bothglobalandloalmappingmodels[20℄. In therststepofthe algorithm,aglobal
translation is determined by maximizing the mutual information of the soure and
the referene images and then it is applied on the soure image. In the seond step,
ablok-wiseloalsalingisappliedonthe translatedsoureimage. Thesale-fators
aredeterminedbymaximizingasimilaritymeasureoftwoorrespondingbloksofthe
translatedsoureimageandtherefereneimage. Theglobalandloaltransformations
onstitute a hybrid tehnique whih is iterated to obtain the optimal result. The
proposed method is automati, easy to implement and gives good results. Results
obtained by applyingthe methodondierentsets of multi-fousimagesare provided
with. Performane of the system is also evaluated and is ompared with a widely
used method. Thehapterisorganizedasfollows. Setion2.2desribestheproposed
algorithm. Experimental results and disussion inluding performane analysis are
given in Setion2.3. Finally,onluding remarks are plaed inSetion 2.4.
2.2 An iterative hybrid registration algorithm
Multi-fous images of a sene are aquired one by one either by hand-held ameras
or by ameras plaed ontripods, in idential environmental onditions inrespet to
image inthe set has fous onobjetsatdierentdistanes inthe sene. Previous re-
searhindiates thatwhen the distanebetween thesene andthe ameraislarge, it
isusuallypossibletoapproximatethemotionoftheseneusinganaÆnetransforma-
tion [90℄. Notethat anaÆne transformation is usually aombinationof translation,
rotationandsaling(seeAppendixB).Inreality,forsuhappliations,rotationofthe
amera relativetothe sene is insigniant and heneis not onsidered here. Global
sale-hangebetween images mayour due tohanges infoalsettings. However in
most pratial appliations, it is less than three perent [76℄ and hene is not on-
sidered here. We onsider global (horizontal and/or vertial) translation(s) between
images dueto aidentalamera-panbetween shots taken byhand-held amerasand
the hanges due tovariations infoalsettings duringaquisition.
Foal variations are done intentionally to fous on objets at a partiular distane.
For example, objets at the bakground of a sene are farther than those at the
foreground andduringaquisition,fous atbakgroundgeneratesafar-fousedimage
in whih the bakground objets are in fous but the foreground objets are out-of-
fous. Similarly fous at foreground generates a near-foused image in whih the
foreground objets are in fous but the bakground objets are out-of-fous. Hene
partial defousing/blurring is inevitable in multi-fous images. Partial defousing
aets the images in two ways. Firstly, due to point-spreading, a blurred objet
appears to be larger in animage when ompared toits foused ounterpart insome
other image [40℄. In addition tothat the radii-of-blur may vary in near-foused and
far-foused images. This results in loalsale-hange between images. Seondly, the
positionof anout-of-fous objet may behangedwhen ompared tothe positionof
its foused ounterpart in some other image. This is shown in Figure 2.1 by a par-
axialgeometrioptis modelof imageformationusingathinonvexlens. Thefoused
image of a point-objet P is reated as a point-image P 0
on Plane-2 whih is the
in-fous image-planefor P. Allother image-planesnearer toorfarther fromthe lens
thanPlane-2 areout-of-fousimage-planesforP. Plane-1 andPlane-3 aretwosuh
out-of-fousplanes. The blurredimagesofpoint-objetP appear asblur-irleswith
P
P’
Object Plane Plane 1 Plane 2 Plane 3
C
D B A
Figure2.1: Geometri Optis Modelof Lens System
fousedandblurredimagesofthepoint-objetdovary. Inadditiontothat,blur-irle
isshiftedvertiallyupwardsinPlane-1andthesameisshiftedvertiallydownwardsin
Plane-3. So the foused and blurredimages of anobjet dohave position dierenes
as well. Intensity or radiometri dierenes aused by partial defousing are not
dealt with in this work beause they are intrinsi to multi-fous images and we do
not intendto hange them. Werather onentrate onspatialtransformations due to
amerapanand partialdefousing. A singleglobaltransformationisnotadequateto
apture all these eets. Considering this fat, a registration tehnique is presented
whih works in two steps. To nullify the eets of global translation(s), the soure
image is translated globally in the diretion(s) reverse to that of the amera pan.
One the translation is done, loal variations in size and position are orreted by
blok-wiseloalsaling. Abovetwosteps areiterateduntilaertainerrorriterionis
fullled. A shemati diagramdepiting the iterative steps isshown inFigure2.2.
In registration of a set of multi-fous images, every image is equally authenti with
its oordinate system. One of them is hosen to represent the ommon oordinate
system and isalled the referene/target/destinationimage. Otherimagesare alled
soureimages. Soureimagesarethenregisteredtothe refereneimage. Registration
is a mappingbetween twoimages both spatially and with respet tointensity [7℄. If
soure image X
s
and referene/destination image X
d
are dened astwo-dimensional
Local Scaling
Images of equal size Translated
Source Image Reference Image
Global Translation Reference
Image
Source Image
Multifocus Images Registered
Images Iteration after an
Source Image Scaled Reference Image
Error <
Limit ?
Stop Yes No
Figure2.2: Shemati diagramfor a hybridand iterative registration method
an be expressed generally as
X 0
s
(u;v)=F
i (X
s (F
g
(r;))) (2.1)
suh that
e=jjX
d
(u;v) X 0
s
(u;v)jj 2
(2.2)
be minimum, where F
i
is the mapping for intensity transformation and F
g
is the
mapping for geometri transformation so that (u;v) = F
g
(r;). The equation may
varydependingontheappliation. Inthiswork, F
g
isaspae-varianttransformation
whih is a ombination of global and loal geometri transformations instead of a
single global transformation generally used for multi-fous image registration. We
desribe below the method for registration of soure image X
s
with referene image
X
d .
2.2.1 Global translation
Sine multi-fous images are aquired one by one, aidental amera pan during
aquisitionmay happen and thisresultsinglobaltranslation(s)of theimageinsmall
amounts. This eetan be nulliedby translatingthe imageinreverse diretion(s).
X
s
and X
d
. Mutual information (MI), originating from the information theory, is
a measure of statistial dependeny between two data-sets [33℄. MI between two
overlapping imagesX
s
and X
d
is given by
MI(X
s
;X
d
)=H(X
s
)+H(X
d
) H(X
s
;X
d
) (2.3)
where H(X
s
) isthe Shannon entropy dened as
H(X
s )=
X
k p
s
(k)logp
s
(k) (2.4)
where p
s
(k) isthe probability of ourrene of greyvalue k inthe image X
s
. Similar
is the denition for Shannon entropy H(X
d
) of image X
d
. Joint entropy H(X
s
;X
d )
of twoimages X
s
and X
d
is given by
H(X
s
;X
d )=
X
(k;l )
p(k;l)logp(k;l) (2.5)
wherep(k;l)isthejointprobabilityofourrenesofgreyvalueskandlinimagesX
s
andX
d
respetively. Entropyofaprobabilitydistributionislowwhenthedistribution
hasafewsharplydened,dominantpeaksanditismaximumwhenalloutomeshave
anequalhaneofourringthat is,thedistributionisuniform. The sameistruefor
jointentropy. It an be seen from Equation (2.3) that a small value of jointentropy
leads to a large value of MI. The idea that MI an be used for image registration
was pioneered by Collignon et al. [17℄ and Viola and Wells [80℄. Both groups used
the idea forregistration ofmulti-modalimages. Itis basedonthe assumption thatif
twomulti-modalimages areproperlyaligned,then orrespondingobjets(and hene
their respetive range of grey values) from two imagesoverlay onone another. This
results in a few sharply dened, dominant peaks or ridges in the joint probability
distribution of the images. Hene, theirjointentropy isminimizedand onsequently
MI ismaximized.
The idea is extended to multi-fous image registration. Soure image X
s
is swiped
over referene image X
d
in suh a way that grids of both images math properly.
This is done by applying integral amount(s) of translation(s) along the axes. So no
the translatedsoureimageandtherefereneimagearefound. MI oftheoverlapping
sub-images is alulated. Varying the translation-amountswithina range and alu-
lating the MI of overlapping sub-images,the amountof translation whih maximizes
MI isfound. Suppose, the soure image X
s
after optimum translation(s), ismapped
to X
s (r+T
r
;+T
) where T
r
and T
are respetive translations along the row and
olumnaxes. Aftermapping,only theoverlappingportionsoftranslatedsoureimage
and referene image are retained and the rest are trunated. So essentially, trans-
lated soureimageand referene imagebeomeof samesize aftertrunation. This is
importantbeause innext step weneed the soureand referene imagesto beof the
same size. Heneforth, we shall refer to new soure image as X
s
and new referene
image as X
d .
The hoie for ranges within whih T
r
and T
are varied is an experimental issue.
Greater rangemeans better auray, but that alsomeans greatertime-requirement.
Wehaveexperimented withvariousrangesofT
r and T
and haveseen ingeneralthat
the shifts are within 5 pixels. So we have taken the range of T
r
and T
to be -5 to
+5. Mis-alignments greaterthan 5 pixelsare orreted duringsuessive iterations.
2.2.2 Loal saling
Variationsinfoalsettingsduringaquisitionofmulti-fousimagesresultinloalsale
and position dierenes in foused and defoused images of an objet, as explained
in the beginning of this setion by using Figure 2.1. The problem is addressed by
blok-wise registration of X
s
with X
d
. At rst, X
s
and X
d
are divided into n equal-
sized non-overlapping bloks. Sine X
s
and X
d
are of same size, their blok-sizes
are taken to be equal. We have experimented with dierent values of n and found
that n = 16 is a reasonably goodhoie for pratial purposes. Eah blok of X
s is
saled independently by appropriatefators along the axes. Resultant image isthen
Best sale-fators for a blok
The best sale-fators (alongrowandolumnaxes) forthe k-thsoure blokis deter-
mined by varying the sale-fators and then nding outthe ones whih givethe best
mathingwith thek-th referene blok. Therangeand preisionfor varying thesale
fators are important. We have experimented with three dierent ranges viz. 0.96-
1.04, 0.97-1.03and 0.98-1.02with three dierentinrement-values ineahrange, viz.
0.02, 0.01 and 0.005. It is observed from our experiments that in general inreasing
the range does not improve the results but ner preision gives better results. The
range 0.98-1.02 with preision 0.005 is found to be suitable for our purpose and we
have used those values for varying the sale-fators along the axes.
Suppose that k-th soure blok is saled upon by the sale-fators s
r and s
respe-
tivelyalongr-axisand -axis. Dependingonthe sale-fators,horizontaland vertial
dimensions of the saled blok are hanged independently. Saled soure blok is
swiped over k-th referene blok. Suppose X k
s
and X k
d
respetively are overlapping
sub-images of k-th soure blok (after saling) and k-th referene blok. To nd out
the best mathing sale-fators of k-th soure blok we need either a similarity or a
dissimilarity measure. Small blok-size redues the statistial power of the proba-
bility distribution estimation [67℄. Hene instead of mutual information,area-based
dissimilarity measure sum of squared dierenes (SSD) is used. For eah swiping
positionof the soureblok, SSDbetween the overlapping sub-images X k
s
and X k
d is
omputed by
SSD(X k
s
;X k
d )=
X
r X
fX
k
s
(r;) X k
d (r;)g
2
(2.6)
The best math ours when the SSD is minimum. The SSD's for best mathing
positions for 9 dierent values (in the range 0.98-1.02with preision0.005) for eah
of the sale-fators s
r and s
are noted. This results in total 81 readings of SSD for
the blok. The minimumofthemgivesthebestsalefatorsfortheblok. Therange
of sale-fators asstated above isobtained from experiments with alarge numberof
Stithing a saled blok
Stithingasaledblokintheresultantimagerequiresadditionalare. Beforesaling
all soure and referene bloks are of equal size. After saling, if the saled soure
blok issmallerin size than the originalsoure blok and isstithed tothe resultant
image, then some blank area will be reated. In that ase an appropriate larger
blok surrounding the original soure blok is saled and positioned there. If the
resizedblokislargerthantherefereneblok,itislippedafterpositioningproperly.
Essentially,theregisteredsoureblokandtherefereneblokshouldbeofsamesize.
For larity, onsider the following example.
Let us illustrate the situations whih may our due to loalsaling, with a soure
blokofsize,say100100pixels. Supposebestsale-fatorsfortheblokis0:98along
both axes. So after saling, size of the blok is 9898pixels whih is smaller than
itstargetarea. Hene a102102blokontainingthe originalsoure blokissaled
to obtaina 100100 blok whih ts the target area. Now onsider another ase in
whih the best sale-fators for the blok is 1:02 along both axes. After saling its
size willbe 102102 pixels whih is biggerthan itstarget area. The best mathing
position of it is found by swiping it over the orresponding referene blok. After
that itislipped to100100pixels, andthen stithedtoitsproperposition. Finally
onsider the ase where the sale-fators are 1.02 along r-axis and 0.98 along -axis.
So the blok beomes 10298 pixels after saling. In this ase, a bigger blok of
size 100102 ontaining the original blok is taken, so that it beomes 102100
after saling. After nding out the best mathing position, the blok is lipped to
100100 pixels and is stithed to the target blok. Eah soure blok is registered
Figure 2.3: Average error between soure and referene images as the iteration step
numberinreases
2.2.3 Iteration
Asstatedabove,theproposedregistrationtehniquehastwodistintsteps: (i)global
translationand(ii)loalsaling. Optimumtransformationparametersaredetermined
in these two transformations independently. But when they are ombined, indepen-
dent parameters may not remain optimum any more. Hene we iterate these two
steps in the given order to ahieve more aeptable result. We expet and experi-
mentallyveriedtothatthetransformationsF
and F
i
ofEquation(2.1)areupdated
and the error dened in Equation (2.2) is redued. The iteration is stopped when
there is nosignianthange inerror. Root-mean-square-error(RMSE) between the
soure and referene imagesis taken as the measure of error for our implementation
purpose. Average RMSE between soure and referene images (used in Setion 2.3)
are shown against iteration step number in Figure 2.3. Column-0 indiates RMSE
before registration, and for i=1 to 4, Column-i indiates RMSE after i-th step of
iteration. It is seen from the Figure 2.3 that RMSE dereases onsiderably in the
rst step of iteration, then as the iteration step number inreases RMSE dereases,
Interpolation
In the global translation step, the soure image is swiped over the referene image
in suh a way that grids of both images math properly. Hene no interpolation
is required in this step. In the loal saling step, however, grids of the soure and
the referene bloks do not math in general. Hene, interpolation is required. Bi-
linear interpolation is a reasonable hoie in terms of ease-of-implementation and
time-omplexity. But during suessive iterations it may redue the ontrast of the
images. A higher-order interpolation like bi-ubi interpolation is a better hoie in
that respet althoughittakesmore time [13℄. Toreduethe time-requirement,bilin-
ear interpolation is used while estimating the sale-fators for a blok, and one the
best sale-fators are obtained, the blok is reonstruted nally to be a part of the
resultant registered image, using bi-ubiinterpolation.
2.3 Experimental results and disussion
The proposed algorithmfor imageregistration has been implemented inC language
in Unix environment. The global translation step has been implemented by varying
the translation-amount from -5 to +5 with unit inrement along eah axis. In the
loal saling step, soure and referene images are divided into 16 bloks and for
eah blok the sale-fators along the axes are varied from 0.98 to 1.02 with an
inrement-valueof0.005. Atmostthreeiterationswereseentobeenoughineahase.
Experimentalresultsforvesetsofmulti-fousimages(`Doll',`Disk',`Garden',`Rose'
and `News') are shown in gures 2.4-2.8. In eah result, original multi-fous images
arefollowedbyregisteredimagesbytheproposedmethod. Therstimageistaken as
the referene image ineah ase. Toshow the eetiveness of the method,dierene
images(between the soureandtherefereneimage)beforeandafterregistrationare
also provided.