In the second contributing chapter, transformed domain properties of the rain streaks in the image are exploited to reduce noise. I have complied with the norms and guidelines given in the Code of Ethical Conduct of the Institute.
Image Restoration
Given xn, the goal of the single-image dewatering problem is to estimate the dewatered image. Given the blurred image xn, the single-image de-blurring task aims to recreate the clean image yc.
Video Restoration
Applications
Outdoor surveillance and tracking: Improved visibility can also be useful for outdoor surveillance and tracking [29].
Literature Survey
Image De-Raining
- Layer Separation Methods
- Deep Learning-based Methods
To remove the rain streaks from the high-frequency part, the Canny edge detection [41] algorithm was used. 42] observed that the high-frequency component of the rain image consists of the major part of the rain streak part.
Image De-Hazing
- Handcrafted Features-based Arts
- Deep Learning-based Methods
61] proposed a method, namely AOD-Net, which does not predict the transmission and skylight maps separately. Instead, it generates the blurred image using a lightweight CNN, unifying transmission map and skylight estimation steps within a single unit known as the K-Estimation block.
Video De-Raining
- Layer Separation Methods
- Deep Learning-based Methods
Also, the proposed work could not generalize the extent and direction of rain tracks. Jiang et al.[75] took into account the intrinsic characteristics of rain tracks and used the method of alternating direction of the multiplier algorithm for an effective solution.
Motivation and Objectives
A graphical demonstration of excess color and white spot artifacts in the bare results of existing works. Propose an end-to-end model based on deep learning that uses the scale space information of the blurred image for deblurring a single image.
Contribution of the Thesis
Exploiting Efficient Spatial Upscaling for Single Image De-
Exploiting Transformed Domain Features for Single Image
In addition to the spatial domain features of the rain image, these subbands were used by the proposed method to generate the artifacts-free clean images.
A Probe Towards Scale-Space Invariant Conditional GAN
Organization of the Thesis
Summary
Based on the shortcomings of the existing literature, the objectives of the research are defined. In particular, first, it contains a brief introduction to some of the image transforms such as Discrete Fourier Transform (DFT), Discrete Wavelet Transform (DWT), and a filter called Laplacians or Gaussian (LoG).
Discrete Haar-Wavelet Transformation
HL/LH: The lower left and upper right subbands are estimated along the height and width using low-pass and high-pass filters, alternatively. HH: The lower right sub-band can be estimated analogously to the upper left quadrant but by using the high-pass filter, which belongs to the given wavelet.
Laplacians of Gaussian
Convolutional Neural Networks
- VGG-16
- ResNet
- U-Net
- Generative Adversarial Networks
- Perceptual Loss
The pooling layer reduces the spatial dimension of the feature representation to avoid overfitting during training. Being the runner-up of the ILSVRC 2014 challenge, the architecture of VGG-16 is shown in Figure 2.5.
Image Quality Metrics
Full-reference Metrics
These metrics are calculated between the original clean ground truth images and the corresponding generated noise-free images. While the second considers the mutual information between the input of the distortion block and the output of the visual block.
NIQE: Naturalness Image Quality Evaluator (NIQE) [123] compares the restored image with a standard model calculated from statistics of the images of natural scenes. PIQE: Perception Based Image Quality Evaluator (PIQE) [124] estimates the perception-based quality of the image using blockwise distortion analysis.
Datasets
The proposed model was tested on synthetic data set (SOTS) provided by Li et al. 129] consisting of 500 outdoor and indoor images in addition to the benchmark test set1 provided by Fattal et al.[17] and actual blurred images.
Summary
Perhaps this is due to improper use of deconvolution layers during the reconstruction of desiccated images in network engineering. A deep residual network [6] was used to further improve the quality of the derained image produced by the encoder-decoder network.
Proposed Approach
- Baseline Generator Model
- Generator with Efficient Sub-Pixel Convolution
- Discriminator
- Cost Function
An overview of the architecture of the proposed discriminator model, which consists of 5 convolutional layers, is shown in Figure 3.4. The goal of the proposed method is to generate a rain-free image that has a rain image as input.
Experiments and Results
- Quality Measures
- Model Parameters
- Comparison configurations
- Quantitative Results
- Qualitative Results
However, the corrupted images generated by the proposed scheme do not suffer from these artifacts and degradations. The visual quality of rain-removed images can be improved by including adversarial training in the proposed model, unlike [45] and [16].
Discussion
To avoid these issues, unlike existing methods, this work has incorporated a subpixel convolution [130] to improve the spatial resolution of the image. This may be because the efficient subpixel inversion may have improved the spatial resolution of the rain-removed images by the proposed architecture, which in turn resulted in better PSNR values and a slight improvement in SSIM and performance metrics. other assessment.
Summary
Rain Streaks in DFT
When we move the DC component to the middle, as shown in Figures 4.1j, 4.1c, the difference is slightly more visible. However, there may be some high-intensity points in the magnitude spectrum that may contain some information about the rain, but are not very sensitive to the human eye.
Fourier Domain Input to Deep CNNs
As shown in Figure 4.2(Q), let θRc denote the rain line direction in the spatial domain and θRf in the S∗ space. When we convert the RGB rain image to YCbCr color space, it is observed that most of the rain line information exists only in the Y channel.
Noise residual in Fourier domain
The real, imaginary coefficients are subtracted pixelwise from the real, imaginary coefficients of rain image to get the real, imaginary coefficients of non-rain image. The rain streaks free image in spatial domain can be reconstructed using inverse DFTop calculated real and imaginary parts.
Proposed Networks
D-Net
N-Net
Loss functions
Results
We only randomly selected our training data from the dataset available by Fu et al. It can be observed that the proposed DFT-based rainstreak removal approach achieves a comparable result to state-of-the-art approaches on the test datasets.
Discussion
Normalization of input data
However, a visual improvement along with the reduction of rain bands can be observed in the rain removed image compared to the original rain image. The loss in the reconstructed image is due to the normalization techniques and also due to the non-linearity described in subsection 4.4.3.
Single layer Vs. Multilayer
Non-linearity
Therefore, it can be concluded that the use of activation function in convolutional neural networks can remove some frequencies which are useful in image reconstruction back to the spatial domain. To summarize, it can be concluded that although a small improvement is achieved using DFTdomain, one may need a different transform domain which can store relatively more information than DFTdomain.
Image De-Raining in Correlated Transformed Domain
Proposed Scheme
Generator Network (G)
Each of the P2, P3 and P4 units consists of a proposed subnetwork called F-Net which consists of four convolution layers where each layer has filters of size 3 × 3, spatial step size of 1 × 1 with a number of filters on each layer . 4,8,6 and 1 respectively. Therefore, these intermediate rain maps are merged and fed into a 10-layer ResNet to further refine and output a merged rain map referred to as Rmerged.
Discriminator Network (D)
The purpose of units P2, P3 and P4 is to use the cues in wavelet subbands LH, HL and HH that are more suitable for generating the rain maps and outputting the intermediate rain maps denoted Rh, Rvan and R respectively Clean image candidate C2 is obtained by pixel-wise subtracting Rmerged from INY and the intermediate rain maps Rh, Rvan, and Rd are pixel-wise subtracted from INY to get the clean image candidates C3, C4, and C5, respectively.
Cost Function
The MSE is used in the majority of the denoising algorithms and can be defined as LE = 1. Therefore, the perceptual loss function [8] is used to avoid these artifacts by preserving the contextual and high-level features of the image.
Experiments and Results
Performance Evaluation
It is trained on LG with similar cost weights as in the case of the proposed method. The proposed method has shown an impressive improvement over recent methods and basic configurations in
Summary
Loss function
The conventional per-pixel loss (LE) between dehazed and ground truth (C) images can be written as. Furthermore, the final objective function of the proposed model for single image demisting task can be defined as.
Experiments and results
- Datasets and training details
- Evaluation metrics
- Ablation study
- Comparison with State-of-the-Art Methods
The runtime comparison of the proposed schedule with existing methods is shown in Table 5.6. However, the perceptual quality of the defocused image restored by using the proposed scheme is better than the same by using the existing methods.
Summary
Extending GAN for Video De-Raining
The proposed method is built on the conditional GAN framework [165], which consists of two main sub-modules, namely (a) Generator and (b) Discriminator, as ϕG, ϕD respectively in our case. As described above, the proposed model consists of two main sub-modules, the Generator (ϕG) and the Discriminator (ϕD).
Network Architecture
The proposedϕD takes either three estimated consecutive unrained frames (ˆfc,i−2m ⊙ˆfc,i−1m ⊙ˆfc,im) or three. Then the output (Y) of the first MCB block can be written in the proposed ϕD, with ⊕ as summation, as.
Cost Function
The goal of 3D convolution in the proposed ϕD is to learn the temporal consistency between the successive three frames i−2, i−1, i. Using only LM SE loss to optimize the proposed model may result in the loss of high-frequency detail from the frames while removing the rain streaks.
Experiments & Training Details
Dataset
Training Parameters
Evaluation Metrics
117], Universal Image Quality Index (UQI) [115], Multiscale Structural Similarity Index (MS-SSIM) [114], Natural Image Quality Evaluation (NIQE) [123], Image Quality Evaluation based on perception (PIQE) [124], Feature Similarity Index (FSIM) [119], Haar Wavelet-Based Perceptual Similarity Index (Haar PSI), Gradient Magnitude Similarity Deviation (GMSD) [ 122], Blind/No Reference Spatial Imaging (BRISQUE) [126], and Total Variation Error (TV). For fair comparison, we have set a figure of merit (fom) based on performances on all approved test sets as, fom = {0.6 * No.
Results
Baseline Configurations
Quantitative Results
The quantitative comparison of the proposed scheme with existing methods on test set a1 is shown in Table. The quantitative comparison of the proposed model with existing schemes on the test sets b1,b2,b3 and b4 is shown in the Tables.
Qualitative Results
TCL [4] SPAC-CNN [14] SE [12] JORDER [51] FastDerain [11] Proposed Figure 6.9: Qualitative comparison of the proposed model with existing schemes on synthetic rain video frames. Figure 6.7 shows a visual comparison of the proposed model with existing schemes on a synthetic rain video.
Ablation Study
Quantitative Results
Qualitative comparison
- Improvement from the perspective of input color-
- Improvement from the perspective of model ar-
- Exponential Perceptual Loss + MSE vs MSE
- Exponential Adversarial Loss vs Fixed Constant
- MSE-based Adversarial Loss vs Entropy-based Ad-
It can be observed from the Figure 6.13 that the results obtained using only MSE loss still contain the visible rain streaks compared to the proposed baseline. It can be observed from Figure 6.17 that the results obtained using G-M-EP-EA-N contain visible rain streaks and reconstruction errors compared to the model trained with the proposed loss function.
Justification
Kim, “Deep piramidale residuele netwerken,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. Algorithms and benchmark,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, blz.