59 research outputs found

    Structural similarity loss for learning to fuse multi-focus images

    Get PDF
    © 2020 by the authors. Licensee MDPI, Basel, Switzerland. Convolutional neural networks have recently been used for multi-focus image fusion. However, some existing methods have resorted to adding Gaussian blur to focused images, to simulate defocus, thereby generating data (with ground-truth) for supervised learning. Moreover, they classify pixels as ‘focused’ or ‘defocused’, and use the classified results to construct the fusion weight maps. This then necessitates a series of post-processing steps. In this paper, we present an end-to-end learning approach for directly predicting the fully focused output image from multi-focus input image pairs. The suggested approach uses a CNN architecture trained to perform fusion, without the need for ground truth fused images. The CNN exploits the image structural similarity (SSIM) to calculate the loss, a metric that is widely accepted for fused image quality evaluation. What is more, we also use the standard deviation of a local window of the image to automatically estimate the importance of the source images in the final fused image when designing the loss function. Our network can accept images of variable sizes and hence, we are able to utilize real benchmark datasets, instead of simulated ones, to train our network. The model is a feed-forward, fully convolutional neural network that can process images of variable sizes during test time. Extensive evaluation on benchmark datasets show that our method outperforms, or is comparable with, existing state-of-the-art techniques on both objective and subjective benchmarks

    Design and Analysis of A New Illumination Invariant Human Face Recognition System

    Get PDF
    In this dissertation we propose the design and analysis of a new illumination invariant face recognition system. We show that the multiscale analysis of facial structure and features of face images leads to superior recognition rates for images under varying illumination. We assume that an image I ( x,y ) is a black box consisting of a combination of illumination and reflectance. A new approximation is proposed to enhance the illumination removal phase. As illumination resides in the low-frequency part of images, a high-performance multiresolution transformation is employed to accurately separate the frequency contents of input images. The procedure is followed by a fine-tuning process. After extracting a mask, feature vector is formed and the principal component analysis (PCA) is used for dimensionality reduction which is then proceeded by the extreme learning machine (ELM) as a classifier. We then analyze the effect of the frequency selectivity of subbands of the transformation on the performance of the proposed face recognition system. In fact, we first propose a method to tune the characteristics of a multiresolution transformation, and then analyze how these specifications may affect the recognition rate. In addition, we show that the proposed face recognition system can be further improved in terms of the computational time and accuracy. The motivation for this progress is related to the fact that although illumination mostly lies in the low-frequency part of images, these low-frequency components may have low- or high-resonance nature. Therefore, for the first time, we introduce the resonance based analysis of face images rather than the traditional frequency domain approaches. We found that energy selectivity of the subbands of the resonance based decomposition can lead to superior results with less computational complexity. The method is free of any prior information about the face shape. It is systematic and can be applied separately on each image. Several experiments are performed employing the well known databases such as the Yale B, Extended-Yale B, CMU-PIE, FERET, AT&T, and LFW. Illustrative examples are given and the results confirm the effectiveness of the method compared to the current results in the literature

    Signal processing algorithms for enhanced image fusion performance and assessment

    Get PDF
    The dissertation presents several signal processing algorithms for image fusion in noisy multimodal conditions. It introduces a novel image fusion method which performs well for image sets heavily corrupted by noise. As opposed to current image fusion schemes, the method has no requirements for a priori knowledge of the noise component. The image is decomposed with Chebyshev polynomials (CP) being used as basis functions to perform fusion at feature level. The properties of CP, namely fast convergence and smooth approximation, renders it ideal for heuristic and indiscriminate denoising fusion tasks. Quantitative evaluation using objective fusion assessment methods show favourable performance of the proposed scheme compared to previous efforts on image fusion, notably in heavily corrupted images. The approach is further improved by incorporating the advantages of CP with a state-of-the-art fusion technique named independent component analysis (ICA), for joint-fusion processing based on region saliency. Whilst CP fusion is robust under severe noise conditions, it is prone to eliminating high frequency information of the images involved, thereby limiting image sharpness. Fusion using ICA, on the other hand, performs well in transferring edges and other salient features of the input images into the composite output. The combination of both methods, coupled with several mathematical morphological operations in an algorithm fusion framework, is considered a viable solution. Again, according to the quantitative metrics the results of our proposed approach are very encouraging as far as joint fusion and denoising are concerned. Another focus of this dissertation is on a novel metric for image fusion evaluation that is based on texture. The conservation of background textural details is considered important in many fusion applications as they help define the image depth and structure, which may prove crucial in many surveillance and remote sensing applications. Our work aims to evaluate the performance of image fusion algorithms based on their ability to retain textural details from the fusion process. This is done by utilising the gray-level co-occurrence matrix (GLCM) model to extract second-order statistical features for the derivation of an image textural measure, which is then used to replace the edge-based calculations in an objective-based fusion metric. Performance evaluation on established fusion methods verifies that the proposed metric is viable, especially for multimodal scenarios

    Infrared and visible image fusion based on residual dense network and gradient loss

    Get PDF
    Deep learning has made great progress in the field of image fusion. Compared with traditional methods, the image fusion approach based on deep learning requires no cumbersome matrix operations. In this paper, an end-to-end model for the infrared and visible image fusion is proposed. This unsupervised learning network architecture do not employ fusion strategy. In the stage of feature extraction, residual dense blocks are used to generate a fusion image, which preserves the information of source images to the greatest extent. In the model of feature reconstruction, shallow feature maps, residual dense information, and deep feature maps are merged in order to build a fused result. Gradient loss that we proposed for the network can cooperate well with special weight blocks extracted from input images to more clearly express texture details in fused images. In the training phase, we select 20 source image pairs with obvious characteristics from the TNO dataset, and expand them by random tailoring to serve as the training dataset of the network. Subjective qualitative and objective quantitative results show that the proposed model has advantages over state-of-the-art methods in the tasks of infrared and visible image fusion. We also use the RoadScene dataset to do ablation experiments to verify the effectiveness of the proposed network for infrared and visible image fusion.<br/

    Algorithms for the enhancement of dynamic range and colour constancy of digital images & video

    Get PDF
    One of the main objectives in digital imaging is to mimic the capabilities of the human eye, and perhaps, go beyond in certain aspects. However, the human visual system is so versatile, complex, and only partially understood that no up-to-date imaging technology has been able to accurately reproduce the capabilities of the it. The extraordinary capabilities of the human eye have become a crucial shortcoming in digital imaging, since digital photography, video recording, and computer vision applications have continued to demand more realistic and accurate imaging reproduction and analytic capabilities. Over decades, researchers have tried to solve the colour constancy problem, as well as extending the dynamic range of digital imaging devices by proposing a number of algorithms and instrumentation approaches. Nevertheless, no unique solution has been identified; this is partially due to the wide range of computer vision applications that require colour constancy and high dynamic range imaging, and the complexity of the human visual system to achieve effective colour constancy and dynamic range capabilities. The aim of the research presented in this thesis is to enhance the overall image quality within an image signal processor of digital cameras by achieving colour constancy and extending dynamic range capabilities. This is achieved by developing a set of advanced image-processing algorithms that are robust to a number of practical challenges and feasible to be implemented within an image signal processor used in consumer electronics imaging devises. The experiments conducted in this research show that the proposed algorithms supersede state-of-the-art methods in the fields of dynamic range and colour constancy. Moreover, this unique set of image processing algorithms show that if they are used within an image signal processor, they enable digital camera devices to mimic the human visual system s dynamic range and colour constancy capabilities; the ultimate goal of any state-of-the-art technique, or commercial imaging device
    • …
    corecore