136 research outputs found

    Deep visible and thermal image fusion for enhanced pedestrian visibility

    Get PDF
    Reliable vision in challenging illumination conditions is one of the crucial requirements of future autonomous automotive systems. In the last decade, thermal cameras have become more easily accessible to a larger number of researchers. This has resulted in numerous studies which confirmed the benefits of the thermal cameras in limited visibility conditions. In this paper, we propose a learning-based method for visible and thermal image fusion that focuses on generating fused images with high visual similarity to regular truecolor (red-green-blue or RGB) images, while introducing new informative details in pedestrian regions. The goal is to create natural, intuitive images that would be more informative than a regular RGB camera to a human driver in challenging visibility conditions. The main novelty of this paper is the idea to rely on two types of objective functions for optimization: a similarity metric between the RGB input and the fused output to achieve natural image appearance; and an auxiliary pedestrian detection error to help defining relevant features of the human appearance and blending them into the output. We train a convolutional neural network using image samples from variable conditions (day and night) so that the network learns the appearance of humans in the different modalities and creates more robust results applicable in realistic situations. Our experiments show that the visibility of pedestrians is noticeably improved especially in dark regions and at night. Compared to existing methods we can better learn context and define fusion rules that focus on the pedestrian appearance, while that is not guaranteed with methods that focus on low-level image quality metrics

    On Using and Improving Gradient Domain Processing for Image Enhancement

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    An Attention-Guided and Wavelet-Constrained Generative Adversarial Network for Infrared and Visible Image Fusion

    Full text link
    The GAN-based infrared and visible image fusion methods have gained ever-increasing attention due to its effectiveness and superiority. However, the existing methods adopt the global pixel distribution of source images as the basis for discrimination, which fails to focus on the key modality information. Moreover, the dual-discriminator based methods suffer from the confrontation between the discriminators. To this end, we propose an attention-guided and wavelet-constrained GAN for infrared and visible image fusion (AWFGAN). In this method, two unique discrimination strategies are designed to improve the fusion performance. Specifically, we introduce the spatial attention modules (SAM) into the generator to obtain the spatial attention maps, and then the attention maps are utilized to force the discrimination of infrared images to focus on the target regions. In addition, we extend the discrimination range of visible information to the wavelet subspace, which can force the generator to restore the high-frequency details of visible images. Ablation experiments demonstrate the effectiveness of our method in eliminating the confrontation between discriminators. And the comparison experiments on public datasets demonstrate the effectiveness and superiority of the proposed method

    A Low-cost Depth Imaging Mobile Platform for Canola Phenotyping

    Get PDF
    To meet the high demand for supporting and accelerating progress in the breeding of novel traits, plant scientists and breeders have to measure a large number of plants and their characteristics accurately. A variety of imaging methodologies are being deployed to acquire data for quantitative studies of complex traits. When applied to a large number of plants such as canola plants, however, a complete three-dimensional (3D) model is time-consuming and expensive for high-throughput phenotyping with an enormous amount of data. In some contexts, a full rebuild of entire plants may not be necessary. In recent years, many 3D plan phenotyping techniques with high cost and large-scale facilities have been introduced to extract plant phenotypic traits, but these applications may be affected by limited research budgets and cross environments. This thesis proposed a low-cost depth and high-throughput phenotyping mobile platform to measure canola plant traits in cross environments. Methods included detecting and counting canola branches and seedpods, monitoring canola growth stages, and fusing color images to improve images resolution and achieve higher accuracy. Canola plant traits were examined in both controlled environment and field scenarios. These methodologies were enhanced by different imaging techniques. Results revealed that this phenotyping mobile platform can be used to investigate canola plant traits in cross environments with high accuracy. The results also show that algorithms for counting canola branches and seedpods enable crop researchers to analyze the relationship between canola genotypes and phenotypes and estimate crop yields. In addition to counting algorithms, fusing techniques can be helpful for plant breeders with more comfortable access plant characteristics by improving the definition and resolution of color images. These findings add value to the automation, low-cost depth and high-throughput phenotyping for canola plants. These findings also contribute a novel multi-focus image fusion that exhibits a competitive performance with outperforms some other state-of-the-art methods based on the visual saliency maps and gradient domain fast guided filter. This proposed platform and counting algorithms can be applied to not only canola plants but also other closely related species. The proposed fusing technique can be extended to other fields, such as remote sensing and medical image fusion

    Signal processing algorithms for enhanced image fusion performance and assessment

    Get PDF
    The dissertation presents several signal processing algorithms for image fusion in noisy multimodal conditions. It introduces a novel image fusion method which performs well for image sets heavily corrupted by noise. As opposed to current image fusion schemes, the method has no requirements for a priori knowledge of the noise component. The image is decomposed with Chebyshev polynomials (CP) being used as basis functions to perform fusion at feature level. The properties of CP, namely fast convergence and smooth approximation, renders it ideal for heuristic and indiscriminate denoising fusion tasks. Quantitative evaluation using objective fusion assessment methods show favourable performance of the proposed scheme compared to previous efforts on image fusion, notably in heavily corrupted images. The approach is further improved by incorporating the advantages of CP with a state-of-the-art fusion technique named independent component analysis (ICA), for joint-fusion processing based on region saliency. Whilst CP fusion is robust under severe noise conditions, it is prone to eliminating high frequency information of the images involved, thereby limiting image sharpness. Fusion using ICA, on the other hand, performs well in transferring edges and other salient features of the input images into the composite output. The combination of both methods, coupled with several mathematical morphological operations in an algorithm fusion framework, is considered a viable solution. Again, according to the quantitative metrics the results of our proposed approach are very encouraging as far as joint fusion and denoising are concerned. Another focus of this dissertation is on a novel metric for image fusion evaluation that is based on texture. The conservation of background textural details is considered important in many fusion applications as they help define the image depth and structure, which may prove crucial in many surveillance and remote sensing applications. Our work aims to evaluate the performance of image fusion algorithms based on their ability to retain textural details from the fusion process. This is done by utilising the gray-level co-occurrence matrix (GLCM) model to extract second-order statistical features for the derivation of an image textural measure, which is then used to replace the edge-based calculations in an objective-based fusion metric. Performance evaluation on established fusion methods verifies that the proposed metric is viable, especially for multimodal scenarios

    NEW TECHNIQUES IN DERIVATIVE DOMAIN IMAGE FUSION AND THEIR APPLICATIONS

    Get PDF
    There are many applications where multiple images are fused to form a single summary greyscale or colour output, including computational photography (e.g. RGB-NIR), diffusion tensor imaging (medical), and remote sensing. Often, and intuitively, image fusion is carried out in the derivative domain (based on image gradients). In this thesis, we propose new derivative domain image fusion methods and metrics, and carry out experiments on a range of image fusion applications. After reviewing previous relevant methods in derivative domain image fusion, we make several new contributions. We present new applications for the Spectral Edge image fusion method, in thermal image fusion (using a FLIR smartphone accessory) and near-infrared image fusion (using an integrated visible and near-infrared sensor). We propose extensions of standard objective image fusion quality metrics for M to N channel image fusion measuring image fusion performance is an unsolved problem. Finally, and most importantly, we propose new methods in image fusion, which give improved results compared to previous methods (based on metric and subjective comparisons): we propose an iterative extension to the Spectral Edge image fusion method, producing improved detail transfer and colour vividness, and we propose a new derivative domain image fusion method, based on finding a local linear combination of input images to produce an output image with optimum gradient detail, without artefacts - this mapping can be calculated by finding the principal characteristic vector of the outer product of the Jacobian matrix of image derivatives, or by solving a least-squares regression (with regularization) to the target gradients calculated by the Spectral Edge theorem. We then use our new image fusion method on a range of image fusion applications, producing state of the art image fusion results with the potential for real-time performance

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Fusing Multimedia Data Into Dynamic Virtual Environments

    Get PDF
    In spite of the dramatic growth of virtual and augmented reality (VR and AR) technology, content creation for immersive and dynamic virtual environments remains a significant challenge. In this dissertation, we present our research in fusing multimedia data, including text, photos, panoramas, and multi-view videos, to create rich and compelling virtual environments. First, we present Social Street View, which renders geo-tagged social media in its natural geo-spatial context provided by 360° panoramas. Our system takes into account visual saliency and uses maximal Poisson-disc placement with spatiotemporal filters to render social multimedia in an immersive setting. We also present a novel GPU-driven pipeline for saliency computation in 360° panoramas using spherical harmonics (SH). Our spherical residual model can be applied to virtual cinematography in 360° videos. We further present Geollery, a mixed-reality platform to render an interactive mirrored world in real time with three-dimensional (3D) buildings, user-generated content, and geo-tagged social media. Our user study has identified several use cases for these systems, including immersive social storytelling, experiencing the culture, and crowd-sourced tourism. We next present Video Fields, a web-based interactive system to create, calibrate, and render dynamic videos overlaid on 3D scenes. Our system renders dynamic entities from multiple videos, using early and deferred texture sampling. Video Fields can be used for immersive surveillance in virtual environments. Furthermore, we present VRSurus and ARCrypt projects to explore the applications of gestures recognition, haptic feedback, and visual cryptography for virtual and augmented reality. Finally, we present our work on Montage4D, a real-time system for seamlessly fusing multi-view video textures with dynamic meshes. We use geodesics on meshes with view-dependent rendering to mitigate spatial occlusion seams while maintaining temporal consistency. Our experiments show significant enhancement in rendering quality, especially for salient regions such as faces. We believe that Social Street View, Geollery, Video Fields, and Montage4D will greatly facilitate several applications such as virtual tourism, immersive telepresence, and remote education

    Machine Learning in Sensors and Imaging

    Get PDF
    Machine learning is extending its applications in various fields, such as image processing, the Internet of Things, user interface, big data, manufacturing, management, etc. As data are required to build machine learning networks, sensors are one of the most important technologies. In addition, machine learning networks can contribute to the improvement in sensor performance and the creation of new sensor applications. This Special Issue addresses all types of machine learning applications related to sensors and imaging. It covers computer vision-based control, activity recognition, fuzzy label classification, failure classification, motor temperature estimation, the camera calibration of intelligent vehicles, error detection, color prior model, compressive sensing, wildfire risk assessment, shelf auditing, forest-growing stem volume estimation, road management, image denoising, and touchscreens
    corecore