389 research outputs found

    Deep visible and thermal image fusion for enhanced pedestrian visibility

    Get PDF
    Reliable vision in challenging illumination conditions is one of the crucial requirements of future autonomous automotive systems. In the last decade, thermal cameras have become more easily accessible to a larger number of researchers. This has resulted in numerous studies which confirmed the benefits of the thermal cameras in limited visibility conditions. In this paper, we propose a learning-based method for visible and thermal image fusion that focuses on generating fused images with high visual similarity to regular truecolor (red-green-blue or RGB) images, while introducing new informative details in pedestrian regions. The goal is to create natural, intuitive images that would be more informative than a regular RGB camera to a human driver in challenging visibility conditions. The main novelty of this paper is the idea to rely on two types of objective functions for optimization: a similarity metric between the RGB input and the fused output to achieve natural image appearance; and an auxiliary pedestrian detection error to help defining relevant features of the human appearance and blending them into the output. We train a convolutional neural network using image samples from variable conditions (day and night) so that the network learns the appearance of humans in the different modalities and creates more robust results applicable in realistic situations. Our experiments show that the visibility of pedestrians is noticeably improved especially in dark regions and at night. Compared to existing methods we can better learn context and define fusion rules that focus on the pedestrian appearance, while that is not guaranteed with methods that focus on low-level image quality metrics

    Low-light Pedestrian Detection in Visible and Infrared Image Feeds: Issues and Challenges

    Full text link
    Pedestrian detection has become a cornerstone for several high-level tasks, including autonomous driving, intelligent transportation, and traffic surveillance. There are several works focussed on pedestrian detection using visible images, mainly in the daytime. However, this task is very intriguing when the environmental conditions change to poor lighting or nighttime. Recently, new ideas have been spurred to use alternative sources, such as Far InfraRed (FIR) temperature sensor feeds for detecting pedestrians in low-light conditions. This study comprehensively reviews recent developments in low-light pedestrian detection approaches. It systematically categorizes and analyses various algorithms from region-based to non-region-based and graph-based learning methodologies by highlighting their methodologies, implementation issues, and challenges. It also outlines the key benchmark datasets that can be used for research and development of advanced pedestrian detection algorithms, particularly in low-light situation

    Enhancing target detection accuracy through cross-modal spatial perception and dual-modality fusion

    Get PDF
    The disparity between human and machine perception of spatial information presents a challenge for machines to accurately sense their surroundings and improve target detection performance. Cross-modal data fusion emerges as a potential solution to enhance the perceptual capabilities of systems. This article introduces a novel spatial perception method that integrates dual-modality feature fusion and coupled attention mechanisms to validate the improvement in detection performance through cross-modal information fusion. The proposed approach incorporates cross-modal feature extraction through a multi-scale feature extraction structure employing a dual-flow architecture. Additionally, a transformer is integrated for feature fusion, while the information perception of the detection system is optimized through the utilization of a linear combination of loss functions. Experimental results demonstrate the superiority of our algorithm over single-modality target detection using visible images, exhibiting an average accuracy improvement of 30.4%. Furthermore, our algorithm outperforms single-modality infrared image detection by 3.0% and comparative multimodal target detection algorithms by 3.5%. These results validate the effectiveness of our proposed algorithm in fusing dual-band features, significantly enhancing target detection accuracy. The adaptability and robustness of our approach are showcased through these results

    There and Back Again: Self-supervised Multispectral Correspondence Estimation

    Full text link
    Across a wide range of applications, from autonomous vehicles to medical imaging, multi-spectral images provide an opportunity to extract additional information not present in color images. One of the most important steps in making this information readily available is the accurate estimation of dense correspondences between different spectra. Due to the nature of cross-spectral images, most correspondence solving techniques for the visual domain are simply not applicable. Furthermore, most cross-spectral techniques utilize spectra-specific characteristics to perform the alignment. In this work, we aim to address the dense correspondence estimation problem in a way that generalizes to more than one spectrum. We do this by introducing a novel cycle-consistency metric that allows us to self-supervise. This, combined with our spectra-agnostic loss functions, allows us to train the same network across multiple spectra. We demonstrate our approach on the challenging task of dense RGB-FIR correspondence estimation. We also show the performance of our unmodified network on the cases of RGB-NIR and RGB-RGB, where we achieve higher accuracy than similar self-supervised approaches. Our work shows that cross-spectral correspondence estimation can be solved in a common framework that learns to generalize alignment across spectra

    Pedestrian and cyclist detection and intent estimation for autonomous vehicles: A survey

    Get PDF
    © 2019 by the authors. As autonomous vehicles become more common on the roads, their advancement draws on safety concerns for vulnerable road users, such as pedestrians and cyclists. This paper presents a review of recent developments in pedestrian and cyclist detection and intent estimation to increase the safety of autonomous vehicles, for both the driver and other road users. Understanding the intentions of the pedestrian/cyclist enables the self-driving vehicle to take actions to avoid incidents. To make this possible, development of methods/techniques, such as deep learning (DL), for the autonomous vehicle will be explored. For example, the development of pedestrian detection has been significantly advanced using DL approaches, such as; Fast Region-Convolutional Neural Network (R-CNN), Faster R-CNN and Single Shot Detector (SSD). Although DL has been around for several decades, the hardware to realise the techniques have only recently become viable. Using these DL methods for pedestrian and cyclist detection and applying it for the tracking, motion modelling and pose estimation can allow for a successful and accurate method of intent estimation for the vulnerable road users. Although there has been a growth in research surrounding the study of pedestrian detection using vision-based approaches, further attention should include focus on cyclist detection. To further improve safety for these vulnerable road users (VRUs), approaches such as sensor fusion and intent estimation should be investigated

    Pedestrian detection in far infrared images

    Get PDF
    Detection of people in images is a relatively new field of research, but has been widely accepted. The applications are multiple, such as self-labeling of large databases, security systems and pedestrian detection in intelligent transportation systems. Within the latter, the purpose of a pedestrian detector from a moving vehicle is to detect the presence of people in the path of the vehicle. The ultimate goal is to avoid a collision between the two. This thesis is framed with the advanced driver assistance systems, passive safety systems that warn the driver of conditions that may be adverse. An advanced driving assistance system module, aimed to warn the driver about the presence of pedestrians, using computer vision in thermal images, is presented in this thesis. Such sensors are particularly useful under conditions of low illumination.The document is divided following the usual parts of a pedestrian detection system: development of descriptors that define the appearance of people in these kind of images, the application of these descriptors to full-sized images and temporal tracking of pedestrians found. As part of the work developed in this thesis, database of pedestrians in the far infrared spectrum is presented. This database has been used in developing an evaluation of pedestrian detection systems as well as for the development of new descriptors. These descriptors use techniques for the systematic description of the shape of the pedestrian as well as methods to achieve invariance to contrast, illumination or ambient temperature. The descriptors are analyzed and modified to improve their performance in a detection problem, where potential candidates are searched for in full size images. Finally, a method for tracking the detected pedestrians is proposed to reduce the number of miss-detections that occurred at earlier stages of the algorithm. --La detección de personas en imágenes es un campo de investigación relativamente nuevo, pero que ha tenido una amplia acogida. Las aplicaciones son múltiples, tales como auto-etiquetado de grandes bases de datos, sistemas de seguridad y detección de peatones en sistemas inteligentes de transporte. Dentro de este último, la detección de peatones desde un vehículo móvil tiene como objetivo detectar la presencia de personas en la trayectoria del vehículo. EL fin último es evitar una colisión entre ambos. Esta tesis se enmarca en los sistemas avanzados de ayuda a la conducción; sistemas de seguridad pasivos, que advierten al conductor de condiciones que pueden ser adversas. En esta tesis se presenta un módulo de ayuda a la conducción destinado a advertir de la presencia de peatones, mediante el uso de visión por computador en imágenes térmicas. Este tipo de sensores resultan especialmente útiles en condiciones de baja iluminación. El documento se divide siguiendo las partes habituales de una sistema de detección de peatones: desarrollo de descriptores que defina la apariencia de las personas en este tipo de imágenes, la aplicación de estos en imágenes de tamano completo y el seguimiento temporal de los peatones encontrados. Como parte del trabajo desarrollado en esta tesis se presenta una base de datos de peatones en el espectro infrarrojo lejano. Esta base de datos ha sido utilizada para desarrollar una evaluación de sistemas de detección de peatones, así como para el desarrollo de nuevos descriptores. Estos integran técnicas para la descripción sistemática de la forma del peatón, así como métodos para la invariancia al contraste, la iluminación o la temperatura externa. Los descriptores son analizados y modificados para mejorar su rendimiento en un problema de detección, donde se buscan posibles candidatos en una imagen de tamano completo. Finalmente, se propone una método de seguimiento de los peatones detectados para reducir el número de fallos que se hayan producido etapas anteriores del algoritmo

    Principal Component Analysis based Image Fusion Routine with Application to Stamping Split Detection

    Get PDF
    This dissertation presents a novel thermal and visible image fusion system with application in online automotive stamping split detection. The thermal vision system scans temperature maps of high reflective steel panels to locate abnormal temperature readings indicative of high local wrinkling pressure that causes metal splitting. The visible vision system offsets the blurring effect of thermal vision system caused by heat diffusion across the surface through conduction and heat losses to the surroundings through convection. The fusion of thermal and visible images combines two separate physical channels and provides more informative result image than the original ones. Principal Component Analysis (PCA) is employed for image fusion to transform original image to its eigenspace. By retaining the principal components with influencing eigenvalues, PCA keeps the key features in the original image and reduces noise level. Then a pixel level image fusion algorithm is developed to fuse images from the thermal and visible channels, enhance the result image from low level and increase the signal to noise ratio. Finally, an automatic split detection algorithm is designed and implemented to perform online objective automotive stamping split detection. The integrated PCA based image fusion system for stamping split detection is developed and tested on an automotive press line. It is also assessed by online thermal and visible acquisitions and illustrates performance and success. Different splits with variant shape, size and amount are detected under actual operating conditions

    Calibration-free Pedestrian Partial Pose Estimation Using a High-mounted Kinect

    Get PDF
    Les applications de l’analyse du comportement humain ont subit de rapides développements durant les dernières décades, tant au niveau des systèmes de divertissements que pour des applications professionnelles comme les interfaces humain-machine, les systèmes d’assistance de conduite automobile ou des systèmes de protection des piétons. Cette thèse traite du problème de reconnaissance de piétons ainsi qu’à l’estimation de leur orientation en 3D. Cette estimation est faite dans l’optique que la connaissance de cette orientation est bénéfique tant au niveau de l’analyse que de la prédiction du comportement des piétons. De ce fait, cette thèse propose à la fois une nouvelle méthode pour détecter les piétons et une manière d’estimer leur orientation, par l’intégration séquentielle d’un module de détection et un module d’estimation d’orientation. Pour effectuer cette détection de piéton, nous avons conçu un classificateur en cascade qui génère automatiquement une boîte autour des piétons détectés dans l’image. Suivant cela, des régions sont extraites d’un nuage de points 3D afin de classifier l’orientation du torse du piéton. Cette classification se base sur une image synthétique grossière par tramage (rasterization) qui simule une caméra virtuelle placée immédiatement au-dessus du piéton détecté. Une machine à vecteurs de support effectue la classification à partir de cette image de synthèse, pour l’une des 10 orientations discrètes utilisées lors de l’entrainement (incréments de 30 degrés). Afin de valider les performances de notre approche d’estimation d’orientation, nous avons construit une base de données de référence contenant 764 nuages de points. Ces données furent capturées à l’aide d’une caméra Kinect de Microsoft pour 30 volontaires différents, et la vérité-terrain sur l’orientation fut établie par l’entremise d’un système de capture de mouvement Vicon. Finalement, nous avons démontré les améliorations apportées par notre approche. En particulier, nous pouvons détecter des piétons avec une précision de 95.29% et estimer l’orientation du corps (dans un intervalle de 30 degrés) avec une précision de 88.88%. Nous espérons ainsi que nos résultats de recherche puissent servir de point de départ à d’autres recherches futures.The application of human behavior analysis has undergone rapid development during the last decades from entertainment system to professional one, as Human Robot Interaction (HRI), Advanced Driver Assistance System (ADAS), Pedestrian Protection System (PPS), etc. Meanwhile, this thesis addresses the problem of recognizing pedestrians and estimating their body orientation in 3D based on the fact that estimating a person’s orientation is beneficial in determining their behavior. In this thesis, a new method is proposed for detecting and estimating the orientation, in which the result of a pedestrian detection module and a orientation estimation module are integrated sequentially. For the goal of pedestrian detection, a cascade classifier is designed to draw a bounding box around the detected pedestrian. Following this, extracted regions are given to a discrete orientation classifier to estimate pedestrian body’s orientation. This classification is based on a coarse, rasterized depth image simulating a top-view virtual camera, and uses a support vector machine classifier that was trained to distinguish 10 orientations (30 degrees increments). In order to test the performance of our approach, a new benchmark database contains 764 sets of point cloud for body-orientation classification was captured. For this benchmark, a Kinect recorded the point cloud of 30 participants and a marker-based motion capture system (Vicon) provided the ground truth on their orientation. Finally we demonstrated the improvements brought by our system, as it detected pedestrian with an accuracy of 95:29% and estimated the body orientation with an accuracy of 88:88%.We hope it can provide a new foundation for future researches
    corecore