22 research outputs found
Pedestrian Detection at Day/Night Time with Visible and FIR Cameras : A Comparison
Altres ajuts: DGT (SPIP2014-01352)Despite all the significant advances in pedestrian detection brought by computer vision for driving assistance, it is still a challenging problem. One reason is the extremely varying lighting conditions under which such a detector should operate, namely day and nighttime. Recent research has shown that the combination of visible and non-visible imaging modalities may increase detection accuracy, where the infrared spectrum plays a critical role. The goal of this paper is to assess the accuracy gain of different pedestrian models (holistic, part-based, patch-based) when training with images in the far infrared spectrum. Specifically, we want to compare detection accuracy on test images recorded at day and nighttime if trained (and tested) using (a) plain color images; (b) just infrared images; and (c) both of them. In order to obtain results for the last item, we propose an early fusion approach to combine features from both modalities. We base the evaluation on a new dataset that we have built for this purpose as well as on the publicly available KAIST multispectral dataset
Low-light Pedestrian Detection in Visible and Infrared Image Feeds: Issues and Challenges
Pedestrian detection has become a cornerstone for several high-level tasks,
including autonomous driving, intelligent transportation, and traffic
surveillance. There are several works focussed on pedestrian detection using
visible images, mainly in the daytime. However, this task is very intriguing
when the environmental conditions change to poor lighting or nighttime.
Recently, new ideas have been spurred to use alternative sources, such as Far
InfraRed (FIR) temperature sensor feeds for detecting pedestrians in low-light
conditions. This study comprehensively reviews recent developments in low-light
pedestrian detection approaches. It systematically categorizes and analyses
various algorithms from region-based to non-region-based and graph-based
learning methodologies by highlighting their methodologies, implementation
issues, and challenges. It also outlines the key benchmark datasets that can be
used for research and development of advanced pedestrian detection algorithms,
particularly in low-light situation
FIR-based Future Trajectory Prediction in Nighttime Autonomous Driving
The performance of the current collision avoidance systems in Autonomous
Vehicles (AV) and Advanced Driver Assistance Systems (ADAS) can be drastically
affected by low light and adverse weather conditions. Collisions with large
animals such as deer in low light cause significant cost and damage every year.
In this paper, we propose the first AI-based method for future trajectory
prediction of large animals and mitigating the risk of collision with them in
low light. In order to minimize false collision warnings, in our multi-step
framework, first, the large animal is accurately detected and a preliminary
risk level is predicted for it and low-risk animals are discarded. In the next
stage, a multi-stream CONV-LSTM-based encoder-decoder framework is designed to
predict the future trajectory of the potentially high-risk animals. The
proposed model uses camera motion prediction as well as the local and global
context of the scene to generate accurate predictions. Furthermore, this paper
introduces a new dataset of FIR videos for large animal detection and risk
estimation in real nighttime driving scenarios. Our experiments show promising
results of the proposed framework in adverse conditions. Our code is available
online.Comment: Conference: IEEE Intelligent Vehicles 2023 (IEEE IV 2023
Deep visible and thermal image fusion for enhanced pedestrian visibility
Reliable vision in challenging illumination conditions is one of the crucial requirements of future autonomous automotive systems. In the last decade, thermal cameras have become more easily accessible to a larger number of researchers. This has resulted in numerous studies which confirmed the benefits of the thermal cameras in limited visibility conditions. In this paper, we propose a learning-based method for visible and thermal image fusion that focuses on generating fused images with high visual similarity to regular truecolor (red-green-blue or RGB) images, while introducing new informative details in pedestrian regions. The goal is to create natural, intuitive images that would be more informative than a regular RGB camera to a human driver in challenging visibility conditions. The main novelty of this paper is the idea to rely on two types of objective functions for optimization: a similarity metric between the RGB input and the fused output to achieve natural image appearance; and an auxiliary pedestrian detection error to help defining relevant features of the human appearance and blending them into the output. We train a convolutional neural network using image samples from variable conditions (day and night) so that the network learns the appearance of humans in the different modalities and creates more robust results applicable in realistic situations. Our experiments show that the visibility of pedestrians is noticeably improved especially in dark regions and at night. Compared to existing methods we can better learn context and define fusion rules that focus on the pedestrian appearance, while that is not guaranteed with methods that focus on low-level image quality metrics
Pedestrian detection at daytime and nighttime conditions based on YOLO-v5
En este artículo se presenta un nuevo algoritmo basado en aprendizaje profundo para la detección de peatones en el día y en la noche, denominada multiespectral, enfocado en aplicaciones de seguridad vehicular. La propuesta se basa en YOLO-v5, y consiste en la construcción de dos subredes que se enfocan en trabajar sobre las imágenes en color (RGB) y térmicas (IR), respectivamente. Luego se fusiona la información, a través, de una subred de fusión que integra las redes RGB e IR, para llegar a un detector de peatones. Los experimentos, destinados a verificar la calidad de la propuesta, fueron desarrollados usando distintas bases de datos públicas de peatones destinadas a su detección en el día y en la noche. Los principales resultados en función de la métrica mAP, estableciendo un IoU en 0.5 son 96.6 % sobre la base de datos INRIA, 89.2 % sobre CVC09, 90.5 % en LSIFIR, 56 % sobre FLIR-ADAS, 79.8 % para CVC14, 72.3 % sobre Nightowls y KAIST un 53.3 %.This paper presents new algorithm based on deep learning for daytime and nighttime pedestrian detection, named multispectral, focused on vehicular safety applications. The proposal is based on YOLO-v5, and consists of the construction of two subnetworks that focus on working with color (RGB) and thermal (IR) images, respectively. Then the information is merged, through a merging subnetwork that integrates RGB and IR networks to obtain a pedestrian detector. Experiments aimed at verifying the quality of the proposal were conducted using several public pedestrian databases for detecting pedestrians at daytime and nighttime. The main results according to the mAP metric, setting an IoU of 0.5 were: 96.6 \% on the INRIA database, 89.2 % on CVC09, 90.5 % on LSIFIR, 56 % on FLIR-ADAS, 79.8 % on CVC14, 72.3 % on Nightowls and 53.3 % on KAIST
Detección de peatones en el día y en la noche usando YOLO-v5
En este artículo se presenta un nuevo algoritmo
basado en aprendizaje profundo para la detección
de peatones en el día y en la noche, denominada
multiespectral, enfocado en aplicaciones de seguridad
vehicular. La propuesta se basa en YOLO-v5,
y consiste en la construcción de dos subredes que
se enfocan en trabajar sobre las imágenes en color
(RGB) y térmicas (IR), respectivamente. Luego se
fusiona la información, a través, de una subred de
fusión que integra las redes RGB e IR, para llegar
a un detector de peatones. Los experimentos, destinados
a verificar la calidad de la propuesta, fueron
desarrollados usando distintas bases de datos públicas
de peatones destinadas a su detección en el día y en
la noche. Los principales resultados en función de la
métrica mAP, estableciendo un IoU en 0.5 son 96.6 %
sobre la base de datos INRIA, 89.2 % sobre CVC09,
90.5 % en LSIFIR, 56 % sobre FLIR-ADAS, 79.8 %
para CVC14, 72.3 % sobre Nightowls y KAIST un
53.3 %.//This paper presents new algorithm based on deep
learning for daytime and nighttime pedestrian detection,
named multispectral, focused on vehicular
safety applications. The proposal is based on YOLOv5,
and consists of the construction of two subnetworks
that focus on working with color (RGB) and
thermal (IR) images, respectively. Then the information
is merged, through a merging subnetwork that
integrates RGB and IR networks to obtain a pedestrian
detector. Experiments aimed at verifying the
quality of the proposal were conducted using several
public pedestrian databases for detecting pedestrians
at daytime and nighttime. The main results according
to the mAP metric, setting an IoU of 0.5 were:
96.6 % on the INRIA database, 89.2 % on CVC09,
90.5 % on LSIFIR, 56 % on FLIR-ADAS, 79.8 % on
CVC14, 72.3 % on Nightowls and 53.3 % on KAIST
ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection
Effective feature fusion of multispectral images plays a crucial role in
multi-spectral object detection. Previous studies have demonstrated the
effectiveness of feature fusion using convolutional neural networks, but these
methods are sensitive to image misalignment due to the inherent deffciency in
local-range feature interaction resulting in the performance degradation. To
address this issue, a novel feature fusion framework of dual cross-attention
transformers is proposed to model global feature interaction and capture
complementary information across modalities simultaneously. This framework
enhances the discriminability of object features through the query-guided
cross-attention mechanism, leading to improved performance. However, stacking
multiple transformer blocks for feature enhancement incurs a large number of
parameters and high spatial complexity. To handle this, inspired by the human
process of reviewing knowledge, an iterative interaction mechanism is proposed
to share parameters among block-wise multimodal transformers, reducing model
complexity and computation cost. The proposed method is general and effective
to be integrated into different detection frameworks and used with different
backbones. Experimental results on KAIST, FLIR, and VEDAI datasets show that
the proposed method achieves superior performance and faster inference, making
it suitable for various practical scenarios. Code will be available at
https://github.com/chanchanchan97/ICAFusion.Comment: submitted to Pattern Recognition Journal, minor revisio