517 research outputs found
Multispectral Deep Neural Networks for Pedestrian Detection
Multispectral pedestrian detection is essential for around-the-clock
applications, e.g., surveillance and autonomous driving. We deeply analyze
Faster R-CNN for multispectral pedestrian detection task and then model it into
a convolutional network (ConvNet) fusion problem. Further, we discover that
ConvNet-based pedestrian detectors trained by color or thermal images
separately provide complementary information in discriminating human instances.
Thus there is a large potential to improve pedestrian detection by using color
and thermal images in DNNs simultaneously. We carefully design four ConvNet
fusion architectures that integrate two-branch ConvNets on different DNNs
stages, all of which yield better performance compared with the baseline
detector. Our experimental results on KAIST pedestrian benchmark show that the
Halfway Fusion model that performs fusion on the middle-level convolutional
features outperforms the baseline method by 11% and yields a missing rate 3.5%
lower than the other proposed architectures.Comment: 13 pages, 8 figures, BMVC 2016 ora
Borrow from Anywhere: Pseudo Multi-modal Object Detection in Thermal Imagery
Can we improve detection in the thermal domain by borrowing features from
rich domains like visual RGB? In this paper, we propose a pseudo-multimodal
object detector trained on natural image domain data to help improve the
performance of object detection in thermal images. We assume access to a
large-scale dataset in the visual RGB domain and relatively smaller dataset (in
terms of instances) in the thermal domain, as is common today. We propose the
use of well-known image-to-image translation frameworks to generate pseudo-RGB
equivalents of a given thermal image and then use a multi-modal architecture
for object detection in the thermal image. We show that our framework
outperforms existing benchmarks without the explicit need for paired training
examples from the two domains. We also show that our framework has the ability
to learn with less data from thermal domain when using our approach. Our code
and pre-trained models are made available at
https://github.com/tdchaitanya/MMTODComment: Accepted at Perception Beyond Visible Spectrum Workshop, CVPR 201
Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection
Effective fusion of complementary information captured by multi-modal sensors
(visible and infrared cameras) enables robust pedestrian detection under
various surveillance situations (e.g. daytime and nighttime). In this paper, we
present a novel box-level segmentation supervised learning framework for
accurate and real-time multispectral pedestrian detection by incorporating
features extracted in visible and infrared channels. Specifically, our method
takes pairs of aligned visible and infrared images with easily obtained
bounding box annotations as input and estimates accurate prediction maps to
highlight the existence of pedestrians. It offers two major advantages over the
existing anchor box based multispectral detection methods. Firstly, it
overcomes the hyperparameter setting problem occurred during the training phase
of anchor box based detectors and can obtain more accurate detection results,
especially for small and occluded pedestrian instances. Secondly, it is capable
of generating accurate detection results using small-size input images, leading
to improvement of computational efficiency for real-time autonomous driving
applications. Experimental results on KAIST multispectral dataset show that our
proposed method outperforms state-of-the-art approaches in terms of both
accuracy and speed
POL-LWIR Vehicle Detection: Convolutional Neural Networks Meet Polarised Infrared Sensors
For vehicle autonomy, driver assistance and situational awareness, it is
necessary to operate at day and night, and in all weather conditions. In
particular, long wave infrared (LWIR) sensors that receive predominantly
emitted radiation have the capability to operate at night as well as during the
day. In this work, we employ a polarised LWIR (POL-LWIR) camera to acquire data
from a mobile vehicle, to compare and contrast four different convolutional
neural network (CNN) configurations to detect other vehicles in video
sequences. We evaluate two distinct and promising approaches, two-stage
detection (Faster-RCNN) and one-stage detection (SSD), in four different
configurations. We also employ two different image decompositions: the first
based on the polarisation ellipse and the second on the Stokes parameters
themselves. To evaluate our approach, the experimental trials were quantified
by mean average precision (mAP) and processing time, showing a clear trade-off
between the two factors. For example, the best mAP result of 80.94% was
achieved using Faster-RCNN, but at a frame rate of 6.4 fps. In contrast,
MobileNet SSD achieved only 64.51% mAP, but at 53.4 fps.Comment: Computer Vision and Pattern Recognition Workshop 201
Low-light Pedestrian Detection in Visible and Infrared Image Feeds: Issues and Challenges
Pedestrian detection has become a cornerstone for several high-level tasks,
including autonomous driving, intelligent transportation, and traffic
surveillance. There are several works focussed on pedestrian detection using
visible images, mainly in the daytime. However, this task is very intriguing
when the environmental conditions change to poor lighting or nighttime.
Recently, new ideas have been spurred to use alternative sources, such as Far
InfraRed (FIR) temperature sensor feeds for detecting pedestrians in low-light
conditions. This study comprehensively reviews recent developments in low-light
pedestrian detection approaches. It systematically categorizes and analyses
various algorithms from region-based to non-region-based and graph-based
learning methodologies by highlighting their methodologies, implementation
issues, and challenges. It also outlines the key benchmark datasets that can be
used for research and development of advanced pedestrian detection algorithms,
particularly in low-light situation
Multi-spectral Pedestrian Detection via Image Fusion and Deep Neural Networks
The use of multi-spectral imaging has been found to improve the accuracy of deep neural network-based pedestrian detection systems, particularly in challenging night time conditions in which pedestrians are more clearly visible in thermal long-wave infrared bands than in plain RGB. In this article, the authors use the Spectral Edge image fusion method to fuse visible RGB and IR imagery, prior to processing using a neural network-based pedestrian detection system. The use of image fusion permits the use of a standard RGB object detection network without requiring the architectural modifications that are required to handle multi-spectral input. We contrast the performance of networks trained using fused images to those that use plain RGB images and networks that use a multi-spectral input. © 2018 Society for Imaging Science and Technology
Unsupervised Domain Adaptation for Multispectral Pedestrian Detection
Multimodal information (e.g., visible and thermal) can generate robust
pedestrian detections to facilitate around-the-clock computer vision
applications, such as autonomous driving and video surveillance. However, it
still remains a crucial challenge to train a reliable detector working well in
different multispectral pedestrian datasets without manual annotations. In this
paper, we propose a novel unsupervised domain adaptation framework for
multispectral pedestrian detection, by iteratively generating pseudo
annotations and updating the parameters of our designed multispectral
pedestrian detector on target domain. Pseudo annotations are generated using
the detector trained on source domain, and then updated by fixing the
parameters of detector and minimizing the cross entropy loss without
back-propagation. Training labels are generated using the pseudo annotations by
considering the characteristics of similarity and complementarity between
well-aligned visible and infrared image pairs. The parameters of detector are
updated using the generated labels by minimizing our defined multi-detection
loss function with back-propagation. The optimal parameters of detector can be
obtained after iteratively updating the pseudo annotations and parameters.
Experimental results show that our proposed unsupervised multimodal domain
adaptation method achieves significantly higher detection performance than the
approach without domain adaptation, and is competitive with the supervised
multispectral pedestrian detectors
- …