1,204 research outputs found
Unsupervised Domain Adaptation for Multispectral Pedestrian Detection
Multimodal information (e.g., visible and thermal) can generate robust
pedestrian detections to facilitate around-the-clock computer vision
applications, such as autonomous driving and video surveillance. However, it
still remains a crucial challenge to train a reliable detector working well in
different multispectral pedestrian datasets without manual annotations. In this
paper, we propose a novel unsupervised domain adaptation framework for
multispectral pedestrian detection, by iteratively generating pseudo
annotations and updating the parameters of our designed multispectral
pedestrian detector on target domain. Pseudo annotations are generated using
the detector trained on source domain, and then updated by fixing the
parameters of detector and minimizing the cross entropy loss without
back-propagation. Training labels are generated using the pseudo annotations by
considering the characteristics of similarity and complementarity between
well-aligned visible and infrared image pairs. The parameters of detector are
updated using the generated labels by minimizing our defined multi-detection
loss function with back-propagation. The optimal parameters of detector can be
obtained after iteratively updating the pseudo annotations and parameters.
Experimental results show that our proposed unsupervised multimodal domain
adaptation method achieves significantly higher detection performance than the
approach without domain adaptation, and is competitive with the supervised
multispectral pedestrian detectors
Enhancing Visibility in Nighttime Haze Images Using Guided APSF and Gradient Adaptive Convolution
Visibility in hazy nighttime scenes is frequently reduced by multiple
factors, including low light, intense glow, light scattering, and the presence
of multicolored light sources. Existing nighttime dehazing methods often
struggle with handling glow or low-light conditions, resulting in either
excessively dark visuals or unsuppressed glow outputs. In this paper, we
enhance the visibility from a single nighttime haze image by suppressing glow
and enhancing low-light regions. To handle glow effects, our framework learns
from the rendered glow pairs. Specifically, a light source aware network is
proposed to detect light sources of night images, followed by the APSF (Angular
Point Spread Function)-guided glow rendering. Our framework is then trained on
the rendered images, resulting in glow suppression. Moreover, we utilize
gradient-adaptive convolution, to capture edges and textures in hazy scenes. By
leveraging extracted edges and textures, we enhance the contrast of the scene
without losing important structural details. To boost low-light intensity, our
network learns an attention map, then adjusted by gamma correction. This
attention has high values on low-light regions and low values on haze and glow
regions. Extensive evaluation on real nighttime haze images, demonstrates the
effectiveness of our method. Our experiments demonstrate that our method
achieves a PSNR of 30.38dB, outperforming state-of-the-art methods by 13 on
GTA5 nighttime haze dataset. Our data and code is available at:
\url{https://github.com/jinyeying/nighttime_dehaze}.Comment: Accepted to ACM'MM2023, https://github.com/jinyeying/nighttime_dehaz
Low-light Pedestrian Detection in Visible and Infrared Image Feeds: Issues and Challenges
Pedestrian detection has become a cornerstone for several high-level tasks,
including autonomous driving, intelligent transportation, and traffic
surveillance. There are several works focussed on pedestrian detection using
visible images, mainly in the daytime. However, this task is very intriguing
when the environmental conditions change to poor lighting or nighttime.
Recently, new ideas have been spurred to use alternative sources, such as Far
InfraRed (FIR) temperature sensor feeds for detecting pedestrians in low-light
conditions. This study comprehensively reviews recent developments in low-light
pedestrian detection approaches. It systematically categorizes and analyses
various algorithms from region-based to non-region-based and graph-based
learning methodologies by highlighting their methodologies, implementation
issues, and challenges. It also outlines the key benchmark datasets that can be
used for research and development of advanced pedestrian detection algorithms,
particularly in low-light situation
Vision Sensors and Edge Detection
Vision Sensors and Edge Detection book reflects a selection of recent developments within the area of vision sensors and edge detection. There are two sections in this book. The first section presents vision sensors with applications to panoramic vision sensors, wireless vision sensors, and automated vision sensor inspection, and the second one shows image processing techniques, such as, image measurements, image transformations, filtering, and parallel computing
Deep visible and thermal image fusion for enhanced pedestrian visibility
Reliable vision in challenging illumination conditions is one of the crucial requirements of future autonomous automotive systems. In the last decade, thermal cameras have become more easily accessible to a larger number of researchers. This has resulted in numerous studies which confirmed the benefits of the thermal cameras in limited visibility conditions. In this paper, we propose a learning-based method for visible and thermal image fusion that focuses on generating fused images with high visual similarity to regular truecolor (red-green-blue or RGB) images, while introducing new informative details in pedestrian regions. The goal is to create natural, intuitive images that would be more informative than a regular RGB camera to a human driver in challenging visibility conditions. The main novelty of this paper is the idea to rely on two types of objective functions for optimization: a similarity metric between the RGB input and the fused output to achieve natural image appearance; and an auxiliary pedestrian detection error to help defining relevant features of the human appearance and blending them into the output. We train a convolutional neural network using image samples from variable conditions (day and night) so that the network learns the appearance of humans in the different modalities and creates more robust results applicable in realistic situations. Our experiments show that the visibility of pedestrians is noticeably improved especially in dark regions and at night. Compared to existing methods we can better learn context and define fusion rules that focus on the pedestrian appearance, while that is not guaranteed with methods that focus on low-level image quality metrics
Nighttime Driver Behavior Prediction Using Taillight Signal Recognition via CNN-SVM Classifier
This paper aims to enhance the ability to predict nighttime driving behavior
by identifying taillights of both human-driven and autonomous vehicles. The
proposed model incorporates a customized detector designed to accurately detect
front-vehicle taillights on the road. At the beginning of the detector, a
learnable pre-processing block is implemented, which extracts deep features
from input images and calculates the data rarity for each feature. In the next
step, drawing inspiration from soft attention, a weighted binary mask is
designed that guides the model to focus more on predetermined regions. This
research utilizes Convolutional Neural Networks (CNNs) to extract
distinguishing characteristics from these areas, then reduces dimensions using
Principal Component Analysis (PCA). Finally, the Support Vector Machine (SVM)
is used to predict the behavior of the vehicles. To train and evaluate the
model, a large-scale dataset is collected from two types of dash-cams and
Insta360 cameras from the rear view of Ford Motor Company vehicles. This
dataset includes over 12k frames captured during both daytime and nighttime
hours. To address the limited nighttime data, a unique pixel-wise image
processing technique is implemented to convert daytime images into realistic
night images. The findings from the experiments demonstrate that the proposed
methodology can accurately categorize vehicle behavior with 92.14% accuracy,
97.38% specificity, 92.09% sensitivity, 92.10% F1-measure, and 0.895 Cohen's
Kappa Statistic. Further details are available at
https://github.com/DeepCar/Taillight_Recognition.Comment: 12 pages, 10 figure
Comprehensive Survey and Analysis of Techniques, Advancements, and Challenges in Video-Based Traffic Surveillance Systems
The challenges inherent in video surveillance are compounded by a several factors, like dynamic lighting conditions, the coordination of object matching, diverse environmental scenarios, the tracking of heterogeneous objects, and coping with fluctuations in object poses, occlusions, and motion blur. This research endeavor aims to undertake a rigorous and in-depth analysis of deep learning- oriented models utilized for object identification and tracking. Emphasizing the development of effective model design methodologies, this study intends to furnish a exhaustive and in-depth analysis of object tracking and identification models within the specific domain of video surveillance
Gabor-enhanced histogram of oriented gradients for human presence detection applied in aerial monitoring
In UAV-based human detection, the extraction and selection of the feature vector are one of the critical tasks to ensure the optimal performance of the detection system. Although UAV cameras capture high-resolution images, human figures' relative size renders persons at very low resolution and contrast. Feature descriptors that can adequately discriminate between local symmetrical patterns in a low-contrast image may improve a human figures' detection in vegetative environments. Such a descriptor is proposed and presented in this paper. Initially, the acquired images are fed to a digital processor in a ground station where the human detection algorithm is performed. Part of the human detection algorithm is the GeHOG feature extraction, where a bank of Gabor filters is used to generate textured images from the original. The local energy for each cell of the Gabor images is calculated to identify the dominant orientations. The bins of conventional HOG are enhanced based on the dominant orientation index and the accumulated local energy in Gabor images. To measure the performance of the proposed features, Gabor-enhanced HOG (GeHOG) and other two recent improvements to HOG, Histogram of Edge Oriented Gradients (HEOG) and Improved HOG (ImHOG), are used for human detection on INRIA dataset and a custom dataset of farmers working in fields captured via unmanned aerial vehicle. The proposed feature descriptor significantly improved human detection and performed better than recent improvements in conventional HOG. Using GeHOG improved the precision of human detection to 98.23% in the INRIA dataset. The proposed feature can significantly improve human detection applied in surveillance systems, especially in vegetative environments
- …