150 research outputs found
Fusion of Multispectral Data Through Illumination-aware Deep Neural Networks for Pedestrian Detection
Multispectral pedestrian detection has received extensive attention in recent
years as a promising solution to facilitate robust human target detection for
around-the-clock applications (e.g. security surveillance and autonomous
driving). In this paper, we demonstrate illumination information encoded in
multispectral images can be utilized to significantly boost performance of
pedestrian detection. A novel illumination-aware weighting mechanism is present
to accurately depict illumination condition of a scene. Such illumination
information is incorporated into two-stream deep convolutional neural networks
to learn multispectral human-related features under different illumination
conditions (daytime and nighttime). Moreover, we utilized illumination
information together with multispectral data to generate more accurate semantic
segmentation which are used to boost pedestrian detection accuracy. Putting all
of the pieces together, we present a powerful framework for multispectral
pedestrian detection based on multi-task learning of illumination-aware
pedestrian detection and semantic segmentation. Our proposed method is trained
end-to-end using a well-designed multi-task loss function and outperforms
state-of-the-art approaches on KAIST multispectral pedestrian dataset
Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection
Effective fusion of complementary information captured by multi-modal sensors
(visible and infrared cameras) enables robust pedestrian detection under
various surveillance situations (e.g. daytime and nighttime). In this paper, we
present a novel box-level segmentation supervised learning framework for
accurate and real-time multispectral pedestrian detection by incorporating
features extracted in visible and infrared channels. Specifically, our method
takes pairs of aligned visible and infrared images with easily obtained
bounding box annotations as input and estimates accurate prediction maps to
highlight the existence of pedestrians. It offers two major advantages over the
existing anchor box based multispectral detection methods. Firstly, it
overcomes the hyperparameter setting problem occurred during the training phase
of anchor box based detectors and can obtain more accurate detection results,
especially for small and occluded pedestrian instances. Secondly, it is capable
of generating accurate detection results using small-size input images, leading
to improvement of computational efficiency for real-time autonomous driving
applications. Experimental results on KAIST multispectral dataset show that our
proposed method outperforms state-of-the-art approaches in terms of both
accuracy and speed
Unsupervised Domain Adaptation for Multispectral Pedestrian Detection
Multimodal information (e.g., visible and thermal) can generate robust
pedestrian detections to facilitate around-the-clock computer vision
applications, such as autonomous driving and video surveillance. However, it
still remains a crucial challenge to train a reliable detector working well in
different multispectral pedestrian datasets without manual annotations. In this
paper, we propose a novel unsupervised domain adaptation framework for
multispectral pedestrian detection, by iteratively generating pseudo
annotations and updating the parameters of our designed multispectral
pedestrian detector on target domain. Pseudo annotations are generated using
the detector trained on source domain, and then updated by fixing the
parameters of detector and minimizing the cross entropy loss without
back-propagation. Training labels are generated using the pseudo annotations by
considering the characteristics of similarity and complementarity between
well-aligned visible and infrared image pairs. The parameters of detector are
updated using the generated labels by minimizing our defined multi-detection
loss function with back-propagation. The optimal parameters of detector can be
obtained after iteratively updating the pseudo annotations and parameters.
Experimental results show that our proposed unsupervised multimodal domain
adaptation method achieves significantly higher detection performance than the
approach without domain adaptation, and is competitive with the supervised
multispectral pedestrian detectors
Multimodal fusion architectures for pedestrian detection
Pedestrian detection provides a crucial functionality in many human-centric applications, such as video surveillance, urban scene analysis, and autonomous driving. Recently, multimodal pedestrian detection has received extensive attention since the fusion of complementary information captured by visible and infrared sensors enables robust human target detection under daytime and nighttime scenes. In this chapter, we systematically evaluate the performance of different multimodal fusion architectures in order to identify the optimal solutions for pedestrian detection. We made two important observations: (1) it is useful to combine the most commonly used concatenation fusion scheme with a global scene-aware mechanism to learn both human-related features and correlation between visible and thermal feature maps; (2) the two-stream segmentation supervision without multimodal fusion provides the most effective scheme to infuse segmentation information as supervision for learning human-related features. Based on these studies, we present a unified multimodal fusion framework for joint training of target detection and segmentation supervision which achieves the state-of-the-art multimodal pedestrian detection performance on the public KAIST benchmark dataset.</p
Two-stream convolutional neural network for non-destructive subsurface defect detection via similarity comparison of lock-in thermography signals
Active infrared thermography is a safe, fast, and low-cost solution for subsurface defects inspection, providing quality control in many industrial production tasks. In this paper, we explore deep learning-based approaches to analyze lock-in thermography image sequences for non-destructive testing and evaluation (NDT&E) of subsurface defects. Different from most existing Convolutional Neural Network (CNN) models that directly classify individual regions/pixels as defective and non-defective ones, we present a novel two-stream CNN architecture to extract/compare features in a pair of 1D thermal signal sequences for accurate classification/differentiation of defective and non-defective regions. In this manner, we can significantly increase the size of the training data by pairing two individually captured 1D thermal signals, thereby greatly easing the requirement for collecting a large number of thermal sequences of specimens with defects to train deep CNN models. Moreover, we experimentally investigate a number of network alternatives, identifying the optimal fusion scheme/stage for differentiating the thermal behaviors of defective and non-defective regions. Experimental results demonstrate that our proposed method, directly learning how to construct feature representations from a large number of real-captured thermal signal pairs, outperforms the well-established lock-in thermography data processing techniques on specimens made of different materials and at various excitation frequencies.</p
An Alarm Method for a Loose Parts Monitoring System
In order to reduce the false alarm rate and missed detection rate of a Loose Parts Monitoring System (LPMS) for Nuclear Power Plants, a new hybrid method combining Linear Predictive Coding (LPC) and Support Vector Machine (SVM) together to discriminate the loose part signal is proposed. The alarm process is divided into two stages. The first stage is to detect the weak burst signal for reducing the missed detection rate. Signal is whitened to improve the SNR, and then the weak burst signal can be detected by checking the short-term Root Mean Square (RMS) of the whitened signal. The second stage is to identify the detected burst signal for reducing the false alarm rate. Taking the signal's LPC coefficients as its characteristics, SVM is then utilized to determine whether the signal is generated by the impact of a loose part. The experiment shows that whitening the signal in the first stage can detect a loose part burst signal even at very low SNR and thusly can significantly reduce the rate of missed detection. In the second alarm stage, the loose parts' burst signal can be distinguished from pulse disturbance by using SVM. Even when the SNR is −15 dB, the system can still achieve a 100% recognition rat
Cascaded Deep Networks with Multiple Receptive Fields for Infrared Image Super-Resolution
Infrared images have a wide range of military and civilian applications including night vision, surveillance and robotics. However, high-resolution infrared detectors are difficult to fabricate and their manufacturing cost is expensive. In this work, we present a cascaded architecture of deep neural networks with multiple receptive fields to increase the spatial resolution of infrared images by a large scale factor (×8). Instead of reconstructing a high-resolution image from its low-resolution version using a single complex deep network, the key idea of our approach is to set up a mid-point (scale ×2) between scale ×1 and ×8 such that lost information can be divided into two components. Lost information within each component contains similar patterns thus can be more accurately recovered even using a simpler deep network. In our proposed cascaded architecture, two consecutive deep networks with different receptive fields are jointly trained through a multi-scale loss function. The first network with a large receptive field is applied to recover large-scale structure information, while the second one uses a relatively smaller receptive field to reconstruct small-scale image details. Our proposed method is systematically evaluated using realistic infrared images. Compared with state-of-theart Super-Resolution methods, our proposed cascaded approach achieves improved reconstruction accuracy using significantly less parameters
Study on Resource Configuration on Cloud Manufacturing
The purpose of manufacturing is to realize the requirement of customer. In manufacturing process of cloud system, there exist a lot of resource services which have similar functional characteristics to realize the requirement. It makes the manufacturing process more diverse. To develop the quality and reduce cost, a resource configuration model on cloud-manufacturing platform is put forward in this paper. According to the generalized six-point location principle, a growth design from the requirement of customers to entities with geometric constraints is proposed. By the requirement growing up to product, a configuration process is used to match the entities with the instances which the resources in the database could supply. Different from most existing studies, this paper studies the tolerance design with multiple candidate resource suppliers on cloud manufacturing to make the market play a two-level game considering the benefit of customers and the profit of resources to give an optimal result. A numerical case study is used to illustrate the proposed model and configuration process. The performance and advantage of the proposed method are discussed at the end
- …