225 research outputs found

    Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection

    Get PDF
    Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g. daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed

    Unsupervised Domain Adaptation for Multispectral Pedestrian Detection

    Get PDF
    Multimodal information (e.g., visible and thermal) can generate robust pedestrian detections to facilitate around-the-clock computer vision applications, such as autonomous driving and video surveillance. However, it still remains a crucial challenge to train a reliable detector working well in different multispectral pedestrian datasets without manual annotations. In this paper, we propose a novel unsupervised domain adaptation framework for multispectral pedestrian detection, by iteratively generating pseudo annotations and updating the parameters of our designed multispectral pedestrian detector on target domain. Pseudo annotations are generated using the detector trained on source domain, and then updated by fixing the parameters of detector and minimizing the cross entropy loss without back-propagation. Training labels are generated using the pseudo annotations by considering the characteristics of similarity and complementarity between well-aligned visible and infrared image pairs. The parameters of detector are updated using the generated labels by minimizing our defined multi-detection loss function with back-propagation. The optimal parameters of detector can be obtained after iteratively updating the pseudo annotations and parameters. Experimental results show that our proposed unsupervised multimodal domain adaptation method achieves significantly higher detection performance than the approach without domain adaptation, and is competitive with the supervised multispectral pedestrian detectors

    NCP activates chloroplast transcription by controlling phytochrome-dependent dual nuclear and plastidial switches.

    Get PDF
    Phytochromes initiate chloroplast biogenesis by activating genes encoding the photosynthetic apparatus, including photosynthesis-associated plastid-encoded genes (PhAPGs). PhAPGs are transcribed by a bacterial-type RNA polymerase (PEP), but how phytochromes in the nucleus activate chloroplast gene expression remains enigmatic. We report here a forward genetic screen in Arabidopsis that identified NUCLEAR CONTROL OF PEP ACTIVITY (NCP) as a necessary component of phytochrome signaling for PhAPG activation. NCP is dual-targeted to plastids and the nucleus. While nuclear NCP mediates the degradation of two repressors of chloroplast biogenesis, PIF1 and PIF3, NCP in plastids promotes the assembly of the PEP complex for PhAPG transcription. NCP and its paralog RCB are non-catalytic thioredoxin-like proteins that diverged in seed plants to adopt nonredundant functions in phytochrome signaling. These results support a model in which phytochromes control PhAPG expression through light-dependent double nuclear and plastidial switches that are linked by evolutionarily conserved and dual-localized regulatory proteins

    An Alarm Method for a Loose Parts Monitoring System

    Get PDF
    In order to reduce the false alarm rate and missed detection rate of a Loose Parts Monitoring System (LPMS) for Nuclear Power Plants, a new hybrid method combining Linear Predictive Coding (LPC) and Support Vector Machine (SVM) together to discriminate the loose part signal is proposed. The alarm process is divided into two stages. The first stage is to detect the weak burst signal for reducing the missed detection rate. Signal is whitened to improve the SNR, and then the weak burst signal can be detected by checking the short-term Root Mean Square (RMS) of the whitened signal. The second stage is to identify the detected burst signal for reducing the false alarm rate. Taking the signal's LPC coefficients as its characteristics, SVM is then utilized to determine whether the signal is generated by the impact of a loose part. The experiment shows that whitening the signal in the first stage can detect a loose part burst signal even at very low SNR and thusly can significantly reduce the rate of missed detection. In the second alarm stage, the loose parts' burst signal can be distinguished from pulse disturbance by using SVM. Even when the SNR is −15 dB, the system can still achieve a 100% recognition rat

    Multimodal fusion architectures for pedestrian detection

    Get PDF
    Pedestrian detection provides a crucial functionality in many human-centric applications, such as video surveillance, urban scene analysis, and autonomous driving. Recently, multimodal pedestrian detection has received extensive attention since the fusion of complementary information captured by visible and infrared sensors enables robust human target detection under daytime and nighttime scenes. In this chapter, we systematically evaluate the performance of different multimodal fusion architectures in order to identify the optimal solutions for pedestrian detection. We made two important observations: (1) it is useful to combine the most commonly used concatenation fusion scheme with a global scene-aware mechanism to learn both human-related features and correlation between visible and thermal feature maps; (2) the two-stream segmentation supervision without multimodal fusion provides the most effective scheme to infuse segmentation information as supervision for learning human-related features. Based on these studies, we present a unified multimodal fusion framework for joint training of target detection and segmentation supervision which achieves the state-of-the-art multimodal pedestrian detection performance on the public KAIST benchmark dataset.</p

    Two-stream convolutional neural network for non-destructive subsurface defect detection via similarity comparison of lock-in thermography signals

    Get PDF
    Active infrared thermography is a safe, fast, and low-cost solution for subsurface defects inspection, providing quality control in many industrial production tasks. In this paper, we explore deep learning-based approaches to analyze lock-in thermography image sequences for non-destructive testing and evaluation (NDT&amp;E) of subsurface defects. Different from most existing Convolutional Neural Network (CNN) models that directly classify individual regions/pixels as defective and non-defective ones, we present a novel two-stream CNN architecture to extract/compare features in a pair of 1D thermal signal sequences for accurate classification/differentiation of defective and non-defective regions. In this manner, we can significantly increase the size of the training data by pairing two individually captured 1D thermal signals, thereby greatly easing the requirement for collecting a large number of thermal sequences of specimens with defects to train deep CNN models. Moreover, we experimentally investigate a number of network alternatives, identifying the optimal fusion scheme/stage for differentiating the thermal behaviors of defective and non-defective regions. Experimental results demonstrate that our proposed method, directly learning how to construct feature representations from a large number of real-captured thermal signal pairs, outperforms the well-established lock-in thermography data processing techniques on specimens made of different materials and at various excitation frequencies.</p

    Efficient Visual State Space Model for Image Deblurring

    Full text link
    Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. ViTs typically yield superior results in image restoration compared to CNNs due to their ability to capture long-range dependencies and input-dependent characteristics. However, the computational complexity of Transformer-based models grows quadratically with the image resolution, limiting their practical appeal in high-resolution image restoration tasks. In this paper, we propose a simple yet effective visual state space model (EVSSM) for image deblurring, leveraging the benefits of state space models (SSMs) to visual data. In contrast to existing methods that employ several fixed-direction scanning for feature extraction, which significantly increases the computational cost, we develop an efficient visual scan block that applies various geometric transformations before each SSM-based module, capturing useful non-local information and maintaining high efficiency. Extensive experimental results show that the proposed EVSSM performs favorably against state-of-the-art image deblurring methods on benchmark datasets and real-captured images

    Cascaded Deep Networks with Multiple Receptive Fields for Infrared Image Super-Resolution

    Get PDF
    Infrared images have a wide range of military and civilian applications including night vision, surveillance and robotics. However, high-resolution infrared detectors are difficult to fabricate and their manufacturing cost is expensive. In this work, we present a cascaded architecture of deep neural networks with multiple receptive fields to increase the spatial resolution of infrared images by a large scale factor (×8). Instead of reconstructing a high-resolution image from its low-resolution version using a single complex deep network, the key idea of our approach is to set up a mid-point (scale ×2) between scale ×1 and ×8 such that lost information can be divided into two components. Lost information within each component contains similar patterns thus can be more accurately recovered even using a simpler deep network. In our proposed cascaded architecture, two consecutive deep networks with different receptive fields are jointly trained through a multi-scale loss function. The first network with a large receptive field is applied to recover large-scale structure information, while the second one uses a relatively smaller receptive field to reconstruct small-scale image details. Our proposed method is systematically evaluated using realistic infrared images. Compared with state-of-theart Super-Resolution methods, our proposed cascaded approach achieves improved reconstruction accuracy using significantly less parameters
    corecore