1,672 research outputs found

    Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection

    Get PDF
    Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g. daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed

    Multitemporal Very High Resolution from Space: Outcome of the 2016 IEEE GRSS Data Fusion Contest

    Get PDF
    In this paper, the scientific outcomes of the 2016 Data Fusion Contest organized by the Image Analysis and Data Fusion Technical Committee of the IEEE Geoscience and Remote Sensing Society are discussed. The 2016 Contest was an open topic competition based on a multitemporal and multimodal dataset, which included a temporal pair of very high resolution panchromatic and multispectral Deimos-2 images and a video captured by the Iris camera on-board the International Space Station. The problems addressed and the techniques proposed by the participants to the Contest spanned across a rather broad range of topics, and mixed ideas and methodologies from the remote sensing, video processing, and computer vision. In particular, the winning team developed a deep learning method to jointly address spatial scene labeling and temporal activity modeling using the available image and video data. The second place team proposed a random field model to simultaneously perform coregistration of multitemporal data, semantic segmentation, and change detection. The methodological key ideas of both these approaches and the main results of the corresponding experimental validation are discussed in this paper

    Multidimensional Optical Sensing and Imaging Systems (MOSIS): From Macro to Micro Scales

    Get PDF
    Multidimensional optical imaging systems for information processing and visualization technologies have numerous applications in fields such as manufacturing, medical sciences, entertainment, robotics, surveillance, and defense. Among different three-dimensional (3-D) imaging methods, integral imaging is a promising multiperspective sensing and display technique. Compared with other 3-D imaging techniques, integral imaging can capture a scene using an incoherent light source and generate real 3-D images for observation without any special viewing devices. This review paper describes passive multidimensional imaging systems combined with different integral imaging configurations. One example is the integral-imaging-based multidimensional optical sensing and imaging systems (MOSIS), which can be used for 3-D visualization, seeing through obscurations, material inspection, and object recognition from microscales to long range imaging. This system utilizes many degrees of freedom such as time and space multiplexing, depth information, polarimetric, temporal, photon flux and multispectral information based on integral imaging to record and reconstruct the multidimensionally integrated scene. Image fusion may be used to integrate the multidimensional images obtained by polarimetric sensors, multispectral cameras, and various multiplexing techniques. The multidimensional images contain substantially more information compared with two-dimensional (2-D) images or conventional 3-D images. In addition, we present recent progress and applications of 3-D integral imaging including human gesture recognition in the time domain, depth estimation, mid-wave-infrared photon counting, 3-D polarimetric imaging for object shape and material identification, dynamic integral imaging implemented with liquid-crystal devices, and 3-D endoscopy for healthcare applications.B. Javidi wishes to acknowledge support by the National Science Foundation (NSF) under Grant NSF/IIS-1422179, and DARPA and US Army under contract number W911NF-13-1-0485. The work of P. Latorre Carmona, A. MartĂ­nez-Uso, J. M. Sotoca and F. Pla was supported by the Spanish Ministry of Economy under the project ESP2013-48458-C4-3-P, and by MICINN under the project MTM2013-48371-C2-2-PDGI, by Generalitat Valenciana under the project PROMETEO-II/2014/062, and by Universitat Jaume I through project P11B2014-09. The work of M. MartĂ­nez-Corral and G. Saavedra was supported by the Spanish Ministry of Economy and Competitiveness under the grant DPI2015-66458-C2-1R, and by the Generalitat Valenciana, Spain under the project PROMETEOII/2014/072

    Viewpoint-free Video Synthesis with an Integrated 4D System

    Get PDF
    In this paper, we introduce a complex approach on 4D reconstruction of dynamic scenarios containing multiple walking pedestrians. The input of the process is a point cloud sequence recorded by a rotating multi-beam Lidar sensor, which monitors the scene from a fixed position. The output is a geometrically reconstructed and textured scene containing moving 4D people models, which can follow in real time the trajectories of the walking pedestrians observed on the Lidar data flow. Our implemented system consists of four main steps. First, we separate foreground and background regions in each point cloud frame of the sequence by a robust probabilistic approach. Second, we perform moving pedestrian detection and tracking, so that among the point cloud regions classified as foreground, we separate the different objects, and assign the corresponding people positions to each other over the consecutive frames of the Lidar measurement sequence. Third, we geometrically reconstruct the ground, walls and further objects of the background scene, and texture the obtained models with photos taken from the scene. Fourth we insert into the scene textured 4D models of moving pedestrians which were preliminary created in a special 4D reconstruction studio. Finally, we integrate the system elements in a joint dynamic scene model and visualize the 4D scenario

    Palmprint Gender Classification Using Deep Learning Methods

    Get PDF
    Gender identification is an important technique that can improve the performance of authentication systems by reducing searching space and speeding up the matching process. Several biometric traits have been used to ascertain human gender. Among them, the human palmprint possesses several discriminating features such as principal-lines, wrinkles, ridges, and minutiae features and that offer cues for gender identification. The goal of this work is to develop novel deep-learning techniques to determine gender from palmprint images. PolyU and CASIA palmprint databases with 90,000 and 5502 images respectively were used for training and testing purposes in this research. After ROI extraction and data augmentation were performed, various convolutional and deep learning-based classification approaches were empirically designed, optimized, and tested. Results of gender classification as high as 94.87% were achieved on the PolyU palmprint database and 90.70% accuracy on the CASIA palmprint database. Optimal performance was achieved by combining two different pre-trained and fine-tuned deep CNNs (VGGNet and DenseNet) through score level average fusion. In addition, Gradient-weighted Class Activation Mapping (Grad-CAM) was also implemented to ascertain which specific regions of the palmprint are most discriminative for gender classification

    Integration, Testing, And Analysis Of Multispectral Imager On Small Unmanned Aerial System For Skin Detection

    Get PDF
    Small Unmanned Aerial Systems (SUAS) have been utilized by the military, geological researchers, and first responders, to provide information about the environment in real time. Hyperspectral Imagery (HSI) provides high resolution data in the spatial and spectral dimension; all objects, including skin have unique spectral signatures. However, little research has been done to integrate HSI into SUAS due to their cost and form factor. Multispectral Imagery (MSI) has proven capable of dismount detection with several distinct wavelengths. This research proposes a spectral imaging system that can detect dismounts on SUAS. Also, factors that pertain to accurate dismount detection with an SUAS are explored. Dismount skin detection from an aerial platform also has an inherent difficulty compared to ground-based platforms. Computer vision registration, stereo camera calibration, and geolocation from autopilot telemetry are utilized to design a dismount detection platform with the Systems Engineering methodology. An average 5.112% difference in ROC AUC values that compared a line scan spectral imager to the prototype area scan imager was recorded. Results indicated that an SUAS-based Spectral Imagers are capable tools in dismount detection protocols. Deficiencies associated with the test expedient prototype are discussed and recommendations for further improvements are provided

    Deep visible and thermal image fusion for enhanced pedestrian visibility

    Get PDF
    Reliable vision in challenging illumination conditions is one of the crucial requirements of future autonomous automotive systems. In the last decade, thermal cameras have become more easily accessible to a larger number of researchers. This has resulted in numerous studies which confirmed the benefits of the thermal cameras in limited visibility conditions. In this paper, we propose a learning-based method for visible and thermal image fusion that focuses on generating fused images with high visual similarity to regular truecolor (red-green-blue or RGB) images, while introducing new informative details in pedestrian regions. The goal is to create natural, intuitive images that would be more informative than a regular RGB camera to a human driver in challenging visibility conditions. The main novelty of this paper is the idea to rely on two types of objective functions for optimization: a similarity metric between the RGB input and the fused output to achieve natural image appearance; and an auxiliary pedestrian detection error to help defining relevant features of the human appearance and blending them into the output. We train a convolutional neural network using image samples from variable conditions (day and night) so that the network learns the appearance of humans in the different modalities and creates more robust results applicable in realistic situations. Our experiments show that the visibility of pedestrians is noticeably improved especially in dark regions and at night. Compared to existing methods we can better learn context and define fusion rules that focus on the pedestrian appearance, while that is not guaranteed with methods that focus on low-level image quality metrics
    • …
    corecore