5 research outputs found

    Robust Fusion of LiDAR and Wide-Angle Camera Data for Autonomous Mobile Robots

    Get PDF
    Autonomous robots that assist humans in day to day living tasks are becoming increasingly popular. Autonomous mobile robots operate by sensing and perceiving their surrounding environment to make accurate driving decisions. A combination of several different sensors such as LiDAR, radar, ultrasound sensors and cameras are utilized to sense the surrounding environment of autonomous vehicles. These heterogeneous sensors simultaneously capture various physical attributes of the environment. Such multimodality and redundancy of sensing need to be positively utilized for reliable and consistent perception of the environment through sensor data fusion. However, these multimodal sensor data streams are different from each other in many ways, such as temporal and spatial resolution, data format, and geometric alignment. For the subsequent perception algorithms to utilize the diversity offered by multimodal sensing, the data streams need to be spatially, geometrically and temporally aligned with each other. In this paper, we address the problem of fusing the outputs of a Light Detection and Ranging (LiDAR) scanner and a wide-angle monocular image sensor for free space detection. The outputs of LiDAR scanner and the image sensor are of different spatial resolutions and need to be aligned with each other. A geometrical model is used to spatially align the two sensor outputs, followed by a Gaussian Process (GP) regression-based resolution matching algorithm to interpolate the missing data with quantifiable uncertainty. The results indicate that the proposed sensor data fusion framework significantly aids the subsequent perception steps, as illustrated by the performance improvement of a uncertainty aware free space detection algorith

    DPDnet: A Robust People Detector using Deep Learning with an Overhead Depth Camera

    Full text link
    In this paper we propose a method based on deep learning that detects multiple people from a single overhead depth image with high reliability. Our neural network, called DPDnet, is based on two fully-convolutional encoder-decoder neural blocks based on residual layers. The Main Block takes a depth image as input and generates a pixel-wise confidence map, where each detected person in the image is represented by a Gaussian-like distribution. The refinement block combines the depth image and the output from the main block, to refine the confidence map. Both blocks are simultaneously trained end-to-end using depth images and head position labels. The experimental work shows that DPDNet outperforms state-of-the-art methods, with accuracies greater than 99% in three different publicly available datasets, without retraining not fine-tuning. In addition, the computational complexity of our proposal is independent of the number of people in the scene and runs in real time using conventional GPUs

    A robust system for counting people using an infrared sensor and a camera

    Get PDF
    In this paper, a multi-modal solution to the people counting problem in a given area is described. The multi-modal system consists of a differential pyro-electric infrared (PIR) sensor and a camera. Faces in the surveillance area are detected by the camera with the aim of counting people using cascaded AdaBoost classifiers. Due to the imprecise results produced by the camera-only system, an additional differential PIR sensor is integrated to the camera. Two types of human motion: (i) entry to and exit from the surveillance area and (ii) ordinary activities in that area are distinguished by the PIR sensor using a Markovian decision algorithm. The wavelet transform of the continuous-time real-valued signal received from the PIR sensor circuit is used for feature extraction from the sensor signal. Wavelet parameters are then fed to a set of Markov models representing the two motion classes. The affiliation of a test signal is decided as the class of the model yielding higher probability. People counting results produced by the camera are then corrected by utilizing the additional information obtained from the PIR sensor signal analysis. With the proof of concept built, it is shown that the multi-modal system can reduce false alarms of the camera-only system and determines the number of people watching a TV set in a more robust manner. © 2015 Elsevier B.V. All rights reserved

    Behavioral pedestrian tracking using a camera and lidar sensors on a moving vehicle

    Get PDF
    In this paper, we present a novel 2D–3D pedestrian tracker designed for applications in autonomous vehicles. The system operates on a tracking by detection principle and can track multiple pedestrians in complex urban traffic situations. By using a behavioral motion model and a non-parametric distribution as state model, we are able to accurately track unpredictable pedestrian motion in the presence of heavy occlusion. Tracking is performed independently, on the image and ground plane, in global, motion compensated coordinates. We employ Camera and LiDAR data fusion to solve the association problem where the optimal solution is found by matching 2D and 3D detections to tracks using a joint log-likelihood observation model. Each 2D–3D particle filter then updates their state from associated observations and a behavioral motion model. Each particle moves independently following the pedestrian motion parameters which we learned offline from an annotated training dataset. Temporal stability of the state variables is achieved by modeling each track as a Markov Decision Process with probabilistic state transition properties. A novel track management system then handles high level actions such as track creation, deletion and interaction. Using a probabilistic track score the track manager can cull false and ambiguous detections while updating tracks with detections from actual pedestrians. Our system is implemented on a GPU and exploits the massively parallelizable nature of particle filters. Due to the Markovian nature of our track representation, the system achieves real-time performance operating with a minimal memory footprint. Exhaustive and independent evaluation of our tracker was performed by the KITTI benchmark server, where it was tested against a wide variety of unknown pedestrian tracking situations. On this realistic benchmark, we outperform all published pedestrian trackers in a multitude of tracking metrics
    corecore