5 research outputs found
Robust Fusion of LiDAR and Wide-Angle Camera Data for Autonomous Mobile Robots
Autonomous robots that assist humans in day to day living tasks are becoming
increasingly popular. Autonomous mobile robots operate by sensing and
perceiving their surrounding environment to make accurate driving decisions. A
combination of several different sensors such as LiDAR, radar, ultrasound
sensors and cameras are utilized to sense the surrounding environment of
autonomous vehicles. These heterogeneous sensors simultaneously capture various
physical attributes of the environment. Such multimodality and redundancy of
sensing need to be positively utilized for reliable and consistent perception
of the environment through sensor data fusion. However, these multimodal sensor
data streams are different from each other in many ways, such as temporal and
spatial resolution, data format, and geometric alignment. For the subsequent
perception algorithms to utilize the diversity offered by multimodal sensing,
the data streams need to be spatially, geometrically and temporally aligned
with each other. In this paper, we address the problem of fusing the outputs of
a Light Detection and Ranging (LiDAR) scanner and a wide-angle monocular image
sensor for free space detection. The outputs of LiDAR scanner and the image
sensor are of different spatial resolutions and need to be aligned with each
other. A geometrical model is used to spatially align the two sensor outputs,
followed by a Gaussian Process (GP) regression-based resolution matching
algorithm to interpolate the missing data with quantifiable uncertainty. The
results indicate that the proposed sensor data fusion framework significantly
aids the subsequent perception steps, as illustrated by the performance
improvement of a uncertainty aware free space detection algorith
DPDnet: A Robust People Detector using Deep Learning with an Overhead Depth Camera
In this paper we propose a method based on deep learning that detects
multiple people from a single overhead depth image with high reliability. Our
neural network, called DPDnet, is based on two fully-convolutional
encoder-decoder neural blocks based on residual layers. The Main Block takes a
depth image as input and generates a pixel-wise confidence map, where each
detected person in the image is represented by a Gaussian-like distribution.
The refinement block combines the depth image and the output from the main
block, to refine the confidence map. Both blocks are simultaneously trained
end-to-end using depth images and head position labels. The experimental work
shows that DPDNet outperforms state-of-the-art methods, with accuracies greater
than 99% in three different publicly available datasets, without retraining not
fine-tuning. In addition, the computational complexity of our proposal is
independent of the number of people in the scene and runs in real time using
conventional GPUs
A robust system for counting people using an infrared sensor and a camera
In this paper, a multi-modal solution to the people counting problem in a given area is described. The multi-modal system consists of a differential pyro-electric infrared (PIR) sensor and a camera. Faces in the surveillance area are detected by the camera with the aim of counting people using cascaded AdaBoost classifiers. Due to the imprecise results produced by the camera-only system, an additional differential PIR sensor is integrated to the camera. Two types of human motion: (i) entry to and exit from the surveillance area and (ii) ordinary activities in that area are distinguished by the PIR sensor using a Markovian decision algorithm. The wavelet transform of the continuous-time real-valued signal received from the PIR sensor circuit is used for feature extraction from the sensor signal. Wavelet parameters are then fed to a set of Markov models representing the two motion classes. The affiliation of a test signal is decided as the class of the model yielding higher probability. People counting results produced by the camera are then corrected by utilizing the additional information obtained from the PIR sensor signal analysis. With the proof of concept built, it is shown that the multi-modal system can reduce false alarms of the camera-only system and determines the number of people watching a TV set in a more robust manner. © 2015 Elsevier B.V. All rights reserved
Behavioral pedestrian tracking using a camera and lidar sensors on a moving vehicle
In this paper, we present a novel 2D–3D pedestrian tracker designed for applications in autonomous vehicles. The system operates on a tracking by detection principle and can track multiple pedestrians in complex urban traffic situations. By using a behavioral motion model and a non-parametric distribution as state model, we are able to accurately track unpredictable pedestrian motion in the presence of heavy occlusion. Tracking is performed independently, on the image and ground plane, in global, motion compensated coordinates. We employ Camera and LiDAR data fusion to solve the association problem where the optimal solution is found by matching 2D and 3D detections to tracks using a joint log-likelihood observation model. Each 2D–3D particle filter then updates their state from associated observations and a behavioral motion model. Each particle moves independently following the pedestrian motion parameters which we learned offline from an annotated training dataset. Temporal stability of the state variables is achieved by modeling each track as a Markov Decision Process with probabilistic state transition properties. A novel track management system then handles high level actions such as track creation, deletion and interaction. Using a probabilistic track score the track manager can cull false and ambiguous detections while updating tracks with detections from actual pedestrians. Our system is implemented on a GPU and exploits the massively parallelizable nature of particle filters. Due to the Markovian nature of our track representation, the system achieves real-time performance operating with a minimal memory footprint. Exhaustive and independent evaluation of our tracker was performed by the KITTI benchmark server, where it was tested against a wide variety of unknown pedestrian tracking situations. On this realistic benchmark, we outperform all published pedestrian trackers in a multitude of tracking metrics