5,213 research outputs found
DPDnet: A Robust People Detector using Deep Learning with an Overhead Depth Camera
In this paper we propose a method based on deep learning that detects
multiple people from a single overhead depth image with high reliability. Our
neural network, called DPDnet, is based on two fully-convolutional
encoder-decoder neural blocks based on residual layers. The Main Block takes a
depth image as input and generates a pixel-wise confidence map, where each
detected person in the image is represented by a Gaussian-like distribution.
The refinement block combines the depth image and the output from the main
block, to refine the confidence map. Both blocks are simultaneously trained
end-to-end using depth images and head position labels. The experimental work
shows that DPDNet outperforms state-of-the-art methods, with accuracies greater
than 99% in three different publicly available datasets, without retraining not
fine-tuning. In addition, the computational complexity of our proposal is
independent of the number of people in the scene and runs in real time using
conventional GPUs
Fast heuristic method to detect people in frontal depth images
This paper presents a new method for detecting people using only depth images captured by a camera in a frontal position. The approach is based on first detecting all the objects present in the scene and determining their average depth (distance to the camera). Next, for each object, a 3D Region of Interest (ROI) is processed around it in order to determine if the characteristics of the object correspond to the biometric characteristics of a human head. The results obtained using three public datasets captured by three depth sensors with different spatial resolutions and different operation principle (structured light, active stereo vision and Time of Flight) are presented. These results demonstrate that our method can run in realtime using a low-cost CPU platform with a high accuracy, being the processing times smaller than 1 ms per frame for a 512 × 424 image resolution with a precision of 99.26% and smaller than 4 ms per frame for a 1280 × 720 image resolution with a precision of 99.77%
Probabilistic RGB-D Odometry based on Points, Lines and Planes Under Depth Uncertainty
This work proposes a robust visual odometry method for structured
environments that combines point features with line and plane segments,
extracted through an RGB-D camera. Noisy depth maps are processed by a
probabilistic depth fusion framework based on Mixtures of Gaussians to denoise
and derive the depth uncertainty, which is then propagated throughout the
visual odometry pipeline. Probabilistic 3D plane and line fitting solutions are
used to model the uncertainties of the feature parameters and pose is estimated
by combining the three types of primitives based on their uncertainties.
Performance evaluation on RGB-D sequences collected in this work and two public
RGB-D datasets: TUM and ICL-NUIM show the benefit of using the proposed depth
fusion framework and combining the three feature-types, particularly in scenes
with low-textured surfaces, dynamic objects and missing depth measurements.Comment: Major update: more results, depth filter released as opensource, 34
page
People re-identification using depth and intensity information from an overhead sensor
This work presents a new people re-identification method, using depth and intensity images, both of them captured with a single static camera, located in an overhead position. The proposed solution arises from the need that exists in many areas of application to carry out identification and re-identification processes to determine, for example, the time that people remain in a certain space, while fulfilling the requirement of preserving people's privacy. This work is a novelty compared to other previous solutions, since the use of top-view images of depth and intensity allows obtaining information to perform the functions of identification and re-identification of people, maintaining their privacy and reducing occlusions. In the procedure of people identification and re-identification, only three frames of intensity and depth are used, so that the first one is obtained when the person enters the scene (frontal view), the second when it is in the central area of the scene (overhead view) and the third one when it leaves the scene (back view). In the implemented method only information from the head and shoulders of people with these three different perspectives is used. From these views three feature vectors are obtained in a simple way, two of them related to depth information and the other one related to intensity data. This increases the robustness of the method against lighting changes. The proposal has been evaluated in two different datasets and compared to other state-of-the-art proposal. The obtained results show a 96,7% success rate in re-identification, with sensors that use different operating principles, all of them obtaining depth and intensity information. Furthermore, the implemented method can work in real time on a PC, without using a GPU.Ministerio de Economía y CompetitividadAgencia Estatal de InvestigaciónUniversidad de Alcal
- …