73 research outputs found

    Probabilistic Perception System for Object Classification Based on Camera -LiDAR Sensor Fusion

    Get PDF
    International audienceOne of the most basic needs to guide the definition of urban, agro-industrial and territorial management policies is to have a digital topographic representation or map of cities, crops and forests. These maps should ideally be created from multiple sensors whose responses are complementary (color information, for example, complements the returns of a LiDAR sensor in the presence of rain or low reflective objects). Once a topographic representation has been constructed, it can be used to produce and geo-localize higher-level estimates (e.g., location and classification of different trees and plants, crop density, location, and types of pests). Data can be collected using terrestrial unmanned vehicles equipped with hyper-spectral cameras, stereo cameras and LiDAR (Light Detection and Ranging) sensors. The processing of the acquired data can be used to generate a digital forest model (DFM). DFM will support forest planners in making multi-criteria decisions (MCDA) when planning harvesting operations. However creating a DFM or the map of a city, require a highly accurate and dense point cloud of the environment at hand. Motivated for building 3D reconstructions from which representations of different vegetation features of an environment can be obtained with high quality and precision. A robust perception system is proposed for densely predicting depth, since it is an essential component in understanding the 3D geometry of a scene. It is known that cameras provide near instantaneous capture of the workspace’s appearance such as texture and color, but from a single view, little geometrical information. On the other hand, laser readings may be so sparse that significant information about the surface is missing. The considerations above motivate the formulation of this work’s research question: How to develop a perception system for fusing a laser scan with a RGB image in order to produce a higher-resolution range

    VConv-DAE: Deep Volumetric Shape Learning Without Object Labels

    Full text link
    With the advent of affordable depth sensors, 3D capture becomes more and more ubiquitous and already has made its way into commercial products. Yet, capturing the geometry or complete shapes of everyday objects using scanning devices (e.g. Kinect) still comes with several challenges that result in noise or even incomplete shapes. Recent success in deep learning has shown how to learn complex shape distributions in a data-driven way from large scale 3D CAD Model collections and to utilize them for 3D processing on volumetric representations and thereby circumventing problems of topology and tessellation. Prior work has shown encouraging results on problems ranging from shape completion to recognition. We provide an analysis of such approaches and discover that training as well as the resulting representation are strongly and unnecessarily tied to the notion of object labels. Thus, we propose a full convolutional volumetric auto encoder that learns volumetric representation from noisy data by estimating the voxel occupancy grids. The proposed method outperforms prior work on challenging tasks like denoising and shape completion. We also show that the obtained deep embedding gives competitive performance when used for classification and promising results for shape interpolation

    Recognizing Objects In-the-wild: Where Do We Stand?

    Full text link
    The ability to recognize objects is an essential skill for a robotic system acting in human-populated environments. Despite decades of effort from the robotic and vision research communities, robots are still missing good visual perceptual systems, preventing the use of autonomous agents for real-world applications. The progress is slowed down by the lack of a testbed able to accurately represent the world perceived by the robot in-the-wild. In order to fill this gap, we introduce a large-scale, multi-view object dataset collected with an RGB-D camera mounted on a mobile robot. The dataset embeds the challenges faced by a robot in a real-life application and provides a useful tool for validating object recognition algorithms. Besides describing the characteristics of the dataset, the paper evaluates the performance of a collection of well-established deep convolutional networks on the new dataset and analyzes the transferability of deep representations from Web images to robotic data. Despite the promising results obtained with such representations, the experiments demonstrate that object classification with real-life robotic data is far from being solved. Finally, we provide a comparative study to analyze and highlight the open challenges in robot vision, explaining the discrepancies in the performance

    DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data

    Full text link
    We introduce the DROW detector, a deep learning based detector for 2D range data. Laser scanners are lighting invariant, provide accurate range data, and typically cover a large field of view, making them interesting sensors for robotics applications. So far, research on detection in laser range data has been dominated by hand-crafted features and boosted classifiers, potentially losing performance due to suboptimal design choices. We propose a Convolutional Neural Network (CNN) based detector for this task. We show how to effectively apply CNNs for detection in 2D range data, and propose a depth preprocessing step and voting scheme that significantly improve CNN performance. We demonstrate our approach on wheelchairs and walkers, obtaining state of the art detection results. Apart from the training data, none of our design choices limits the detector to these two classes, though. We provide a ROS node for our detector and release our dataset containing 464k laser scans, out of which 24k were annotated.Comment: Lucas Beyer and Alexander Hermans contributed equall

    Monocular SLAM Supported Object Recognition

    Get PDF
    In this work, we develop a monocular SLAM-aware object recognition system that is able to achieve considerably stronger recognition performance, as compared to classical object recognition systems that function on a frame-by-frame basis. By incorporating several key ideas including multi-view object proposals and efficient feature encoding methods, our proposed system is able to detect and robustly recognize objects in its environment using a single RGB camera in near-constant time. Through experiments, we illustrate the utility of using such a system to effectively detect and recognize objects, incorporating multiple object viewpoint detections into a unified prediction hypothesis. The performance of the proposed recognition system is evaluated on the UW RGB-D Dataset, showing strong recognition performance and scalable run-time performance compared to current state-of-the-art recognition systems.Comment: Accepted to appear at Robotics: Science and Systems 2015, Rome, Ital
    • …
    corecore