5,395 research outputs found

    Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection

    Full text link
    Selective weeding is one of the key challenges in the field of agriculture robotics. To accomplish this task, a farm robot should be able to accurately detect plants and to distinguish them between crop and weeds. Most of the promising state-of-the-art approaches make use of appearance-based models trained on large annotated datasets. Unfortunately, creating large agricultural datasets with pixel-level annotations is an extremely time consuming task, actually penalizing the usage of data-driven techniques. In this paper, we face this problem by proposing a novel and effective approach that aims to dramatically minimize the human intervention needed to train the detection and classification algorithms. The idea is to procedurally generate large synthetic training datasets randomizing the key features of the target environment (i.e., crop and weed species, type of soil, light conditions). More specifically, by tuning these model parameters, and exploiting a few real-world textures, it is possible to render a large amount of realistic views of an artificial agricultural scenario with no effort. The generated data can be directly used to train the model or to supplement real-world images. We validate the proposed methodology by using as testbed a modern deep learning based image segmentation architecture. We compare the classification results obtained using both real and synthetic images as training data. The reported results confirm the effectiveness and the potentiality of our approach.Comment: To appear in IEEE/RSJ IROS 201

    Semantically Guided Depth Upsampling

    Full text link
    We present a novel method for accurate and efficient up- sampling of sparse depth data, guided by high-resolution imagery. Our approach goes beyond the use of intensity cues only and additionally exploits object boundary cues through structured edge detection and semantic scene labeling for guidance. Both cues are combined within a geodesic distance measure that allows for boundary-preserving depth in- terpolation while utilizing local context. We model the observed scene structure by locally planar elements and formulate the upsampling task as a global energy minimization problem. Our method determines glob- ally consistent solutions and preserves fine details and sharp depth bound- aries. In our experiments on several public datasets at different levels of application, we demonstrate superior performance of our approach over the state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral

    3D Segmentation Method for Natural Environments based on a Geometric-Featured Voxel Map

    Get PDF
    This work proposes a new segmentation algorithm for three-dimensional dense point clouds and has been specially designed for natural environments where the ground is unstructured and may include big slopes, non-flat areas and isolated areas. This technique is based on a Geometric-Featured Voxel map (GFV) where the scene is discretized in constant size cubes or voxels which are classified in flat surface, linear or tubular structures and scattered or undefined shapes, usually corresponding to vegetation. Since this is not a point-based technique the computational cost is significantly reduced, hence it may be compatible with Real-Time applications. The ground is extracted in order to obtain more accurate results in the posterior segmentation process. The scene is split into objects and a second segmentation in regions inside each object is performed based on the voxel’s geometric class. The work here evaluates the proposed algorithm in various versions and several voxel sizes and compares the results with other methods from the literature. For the segmentation evaluation the algorithms are tested on several differently challenging hand-labeled data sets using two metrics, one of which is novel.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Motion Cooperation: Smooth Piece-Wise Rigid Scene Flow from RGB-D Images

    Get PDF
    We propose a novel joint registration and segmentation approach to estimate scene flow from RGB-D images. Instead of assuming the scene to be composed of a number of independent rigidly-moving parts, we use non-binary labels to capture non-rigid deformations at transitions between the rigid parts of the scene. Thus, the velocity of any point can be computed as a linear combination (interpolation) of the estimated rigid motions, which provides better results than traditional sharp piecewise segmentations. Within a variational framework, the smooth segments of the scene and their corresponding rigid velocities are alternately refined until convergence. A K-means-based segmentation is employed as an initialization, and the number of regions is subsequently adapted during the optimization process to capture any arbitrary number of independently moving objects. We evaluate our approach with both synthetic and real RGB-D images that contain varied and large motions. The experiments show that our method estimates the scene flow more accurately than the most recent works in the field, and at the same time provides a meaningful segmentation of the scene based on 3D motion.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. Spanish Government under the grant programs FPI-MICINN 2012 and DPI2014- 55826-R (co-founded by the European Regional Development Fund), as well as by the EU ERC grant Convex Vision (grant agreement no. 240168)

    Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View

    Full text link
    We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image. To make this possible, Im2Pano3D leverages strong contextual priors learned from large-scale synthetic and real-world indoor scenes. To ease the prediction of 3D structure, we propose to parameterize 3D surfaces with their plane equations and train the model to predict these parameters directly. To provide meaningful training supervision, we use multiple loss functions that consider both pixel level accuracy and global context consistency. Experiments demon- strate that Im2Pano3D is able to predict the semantics and 3D structure of the unobserved scene with more than 56% pixel accuracy and less than 0.52m average distance error, which is significantly better than alternative approaches.Comment: Video summary: https://youtu.be/Au3GmktK-S

    Volume-based Semantic Labeling with Signed Distance Functions

    Full text link
    Research works on the two topics of Semantic Segmentation and SLAM (Simultaneous Localization and Mapping) have been following separate tracks. Here, we link them quite tightly by delineating a category label fusion technique that allows for embedding semantic information into the dense map created by a volume-based SLAM algorithm such as KinectFusion. Accordingly, our approach is the first to provide a semantically labeled dense reconstruction of the environment from a stream of RGB-D images. We validate our proposal using a publicly available semantically annotated RGB-D dataset and a) employing ground truth labels, b) corrupting such annotations with synthetic noise, c) deploying a state of the art semantic segmentation algorithm based on Convolutional Neural Networks.Comment: Submitted to PSIVT201
    • …
    corecore