Search CORE

5,395 research outputs found

Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection

Author: Di Cicco Maurilio
Grisetti Giorgio
Potena Ciro
Pretto Alberto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Selective weeding is one of the key challenges in the field of agriculture robotics. To accomplish this task, a farm robot should be able to accurately detect plants and to distinguish them between crop and weeds. Most of the promising state-of-the-art approaches make use of appearance-based models trained on large annotated datasets. Unfortunately, creating large agricultural datasets with pixel-level annotations is an extremely time consuming task, actually penalizing the usage of data-driven techniques. In this paper, we face this problem by proposing a novel and effective approach that aims to dramatically minimize the human intervention needed to train the detection and classification algorithms. The idea is to procedurally generate large synthetic training datasets randomizing the key features of the target environment (i.e., crop and weed species, type of soil, light conditions). More specifically, by tuning these model parameters, and exploiting a few real-world textures, it is possible to render a large amount of realistic views of an artificial agricultural scenario with no effort. The generated data can be directly used to train the model or to supplement real-world images. We validate the proposed methodology by using as testbed a modern deep learning based image segmentation architecture. We compare the classification results obtained using both real and synthetic images as training data. The reported results confirm the effectiveness and the potentiality of our approach.Comment: To appear in IEEE/RSJ IROS 201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Archivio istituzionale della ricerca - Università di Padova

Semantically Guided Depth Upsampling

Author: A Geiger
A Kundu
D Scharstein
J Kopf
J Liu
K He
K Yamaguchi
L Ladický
M Everingham
M Kiechle
P Dollar
Publication venue
Publication date: 02/08/2016
Field of study

We present a novel method for accurate and efficient up- sampling of sparse depth data, guided by high-resolution imagery. Our approach goes beyond the use of intensity cues only and additionally exploits object boundary cues through structured edge detection and semantic scene labeling for guidance. Both cues are combined within a geodesic distance measure that allows for boundary-preserving depth in- terpolation while utilizing local context. We model the observed scene structure by locally planar elements and formulate the upsampling task as a global energy minimization problem. Our method determines glob- ally consistent solutions and preserves fine details and sharp depth bound- aries. In our experiments on several public datasets at different levels of application, we demonstrate superior performance of our approach over the state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral

arXiv.org e-Print Archive

Crossref

3D Segmentation Method for Natural Environments based on a Geometric-Featured Voxel Map

Author: Ababsa Fakhr-Eddine
Garcia-Cerezo Alfonso J.
Gomez-Ruiz Jose Antonio
Plaza Victoria
Publication venue
Publication date: 26/03/2015
Field of study

This work proposes a new segmentation algorithm for three-dimensional dense point clouds and has been specially designed for natural environments where the ground is unstructured and may include big slopes, non-flat areas and isolated areas. This technique is based on a Geometric-Featured Voxel map (GFV) where the scene is discretized in constant size cubes or voxels which are classified in flat surface, linear or tubular structures and scattered or undefined shapes, usually corresponding to vegetation. Since this is not a point-based technique the computational cost is significantly reduced, hence it may be compatible with Real-Time applications. The ground is extracted in order to obtain more accurate results in the posterior segmentation process. The scene is split into objects and a second segmentation in regions inside each object is performed based on the voxel’s geometric class. The work here evaluates the proposed algorithm in various versions and several voxel sizes and compares the results with other methods from the literature. For the segmentation evaluation the algorithms are tested on several differently challenging hand-labeled data sets using two metrics, one of which is novel.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Crossref

Repositorio Institucional Universidad de Málaga

Motion Cooperation: Smooth Piece-Wise Rigid Scene Flow from RGB-D Images

Author: Cremers Daniel
Gonzalez-Jimenez Antonio Javier
Jaimez Mariano
Souiai Mohamed
Stuckler Jorg
Publication venue
Publication date: 01/01/2015
Field of study

We propose a novel joint registration and segmentation approach to estimate scene flow from RGB-D images. Instead of assuming the scene to be composed of a number of independent rigidly-moving parts, we use non-binary labels to capture non-rigid deformations at transitions between the rigid parts of the scene. Thus, the velocity of any point can be computed as a linear combination (interpolation) of the estimated rigid motions, which provides better results than traditional sharp piecewise segmentations. Within a variational framework, the smooth segments of the scene and their corresponding rigid velocities are alternately refined until convergence. A K-means-based segmentation is employed as an initialization, and the number of regions is subsequently adapted during the optimization process to capture any arbitrary number of independently moving objects. We evaluate our approach with both synthetic and real RGB-D images that contain varied and large motions. The experiments show that our method estimates the scene flow more accurately than the most recent works in the field, and at the same time provides a meaningful segmentation of the scene based on 3D motion.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. Spanish Government under the grant programs FPI-MICINN 2012 and DPI2014- 55826-R (co-founded by the European Regional Development Fund), as well as by the EU ERC grant Convex Vision (grant agreement no. 240168)

OPUS Augsburg

Crossref

Repositorio Institucional Universidad de Málaga

Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View

Author: Chang Angel X.
Funkhouser Thomas
Savarese Silvio
Savva Manolis
Song Shuran
Zeng Andy
Publication venue
Publication date: 12/12/2017
Field of study

We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image. To make this possible, Im2Pano3D leverages strong contextual priors learned from large-scale synthetic and real-world indoor scenes. To ease the prediction of 3D structure, we propose to parameterize 3D surfaces with their plane equations and train the model to predict these parameters directly. To provide meaningful training supervision, we use multiple loss functions that consider both pixel level accuracy and global context consistency. Experiments demon- strate that Im2Pano3D is able to predict the semantics and 3D structure of the unobserved scene with more than 56% pixel accuracy and less than 0.52m average distance error, which is significantly better than alternative approaches.Comment: Video summary: https://youtu.be/Au3GmktK-S

arXiv.org e-Print Archive

Crossref

Volume-based Semantic Labeling with Signed Distance Functions

Author: Cavallari Tommaso
Di Stefano Luigi
Publication venue
Publication date: 13/11/2015
Field of study

Research works on the two topics of Semantic Segmentation and SLAM (Simultaneous Localization and Mapping) have been following separate tracks. Here, we link them quite tightly by delineating a category label fusion technique that allows for embedding semantic information into the dense map created by a volume-based SLAM algorithm such as KinectFusion. Accordingly, our approach is the first to provide a semantically labeled dense reconstruction of the environment from a stream of RGB-D images. We validate our proposal using a publicly available semantically annotated RGB-D dataset and a) employing ground truth labels, b) corrupting such annotations with synthetic noise, c) deploying a state of the art semantic segmentation algorithm based on Convolutional Neural Networks.Comment: Submitted to PSIVT201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna