Search CORE

18,980 research outputs found

Furniture models learned from the WWW: using web catalogs to locate and categorize unknown furniture pieces in 3D laser scans

Author: Beetz Michael
Martinez Mozos Oscar
Marton Zoltan-Csaba
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

In this article, we investigate how autonomous robots can exploit the high quality information already available from the WWW concerning 3-D models of office furniture. Apart from the hobbyist effort in Google 3-D Warehouse, many companies providing office furnishings already have the models for considerable portions of the objects found in our workplaces and homes. In particular, we present an approach that allows a robot to learn generic models of typical office furniture using examples found in the Web. These generic models are then used by the robot to locate and categorize unknown furniture in real indoor environments

University of Lincoln Institutional Repository

CiteSeerX

High-resolution optical and SAR image fusion for building database updating

Author: Inglada Jordi
Marthon Philippe
Poulain Vincent
Spigai Marc
Tourneret Jean-Yves
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

This paper addresses the issue of cartographic database (DB) creation or updating using high-resolution synthetic aperture radar and optical images. In cartographic applications, objects of interest are mainly buildings and roads. This paper proposes a processing chain to create or update building DBs. The approach is composed of two steps. First, if a DB is available, the presence of each DB object is checked in the images. Then, we verify if objects coming from an image segmentation should be included in the DB. To do those two steps, relevant features are extracted from images in the neighborhood of the considered object. The object removal/inclusion in the DB is based on a score obtained by the fusion of features in the framework of Dempster–Shafer evidence theory

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

HAL-INSU

HAL-IRD

iPose: Instance-Aware 6D Pose Estimation of Partly Occluded Objects

Author: A Tejani
D Huttenlocher
E Brachmann
N Silberman
S Hinterstoisser
S Hinterstoisser
S Hinterstoisser
V Lepetit
W Kabsch
W Kehl
W Liu
XS Gao
Y Konishi
Publication venue
Publication date: 18/06/2018
Field of study

We address the task of 6D pose estimation of known rigid objects from single input images in scenarios where the objects are partly occluded. Recent RGB-D-based methods are robust to moderate degrees of occlusion. For RGB inputs, no previous method works well for partly occluded objects. Our main contribution is to present the first deep learning-based system that estimates accurate poses for partly occluded objects from RGB-D and RGB input. We achieve this with a new instance-aware pipeline that decomposes 6D object pose estimation into a sequence of simpler steps, where each step removes specific aspects of the problem. The first step localizes all known objects in the image using an instance segmentation network, and hence eliminates surrounding clutter and occluders. The second step densely maps pixels to 3D object surface positions, so called object coordinates, using an encoder-decoder network, and hence eliminates object appearance. The third, and final, step predicts the 6D pose using geometric optimization. We demonstrate that we significantly outperform the state-of-the-art for pose estimation of partly occluded objects for both RGB and RGB-D input

arXiv.org e-Print Archive

Crossref

Learning Aerial Image Segmentation from Online Maps

Author: Hofmann Thomas
Jaggi Martin
Kaiser Pascal
Lucchi Aurelien
Schindler Konrad
Wegner Jan Dirk
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/07/2017
Field of study

This study deals with semantic segmentation of high-resolution (aerial) images where a semantic class label is assigned to each pixel via supervised classification as a basis for automatic map generation. Recently, deep convolutional neural networks (CNNs) have shown impressive performance and have quickly become the de-facto standard for semantic segmentation, with the added benefit that task-specific feature design is no longer necessary. However, a major downside of deep learning methods is that they are extremely data-hungry, thus aggravating the perennial bottleneck of supervised classification, to obtain enough annotated training data. On the other hand, it has been observed that they are rather robust against noise in the training labels. This opens up the intriguing possibility to avoid annotating huge amounts of training data, and instead train the classifier from existing legacy data or crowd-sourced maps which can exhibit high levels of noise. The question addressed in this paper is: can training with large-scale, publicly available labels replace a substantial part of the manual labeling effort and still achieve sufficient performance? Such data will inevitably contain a significant portion of errors, but in return virtually unlimited quantities of it are available in larger parts of the world. We adapt a state-of-the-art CNN architecture for semantic segmentation of buildings and roads in aerial images, and compare its performance when using different training data sets, ranging from manually labeled, pixel-accurate ground truth of the same city to automatic training data derived from OpenStreetMap data from distant locations. We report our results that indicate that satisfying performance can be obtained with significantly less manual annotation effort, by exploiting noisy large-scale training data.Comment: Published in IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSIN

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Vision systems with the human in the loop

Author: Bauckhage Christian
Hanheide Marc
Kaster Thomas
Pfeiffer Michael
Sagerer Gerhard
Wrede Sebastian
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2005
Field of study

The emerging cognitive vision paradigm deals with vision systems that apply machine learning and automatic reasoning in order to learn from what they perceive. Cognitive vision systems can rate the relevance and consistency of newly acquired knowledge, they can adapt to their environment and thus will exhibit high robustness. This contribution presents vision systems that aim at flexibility and robustness. One is tailored for content-based image retrieval, the others are cognitive vision systems that constitute prototypes of visual active memories which evaluate, gather, and integrate contextual knowledge for visual analysis. All three systems are designed to interact with human users. After we will have discussed adaptive content-based image retrieval and object and action recognition in an office environment, the issue of assessing cognitive systems will be raised. Experiences from psychologically evaluated human-machine interactions will be reported and the promising potential of psychologically-based usability experiments will be stressed

University of Lincoln Institutional Repository

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Publications at Bielefeld University