Search CORE

780 research outputs found

Synthesizing Training Data for Object Detection in Indoor Scenes

Author: Berg Alexander C.
Georgakis Georgios
Kosecka Jana
Mousavian Arsalan
Publication venue
Publication date: 07/09/2017
Field of study

Detection of objects in cluttered indoor environments is one of the key enabling functionalities for service robots. The best performing object detection approaches in computer vision exploit deep Convolutional Neural Networks (CNN) to simultaneously detect and categorize the objects of interest in cluttered scenes. Training of such models typically requires large amounts of annotated training data which is time consuming and costly to obtain. In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection. We superimpose 2D images of textured object models into images of real environments at variety of locations and scales. Our experiments evaluate different superimposition strategies ranging from purely image-based blending all the way to depth and semantics informed positioning of the object models into real scenes. We demonstrate the effectiveness of these object detector training strategies on two publicly available datasets, the GMU-Kitchens and the Washington RGB-D Scenes v2. As one observation, augmenting some hand-labeled training data with synthetic examples carefully composed onto scenes yields object detectors with comparable performance to using much more hand-labeled data. Broadly, this work charts new opportunities for training detectors for new objects by exploiting existing object model repositories in either a purely automatic fashion or with only a very small number of human-annotated examples.Comment: Added more experiments and link to project webpag

arXiv.org e-Print Archive

Crossref

DepthCut: Improved Depth Edge Estimation Using Multiple Unreliable Channels

Author: Guerrero Paul
Li Wilmot
Mitra Niloy J.
Winnemöller Holger
Publication venue
Publication date: 26/05/2017
Field of study

In the context of scene understanding, a variety of methods exists to estimate different information channels from mono or stereo images, including disparity, depth, and normals. Although several advances have been reported in the recent years for these tasks, the estimated information is often imprecise particularly near depth discontinuities or creases. Studies have however shown that precisely such depth edges carry critical cues for the perception of shape, and play important roles in tasks like depth-based segmentation or foreground selection. Unfortunately, the currently extracted channels often carry conflicting signals, making it difficult for subsequent applications to effectively use them. In this paper, we focus on the problem of obtaining high-precision depth edges (i.e., depth contours and creases) by jointly analyzing such unreliable information channels. We propose DepthCut, a data-driven fusion of the channels using a convolutional neural network trained on a large dataset with known depth. The resulting depth edges can be used for segmentation, decomposing a scene into depth layers with relatively flat depth, or improving the accuracy of the depth estimate near depth edges by constraining its gradients to agree with these edges. Quantitatively, we compare against 15 variants of baselines and demonstrate that our depth edges result in an improved segmentation performance and an improved depth estimate near depth edges compared to data-agnostic channel fusion. Qualitatively, we demonstrate that the depth edges result in superior segmentation and depth orderings.Comment: 12 page

arXiv.org e-Print Archive

UCL Discovery

A Brief Survey of Image-Based Depth Upsampling

Author: Csetverikov Dmitrij
Eichhardt Iván
Jankó Zsolt
Publication venue: [s. n.]
Publication date: 01/01/2015
Field of study

Recently, there has been remarkable growth of interest in the development and applications of Time-of-Flight (ToF) depth cameras. However, despite the permanent improvement of their characteristics, the practical applicability of ToF cameras is still limited by low resolution and quality of depth measurements. This has motivated many researchers to combine ToF cameras with other sensors in order to enhance and upsample depth images. In this paper, we compare ToF cameras to three image-based techniques for depth recovery, discuss the upsampling problem and survey the approaches that couple ToF depth images with high-resolution optical images. Other classes of upsampling methods are also mentioned

SZTAKI Publication Repository

Image-based Material Editing

Author: Khan Erum
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2006
Field of study

Photo editing software allows digital images to be blurred, warped or re-colored at the touch of a button. However, it is not currently possible to change the material appearance of an object except by painstakingly painting over the appropriate pixels. Here we present a set of methods for automatically replacing one material with another, completely different material, starting with only a single high dynamic range image, and an alpha matte specifying the object. Our approach exploits the fact that human vision is surprisingly tolerant of certain (sometimes enormous) physical inaccuracies. Thus, it may be possible to produce a visually compelling illusion of material transformations, without fully reconstructing the lighting or geometry. We employ a range of algorithms depending on the target material. First, an approximate depth map is derived from the image intensities using bilateral filters. The resulting surface normals are then used to map data onto the surface of the object to specify its material appearance. To create transparent or translucent materials, the mapped data are derived from the object\u27s background. To create textured materials, the mapped data are a texture map. The surface normals can also be used to apply arbitrary bidirectional reflectance distribution functions to the surface, allowing us to simulate a wide range of materials. To facilitate the process of material editing, we generate the HDR image with a novel algorithm, that is robust against noise in individual exposures. This ensures that any noise, which would possibly have affected the shape recovery of the objects adversely, will be removed. We also present an algorithm to automatically generate alpha mattes. This algorithm requires as input two images--one where the object is in focus, and one where the background is in focus--and then automatically produces an approximate matte, indicating which pixels belong to the object. The result is then improved by a second algorithm to generate an accurate alpha matte, which can be given as input to our material editing techniques

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Data-Driven Shape Analysis and Processing

Author: Huang Qixing
Kalogerakis Evangelos
Kim Vladimir G.
Xu Kai
Publication venue
Publication date: 23/02/2015
Field of study

Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

arXiv.org e-Print Archive

CiteSeerX

Sensing of complex buildings and reconstruction into photo-realistic 3D models

Author: Heredia Soriano F.J.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2012
Field of study

The 3D reconstruction of indoor and outdoor environments has received an interest only recently, as companies began to recognize that using reconstructed models is a way to generate revenue through location-based services and advertisements. A great amount of research has been done in the field of 3D reconstruction, and one of the latest and most promising applications is Kinect Fusion, which was developed by Microsoft Research. Its strong points are the real-time intuitive 3D reconstruction, interactive frame rate, the level of detail in the models, and the availability of the hardware and software for researchers and enthusiasts. A representative effort towards 3D reconstruction is the Point Cloud Library (PCL). PCL is a large scale, open project for 2D/3D image and point cloud processing. On December 2011, PCL made available an implementation of Kinect Fusion, namely KinFu. KinFu emulates the functionality provided in Kinect Fusion. However, both implementations have two major limitations: 1. The real-time reconstruction takes place only within a cube with a size of 3 meters per axis. The cube's position is fixed at the start of execution, and any object outside of this cube is not integrated into the reconstructed model. Therefore the volume that can be scanned is always limited by the size of the cube. It is possible to manually align many small-size cubes into a single large model, however this is a time-consuming and difficult task, especially when the meshes have complex topologies and high polygon count, as is the case with the meshes obtained from KinFu. 2. The output mesh does not have any color textures. There are some at-tempts to add color in the output point cloud; however, the resulting effect is not photo-realistic. Applying photo-realistic textures to a model can enhance the user experience, even when the model has a simple topology. The main goal of this project is to design and implement a system that captures large indoor environments and generates 3D photo-realistic large indoor models in real time. This report describes an extended version of the KinFu system. The extensions overcome the scalability and texture reconstruction limitations using commodity hardware and open-source software. The complete hardware setup used in this project is worth €2,000, which is comparable to the cost of a single professional laser scanner. The software is released under BSD license, which makes it completely free to use and commercialize. The system has been integrated into the open-source PCL project. The immediate benefits are three-fold: the system becomes a potential industry standard, it is maintained and extended by many developers around the world with no addition-al cost to the VCA group, and it can reduce the application development time by reusing numerous state-of-the-art algorithms

Repository TU/e

Pure OAI Repository

Automatic Reconstruction of Textured 3D Models

Author: Pitzer Benjamin
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

Three dimensional modeling and visualization of environments is an increasingly important problem. This work addresses the problem of automatic 3D reconstruction and we present a system for unsupervised reconstruction of textured 3D models in the context of modeling indoor environments. We present solutions to all aspects of the modeling process and an integrated system for the automatic creation of large scale 3D models

Directory of Open Access Books (DOAB)