Search CORE

7,762 research outputs found

Contextual Analysis of Textured Scene Images

Author
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2006
Field of study

Where and Who? Automatic Semantic-Aware Person Composition

Author: Barnes Connelly
Bernier Crispin
Cohen Benjamin
Ordonez Vicente
Tan Fuwen
Publication venue
Publication date: 02/12/2017
Field of study

Image compositing is a method used to generate realistic yet fake imagery by inserting contents from one image to another. Previous work in compositing has focused on improving appearance compatibility of a user selected foreground segment and a background image (i.e. color and illumination consistency). In this work, we instead develop a fully automated compositing model that additionally learns to select and transform compatible foreground segments from a large collection given only an input image background. To simplify the task, we restrict our problem by focusing on human instance composition, because human segments exhibit strong correlations with their background and because of the availability of large annotated data. We develop a novel branching Convolutional Neural Network (CNN) that jointly predicts candidate person locations given a background image. We then use pre-trained deep feature representations to retrieve person instances from a large segment database. Experimental results show that our model can generate composite images that look visually convincing. We also develop a user interface to demonstrate the potential application of our method.Comment: 10 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Matterport3D: Learning from RGB-D Data in Indoor Environments

Author: Chang Angel
Dai Angela
Funkhouser Thomas
Halber Maciej
Nießner Matthias
Savva Manolis
Song Shuran
Zeng Andy
Zhang Yinda
Publication venue
Publication date: 01/01/2017
Field of study

Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms. However, existing datasets still cover only a limited number of views or a restricted scale of spaces. In this paper, we introduce Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided with surface reconstructions, camera poses, and 2D and 3D semantic segmentations. The precise global alignment and comprehensive, diverse panoramic set of views over entire buildings enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Recovering 6D Object Pose: A Review and Multi-modal Analysis

Author: A Tejani
C Sahin
D Hoiem
E Brachmann
H Azizpour
M Everingham
M Everingham
MY Liu
N Correll
O Russakovsky
S Hinterstoisser
S Hinterstoisser
T Hodaň
U Bonde
W Kehl
Publication venue
Publication date: 15/08/2018
Field of study

A large number of studies analyse object detection and pose estimation at visual level in 2D, discussing the effects of challenges such as occlusion, clutter, texture, etc., on the performances of the methods, which work in the context of RGB modality. Interpreting the depth data, the study in this paper presents thorough multi-modal analyses. It discusses the above-mentioned challenges for full 6D object pose estimation in RGB-D images comparing the performances of several 6D detectors in order to answer the following questions: What is the current position of the computer vision community for maintaining "automation" in robotic manipulation? What next steps should the community take for improving "autonomy" in robotics while handling objects? Our findings include: (i) reasonably accurate results are obtained on textured-objects at varying viewpoints with cluttered backgrounds. (ii) Heavy existence of occlusion and clutter severely affects the detectors, and similar-looking distractors is the biggest challenge in recovering instances' 6D. (iii) Template-based methods and random forest-based learning algorithms underlie object detection and 6D pose estimation. Recent paradigm is to learn deep discriminative feature representations and to adopt CNNs taking RGB images as input. (iv) Depending on the availability of large-scale 6D annotated depth datasets, feature representations can be learnt on these datasets, and then the learnt representations can be customized for the 6D problem

arXiv.org e-Print Archive

Crossref

Matching Local Invariant Features with Contextual Information : an Experimental Evaluation

Author: Janaqi Stefan
Montesinos Philippe
Sidibe Desire
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2008
Field of study

The main advantage of using local invariant features is their local character which yields robustness to occlusion and varying background. Therefore, local features have proved to be a powerful tool for finding correspondences between images, and have been employed in many applications. However, the local character limits the descriptive capability of features descriptors, and local features fail to resolve ambiguities that can occur when an image shows multiple similar regions. Considering some global information will clearly help to achieve better performances. The question is which information to use and how to use it. Context can be used to enrich the description of the features, or used in the matching step to filter out mismatches. In this paper, we compare different recent methods which use context for matching and show that better results are obtained if contextual information is used during the matching process. We evaluate the methods in two applications: wide baseline matching and object recognition, and it appears that a relaxation based approach gives the best results

HAL-UJM

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Revistes Catalanes amb Accés Obert

Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)

Diposit Digital de Documents de la UAB