12,446 research outputs found
Proposal Flow: Semantic Correspondences from Object Proposals
Finding image correspondences remains a challenging problem in the presence
of intra-class variations and large changes in scene layout. Semantic flow
methods are designed to handle images depicting different instances of the same
object or scene category. We introduce a novel approach to semantic flow,
dubbed proposal flow, that establishes reliable correspondences using object
proposals. Unlike prevailing semantic flow approaches that operate on pixels or
regularly sampled local regions, proposal flow benefits from the
characteristics of modern object proposals, that exhibit high repeatability at
multiple scales, and can take advantage of both local and geometric consistency
constraints among proposals. We also show that the corresponding sparse
proposal flow can effectively be transformed into a conventional dense flow
field. We introduce two new challenging datasets that can be used to evaluate
both general semantic flow techniques and region-based approaches such as
proposal flow. We use these benchmarks to compare different matching
algorithms, object proposals, and region features within proposal flow, to the
state of the art in semantic flow. This comparison, along with experiments on
standard datasets, demonstrates that proposal flow significantly outperforms
existing semantic flow methods in various settings.Comment: arXiv admin note: text overlap with arXiv:1511.0506
Loop Closure Detection Based on Object-level Spatial Layout and Semantic Consistency
Visual simultaneous localization and mapping (SLAM) systems face challenges
in detecting loop closure under the circumstance of large viewpoint changes. In
this paper, we present an object-based loop closure detection method based on
the spatial layout and semanic consistency of the 3D scene graph. Firstly, we
propose an object-level data association approach based on the semantic
information from semantic labels, intersection over union (IoU), object color,
and object embedding. Subsequently, multi-view bundle adjustment with the
associated objects is utilized to jointly optimize the poses of objects and
cameras. We represent the refined objects as a 3D spatial graph with semantics
and topology. Then, we propose a graph matching approach to select
correspondence objects based on the structure layout and semantic property
similarity of vertices' neighbors. Finally, we jointly optimize camera
trajectories and object poses in an object-level pose graph optimization, which
results in a globally consistent map. Experimental results demonstrate that our
proposed data association approach can construct more accurate 3D semantic
maps, and our loop closure method is more robust than point-based and
object-based methods in circumstances with large viewpoint changes
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
InLoc: Indoor Visual Localization with Dense Matching and View Synthesis
We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph
with respect to a large indoor 3D map. The contributions of this work are
three-fold. First, we develop a new large-scale visual localization method
targeted for indoor environments. The method proceeds along three steps: (i)
efficient retrieval of candidate poses that ensures scalability to large-scale
environments, (ii) pose estimation using dense matching rather than local
features to deal with textureless indoor scenes, and (iii) pose verification by
virtual view synthesis to cope with significant changes in viewpoint, scene
layout, and occluders. Second, we collect a new dataset with reference 6DoF
poses for large-scale indoor localization. Query photographs are captured by
mobile phones at a different time than the reference 3D map, thus presenting a
realistic indoor localization scenario. Third, we demonstrate that our method
significantly outperforms current state-of-the-art indoor localization
approaches on this new challenging data
Blending Learning and Inference in Structured Prediction
In this paper we derive an efficient algorithm to learn the parameters of
structured predictors in general graphical models. This algorithm blends the
learning and inference tasks, which results in a significant speedup over
traditional approaches, such as conditional random fields and structured
support vector machines. For this purpose we utilize the structures of the
predictors to describe a low dimensional structured prediction task which
encourages local consistencies within the different structures while learning
the parameters of the model. Convexity of the learning task provides the means
to enforce the consistencies between the different parts. The
inference-learning blending algorithm that we propose is guaranteed to converge
to the optimum of the low dimensional primal and dual programs. Unlike many of
the existing approaches, the inference-learning blending allows us to learn
efficiently high-order graphical models, over regions of any size, and very
large number of parameters. We demonstrate the effectiveness of our approach,
while presenting state-of-the-art results in stereo estimation, semantic
segmentation, shape reconstruction, and indoor scene understanding
- …