3,329 research outputs found
Semantic 3D Reconstruction with Finite Element Bases
We propose a novel framework for the discretisation of multi-label problems
on arbitrary, continuous domains. Our work bridges the gap between general FEM
discretisations, and labeling problems that arise in a variety of computer
vision tasks, including for instance those derived from the generalised Potts
model. Starting from the popular formulation of labeling as a convex relaxation
by functional lifting, we show that FEM discretisation is valid for the most
general case, where the regulariser is anisotropic and non-metric. While our
findings are generic and applicable to different vision problems, we
demonstrate their practical implementation in the context of semantic 3D
reconstruction, where such regularisers have proved particularly beneficial.
The proposed FEM approach leads to a smaller memory footprint as well as faster
computation, and it constitutes a very simple way to enable variable, adaptive
resolution within the same model
Advances in Graph-Cut Optimization: Multi-Surface Models, Label Costs, and Hierarchical Costs
Computer vision is full of problems that are elegantly expressed in terms of mathematical optimization, or energy minimization. This is particularly true of low-level inference problems such as cleaning up noisy signals, clustering and classifying data, or estimating 3D points from images. Energies let us state each problem as a clear, precise objective function. Minimizing the correct energy would, hypothetically, yield a good solution to the corresponding problem. Unfortunately, even for low-level problems we are confronted by energies that are computationally hard—often NP-hard—to minimize. As a consequence, a rather large portion of computer vision research is dedicated to proposing better energies and better algorithms for energies. This dissertation presents work along the same line, specifically new energies and algorithms based on graph cuts.
We present three distinct contributions. First we consider biomedical segmentation where the object of interest comprises multiple distinct regions of uncertain shape (e.g. blood vessels, airways, bone tissue). We show that this common yet difficult scenario can be modeled as an energy over multiple interacting surfaces, and can be globally optimized by a single graph cut. Second, we introduce multi-label energies with label costs and provide algorithms to minimize them. We show how label costs are useful for clustering and robust estimation problems in vision. Third, we characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm with improved approximation guarantees. Hierarchical costs are natural for modeling an array of difficult problems, e.g. segmentation with hierarchical context, simultaneous estimation of motions and homographies, or detecting hierarchies of patterns
A Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds
This paper proposes a segmentation-free, automatic and efficient procedure to
detect general geometric quadric forms in point clouds, where clutter and
occlusions are inevitable. Our everyday world is dominated by man-made objects
which are designed using 3D primitives (such as planes, cones, spheres,
cylinders, etc.). These objects are also omnipresent in industrial
environments. This gives rise to the possibility of abstracting 3D scenes
through primitives, thereby positions these geometric forms as an integral part
of perception and high level 3D scene understanding.
As opposed to state-of-the-art, where a tailored algorithm treats each
primitive type separately, we propose to encapsulate all types in a single
robust detection procedure. At the center of our approach lies a closed form 3D
quadric fit, operating in both primal & dual spaces and requiring as low as 4
oriented-points. Around this fit, we design a novel, local null-space voting
strategy to reduce the 4-point case to 3. Voting is coupled with the famous
RANSAC and makes our algorithm orders of magnitude faster than its conventional
counterparts. This is the first method capable of performing a generic
cross-type multi-object primitive detection in difficult scenes. Results on
synthetic and real datasets support the validity of our method.Comment: Accepted for publication at CVPR 201
Semantic 3D Occupancy Mapping through Efficient High Order CRFs
Semantic 3D mapping can be used for many applications such as robot
navigation and virtual interaction. In recent years, there has been great
progress in semantic segmentation and geometric 3D mapping. However, it is
still challenging to combine these two tasks for accurate and large-scale
semantic mapping from images. In the paper, we propose an incremental and
(near) real-time semantic mapping system. A 3D scrolling occupancy grid map is
built to represent the world, which is memory and computationally efficient and
bounded for large scale environments. We utilize the CNN segmentation as prior
prediction and further optimize 3D grid labels through a novel CRF model.
Superpixels are utilized to enforce smoothness and form robust P N high order
potential. An efficient mean field inference is developed for the graph
optimization. We evaluate our system on the KITTI dataset and improve the
segmentation accuracy by 10% over existing systems.Comment: IROS 201
- …