4,657 research outputs found
Geometric Multi-Model Fitting with a Convex Relaxation Algorithm
We propose a novel method to fit and segment multi-structural data via convex
relaxation. Unlike greedy methods --which maximise the number of inliers-- this
approach efficiently searches for a soft assignment of points to models by
minimising the energy of the overall classification. Our approach is similar to
state-of-the-art energy minimisation techniques which use a global energy.
However, we deal with the scaling factor (as the number of models increases) of
the original combinatorial problem by relaxing the solution. This relaxation
brings two advantages: first, by operating in the continuous domain we can
parallelize the calculations. Second, it allows for the use of different
metrics which results in a more general formulation.
We demonstrate the versatility of our technique on two different problems of
estimating structure from images: plane extraction from RGB-D data and
homography estimation from pairs of images. In both cases, we report accurate
results on publicly available datasets, in most of the cases outperforming the
state-of-the-art
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
DDL-MVS: Depth Discontinuity Learning for MVS Networks
Traditional MVS methods have good accuracy but struggle with completeness,
while recently developed learning-based multi-view stereo (MVS) techniques have
improved completeness except accuracy being compromised. We propose depth
discontinuity learning for MVS methods, which further improves accuracy while
retaining the completeness of the reconstruction. Our idea is to jointly
estimate the depth and boundary maps where the boundary maps are explicitly
used for further refinement of the depth maps. We validate our idea and
demonstrate that our strategies can be easily integrated into the existing
learning-based MVS pipeline where the reconstruction depends on high-quality
depth map estimation. Extensive experiments on various datasets show that our
method improves reconstruction quality compared to baseline. Experiments also
demonstrate that the presented model and strategies have good generalization
capabilities. The source code will be available soon
Temporally coherent 4D reconstruction of complex dynamic scenes
This paper presents an approach for reconstruction of 4D temporally coherent
models of complex dynamic scenes. No prior knowledge is required of scene
structure or camera calibration allowing reconstruction from multiple moving
cameras. Sparse-to-dense temporal correspondence is integrated with joint
multi-view segmentation and reconstruction to obtain a complete 4D
representation of static and dynamic objects. Temporal coherence is exploited
to overcome visual ambiguities resulting in improved reconstruction of complex
scenes. Robust joint segmentation and reconstruction of dynamic objects is
achieved by introducing a geodesic star convexity constraint. Comparative
evaluation is performed on a variety of unstructured indoor and outdoor dynamic
scenes with hand-held cameras and multiple people. This demonstrates
reconstruction of complete temporally coherent 4D scene models with improved
nonrigid object segmentation and shape reconstruction.Comment: To appear in The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 2016 . Video available at:
https://www.youtube.com/watch?v=bm_P13_-Ds
Second-order Shape Optimization for Geometric Inverse Problems in Vision
We develop a method for optimization in shape spaces, i.e., sets of surfaces
modulo re-parametrization. Unlike previously proposed gradient flows, we
achieve superlinear convergence rates through a subtle approximation of the
shape Hessian, which is generally hard to compute and suffers from a series of
degeneracies. Our analysis highlights the role of mean curvature motion in
comparison with first-order schemes: instead of surface area, our approach
penalizes deformation, either by its Dirichlet energy or total variation.
Latter regularizer sparks the development of an alternating direction method of
multipliers on triangular meshes. Therein, a conjugate-gradients solver enables
us to bypass formation of the Gaussian normal equations appearing in the course
of the overall optimization. We combine all of the aforementioned ideas in a
versatile geometric variation-regularized Levenberg-Marquardt-type method
applicable to a variety of shape functionals, depending on intrinsic properties
of the surface such as normal field and curvature as well as its embedding into
space. Promising experimental results are reported
Depth-Assisted Semantic Segmentation, Image Enhancement and Parametric Modeling
This dissertation addresses the problem of employing 3D depth information on solving a number of traditional challenging computer vision/graphics problems. Humans have the abilities of perceiving the depth information in 3D world, which enable humans to reconstruct layouts, recognize objects and understand the geometric space and semantic meanings of the visual world. Therefore it is significant to explore how the 3D depth information can be utilized by computer vision systems to mimic such abilities of humans. This dissertation aims at employing 3D depth information to solve vision/graphics problems in the following aspects: scene understanding, image enhancements and 3D reconstruction and modeling.
In addressing scene understanding problem, we present a framework for semantic segmentation and object recognition on urban video sequence only using dense depth maps recovered from the video. Five view-independent 3D features that vary with object class are extracted from dense depth maps and used for segmenting and recognizing different object classes in street scene images. We demonstrate a scene parsing algorithm that uses only dense 3D depth information to outperform using sparse 3D or 2D appearance features.
In addressing image enhancement problem, we present a framework to overcome the imperfections of personal photographs of tourist sites using the rich information provided by large-scale internet photo collections (IPCs). By augmenting personal 2D images with 3D information reconstructed from IPCs, we address a number of traditionally challenging image enhancement techniques and achieve high-quality results using simple and robust algorithms.
In addressing 3D reconstruction and modeling problem, we focus on parametric modeling of flower petals, the most distinctive part of a plant. The complex structure, severe occlusions and wide variations make the reconstruction of their 3D models a challenging task. We overcome these challenges by combining data driven modeling techniques with domain knowledge from botany. Taking a 3D point cloud of an input flower scanned from a single view, each segmented petal is fitted with a scale-invariant morphable petal shape model, which is constructed from individually scanned 3D exemplar petals. Novel constraints based on botany studies are incorporated into the fitting process for realistically reconstructing occluded regions and maintaining correct 3D spatial relations.
The main contribution of the dissertation is in the intelligent usage of 3D depth information on solving traditional challenging vision/graphics problems. By developing some advanced algorithms either automatically or with minimum user interaction, the goal of this dissertation is to demonstrate that computed 3D depth behind the multiple images contains rich information of the visual world and therefore can be intelligently utilized to recognize/ understand semantic meanings of scenes, efficiently enhance and augment single 2D images, and reconstruct high-quality 3D models
- …