931 research outputs found
Linear Global Translation Estimation with Feature Tracks
This paper derives a novel linear position constraint for cameras seeing a
common scene point, which leads to a direct linear method for global camera
translation estimation. Unlike previous solutions, this method deals with
collinear camera motion and weak image association at the same time. The final
linear formulation does not involve the coordinates of scene points, which
makes it efficient even for large scale data. We solve the linear equation
based on norm, which makes our system more robust to outliers in
essential matrices and feature correspondences. We experiment this method on
both sequentially captured images and unordered Internet images. The
experiments demonstrate its strength in robustness, accuracy, and efficiency.Comment: Changes: 1. Adopt BMVC2015 style; 2. Combine sections 3 and 5; 3.
Move "Evaluation on synthetic data" out to supplementary file; 4. Divide
subsection "Evaluation on general data" to subsections "Experiment on
sequential data" and "Experiment on unordered Internet data"; 5. Change Fig.
1 and Fig.8; 6. Move Fig. 6 and Fig. 7 to supplementary file; 7 Change some
symbols; 8. Correct some typo
Point triangulation through polyhedron collapse using the l∞ norm
Multi-camera triangulation of feature points based on a minimisation of the overall l(2) reprojection error can get stuck in suboptimal local minima or require slow global optimisation. For this reason, researchers have proposed optimising the l(infinity) norm of the l(2) single view reprojection errors, which avoids the problem of local minima entirely. In this paper we present a novel method for l(infinity) triangulation that minimizes the l(infinity) norm of the l(infinity) reprojection errors: this apparently small difference leads to a much faster but equally accurate solution which is related to the MLE under the assumption of uniform noise. The proposed method adopts a new optimisation strategy based on solving simple quadratic equations. This stands in contrast with the fastest existing methods, which solve a sequence of more complex auxiliary Linear Programming or Second Order Cone Problems. The proposed algorithm performs well: for triangulation, it achieves the same accuracy as existing techniques while executing faster and being straightforward to implement
Robust and large-scale quasiconvex programming in structure-from-motion
Structure-from-Motion (SfM) is a cornerstone of computer vision. Briefly speaking,
SfM is the task of simultaneously estimating the poses of the cameras behind a set of images of a
scene, and the 3D coordinates of the points in the scene.
Often, the optimisation problems that underpin SfM do not have closed-form solutions, and finding
solutions via numerical schemes is necessary. An objective function, which measures the discrepancy
of a geometric object (e.g., camera poses, rotations, 3D coordi- nates) with a set of image
measurements, is to be minimised. Each image measurement gives rise to an error function. For
example, the reprojection error, which measures the distance between an observed image point and
the projection of a 3D point onto the image, is a commonly used error function.
An influential optimisation paradigm in SfM is the ℓ₀₀ paradigm, where the objective function takes
the form of the maximum of all individual error functions (e.g. individual reprojection errors of
scene points). The benefit of the ℓ₀₀ paradigm is that the objective function of many SfM
optimisation problems become quasiconvex, hence there is a unique minimum in the objective
function. The task of formulating and minimising quasiconvex objective functions is called
quasiconvex programming.
Although tremendous progress in SfM techniques under the ℓ₀₀ paradigm has been made, there are still
unsatisfactorily solved problems, specifically, problems associated with large-scale input data and
outliers in the data. This thesis describes novel techniques to
tackle these problems.
A major weakness of the ℓ₀₀ paradigm is its susceptibility to outliers. This thesis improves the
robustness of ℓ₀₀ solutions against outliers by employing the least median of squares (LMS)
criterion, which amounts to minimising the median error. In the context of triangulation, this
thesis proposes a locally convergent robust algorithm underpinned by a novel quasiconvex plane
sweep technique. Imposing the LMS criterion achieves significant outlier tolerance, and, at the
same time, some properties of quasiconvexity greatly simplify the process of solving the LMS
problem.
Approximation is a commonly used technique to tackle large-scale input data. This thesis introduces
the coreset technique to quasiconvex programming problems. The coreset technique aims find a
representative subset of the input data, such that solving the same problem on the subset yields a
solution that is within known bound of the optimal solution on the complete input set. In
particular, this thesis develops a coreset approximate algorithm to handle large-scale
triangulation tasks.
Another technique to handle large-scale input data is to break the optimisation into multiple
smaller sub-problems. Such a decomposition usually speeds up the overall optimisation process,
and alleviates the limitation on memory. This thesis develops a large-scale optimisation algorithm
for the known rotation problem (KRot). The proposed method decomposes the original quasiconvex
programming problem with potentially hundreds of thousands of parameters into multiple sub-problems
with only three parameters each. An efficient solver based on a novel minimum enclosing ball
technique is proposed to solve the sub-problems.Thesis (Ph.D.) (Research by Publication) -- University of Adelaide, School of Computer Science, 201
Sparse learning approach to the problem of robust estimation of camera locations
International audienceIn this paper, we propose a new approach--inspired by the recent advances in the theory of sparse learning-- to the problem of estimating camera locations when the internal parameters and the orientations of the cameras are known. Our estimator is defined as a Bayesian maximum a posteriori with multivariate Laplace prior on the vector describing the outliers. This leads to an estimator in which the fidelity to the data is measured by the L∞-norm while the regularization is done by the L1 -norm. Building on the papers [11, 15, 16, 14, 21, 22, 24, 18, 23] for L∞ -norm minimization in multiview geometry and, on the other hand, on the papers [8, 4, 7, 2, 1, 3] for sparse recovery in statistical framework, we propose a two-step procedure which, at the first step, identifies and removes the outliers and, at the second step, estimates the unknown parameters by minimizing the L∞ cost function. Both steps are fairly fast: the outlierremoval is done by solving one linear program (LP), while the final estimation is performed by a sequence of LPs. An important difference compared to many existing algorithms is that for our estimator it is not necessary to specify neither the number nor the proportion of the outliers
An adversarial optimization approach to efficient outlier removal
This paper proposes a novel adversarial optimization approach to efficient outlier removal in computer vision. We characterize the outlier removal problem as a game that involves two players of conflicting interests, namely, optimizer and outlier. Such an adversarial view not only brings new insights into various existing methods, but also gives rise to a general optimization framework that provably unifies them. Under the proposed framework, we develop a new outlier removal approach that is able to offer a much needed control over the trade-off between reliability and speed, which is otherwise not available in previous methods. The proposed approach is driven by a mixed-integer minmax (convex-concave) optimization process. Although a minmax problem is generally not amenable to efficient optimization, we show that for some commonly used vision objective functions, an equivalent Linear Program reformulation exists. We demonstrate our method on two representative multiview geometry problems. Experiments on real image data illustrate superior practical performance of our method over recent techniques.Jin Yu, Anders Eriksson, Tat-Jun Chin, David Suterhttp://www.iccv2011.org
New algorithmic developments in maximum consensus robust fitting
In many computer vision applications, the task of robustly estimating the set of parameters of
a geometric model is a fundamental problem. Despite the longstanding research efforts on robust
model fitting, there remains significant scope for investigation. For a large number of geometric
estimation tasks in computer vision, maximum consensus is the most popular robust fitting
criterion. This thesis makes several contributions in the algorithms for consensus maximization.
Randomized hypothesize-and-verify algorithms are arguably the most widely used class of
techniques for robust estimation thanks to their simplicity. Though efficient, these randomized
heuristic methods do not guarantee finding good maximum consensus estimates. To improve the
randomize algorithms, guided sampling approaches have been developed. These methods take
advantage of additional domain information, such as descriptor matching scores, to guide the
sampling process. Subsets of the data that are more likely to result in good estimates are prioritized
for consideration. However, these guided sampling approaches are ineffective when good
domain information is not available. This thesis tackles this shortcoming by proposing a new
guided sampling algorithm, which is based on the class of LP-type problems and Monte Carlo
Tree Search (MCTS). The proposed algorithm relies on a fundamental geometric arrangement
of the data to guide the sampling process. Specifically, we take advantage of the underlying tree
structure of the maximum consensus problem and apply MCTS to efficiently search the tree.
Empirical results show that the new guided sampling strategy outperforms traditional randomized
methods.
Consensus maximization also plays a key role in robust point set registration. A special case
is the registration of deformable shapes. If the surfaces have the same intrinsic shapes, their
deformations can be described accurately by a conformal model. The uniformization theorem
allows the shapes to be conformally mapped onto a canonical domain, wherein the shapes can be
aligned using a M¨obius transformation. The problem of correspondence-free M¨obius alignment
of two sets of noisy and partially overlapping point sets can be tackled as a maximum consensus
problem. Solving for the M¨obius transformation can be approached by randomized voting-type
methods which offers no guarantee of optimality. Local methods such as Iterative Closest Point
can be applied, but with the assumption that a good initialization is given or these techniques
may converge to a bad local minima. When a globally optimal solution is required, the literature
has so far considered only brute-force search. This thesis contributes a new branch-and-bound
algorithm that solves for the globally optimal M¨obius transformation much more efficiently.
So far, the consensus maximization problems are approached mainly by randomized algorithms,
which are efficient but offer no analytical convergence guarantee. On the other hand,
there exist exact algorithms that can solve the problem up to global optimality. The global methods,
however, are intractable in general due to the NP-hardness of the consensus maximization. To fill the gap between the two extremes, this thesis contributes two novel deterministic algorithms
to approximately optimize the maximum consensus criterion. The first method is based
on non-smooth penalization supported by a Frank-Wolfe-style optimization scheme, and another
algorithm is based on Alternating Direction Method of Multipliers (ADMM). Both of the
proposed methods are capable of handling the non-linear geometric residuals commonly used in
computer vision. As will be demonstrated, our proposed methods consistently outperform other
heuristics and approximate methods.Thesis (Ph.D.) (Research by Publication) -- University of Adelaide, School of Computer Science, 201
Keyframe-based monocular SLAM: design, survey, and future directions
Extensive research in the field of monocular SLAM for the past fifteen years
has yielded workable systems that found their way into various applications in
robotics and augmented reality. Although filter-based monocular SLAM systems
were common at some time, the more efficient keyframe-based solutions are
becoming the de facto methodology for building a monocular SLAM system. The
objective of this paper is threefold: first, the paper serves as a guideline
for people seeking to design their own monocular SLAM according to specific
environmental constraints. Second, it presents a survey that covers the various
keyframe-based monocular SLAM systems in the literature, detailing the
components of their implementation, and critically assessing the specific
strategies made in each proposed solution. Third, the paper provides insight
into the direction of future research in this field, to address the major
limitations still facing monocular SLAM; namely, in the issues of illumination
changes, initialization, highly dynamic motion, poorly textured scenes,
repetitive textures, map maintenance, and failure recovery
- …