7 research outputs found
Robust and large-scale quasiconvex programming in structure-from-motion
Structure-from-Motion (SfM) is a cornerstone of computer vision. Briefly speaking,
SfM is the task of simultaneously estimating the poses of the cameras behind a set of images of a
scene, and the 3D coordinates of the points in the scene.
Often, the optimisation problems that underpin SfM do not have closed-form solutions, and finding
solutions via numerical schemes is necessary. An objective function, which measures the discrepancy
of a geometric object (e.g., camera poses, rotations, 3D coordi- nates) with a set of image
measurements, is to be minimised. Each image measurement gives rise to an error function. For
example, the reprojection error, which measures the distance between an observed image point and
the projection of a 3D point onto the image, is a commonly used error function.
An influential optimisation paradigm in SfM is the ℓ₀₀ paradigm, where the objective function takes
the form of the maximum of all individual error functions (e.g. individual reprojection errors of
scene points). The benefit of the ℓ₀₀ paradigm is that the objective function of many SfM
optimisation problems become quasiconvex, hence there is a unique minimum in the objective
function. The task of formulating and minimising quasiconvex objective functions is called
quasiconvex programming.
Although tremendous progress in SfM techniques under the ℓ₀₀ paradigm has been made, there are still
unsatisfactorily solved problems, specifically, problems associated with large-scale input data and
outliers in the data. This thesis describes novel techniques to
tackle these problems.
A major weakness of the ℓ₀₀ paradigm is its susceptibility to outliers. This thesis improves the
robustness of ℓ₀₀ solutions against outliers by employing the least median of squares (LMS)
criterion, which amounts to minimising the median error. In the context of triangulation, this
thesis proposes a locally convergent robust algorithm underpinned by a novel quasiconvex plane
sweep technique. Imposing the LMS criterion achieves significant outlier tolerance, and, at the
same time, some properties of quasiconvexity greatly simplify the process of solving the LMS
problem.
Approximation is a commonly used technique to tackle large-scale input data. This thesis introduces
the coreset technique to quasiconvex programming problems. The coreset technique aims find a
representative subset of the input data, such that solving the same problem on the subset yields a
solution that is within known bound of the optimal solution on the complete input set. In
particular, this thesis develops a coreset approximate algorithm to handle large-scale
triangulation tasks.
Another technique to handle large-scale input data is to break the optimisation into multiple
smaller sub-problems. Such a decomposition usually speeds up the overall optimisation process,
and alleviates the limitation on memory. This thesis develops a large-scale optimisation algorithm
for the known rotation problem (KRot). The proposed method decomposes the original quasiconvex
programming problem with potentially hundreds of thousands of parameters into multiple sub-problems
with only three parameters each. An efficient solver based on a novel minimum enclosing ball
technique is proposed to solve the sub-problems.Thesis (Ph.D.) (Research by Publication) -- University of Adelaide, School of Computer Science, 201
Large bichromatic point sets admit empty monochromatic 4-gons
We consider a variation of a problem stated by Erd˝os
and Szekeres in 1935 about the existence of a number
fES(k) such that any set S of at least fES(k) points in
general position in the plane has a subset of k points
that are the vertices of a convex k-gon. In our setting
the points of S are colored, and we say that a (not necessarily
convex) spanned polygon is monochromatic if
all its vertices have the same color. Moreover, a polygon
is called empty if it does not contain any points of
S in its interior. We show that any bichromatic set of
n ≥ 5044 points in R2 in general position determines
at least one empty, monochromatic quadrilateral (and
thus linearly many).Postprint (published version
Combinatorial Solutions for Shape Optimization in Computer Vision
This thesis aims at solving so-called shape optimization problems, i.e. problems where the shape of some real-world entity is sought, by applying combinatorial algorithms. I present several advances in this field, all of them based on energy minimization. The addressed problems will become more intricate in the course of the thesis, starting from problems that are solved globally, then turning to problems where so far no global solutions are known. The first two chapters treat segmentation problems where the considered grouping criterion is directly derived from the image data. That is, the respective data terms do not involve any parameters to estimate. These problems will be solved globally. The first of these chapters treats the problem of unsupervised image segmentation where apart from the image there is no other user input. Here I will focus on a contour-based method and show how to integrate curvature regularity into a ratio-based optimization framework. The arising optimization problem is reduced to optimizing over the cycles in a product graph. This problem can be solved globally in polynomial, effectively linear time. As a consequence, the method does not depend on initialization and translational invariance is achieved. This is joint work with Daniel Cremers and Simon Masnou. I will then proceed to the integration of shape knowledge into the framework, while keeping translational invariance. This problem is again reduced to cycle-finding in a product graph. Being based on the alignment of shape points, the method actually uses a more sophisticated shape measure than most local approaches and still provides global optima. It readily extends to tracking problems and allows to solve some of them in real-time. I will present an extension to highly deformable shape models which can be included in the global optimization framework. This method simultaneously allows to decompose a shape into a set of deformable parts, based only on the input images. This is joint work with Daniel Cremers. In the second part segmentation is combined with so-called correspondence problems, i.e. the underlying grouping criterion is now based on correspondences that have to be inferred simultaneously. That is, in addition to inferring the shapes of objects, one now also tries to put into correspondence the points in several images. The arising problems become more intricate and are no longer optimized globally. This part is divided into two chapters. The first chapter treats the topic of real-time motion segmentation where objects are identified based on the observations that the respective points in the video will move coherently. Rather than pre-estimating motion, a single energy functional is minimized via alternating optimization. The main novelty lies in the real-time capability, which is achieved by exploiting a fast combinatorial segmentation algorithm. The results are furthermore improved by employing a probabilistic data term. This is joint work with Daniel Cremers. The final chapter presents a method for high resolution motion layer decomposition and was developed in combination with Daniel Cremers and Thomas Pock. Layer decomposition methods support the notion of a scene model, which allows to model occlusion and enforce temporal consistency. The contributions are twofold: from a practical point of view the proposed method allows to recover fine-detailed layer images by minimizing a single energy. This is achieved by integrating a super-resolution method into the layer decomposition framework. From a theoretical viewpoint the proposed method introduces layer-based regularity terms as well as a graph cut-based scheme to solve for the layer domains. The latter is combined with powerful continuous convex optimization techniques into an alternating minimization scheme. Lastly I want to mention that a significant part of this thesis is devoted to the recent trend of exploiting parallel architectures, in particular graphics cards: many combinatorial algorithms are easily parallelized. In Chapter 3 we will see a case where the standard algorithm is hard to parallelize, but easy for the respective problem instances