37 research outputs found

    Robust and large-scale quasiconvex programming in structure-from-motion

    Get PDF
    Structure-from-Motion (SfM) is a cornerstone of computer vision. Briefly speaking, SfM is the task of simultaneously estimating the poses of the cameras behind a set of images of a scene, and the 3D coordinates of the points in the scene. Often, the optimisation problems that underpin SfM do not have closed-form solutions, and finding solutions via numerical schemes is necessary. An objective function, which measures the discrepancy of a geometric object (e.g., camera poses, rotations, 3D coordi- nates) with a set of image measurements, is to be minimised. Each image measurement gives rise to an error function. For example, the reprojection error, which measures the distance between an observed image point and the projection of a 3D point onto the image, is a commonly used error function. An influential optimisation paradigm in SfM is the ℓ₀₀ paradigm, where the objective function takes the form of the maximum of all individual error functions (e.g. individual reprojection errors of scene points). The benefit of the ℓ₀₀ paradigm is that the objective function of many SfM optimisation problems become quasiconvex, hence there is a unique minimum in the objective function. The task of formulating and minimising quasiconvex objective functions is called quasiconvex programming. Although tremendous progress in SfM techniques under the ℓ₀₀ paradigm has been made, there are still unsatisfactorily solved problems, specifically, problems associated with large-scale input data and outliers in the data. This thesis describes novel techniques to tackle these problems. A major weakness of the ℓ₀₀ paradigm is its susceptibility to outliers. This thesis improves the robustness of ℓ₀₀ solutions against outliers by employing the least median of squares (LMS) criterion, which amounts to minimising the median error. In the context of triangulation, this thesis proposes a locally convergent robust algorithm underpinned by a novel quasiconvex plane sweep technique. Imposing the LMS criterion achieves significant outlier tolerance, and, at the same time, some properties of quasiconvexity greatly simplify the process of solving the LMS problem. Approximation is a commonly used technique to tackle large-scale input data. This thesis introduces the coreset technique to quasiconvex programming problems. The coreset technique aims find a representative subset of the input data, such that solving the same problem on the subset yields a solution that is within known bound of the optimal solution on the complete input set. In particular, this thesis develops a coreset approximate algorithm to handle large-scale triangulation tasks. Another technique to handle large-scale input data is to break the optimisation into multiple smaller sub-problems. Such a decomposition usually speeds up the overall optimisation process, and alleviates the limitation on memory. This thesis develops a large-scale optimisation algorithm for the known rotation problem (KRot). The proposed method decomposes the original quasiconvex programming problem with potentially hundreds of thousands of parameters into multiple sub-problems with only three parameters each. An efficient solver based on a novel minimum enclosing ball technique is proposed to solve the sub-problems.Thesis (Ph.D.) (Research by Publication) -- University of Adelaide, School of Computer Science, 201

    Consensus Maximization: Theoretical Analysis and New Algorithms

    Get PDF
    The core of many computer vision systems is model fitting, which estimates a particular mathematical model given a set of input data. Due to the imperfection of the sensors, pre-processing steps and/or model assumptions, computer vision data usually contains outliers, which are abnormally distributed data points that can heavily reduce the accuracy of conventional model fitting methods. Robust fitting aims to make model fitting insensitive to outliers. Consensus maximization is one of the most popular paradigms for robust fitting, which is the main research subject of this thesis. Mathematically, consensus maximization is an optimization problem. To understand the theoretical hardness of this problem, a thorough analysis about its computational complexity is first conducted. Motivated by the theoretical analysis, novel techniques that improve different types of algorithms are then introduced. On one hand, an efficient and deterministic optimization approach is proposed. Unlike previous deterministic approaches, the proposed one does not rely on the relaxation of the original optimization problem. This property makes it much more effective at refining an initial solution. On the other hand, several techniques are proposed to significantly accelerate consensus maximization tree search. Tree search is one of the most efficient global optimization approaches for consensus maximization. Hence, the proposed techniques greatly improve the practicality of globally optimal consensus maximization algorithms. Finally, a consensus-maximization-based method is proposed to register terrestrial LiDAR point clouds. It demonstrates how to surpass the general theoretical hardness by using special problem structure (the rotation axis returned by the sensors), which simplify the problem and lead to application-oriented algorithms that are both efficient and globally optimal.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202

    Robust and Optimal Methods for Geometric Sensor Data Alignment

    Get PDF
    Geometric sensor data alignment - the problem of finding the rigid transformation that correctly aligns two sets of sensor data without prior knowledge of how the data correspond - is a fundamental task in computer vision and robotics. It is inconvenient then that outliers and non-convexity are inherent to the problem and present significant challenges for alignment algorithms. Outliers are highly prevalent in sets of sensor data, particularly when the sets overlap incompletely. Despite this, many alignment objective functions are not robust to outliers, leading to erroneous alignments. In addition, alignment problems are highly non-convex, a property arising from the objective function and the transformation. While finding a local optimum may not be difficult, finding the global optimum is a hard optimisation problem. These key challenges have not been fully and jointly resolved in the existing literature, and so there is a need for robust and optimal solutions to alignment problems. Hence the objective of this thesis is to develop tractable algorithms for geometric sensor data alignment that are robust to outliers and not susceptible to spurious local optima. This thesis makes several significant contributions to the geometric alignment literature, founded on new insights into robust alignment and the geometry of transformations. Firstly, a novel discriminative sensor data representation is proposed that has better viewpoint invariance than generative models and is time and memory efficient without sacrificing model fidelity. Secondly, a novel local optimisation algorithm is developed for nD-nD geometric alignment under a robust distance measure. It manifests a wider region of convergence and a greater robustness to outliers and sampling artefacts than other local optimisation algorithms. Thirdly, the first optimal solution for 3D-3D geometric alignment with an inherently robust objective function is proposed. It outperforms other geometric alignment algorithms on challenging datasets due to its guaranteed optimality and outlier robustness, and has an efficient parallel implementation. Fourthly, the first optimal solution for 2D-3D geometric alignment with an inherently robust objective function is proposed. It outperforms existing approaches on challenging datasets, reliably finding the global optimum, and has an efficient parallel implementation. Finally, another optimal solution is developed for 2D-3D geometric alignment, using a robust surface alignment measure. Ultimately, robust and optimal methods, such as those in this thesis, are necessary to reliably find accurate solutions to geometric sensor data alignment problems

    Pareto optimality solution of the multi-objective photogrammetric resection-intersection problem

    Get PDF
    Reconstruction of architectural structures from photographs has recently experienced intensive efforts in computer vision research. This is achieved through the solution of nonlinear least squares (NLS) problems to obtain accurate structure and motion estimates. In Photogrammetry, NLS contribute to the determination of the 3-dimensional (3D) terrain models from the images taken from photographs. The traditional NLS approach for solving the resection-intersection problem based on implicit formulation on the one hand suffers from the lack of provision by which the involved variables can be weighted. On the other hand, incorporation of explicit formulation expresses the objectives to be minimized in different forms, thus resulting in different parametric values for the estimated parameters at non-zero residuals. Sometimes, these objectives may conflict in a Pareto sense, namely, a small change in the parameters results in the increase of one objective and a decrease of the other, as is often the case in multi-objective problems. Such is often the case with error-in-all-variable (EIV) models, e.g., in the resection-intersection problem where such change in the parameters could be caused by errors in both image and reference coordinates.This study proposes the Pareto optimal approach as a possible improvement to the solution of the resection-intersection problem, where it provides simultaneous estimation of the coordinates and orientation parameters of the cameras in a two or multistation camera system on the basis of a properly weighted multi-objective function. This objective represents the weighted sum of the square of the direct explicit differences of the measured and computed ground as well as the image coordinates. The effectiveness of the proposed method is demonstrated by two camera calibration problems, where the internal and external orientation parameters are estimated on the basis of the collinearity equations, employing the data of a Manhattan-type test field as well as the data of an outdoor, real case experiment. In addition, an architectural structural reconstruction of the Merton college court in Oxford (UK) via estimation of camera matrices is also presented. Although these two problems are different, where the first case considers the error reduction of the image and spatial coordinates, while the second case considers the precision of the space coordinates, the Pareto optimality can handle both problems in a general and flexible way

    Very fast solution to the PnP problem with algebraic outlier rejection

    Get PDF
    Presentado al CVPR 2014 celebrado en Columbus, Ohio (US) del 23 al 28 de junio.We propose a real-time, robust to outliers and accurate solution to the Perspective-n-Point (PnP) problem. The main advantages of our solution are twofold: first, it integrates the outlier rejection within the pose estimation pipeline with a negligible computational overhead; and second, its scalability to arbitrarily large number of correspondences. Given a set of 3D-to-2D matches, we formulate pose estimation problem as a low-rank homogeneous system where the solution lies on its 1D null space. Outlier correspondences are those rows of the linear system which perturb the null space and are progressively detected by projecting them on an iteratively estimated solution of the null space. Since our outlier removal process is based on an algebraic criterion which does not require computing the full-pose and reprojecting back all 3D points on the image plane at each step, we achieve speed gains of more than 100× compared to RANSAC strategies. An extensive experimental evaluation will show that our solution yields accurate results in situations with up to 50% of outliers, and can process more than 1000 correspondences in less than 5ms.This work has been partially funded by Spanish government under projects DPI2011-27510, IPT-2012-0630-020000, IPT-2011-1015-430000 and CICYT grant TIN2012-39203; by the EU project ARCAS FP7-ICT-2011-28761; and by the ERA-Net Chistera project ViSen PCIN-2013-047Peer Reviewe

    Methods for Optimal Model Fitting and Sensor Calibration

    Get PDF
    The problem of fitting models to measured data has been studied extensively, not least in the field of computer vision. A central problem in this field is the difficulty in reliably find corresponding structures and points in different images, resulting in outlier data. This thesis presents theoretical results improving the understanding of the connection between model parameter estimation and possible outlier-inlier partitions of data point sets. Using these results a multitude of applications can be analyzed in respects to optimal outlier inlier partitions, optimal norm fitting, and not least in truncated norm sense. Practical polynomial time optimal solvers are derived for several applications, including but not limited to multi-view triangulation and image registration. In this thesis the problem of sensor network self calibration is investigated. Sensor networks play an increasingly important role with the increased availability of mobile, antenna equipped, devices. The application areas can be extended with knowledge of the different sensors relative or absolute positions. We study this problem in the context of bipartite sensor networks. We identify requirements of solvability for several configurations, and present a framework for how such problems can be approached. Further we utilize this framework to derive several solvers, which we show in both synthetic and real examples functions as desired. In both these types of model estimation, as well as in the classical random samples based approaches minimal cases of polynomial systems play a central role. A majority of the problems tackled in this thesis will have solvers based on recent techniques pertaining to action matrix solvers. New application specific polynomial equation sets are constructed and elimination templates designed for them. In addition a general improvement to the method is suggested for a large class of polynomial systems. The method is shown to improve the computational speed by significant reductions in the size of elimination templates as well as in the size of the action matrices. In addition the methodology on average improves the numerical stability of the solvers

    Robust convex optimisation techniques for autonomous vehicle vision-based navigation

    Get PDF
    This thesis investigates new convex optimisation techniques for motion and pose estimation. Numerous computer vision problems can be formulated as optimisation problems. These optimisation problems are generally solved via linear techniques using the singular value decomposition or iterative methods under an L2 norm minimisation. Linear techniques have the advantage of offering a closed-form solution that is simple to implement. The quantity being minimised is, however, not geometrically or statistically meaningful. Conversely, L2 algorithms rely on iterative estimation, where a cost function is minimised using algorithms such as Levenberg-Marquardt, Gauss-Newton, gradient descent or conjugate gradient. The cost functions involved are geometrically interpretable and can statistically be optimal under an assumption of Gaussian noise. However, in addition to their sensitivity to initial conditions, these algorithms are often slow and bear a high probability of getting trapped in a local minimum or producing infeasible solutions, even for small noise levels. In light of the above, in this thesis we focus on developing new techniques for finding solutions via a convex optimisation framework that are globally optimal. Presently convex optimisation techniques in motion estimation have revealed enormous advantages. Indeed, convex optimisation ensures getting a global minimum, and the cost function is geometrically meaningful. Moreover, robust optimisation is a recent approach for optimisation under uncertain data. In recent years the need to cope with uncertain data has become especially acute, particularly where real-world applications are concerned. In such circumstances, robust optimisation aims to recover an optimal solution whose feasibility must be guaranteed for any realisation of the uncertain data. Although many researchers avoid uncertainty due to the added complexity in constructing a robust optimisation model and to lack of knowledge as to the nature of these uncertainties, and especially their propagation, in this thesis robust convex optimisation, while estimating the uncertainties at every step is investigated for the motion estimation problem. First, a solution using convex optimisation coupled to the recursive least squares (RLS) algorithm and the robust H filter is developed for motion estimation. In another solution, uncertainties and their propagation are incorporated in a robust L convex optimisation framework for monocular visual motion estimation. In this solution, robust least squares is combined with a second order cone program (SOCP). A technique to improve the accuracy and the robustness of the fundamental matrix is also investigated in this thesis. This technique uses the covariance intersection approach to fuse feature location uncertainties, which leads to more consistent motion estimates. Loop-closure detection is crucial in improving the robustness of navigation algorithms. In practice, after long navigation in an unknown environment, detecting that a vehicle is in a location it has previously visited gives the opportunity to increase the accuracy and consistency of the estimate. In this context, we have developed an efficient appearance-based method for visual loop-closure detection based on the combination of a Gaussian mixture model with the KD-tree data structure. Deploying this technique for loop-closure detection, a robust L convex posegraph optimisation solution for unmanned aerial vehicle (UAVs) monocular motion estimation is introduced as well. In the literature, most proposed solutions formulate the pose-graph optimisation as a least-squares problem by minimising a cost function using iterative methods. In this work, robust convex optimisation under the L norm is adopted, which efficiently corrects the UAV’s pose after loop-closure detection. To round out the work in this thesis, a system for cooperative monocular visual motion estimation with multiple aerial vehicles is proposed. The cooperative motion estimation employs state-of-the-art approaches for optimisation, individual motion estimation and registration. Three-view geometry algorithms in a convex optimisation framework are deployed on board the monocular vision system for each vehicle. In addition, vehicle-to-vehicle relative pose estimation is performed with a novel robust registration solution in a global optimisation framework. In parallel, and as a complementary solution for the relative pose, a robust non-linear H solution is designed as well to fuse measurements from the UAVs’ on-board inertial sensors with the visual estimates. The suggested contributions have been exhaustively evaluated over a number of real-image data experiments in the laboratory using monocular vision systems and range imaging devices. In this thesis, we propose several solutions towards the goal of robust visual motion estimation using convex optimisation. We show that the convex optimisation framework may be extended to include uncertainty information, to achieve robust and optimal solutions. We observed that convex optimisation is a practical and very appealing alternative to linear techniques and iterative methods

    Why do we optimize what we optimize in multiple view geometry?

    Get PDF
    Para que un computador sea capaz de entender la geometría 3D de su entorno, necesitamos derivar las relaciones geométricas entre las imágenes 2D y el mundo 3D.La geometría de múltiples vistas es el área de investigación que estudia este problema.La mayor parte de métodos existentes resuelve pequeñas partes de este gran problema minimizando una determinada función objetivo.Estas funciones normalmente se componen de errores algebraicos o geométricos que representan las desviaciones con respecto al modelo de observación.En resumen, en general tratamos de recuperar la estructura 3D del mundo y el movimiento de la cámara encontrando el modelo que minimiza la discrepancia con respecto a las observaciones.El enfoque de esta tesis se centra principalmente en dos aspectos de los problemas de reconstrucción multivista:los criterios de error y la robustez.Primero, estudiamos los criterios de error usados en varios problemas geométricos y nos preguntamos`¿Por qué optimizamos lo que optimizamos?'Específicamente, analizamos sus pros y sus contras y proponemos métodos novedosos que combinan los criterios existentes o adoptan una mejor alternativa.En segundo lugar, tratamos de alcanzar el estado del arte en robustez frente a valores atípicos y escenarios desafiantes, que a menudo se encuentran en la práctica.Para ello, proponemos múltiples ideas novedosas que pueden ser incorporadas en los métodos basados en optimización.Específicamente, estudiamos los siguientes problemas: SLAM monocular, triangulación a partir de dos y de múltiples vistas, promedio de rotaciones únicas y múltiples, ajuste de haces únicamente con rotaciones de cámara, promedio robusto de números y evaluación cuantitativa de estimación de trayectoria.Para SLAM monocular, proponemos un enfoque híbrido novedoso que combina las fortalezas de los métodos directos y los basados en características.Los métodos directos minimizan los errores fotométricos entre los píxeles correspondientes en varias imágenes, mientras que los métodos basados en características minimizan los errores de reproyección.Nuestro método combina de manera débilmente acoplada la odometría directa y el SLAM basado en características, y demostramos que mejora la robustez en escenarios desafiantes, así como la precisión cuando el movimiento de la cámara realiza frecuentes revisitas.Para la triangulación de dos vistas, proponemos métodos óptimos que minimizan los errores de reproyección angular en forma cerrada.Dado que el error angular es rotacionalmente invariante, estos métodos se pueden utilizar para cámaras perspectivas, lentes de ojo de pez u omnidireccionales.Además, son mucho más rápidos que los métodos óptimos existentes en la literatura.Otro método de triangulación de dos vistas que proponemos adopta un enfoque completamente diferente:Modificamos ligeramente el método clásico del punto medio y demostramos que proporciona un equilibrio superior de precisión 2D y 3D, aunque no es óptimo.Para la triangulación multivista, proponemos un método robusto y eficiente utilizando RANSAC de dos vistas.Presentamos varios criterios de finalización temprana para RANSAC de dos vistas utilizando el método de punto medio y mostramos que mejora la eficiencia cuando la proporción de medidas espúreas es alta.Además, mostramos que la incertidumbre de un punto triangulado se puede modelar en función de tres factores: el número de cámaras, el error medio de reproyección y el ángulo de paralaje máximo.Al aprender este modelo, la incertidumbre se puede interpolar para cada caso.Para promediar una sola rotación, proponemos un método robusto basado en el algoritmo de Weiszfeld.La idea principal es comenzar con una inicialización robusta y realizar un esquema de rechazo de valores espúreos implícito dentro del algoritmo de Weiszfeld para aumentar aún más la robustez.Además, usamos una aproximación de la mediana cordal en SO(3)SO(3) que proporciona una aceleración significativa del método. Para promediar rotaciones múltiples proponemos HARA, un enfoque novedoso que inicializa de manera incremental el grafo de rotaciones basado en una jerarquía de compatibilidad con tripletas.Esencialmente, construimos un árbol de expansión priorizando los enlaces con muchos soportes triples fuertes y agregando gradualmente aquellos con menos soportes y más débiles.Como resultado, reducimos el riesgo de agregar valores atípicos en la solución inicial, lo que nos permite filtrar los valores atípicos antes de la optimización no lineal.Además, mostramos que podemos mejorar los resultados usando la función suavizada L0+ en el paso de refinamiento local.A continuación, proponemos el ajuste de haces únicamente con rotaciones, un método novedoso para estimar las rotaciones absolutas de múltiples vistas independientemente de las traslaciones y la estructura de la escena.La clave es minimizar una función de coste especialmente diseñada basada en el error epipolar normalizado, que está estrechamente relacionado con el error de reproyección angular óptimo L1 entre otras cantidades geométricas.Nuestro enfoque brinda múltiples beneficios, como inmunidad total a translaciones y triangulaciones imprecisas, robustez frente a rotaciones puras y escenas planas, y la mejora de la precisión cuando se usa tras el promedio de promedio de rotaciones explicado anteriormente.También proponemos RODIAN, un método robusto para promediar un conjunto de números contaminados por una gran proporción de valores atípicos.En nuestro método, asumimos que los valores atípicos se distribuyen uniformemente dentro del rango de los datos y buscamos la región que es menos probable que contenga solo valores atípicos.Luego tomamos la mediana de los datos dentro de esta región.Nuestro método es rápido, robusto y determinista, y no se basa en un límite de error interno conocido.Finalmente, para la evaluación cuantitativa de la trayectoria, señalamos la debilidad del Error de Trayectoria Absoluta (ATE) comúnmente utilizado y proponemos una alternativa novedosa llamada Error de Trayectoria Discernible (DTE).En presencia de solo unos pocos valores espúreos, el ATE pierde su sensibilidad respecto al error de trayectoria de los valores típicos y respecto al número de datos atípicos o espúreos.El DTE supera esta debilidad al alinear la trayectoria estimada con la verdadera (ground truth) utilizando un método robusto basado en varios tipos diferentes de medianas.Usando ideas similares, también proponemos una métrica de solo rotación, llamada Error de Rotación Discernible (DRE).Además, proponemos un método simple para calibrar la rotación de cámara a marcador, que es un requisito previo para el cálculo de DTE y DRE.<br /

    Unreliable and resource-constrained decoding

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 185-213).Traditional information theory and communication theory assume that decoders are noiseless and operate without transient or permanent faults. Decoders are also traditionally assumed to be unconstrained in physical resources like material, memory, and energy. This thesis studies how constraining reliability and resources in the decoder limits the performance of communication systems. Five communication problems are investigated. Broadly speaking these are communication using decoders that are wiring cost-limited, that are memory-limited, that are noisy, that fail catastrophically, and that simultaneously harvest information and energy. For each of these problems, fundamental trade-offs between communication system performance and reliability or resource consumption are established. For decoding repetition codes using consensus decoding circuits, the optimal tradeoff between decoding speed and quadratic wiring cost is defined and established. Designing optimal circuits is shown to be NP-complete, but is carried out for small circuit size. The natural relaxation to the integer circuit design problem is shown to be a reverse convex program. Random circuit topologies are also investigated. Uncoded transmission is investigated when a population of heterogeneous sources must be categorized due to decoder memory constraints. Quantizers that are optimal for mean Bayes risk error, a novel fidelity criterion, are designed. Human decision making in segregated populations is also studied with this framework. The ratio between the costs of false alarms and missed detections is also shown to fundamentally affect the essential nature of discrimination. The effect of noise on iterative message-passing decoders for low-density parity check (LDPC) codes is studied. Concentration of decoding performance around its average is shown to hold. Density evolution equations for noisy decoders are derived. Decoding thresholds degrade smoothly as decoder noise increases, and in certain cases, arbitrarily small final error probability is achievable despite decoder noisiness. Precise information storage capacity results for reliable memory systems constructed from unreliable components are also provided. Limits to communicating over systems that fail at random times are established. Communication with arbitrarily small probability of error is not possible, but schemes that optimize transmission volume communicated at fixed maximum message error probabilities are determined. System state feedback is shown not to improve performance. For optimal communication with decoders that simultaneously harvest information and energy, a coding theorem that establishes the fundamental trade-off between the rates at which energy and reliable information can be transmitted over a single line is proven. The capacity-power function is computed for several channels; it is non-increasing and concave.by Lav R. Varshney.Ph.D
    corecore