298 research outputs found
Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction
We tackle the problem of estimating a Manhattan frame, i.e. three orthogonal
vanishing points, and the unknown focal length of the camera, leveraging a
prior vertical direction. The direction can come from an Inertial Measurement
Unit that is a standard component of recent consumer devices, e.g.,
smartphones. We provide an exhaustive analysis of minimal line configurations
and derive two new 2-line solvers, one of which does not suffer from
singularities affecting existing solvers. Additionally, we design a new
non-minimal method, running on an arbitrary number of lines, to boost the
performance in local optimization. Combining all solvers in a hybrid robust
estimator, our method achieves increased accuracy even with a rough prior.
Experiments on synthetic and real-world datasets demonstrate the superior
accuracy of our method compared to the state of the art, while having
comparable runtimes. We further demonstrate the applicability of our solvers
for relative rotation estimation. The code is available at
https://github.com/cvg/VP-Estimation-with-Prior-Gravity.Comment: Accepted at ICCV 202
Prior-based facade rectification for AR in urban environment
International audienceWe present a method for automatic facade rectification and detection in the Manhattan world scenario. A Bayesian inference approach is proposed to recover the Manhattan directions in camera coordinate system, based on a prior we derived from the analysis of urban datasets. In addition, a SVM-based procedure is used to identify right-angle corners in the rectified images. These corners are clustered in facade regions using a greedy rectangular min-cut technique. Experiments on a standard dataset show that our algorithm performs better or as well as state-of-the-art techniques while being much faster
Line Primitives and Their Applications in Geometric Computer Vision
Line primitives are widely found in structured scenes which provide a higher level of structure information about the scenes than point primitives. Furthermore, line primitives in space are closely related to Euclidean transformations, because the dual vector (also known as Pluecker coordinates) representation of 3D lines is the counterpart of the dual quaternion which depicts an Euclidean transformation. These geometric properties of line primitives motivate the work in this thesis with the following contributions: Firstly, by combining local appearances of lines and geometric constraints between line pairs in images, a line segment matching algorithm is developed which constructs a novel line band descriptor to depict the local appearance of a line and builds a relational graph to measure the pair-wise consistency between line correspondences. Experiments show that the matching algorithm is robust to various image transformations and more efficient than conventional graph based line matching algorithms. Secondly, by investigating the symmetric property of line directions in space, this thesis presents a complete analysis about the solutions of the Perspective-3-Line (P3L) problem which estimates the camera pose from three reference lines in space and their 2D projections. For three spatial lines in general configurations, a P3L polynomial is derived which is employed to develop a solution of the Perspective-n-Line problem. The proposed robust PnL algorithm can efficiently and accurately estimate the camera pose for both small numbers and large numbers of line correspondences. For three spatial lines in special configurations (e.g., in a Manhattan world which consists of three mutually orthogonal dominant directions), the solution of the P3L problem is employed to solve the vanishing point estimation and line classification problem. The proposed vanishing point estimation algorithm achieves high accuracy and efficiency by thoroughly utilizing the Manhattan world characteristic. Another advantage of the proposed framework is that it can be easily generalized to images taken by central catadioptric cameras or uncalibrated cameras. The third major contribution of this thesis is about structure-from-motion using line primitives. To circumvent the Pluecker constraints on the Pluecker coordinates of lines, the Cayley representation of lines is developed which is inspired by the geometric property of the Pluecker coordinates of lines. To build the line observation model, two derivations of line projection functions are presented: one is based on the dual relationship between points and lines; and the other is based on the relationship between Pluecker coordinates and the Pluecker matrix. Then the motion and structure parameters are initialized by an incremental approach and optimized by sparse bundle adjustment. Quantitative validations show the increase in performance when compared to conventional line reconstruction algorithms
Stereo Reconstruction using Induced Symmetry and 3D scene priors
Tese de doutoramento em Engenharia Electrotécnica e de
Computadores apresentada à Faculdade de Ciências e Tecnologia da Universidade de CoimbraRecuperar a geometria 3D a partir de dois vistas, conhecida como reconstrução
estéreo, é um dos tópicos mais antigos e mais investigado em visão por computador.
A computação de modelos 3D do ambiente é útil para uma grande número de
aplicações, desde a robótica, passando pela sua utilização do consumidor comum,
até a procedimentos médicos. O princípio para recuperar a estrutura 3D cena é
bastante simples, no entanto, existem algumas situações que complicam consideravelmente o processo de reconstrução. Objetos que contêm estruturas pouco
texturadas ou repetitivas, e superfícies com bastante inclinação ainda colocam em
dificuldade os algoritmos state-of-the-art.
Esta tese de doutoramento aborda estas questões e apresenta um novo framework
estéreo que é completamente diferente das abordagens convencionais. Propomos
a utilização de simetria em vez de foto-similaridade para avaliar a verosimilhança
de pontos em duas imagens distintas serem uma correspondência. O framework
é chamado SymStereo, e baseia-se no efeito de espelhagem que surge sempre
que uma imagem é mapeada para a outra câmera usando a homografia induzida
por um plano de corte virtual que intersecta a baseline. Experiências em estéreo
denso comprovam que as nossas funções de custo baseadas em simetria se comparam
favoravelmente com os custos baseados em foto-consistência de melhor
desempenho. Param além disso, investigamos a possibilidade de realizar Stereo-Rangefinding, que consiste em usar estéreo passivo para recuperar exclusivamente a profundidade ao longo de um plano de varrimento. Experiências abrangentes fornecem evidência de que estéreo baseada em simetria induzida é especialmente eficaz para esta finalidade.
Como segunda linha de investigação, propomos superar os problemas descritos
anteriormente usando informação a priori sobre o ambiente 3D, com o objectivo de
aumentar a robustez do processo de reconstrução. Para tal, apresentamos uma nova abordagem global para detectar pontos de desvanecimento e grupos de direcções de desvanecimento mutuamente ortogonais em ambientes Manhattan. Experiências quer em imagens sintéticas quer em imagens reais demonstram que os nossos algoritmos superaram os métodos state-of-the-art, mantendo a computação aceitável. Além disso, mostramos pela primeira vez resultados na detecção simultânea de múltiplas configurações de Manhattan. Esta informação a priori sobre a estrutura da cena é depois usada numa pipeline de reconstrução que gera modelos piecewise planares de ambientes urbanos a partir de duas vistas calibradas. A nossa
formulação combina SymStereo e o algoritmo de clustering PEARL [3], e alterna
entre um passo de otimização discreto, que funde hipóteses de superfícies planares
e descarta detecções com pouco suporte, e uma etapa de otimização contínua, que
refina as poses dos planos. Experiências com pares estéreo de ambientes interiores
e exteriores confirmam melhorias significativas sobre métodos state-of-the-art
relativamente a precisão e robustez.
Finalmente, e como terceira contribuição para melhorar a visão estéreo na
presença de superfícies inclinadas, estendemos o recente framework de agregação estéreo baseada em histogramas [4]. O algoritmo original utiliza janelas de suporte fronto-paralelas para a agregação de custo, o que leva a resultados imprecisos
na presença de superfícies com inclinação significativa. Nós abordamos o
problema considerando hipóteses de orientação discretas. Os resultados experimentais obtidos comprovam a eficácia do método, permitindo melhorar a precisção
de correspondência, preservando simultaneamente uma baixa complexidade computacional.Recovering the 3D geometry from two or more views, known as stereo reconstruction,
is one of the earliest and most investigated topics in computer vision. The
computation of 3D models of an environment is useful for a very large number of applications, ranging from robotics, consumer utilization to medical procedures. The principle to recover the 3D scene structure is quite simple, however, there are some issues that considerable complicate the reconstruction process. Objects containing complicated structures, including low and repetitive textures, and highly slanted surfaces still pose difficulties to state-of-the-art algorithms.
This PhD thesis tackles this issues and introduces a new stereo framework that
is completely different from conventional approaches. We propose to use symmetry
instead of photo-similarity for assessing the likelihood of two image locations
being a match. The framework is called SymStereo, and is based on the mirroring
effect that arises whenever one view is mapped into the other using the homography induced by a virtual cut plane that intersects the baseline. Extensive experiments in dense stereo show that our symmetry-based cost functions compare favorably against the best performing photo-similarity matching costs. In addition, we investigate the possibility of accomplishing Stereo-Rangefinding that consists in using
passive stereo to exclusively recover depth along a scan plane. Thorough experiments
provide evidence that Stereo from Induced Symmetry is specially well suited
for this purpose.
As a second research line, we propose to overcome the previous issues using
priors about the 3D scene for increasing the robustness of the reconstruction process.
For this purpose, we present a new global approach for detecting vanishing
points and groups of mutually orthogonal vanishing directions in man-made environments. Experiments in both synthetic and real images show that our algorithms
outperform the state-of-the-art methods while keeping computation tractable. In
addition, we show for the first time results in simultaneously detecting multiple
Manhattan-world configurations. This prior information about the scene structure
is then included in a reconstruction pipeline that generates piece-wise planar
models of man-made environments from two calibrated views. Our formulation
combines SymStereo and PEARL clustering [3], and alternates between a discrete
optimization step, that merges planar surface hypotheses and discards detections
with poor support, and a continuous optimization step, that refines the plane poses.
Experiments with both indoor and outdoor stereo pairs show significant improvements
over state-of-the-art methods with respect to accuracy and robustness.
Finally, and as a third contribution to improve stereo matching in the presence
of surface slant, we extend the recent framework of Histogram Aggregation
[4]. The original algorithm uses a fronto-parallel support window for cost aggregation, leading to inaccurate results in the presence of significant surface slant. We
address the problem by considering discrete orientation hypotheses. The experimental
results prove the effectiveness of the approach, which enables to improve
the matching accuracy while preserving a low computational complexity
Stereo Reconstruction using Induced Symmetry and 3D scene priors
Tese de doutoramento em Engenharia Electrotécnica e de
Computadores apresentada à Faculdade de Ciências e Tecnologia da Universidade de CoimbraRecuperar a geometria 3D a partir de dois vistas, conhecida como reconstrução
estéreo, é um dos tópicos mais antigos e mais investigado em visão por computador.
A computação de modelos 3D do ambiente é útil para uma grande número de
aplicações, desde a robótica, passando pela sua utilização do consumidor comum,
até a procedimentos médicos. O princípio para recuperar a estrutura 3D cena é
bastante simples, no entanto, existem algumas situações que complicam consideravelmente o processo de reconstrução. Objetos que contêm estruturas pouco
texturadas ou repetitivas, e superfícies com bastante inclinação ainda colocam em
dificuldade os algoritmos state-of-the-art.
Esta tese de doutoramento aborda estas questões e apresenta um novo framework
estéreo que é completamente diferente das abordagens convencionais. Propomos
a utilização de simetria em vez de foto-similaridade para avaliar a verosimilhança
de pontos em duas imagens distintas serem uma correspondência. O framework
é chamado SymStereo, e baseia-se no efeito de espelhagem que surge sempre
que uma imagem é mapeada para a outra câmera usando a homografia induzida
por um plano de corte virtual que intersecta a baseline. Experiências em estéreo
denso comprovam que as nossas funções de custo baseadas em simetria se comparam
favoravelmente com os custos baseados em foto-consistência de melhor
desempenho. Param além disso, investigamos a possibilidade de realizar Stereo-Rangefinding, que consiste em usar estéreo passivo para recuperar exclusivamente a profundidade ao longo de um plano de varrimento. Experiências abrangentes fornecem evidência de que estéreo baseada em simetria induzida é especialmente eficaz para esta finalidade.
Como segunda linha de investigação, propomos superar os problemas descritos
anteriormente usando informação a priori sobre o ambiente 3D, com o objectivo de
aumentar a robustez do processo de reconstrução. Para tal, apresentamos uma nova abordagem global para detectar pontos de desvanecimento e grupos de direcções de desvanecimento mutuamente ortogonais em ambientes Manhattan. Experiências quer em imagens sintéticas quer em imagens reais demonstram que os nossos algoritmos superaram os métodos state-of-the-art, mantendo a computação aceitável. Além disso, mostramos pela primeira vez resultados na detecção simultânea de múltiplas configurações de Manhattan. Esta informação a priori sobre a estrutura da cena é depois usada numa pipeline de reconstrução que gera modelos piecewise planares de ambientes urbanos a partir de duas vistas calibradas. A nossa
formulação combina SymStereo e o algoritmo de clustering PEARL [3], e alterna
entre um passo de otimização discreto, que funde hipóteses de superfícies planares
e descarta detecções com pouco suporte, e uma etapa de otimização contínua, que
refina as poses dos planos. Experiências com pares estéreo de ambientes interiores
e exteriores confirmam melhorias significativas sobre métodos state-of-the-art
relativamente a precisão e robustez.
Finalmente, e como terceira contribuição para melhorar a visão estéreo na
presença de superfícies inclinadas, estendemos o recente framework de agregação estéreo baseada em histogramas [4]. O algoritmo original utiliza janelas de suporte fronto-paralelas para a agregação de custo, o que leva a resultados imprecisos
na presença de superfícies com inclinação significativa. Nós abordamos o
problema considerando hipóteses de orientação discretas. Os resultados experimentais obtidos comprovam a eficácia do método, permitindo melhorar a precisção
de correspondência, preservando simultaneamente uma baixa complexidade computacional.Recovering the 3D geometry from two or more views, known as stereo reconstruction,
is one of the earliest and most investigated topics in computer vision. The
computation of 3D models of an environment is useful for a very large number of applications, ranging from robotics, consumer utilization to medical procedures. The principle to recover the 3D scene structure is quite simple, however, there are some issues that considerable complicate the reconstruction process. Objects containing complicated structures, including low and repetitive textures, and highly slanted surfaces still pose difficulties to state-of-the-art algorithms.
This PhD thesis tackles this issues and introduces a new stereo framework that
is completely different from conventional approaches. We propose to use symmetry
instead of photo-similarity for assessing the likelihood of two image locations
being a match. The framework is called SymStereo, and is based on the mirroring
effect that arises whenever one view is mapped into the other using the homography induced by a virtual cut plane that intersects the baseline. Extensive experiments in dense stereo show that our symmetry-based cost functions compare favorably against the best performing photo-similarity matching costs. In addition, we investigate the possibility of accomplishing Stereo-Rangefinding that consists in using
passive stereo to exclusively recover depth along a scan plane. Thorough experiments
provide evidence that Stereo from Induced Symmetry is specially well suited
for this purpose.
As a second research line, we propose to overcome the previous issues using
priors about the 3D scene for increasing the robustness of the reconstruction process.
For this purpose, we present a new global approach for detecting vanishing
points and groups of mutually orthogonal vanishing directions in man-made environments. Experiments in both synthetic and real images show that our algorithms
outperform the state-of-the-art methods while keeping computation tractable. In
addition, we show for the first time results in simultaneously detecting multiple
Manhattan-world configurations. This prior information about the scene structure
is then included in a reconstruction pipeline that generates piece-wise planar
models of man-made environments from two calibrated views. Our formulation
combines SymStereo and PEARL clustering [3], and alternates between a discrete
optimization step, that merges planar surface hypotheses and discards detections
with poor support, and a continuous optimization step, that refines the plane poses.
Experiments with both indoor and outdoor stereo pairs show significant improvements
over state-of-the-art methods with respect to accuracy and robustness.
Finally, and as a third contribution to improve stereo matching in the presence
of surface slant, we extend the recent framework of Histogram Aggregation
[4]. The original algorithm uses a fronto-parallel support window for cost aggregation, leading to inaccurate results in the presence of significant surface slant. We
address the problem by considering discrete orientation hypotheses. The experimental
results prove the effectiveness of the approach, which enables to improve
the matching accuracy while preserving a low computational complexity
Automated 3D model generation for urban environments [online]
Abstract
In this thesis, we present a fast approach to automated
generation of textured 3D city models with both high details at
ground level and complete coverage for birds-eye view.
A ground-based facade model is acquired by driving a vehicle
equipped with two 2D laser scanners and a digital camera under
normal traffic conditions on public roads. One scanner is
mounted horizontally and is used to determine the approximate
component of relative motion along the movement of the
acquisition vehicle via scan matching; the obtained relative
motion estimates are concatenated to form an initial path.
Assuming that features such as buildings are visible from both
ground-based and airborne view, this initial path is globally
corrected by Monte-Carlo Localization techniques using an aerial
photograph or a Digital Surface Model as a global map. The
second scanner is mounted vertically and is used to capture the
3D shape of the building facades. Applying a series of automated
processing steps, a texture-mapped 3D facade model is
reconstructed from the vertical laser scans and the camera
images. In order to obtain an airborne model containing the roof
and terrain shape complementary to the facade model, a Digital
Surface Model is created from airborne laser scans, then
triangulated, and finally texturemapped with aerial imagery.
Finally, the facade model and the airborne model are fused
to one single model usable for both walk- and fly-thrus. The
developed algorithms are evaluated on a large data set acquired
in downtown Berkeley, and the results are shown and discussed
Robust and Optimal Methods for Geometric Sensor Data Alignment
Geometric sensor data alignment - the problem of finding the
rigid transformation that correctly aligns two sets of sensor
data without prior knowledge of how the data correspond - is a
fundamental task in computer vision and robotics. It is
inconvenient then that outliers and non-convexity are inherent to
the problem and present significant challenges for alignment
algorithms. Outliers are highly prevalent in sets of sensor data,
particularly when the sets overlap incompletely. Despite this,
many alignment objective functions are not robust to outliers,
leading to erroneous alignments. In addition, alignment problems
are highly non-convex, a property arising from the objective
function and the transformation. While finding a local optimum
may not be difficult, finding the global optimum is a hard
optimisation problem. These key challenges have not been fully
and jointly resolved in the existing literature, and so there is
a need for robust and optimal solutions to alignment problems.
Hence the objective of this thesis is to develop tractable
algorithms for geometric sensor data alignment that are robust to
outliers and not susceptible to spurious local optima.
This thesis makes several significant contributions to the
geometric alignment literature, founded on new insights into
robust alignment and the geometry of transformations. Firstly, a
novel discriminative sensor data representation is proposed that
has better viewpoint invariance than generative models and is
time and memory efficient without sacrificing model fidelity.
Secondly, a novel local optimisation algorithm is developed for
nD-nD geometric alignment under a robust distance measure. It
manifests a wider region of convergence and a greater robustness
to outliers and sampling artefacts than other local optimisation
algorithms. Thirdly, the first optimal solution for 3D-3D
geometric alignment with an inherently robust objective function
is proposed. It outperforms other geometric alignment algorithms
on challenging datasets due to its guaranteed optimality and
outlier robustness, and has an efficient parallel implementation.
Fourthly, the first optimal solution for 2D-3D geometric
alignment with an inherently robust objective function is
proposed. It outperforms existing approaches on challenging
datasets, reliably finding the global optimum, and has an
efficient parallel implementation. Finally, another optimal
solution is developed for 2D-3D geometric alignment, using a
robust surface alignment measure.
Ultimately, robust and optimal methods, such as those in this
thesis, are necessary to reliably find accurate solutions to
geometric sensor data alignment problems
- …