477 research outputs found
Autocalibration with the Minimum Number of Cameras with Known Pixel Shape
In 3D reconstruction, the recovery of the calibration parameters of the
cameras is paramount since it provides metric information about the observed
scene, e.g., measures of angles and ratios of distances. Autocalibration
enables the estimation of the camera parameters without using a calibration
device, but by enforcing simple constraints on the camera parameters. In the
absence of information about the internal camera parameters such as the focal
length and the principal point, the knowledge of the camera pixel shape is
usually the only available constraint. Given a projective reconstruction of a
rigid scene, we address the problem of the autocalibration of a minimal set of
cameras with known pixel shape and otherwise arbitrarily varying intrinsic and
extrinsic parameters. We propose an algorithm that only requires 5 cameras (the
theoretical minimum), thus halving the number of cameras required by previous
algorithms based on the same constraint. To this purpose, we introduce as our
basic geometric tool the six-line conic variety (SLCV), consisting in the set
of planes intersecting six given lines of 3D space in points of a conic. We
show that the set of solutions of the Euclidean upgrading problem for three
cameras with known pixel shape can be parameterized in a computationally
efficient way. This parameterization is then used to solve autocalibration from
five or more cameras, reducing the three-dimensional search space to a
two-dimensional one. We provide experiments with real images showing the good
performance of the technique.Comment: 19 pages, 14 figures, 7 tables, J. Math. Imaging Vi
Vision-Based Object Recognition and 3-D Pose Estimation Using Conic Features
This thesis deals with monocular vision-based object recognition and 3-D pose estimation based on conic features. Conic features including circles and ellipses are frequently observed in many man-made objects in real word as well as have the merit of robustness potentially in feature extraction in vision-based applications. Although the 3-D pose estimation problem of conic features in 3-D space has been studied well since 1990, the previous work has not provided a unique solution completely for full 3-D pose parameters (i.e., 3-orientations and 3-positions) due to complexity from high nonlinearity of a general conic.
This thesis, therefore, renews conic features in a new perspective on geometric invariants in both 3-D space and 2-D projective space, incorporating other geometric features with conics. First, as the most essential step in dealing with conics, this thesis shows that the pose parameters of a circular feature in 3-D space can be derived analytically from incorporating a coplanar point.
A procedure of pose parameter recovery is described in detail, and its performance is evaluated and discussed in view of pose estimation errors and sensitivity. Second, it is also revealed that the pose of an elliptic feature can be resolved when two coplanar points are incorporated on the basis of the polarity of two points for a conic in 2-D projective space. This thesis proposes a series of algorithms to determine the 3-D pose parameters uniquely, and evaluates the proposed method through a measure of estimation performance and sensitivity depending on point locations. Third, a pair of two conics is dealt with, which is regarded as an extension of the idea of the incorporation scheme to another conic feature from point features.
Under the polarity concept, this thesis proves that the problem involving a pair of two conics can be formulated with the problem of one ellipse with two points so that its solution is derived in the same form as in the ellipse case.
In order to treat two or more conic objects as well as to deal with an object recognition problem, the rest of thesis concentrates on the theoretical foundation of multiple object recognition. First, some effective modeling approaches are described. A general object model is specially designed to model multiple objects for object recognition and pose recovery in view of spatial geometry. In particular, this thesis defines a pairwise conic model that can describes the geometrical relation between two conics invariantly in 2-D projective space, which consists of a pairwise conic (PC), a pairwise conic invariant (PCI), and a pairwise conic pole (PCP). Based on the two kinds of models, an object learning and recognition system is proposed as a general framework for multiple object recognition.
Considering simplicity and flexibility in object learning stage, this thesis introduces a semi-automatic learning scheme to construct the multiple object model from a model image at once. To utilize geometric relations among multiple objects effectively in object recognition, this thesis specifies some feature functions based on the pairwise conic model, and then describes an object recognition method in a fashion of linear-chain conditional random field (CRF). In particular, as a post refinement step of the recognition, a geometric alignment procedure is also proposed in algorithmic details to improve recognition performance against noisy conditions.
Last, the multiple object recognition method is evaluated intensively through two practical applications that deal with a place recognition and an elevator button recognition problem for service robots. A series of experiment results supports the effectiveness of the proposed method, maintaining reliable performance against noisy conditions in the presence of perspective distortion and partial object occlusions.Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Research objective and expected contribution . . . . . . . . . . . . . . . . . . 6
1.4 Organization of thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 3-D Pose Estimation of a Circular Feature 10
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.4 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Preliminaries: an elliptic cone in 3-D space and its homogeneous representation
in 2-D projective space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Homogeneous representation . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Principal planes of a cone versus diagonalization of a conic matrix Q . 16
2.3 3-D interpretation of a circular feature for 3-D pose estimation . . . . . . . . 19
2.3.1 3-D orientation estimation . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2 3-D position estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.3 Composition of homogeneous transformation and discrimination for
the unique solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.1 A numerical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.2 Evaluation of pose estimation performance . . . . . . . . . . . . . . . 29
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 3-D Pose Estimation of an Elliptic Feature 35
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Interpretation of an elliptic feature with coplanar points in 2-D projective space 38
3.2.1 The minimal number of points for pose estimation . . . . . . . . . . . 39
3.2.2 Analysis of possible constraints for relative positions of two points to
an ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.3 Feature selection scheme for stable homography estimation . . . . . . 43
3.3 3-D pose estimation algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.1 Extraction of triangular features from an elliptic object . . . . . . . . 47
3.3.2 Homography decomposition . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.3 Composition of homogeneous transformation matrix with unique solution
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.2 Evaluation of the proposed method . . . . . . . . . . . . . . . . . . . . 54
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4 3-D Pose Estimation of a Pair of Conic Features 61
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 3-D pose estimation of a conic feature incorporated with line features . . . . 61
4.3 3-D pose estimation of a conic feature incorporated with another conic feature 63
4.3.1 Some examples of self-polar triangle and invariants . . . . . . . . . . . 65
4.3.2 3-D pose estimation of a pair of coplanar conics . . . . . . . . . . . . . 67
4.3.3 Examples of 3-D pose estimation of a conic feature incorporated with
another conic feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Multiple Object Recognition Based on Pairwise Conic Model 77
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Learning of geometric relation of multiple objects . . . . . . . . . . . . . . . . 78
5.3 Pairwise conic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3.1 De_nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4 Multiple object recognition based on pairwise conic model and conditional
random _elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4.1 Graphical model for multiple object recognition . . . . . . . . . . . . . 86
5.4.2 Linear-chain conditional random _eld . . . . . . . . . . . . . . . . . . 87
5.4.3 Determination of low-level feature functions for multiple object recognition
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.4 Range selection trick for e_ciently computing the costs of low-level
feature functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.4.5 Evaluation of observation sequence . . . . . . . . . . . . . . . . . . . . 93
5.4.6 Object recognition based on hierarchical CRF . . . . . . . . . . . . . . 95
5.5 Geometric alignment algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6 Application to Place Recognition for Service Robots 105
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.2 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2.1 Detection of 2-D geometric shapes . . . . . . . . . . . . . . . . . . . . 107
6.2.2 Examples of shape feature extraction . . . . . . . . . . . . . . . . . . . 109
6.3 Object modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.3.1 A place model that describes multiple landmark objects . . . . . . . . 112
6.3.2 Pairwise conic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.3.3 Incorporation of non-conic features with a pairwise conic model . . . . 114
6.4 Place learning and recognition system . . . . . . . . . . . . . . . . . . . . . . 121
6.4.1 HCRF-based recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.5 Experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.5.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.5.2 Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7 Application to Elevator Button Recognition 136
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.1.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.2 Object modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
7.2.1 Geometric model for multiple button objects . . . . . . . . . . . . . . 140
7.2.2 Pairwise conic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.3 Learning and recognition system . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.3.1 Button object learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.3.2 CRF-based recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.4 Experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.4.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.4.2 Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8 Concluding remarks 159
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.2 Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
References 161
Summary (in Korean) 16
Self-calibration of turntable sequences from silhouettes
This paper addresses the problem of recovering both the intrinsic and extrinsic parameters of a camera from the silhouettes of an object in a turntable sequence. Previous silhouette-based approaches have exploited correspondences induced by epipolar tangents to estimate the image invariants under turntable motion and achieved a weak calibration of the cameras. It is known that the fundamental matrix relating any two views in a turntable sequence can be expressed explicitly in terms of the image invariants, the rotation angle, and a fixed scalar. It will be shown that the imaged circular points for the turntable plane can also be formulated in terms of the same image invariants and fixed scalar. This allows the imaged circular points to be recovered directly from the estimated image invariants, and provide constraints for the estimation of the imaged absolute conic. The camera calibration matrix can thus be recovered. A robust method for estimating the fixed scalar from image triplets is introduced, and a method for recovering the rotation angles using the estimated imaged circular points and epipoles is presented. Using the estimated camera intrinsics and extrinsics, a Euclidean reconstruction can be obtained. Experimental results on real data sequences are presented, which demonstrate the high precision achieved by the proposed method. © 2009 IEEE.published_or_final_versio
The Quadric Reference Surface: Theory and Applications
The conceptual component of this work is about "reference surfaces'' which are the dual of reference frames often used for shape representation purposes. The theoretical component of this work involves the question of whether one can find a unique (and simple) mapping that aligns two arbitrary perspective views of an opaque textured quadric surface in 3D, given (i) few corresponding points in the two views, or (ii) the outline conic of the surface in one view (only) and few corresponding points in the two views. The practical component of this work is concerned with applying the theoretical results as tools for the task of achieving full correspondence between views of arbitrary objects
Self-calibration and motion recovery from silhouettes with two mirrors
LNCS v. 7724-7727 (pts. 1-4) entitled: Computer vision - ACCV 2012: 11th Asian Conference on Computer Vision ... 2012: revised selected papersThis paper addresses the problem of self-calibration and motion recovery from a single snapshot obtained under a setting of two mirrors. The mirrors are able to show five views of an object in one image. In this paper, the epipoles of the real and virtual cameras are firstly estimated from the intersection of the bitangent lines between corresponding images, from which we can easily derive the horizon of the camera plane. The imaged circular points and the angle between the mirrors can then be obtained from equal angles between the bitangent lines, by planar rectification. The silhouettes produced by reflections can be treated as a special circular motion sequence. With this observation, technique developed for calibrating a circular motion sequence can be exploited to simplify the calibration of a single-view two-mirror system. Different from the state-of-the-art approaches, only one snapshot is required in this work for self-calibrating a natural camera and recovering the poses of the two mirrors. This is more flexible than previous approaches which require at least two images. When more than a single image is available, each image can be calibrated independently and the problem of varying focal length does not complicate the calibration problem. After the calibration, the visual hull of the objects can be obtained from the silhouettes. Experimental results show the feasibility and the preciseness of the proposed approach. © 2013 Springer-Verlag.postprin
Angular variation as a monocular cue for spatial percepcion
Monocular cues are spatial sensory inputs which are picked up exclusively from one eye. They are in majority static features that
provide depth information and are extensively used in graphic art to create realistic representations of a scene. Since the spatial
information contained in these cues is picked up from the retinal image, the existence of a link between it and the theory of direct
perception can be conveniently assumed. According to this theory, spatial information of an environment is directly contained in the
optic array. Thus, this assumption makes possible the modeling of visual perception processes through computational approaches.
In this thesis, angular variation is considered as a monocular cue, and the concept of direct perception is adopted by a computer
vision approach that considers it as a suitable principle from which innovative techniques to calculate spatial information can be
developed.
The expected spatial information to be obtained from this monocular cue is the position and orientation of an object with respect to
the observer, which in computer vision is a well known field of research called 2D-3D pose estimation. In this thesis, the attempt to
establish the angular variation as a monocular cue and thus the achievement of a computational approach to direct perception is
carried out by the development of a set of pose estimation methods. Parting from conventional strategies to solve the pose
estimation problem, a first approach imposes constraint equations to relate object and image features. In this sense, two algorithms
based on a simple line rotation motion analysis were developed. These algorithms successfully provide pose information; however,
they depend strongly on scene data conditions. To overcome this limitation, a second approach inspired in the biological processes
performed by the human visual system was developed. It is based in the proper content of the image and defines a computational
approach to direct perception.
The set of developed algorithms analyzes the visual properties provided by angular variations. The aim is to gather valuable data
from which spatial information can be obtained and used to emulate a visual perception process by establishing a 2D-3D metric
relation. Since it is considered fundamental in the visual-motor coordination and consequently essential to interact with the
environment, a significant cognitive effect is produced by the application of the developed computational approach in environments
mediated by technology. In this work, this cognitive effect is demonstrated by an experimental study where a number of participants
were asked to complete an action-perception task. The main purpose of the study was to analyze the visual guided behavior in
teleoperation and the cognitive effect caused by the addition of 3D information. The results presented a significant influence of the
3D aid in the skill improvement, which showed an enhancement of the sense of presence.Las señales monoculares son entradas sensoriales capturadas exclusivamente por un
solo ojo que ayudan a la percepción de distancia o espacio. Son en su mayoría
características estáticas que proveen información de profundidad y son muy
utilizadas en arte gráfico para crear apariencias reales de una escena. Dado que la
información espacial contenida en dichas señales son extraídas de la retina, la
existencia de una relación entre esta extracción de información y la teoría de
percepción directa puede ser convenientemente asumida. De acuerdo a esta teoría, la
información espacial de todo le que vemos está directamente contenido en el arreglo
óptico. Por lo tanto, esta suposición hace posible el modelado de procesos de
percepción visual a través de enfoques computacionales. En esta tesis doctoral, la
variación angular es considerada como una señal monocular, y el concepto de
percepción directa adoptado por un enfoque basado en algoritmos de visión por
computador que lo consideran un principio apropiado para el desarrollo de nuevas
técnicas de cálculo de información espacial.
La información espacial esperada a obtener de esta señal monocular es la posición y
orientación de un objeto con respecto al observador, lo cual en visión por computador
es un conocido campo de investigación llamado estimación de la pose 2D-3D. En esta
tesis doctoral, establecer la variación angular como señal monocular y conseguir un
modelo matemático que describa la percepción directa, se lleva a cabo mediante el
desarrollo de un grupo de métodos de estimación de la pose. Partiendo de estrategias
convencionales, un primer enfoque implanta restricciones geométricas en ecuaciones
para relacionar características del objeto y la imagen. En este caso, dos algoritmos
basados en el análisis de movimientos de rotación de una línea recta fueron
desarrollados. Estos algoritmos exitosamente proveen información de la pose. Sin
embargo, dependen fuertemente de condiciones de la escena. Para superar esta
limitación, un segundo enfoque inspirado en los procesos biológicos ejecutados por el
sistema visual humano fue desarrollado. Está basado en el propio contenido de la
imagen y define un enfoque computacional a la percepción directa.
El grupo de algoritmos desarrollados analiza las propiedades visuales suministradas
por variaciones angulares. El propósito principal es el de reunir datos de importancia
con los cuales la información espacial pueda ser obtenida y utilizada para emular
procesos de percepción visual mediante el establecimiento de relaciones métricas 2D-
3D. Debido a que dicha relación es considerada fundamental en la coordinación
visuomotora y consecuentemente esencial para interactuar con lo que nos rodea, un
efecto cognitivo significativo puede ser producido por la aplicación de métodos de
L
estimación de pose en entornos mediados tecnológicamente. En esta tesis doctoral, este
efecto cognitivo ha sido demostrado por un estudio experimental en el cual un número
de participantes fueron invitados a ejecutar una tarea de acción-percepción. El
propósito principal de este estudio fue el análisis de la conducta guiada visualmente en
teleoperación y el efecto cognitivo causado por la inclusión de información 3D. Los
resultados han presentado una influencia notable de la ayuda 3D en la mejora de la
habilidad, así como un aumento de la sensación de presencia
Lunar Crater Identification in Digital Images
It is often necessary to identify a pattern of observed craters in a single
image of the lunar surface and without any prior knowledge of the camera's
location. This so-called "lost-in-space" crater identification problem is
common in both crater-based terrain relative navigation (TRN) and in automatic
registration of scientific imagery. Past work on crater identification has
largely been based on heuristic schemes, with poor performance outside of a
narrowly defined operating regime (e.g., nadir pointing images, small search
areas). This work provides the first mathematically rigorous treatment of the
general crater identification problem. It is shown when it is (and when it is
not) possible to recognize a pattern of elliptical crater rims in an image
formed by perspective projection. For the cases when it is possible to
recognize a pattern, descriptors are developed using invariant theory that
provably capture all of the viewpoint invariant information. These descriptors
may be pre-computed for known crater patterns and placed in a searchable index
for fast recognition. New techniques are also developed for computing pose from
crater rim observations and for evaluating crater rim correspondences. These
techniques are demonstrated on both synthetic and real images
Shape description and matching using integral invariants on eccentricity transformed images
Matching occluded and noisy shapes is a problem frequently encountered in medical image analysis and more generally in computer vision. To keep track of changes inside the breast, for example, it is important for a computer aided detection system to establish correspondences between regions of interest. Shape transformations, computed both with integral invariants (II) and with geodesic distance, yield signatures that are invariant to isometric deformations, such as bending and articulations. Integral invariants describe the boundaries of planar shapes. However, they provide no information about where a particular feature lies on the boundary with regard to the overall shape structure. Conversely, eccentricity transforms (Ecc) can match shapes by signatures of geodesic distance histograms based on information from inside the shape; but they ignore the boundary information. We describe a method that combines the boundary signature of a shape obtained from II and structural information from the Ecc to yield results that improve on them separately
- …