562 research outputs found
P1AC: Revisiting Absolute Pose From a Single Affine Correspondence
We introduce a novel solution to the problem of estimating the pose of a
calibrated camera given a single observation of an oriented point and an affine
correspondence to a reference image. Affine correspondences have traditionally
been used to improve feature matching over wide baselines; however, little
previous work has considered the use of such correspondences for absolute
camera pose computation. The advantage of our approach (P1AC) is that it
requires only a single correspondence in the minimal case in comparison to the
traditional point-based approach (P3P) which requires at least three points.
Our method removes the limiting assumptions made in previous work and provides
a general solution that is applicable to large-scale image-based localization.
Our evaluation on synthetic data shows that our approach is numerically stable
and more robust to point observation noise than P3P. We also evaluate the
application of our approach for large-scale image-based localization and
demonstrate a practical reduction in the number of iterations and computation
time required to robustly localize an image
Geometric and photometric affine invariant image registration
This thesis aims to present a solution to the correspondence problem for the registration
of wide-baseline images taken from uncalibrated cameras. We propose an affine
invariant descriptor that combines the geometry and photometry of the scene to find
correspondences between both views. The geometric affine invariant component of the
descriptor is based on the affine arc-length metric, whereas the photometry is analysed
by invariant colour moments. A graph structure represents the spatial distribution of the
primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs
represent connectivities by extracted contours. After matching, we refine the search for
correspondences by using a maximum likelihood robust algorithm. We have evaluated
the system over synthetic and real data. The method is endemic to propagation of errors
introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System
DESIGN OF A USER INTERFACE FOR THE ANALYSIS OF MULTI-MODAL IMAGE REGISTRATION
Image registration is the process of spatially aligning two or more images of a scene into a common coordinate system. Research in image registration has yielded a number of rigid and non-rigid image registration methods capable of registering images of a scene between modalities. In addition, techniques of information visualization have been applied to medical image registration research to produce an atlas based image registration method. This method is capable of registration medical images of a same modality between subjects for comparative studies.
This thesis aims to extend research in image registration by adding to it the visual encoding of time. The visual encoding of time furthers image registration research by enabling the simultaneous analysis of the spatial and temporal relationships that exist between images. The benefit ofregistering images with respect to both space and time is shown through the development of a software application capable of presenting a timeΒ space narrative of x-ray images representing a patientβs medical history. This time-space narrative is assembled by performing rigid atlas based image registration on a set of x-ray images and by visually encoding their timestamps to form of an interactive timeline. The atlas based image registration method was selected to ensure that images can be registered to a common coordinate system in cases where images do not overlap. Rigid image registration was assumed to be sufficient to provide the desired visual result.
Subsequent to its implementation, an analysis of the measured uncertainty of the image registration method was performed. The error in manual point pair correspondence selection was measured at more than +/- 1.08 pixels under ideal conditions and a method to calculate the unique standard error of each image registration was presented
Object recognition using multi-view imaging
Single view imaging data has been used in most previous research in computer vision and
image understanding and lots of techniques have been developed. Recently with the fast
development and dropping cost of multiple cameras, it has become possible to have many
more views to achieve image processing tasks. This thesis will consider how to use the
obtained multiple images in the application of target object recognition.
In this context, we present two algorithms for object recognition based on scale-
invariant feature points. The first is single view object recognition method (SOR), which
operates on single images and uses a chirality constraint to reduce the recognition errors
that arise when only a small number of feature points are matched. The procedure is
extended in the second multi-view object recognition algorithm (MOR) which operates on
a multi-view image sequence and, by tracking feature points using a dynamic programming
method in the plenoptic domain subject to the epipolar constraint, is able to fuse feature
point matches from all the available images, resulting in more robust recognition.
We evaluated these algorithms using a number of data sets of real images capturing
both indoor and outdoor scenes. We demonstrate that MOR is better than SOR particularly for noisy and low resolution images, and it is also able to recognize objects that are
partially occluded by combining it with some segmentation techniques
Algorithms for trajectory integration in multiple views
PhDThis thesis addresses the problem of deriving a coherent and accurate localization
of moving objects from partial visual information when data are generated by cameras
placed in di erent view angles with respect to the scene. The framework is built around
applications of scene monitoring with multiple cameras. Firstly, we demonstrate how a
geometric-based solution exploits the relationships between corresponding feature points
across views and improves accuracy in object location. Then, we improve the estimation
of objects location with geometric transformations that account for lens distortions.
Additionally, we study the integration of the partial visual information generated by each
individual sensor and their combination into one single frame of observation that considers
object association and data fusion. Our approach is fully image-based, only relies on 2D
constructs and does not require any complex computation in 3D space. We exploit the
continuity and coherence in objects' motion when crossing cameras' elds of view. Additionally,
we work under the assumption of planar ground plane and wide baseline (i.e.
cameras' viewpoints are far apart). The main contributions are: i) the development of a
framework for distributed visual sensing that accounts for inaccuracies in the geometry
of multiple views; ii) the reduction of trajectory mapping errors using a statistical-based
homography estimation; iii) the integration of a polynomial method for correcting inaccuracies
caused by the cameras' lens distortion; iv) a global trajectory reconstruction
algorithm that associates and integrates fragments of trajectories generated by each camera
- β¦