68 research outputs found
Lunar Crater Identification in Digital Images
It is often necessary to identify a pattern of observed craters in a single
image of the lunar surface and without any prior knowledge of the camera's
location. This so-called "lost-in-space" crater identification problem is
common in both crater-based terrain relative navigation (TRN) and in automatic
registration of scientific imagery. Past work on crater identification has
largely been based on heuristic schemes, with poor performance outside of a
narrowly defined operating regime (e.g., nadir pointing images, small search
areas). This work provides the first mathematically rigorous treatment of the
general crater identification problem. It is shown when it is (and when it is
not) possible to recognize a pattern of elliptical crater rims in an image
formed by perspective projection. For the cases when it is possible to
recognize a pattern, descriptors are developed using invariant theory that
provably capture all of the viewpoint invariant information. These descriptors
may be pre-computed for known crater patterns and placed in a searchable index
for fast recognition. New techniques are also developed for computing pose from
crater rim observations and for evaluating crater rim correspondences. These
techniques are demonstrated on both synthetic and real images
Disparate View Matching
Matching of disparate views has gained significance in computer vision due to its role in many novel application areas. Being able to match images of the same scene captured during day and night, between a historic and contemporary picture of a scene, and between aerial and ground-level views of a building facade all enable novel applications ranging from loop-closure detection for structure-from-motion and re-photography to geo-localization of a street-level image using reference imagery captured from the air. The goal of this work is to develop novel features and methods that address matching problems where direct appearance-based correspondences are either difficult to obtain or infeasible because of the lack of appearance similarity altogether. To address these problems, we propose methods that span the appearance-geometry spectrum in terms of both the use of these cues as well as the ability of each method to handle variations in appearance and geometry. First, we consider the problem of geo-localization of a query street-level image using a reference database of building facades captured from a bird\u27s eye view. To address this wide-baseline facade matching problem, a novel scale-selective self-similarity feature that avoids direct comparison of appearance between disparate facade images is presented. Next, to address image matching problems with more extreme appearance variation, a novel representation for matchable images expressed in terms of the eigen-functions of the joint graph of the two images is presented. This representation is used to derive features that are persistent across wide variations in appearance. Next, the problem setting of matching between a street-level image and a digital elevation map (DEM) is considered. Given the limited appearance information available in this scenario, the matching approach has to rely more significantly on geometric cues. Therefore, a purely geometric method to establish correspondences between building corners in the DEM and the visible corners in the query image is presented. Finally, to generalize this problem setting we address the problem of establishing correspondences between 3D and 2D point clouds using geometric means alone. A novel framework for incorporating purely geometric constraints into a higher-order graph matching framework is presented with specific formulations for the three-point calibrated absolute camera pose problem (P3P), two-point upright camera pose problem (Up2p) and the three-plus-one relative camera pose problem
Vision Sensors and Edge Detection
Vision Sensors and Edge Detection book reflects a selection of recent developments within the area of vision sensors and edge detection. There are two sections in this book. The first section presents vision sensors with applications to panoramic vision sensors, wireless vision sensors, and automated vision sensor inspection, and the second one shows image processing techniques, such as, image measurements, image transformations, filtering, and parallel computing
Pose Invariant Gait Analysis And Reconstruction
One of the unique advantages of human gait is that it can be perceived from a distance. A varied range of research has been undertaken within the field of gait recognition. However, in almost all circumstances subjects have been constrained to walk fronto-parallel to the camera with a single walking speed. In this thesis we show that gait has sufficient properties that allows us to exploit the structure of articulated leg motion within single view sequences, in order to remove the unknown subject pose and reconstruct the underlying gait signature, with no prior knowledge of the camera calibration. Articulated leg motion is approximately planar, since almost all of the perceived motion is contained within a single limb swing plane. The variation of motion out of this plane is subtle and negligible in comparison to this major plane of motion. Subsequently, we can model human motion by employing a cardboard person assumption. A subject's body and leg segments may be represented by repeating spatio-temporal motion patterns within a set of bilaterally symmetric limb planes. The static features of gait are defined as quantities that remain invariant over the full range of walking motions. In total, we have identified nine static features of articulated leg motion, corresponding to the fronto-parallel view of gait, that remain invariant to the differences in the mode of subject motion. These features are hypothetically unique to each individual, thus can be used as suitable parameters for biometric identification. We develop a stratified approach to linear trajectory gait reconstruction that uses the rigid bone lengths of planar articulated leg motion in order to reconstruct the fronto-parallel view of gait. Furthermore, subject motion commonly occurs within a fixed ground plane and is imaged by a static camera. In general, people tend to walk in straight lines with constant velocity. Imaged gait can then be split piecewise into natural segments of linear motion. If two or more sufficiently different imaged trajectories are available then the calibration of the camera can be determined. Subsequently, the total pattern of gait motion can be globally parameterised for all subjects within an image sequence. We present the details of a sparse method that computes the maximum likelihood estimate of this set of parameters, then conclude with a reconstruction error analysis corresponding to an example image sequence of subject motion
Recommended from our members
Development and evaluation of a multiscale keypoint detector based on complex wavelets
This thesis develops a multiscale keypoint detector and descriptor based on the Dual-Tree Complex Wavelet Transform (DTCWT). First, we develop a scale-space framework called the 4S-DTCWT that uses the dyadic decomposition of the DTCWT but achieves denser sampling in scale by interleaving several DTCWT trees, leading to reduced scale-related aliasing. This forms the foundation for the rest of our work. Then, we present a new DTCWT based keypoint detector (BTK), which exhibits improved spatial localisation owing to the use of a more selective cornerness measure and keypoint localisation in individual levels in the 4S-DTCWT. A number of scale refinement approaches are investigated.
The improved keypoint position and scale localisation directly leads to more robust image characterisation using DTCWT based visual descriptors. We also present some ways of speeding up both the descriptor and the matching computations. These changes make it possible to use the system in practical scenarios.
We develop a novel, fully automated framework for the evaluation of keypoint detectors and descriptors. This includes a new dataset containing 3978 calibrated images from 2 cameras of 39 different toy cars on a turntable. The dataset, calibration images, inter-camera calibration, rotational calibration and test scripts are publicly available. We establish ground truth correspondences using a three-image setup, with fixed angular separation between two of the three views, thus reducing the dependency on angular separation when compared to conventional epipolar line search.
Various keypoint detectors and descriptors were compared with DTCWT based methods using this framework. To the extent possible, we separated the evaluation of the keypoint detectors from that of the descriptors. The main conclusions were that DTCWT based methods can achieve a performance comparable, if not superior, to that of established methods. We also showed that, although repeatability of keypoint detections falls off reasonably steeply with change in viewing angle, conditioned on an associated keypoint being detected at a reasonably correct corresponding location, descriptor similarity is hardly affected by viewpoint variation.
Finally, we show how an evaluation that is based purely on the prior knowledge of the geometry of the scene can be useful in eliminating the inaccuracies involved in appearance based evaluations. This uses an enhanced epipolar constraint that exploits both positions and scales of keypoints to constrain the range of possible matches
An empirical assessment of real-time progressive stereo reconstruction
3D reconstruction from images, the problem of reconstructing depth from images, is one of the most well-studied problems within computer vision. In part because it is academically interesting, but also because of the significant growth in the use of 3D models. This growth can be attributed to the development of augmented reality, 3D printing and indoor mapping. Progressive stereo reconstruction is the sequential application of stereo reconstructions to reconstruct a scene. To achieve a reliable progressive stereo reconstruction a combination of best practice algorithms needs to be used. The purpose of this research is to determine the combinat ion of best practice algorithms that lead to the most accurate and efficient progressive stereo reconstruction i.e the best practice combination. In order to obtain a similarity reconstruction the in t rinsic parameters of the camera need to be known. If they are not known they are determined by capturing ten images of a checkerboard with a known calibration pattern from different angles and using the moving plane algori thm. Thereafter in order to perform a near real-time reconstruction frames are acquired and reconstructed simultaneously. For the first pair of frames keypoints are detected and matched using a best practice keypoint detection and matching algorithm. The motion of the camera between the frames is then determined by decomposing the essential matrix which is determined from the fundamental matrix, which is determined using a best practice ego-motion estimation algorithm. Finally the keypoints are reconstructed using a best practice reconstruction algorithm. For sequential frames each frame is paired with t he previous frame and keypoints are therefore only detected in the sequential frame. They are detected , matched and reconstructed in the same fashion as the first pair of frames, however to ensure that the reconstructed points are in the same scale as the points reconstructed from the first pair of frames the motion of the camera between t he frames is estimated from 3D-2D correspondences using a best practice algorithm. If the purpose of progressive reconstruction is for visualization the best practice combination algorithm for keypoint detection was found to be Speeded Up Robust Features (SURF) as it results in more reconstructed points than Scale-Invariant Feature Transform (SIFT). SIFT is however more computationally efficient and thus better suited if the number of reconstructed points does not matter, for example if the purpose of progressive reconstruction is for camera tracking. For all purposes the best practice combination algorithm for matching was found to be optical flow as it is the most efficient and for ego-motion estimation the best practice combination algorithm was found to be the 5-point algorithm as it is robust to points located on planes. This research is significant as the effects of the key steps of progressive reconstruction and the choices made at each step on the accuracy and efficiency of the reconstruction as a whole have never been studied. As a result progressive stereo reconstruction can now be performed in near real-time on a mobile device without compromising the accuracy of reconstruction
Geometric uncertainty models for correspondence problems in digital image processing
Many recent advances in technology rely heavily on the correct interpretation of an enormous amount of visual information. All available sources of visual data (e.g. cameras in surveillance networks, smartphones, game consoles) must be adequately processed to retrieve the most interesting user information. Therefore, computer vision and image processing techniques gain significant interest at the moment, and will do so in the near future.
Most commonly applied image processing algorithms require a reliable solution for correspondence problems. The solution involves, first, the localization of corresponding points -visualizing the same 3D point in the observed scene- in the different images of distinct sources, and second, the computation of consistent geometric transformations relating correspondences on scene objects.
This PhD presents a theoretical framework for solving correspondence problems with geometric features (such as points and straight lines) representing rigid objects in image sequences of complex scenes with static and dynamic cameras. The research focuses on localization uncertainty due to errors in feature detection and measurement, and its effect on each step in the solution of a correspondence problem.
Whereas most other recent methods apply statistical-based models for spatial localization uncertainty, this work considers a novel geometric approach. Localization uncertainty is modeled as a convex polygonal region in the image space. This model can be efficiently propagated throughout the correspondence finding procedure. It allows for an easy extension toward transformation uncertainty models, and to infer confidence measures to verify the reliability of the outcome in the correspondence framework. Our procedure aims at finding reliable consistent transformations in sets of few and ill-localized features, possibly containing a large fraction of false candidate correspondences.
The evaluation of the proposed procedure in practical correspondence problems shows that correct consistent correspondence sets are returned in over 95% of the experiments for small sets of 10-40 features contaminated with up to 400% of false positives and 40% of false negatives. The presented techniques prove to be beneficial in typical image processing applications, such as image registration and rigid object tracking
Geometry model for marker-based localisation
This work presents a novel approach for position estimation from monocularvision. It has been shown that vision systems have great capability in reaching precise and accurate measurements and are becoming the state-of-the-art innavigation. Navigation systems have only been integrated in commercial mobile robots since the early 2000s, and yet localisation in a dynamic environmentthat form the main building block of navigation, has no truly elegant solution.Solutions are many and their strategies and methods differ depending on theapplication. For the lack of a single accurate procedure, methods are combinedwhich make use of different sensors fusion. This thesis focus on the use of monocular vision sensor to develop an accurate Marker-Based positioning system thatcan be used in various applications in outdoor, in agriculture for example, andin other indoor applications. Many contributions arouse here in this context. Amain contribution is in perspective distortion correction in which distortions aremodeled in all its forms with correction process. This is essential when dealingwith measurements and shapes in images. Because of the lack of robustness indepth sensing using monocular vision-based system, a second contribution is inthe novel spherical marker-based approach position captured, which is designedand developed within the concept of relative pose estimation. In this Model-Basedposition estimation, relative position can be extracted instantaneously withoutthe need of prior knowledge of the previous state of the camera, as it relies onmonocular image. This model can as well compensate for the lack of knowledge inthe scale of the real world, for example in the case of Monocular Visual Simultaneous Localisation and Mapping (VSLAM). In addition to these contributions, someexperimental and simulation evidence presented in this work has shown feasibilityof the reading measurements like distance capture and relative pose between themarker-based model and the observer, with reliability and high accuracy. Thesystem has shown the ability to track accurately the object at a farthest possibleposition from low resolution digital images and from a single viewpoint. Whilethe main application field targeted is tracking mobile-robots, other applicationscan profit from this concept like motion capture and application related to thefield of topography
- ā¦