948 research outputs found

    Keyframe-based monocular SLAM: design, survey, and future directions

    Get PDF
    Extensive research in the field of monocular SLAM for the past fifteen years has yielded workable systems that found their way into various applications in robotics and augmented reality. Although filter-based monocular SLAM systems were common at some time, the more efficient keyframe-based solutions are becoming the de facto methodology for building a monocular SLAM system. The objective of this paper is threefold: first, the paper serves as a guideline for people seeking to design their own monocular SLAM according to specific environmental constraints. Second, it presents a survey that covers the various keyframe-based monocular SLAM systems in the literature, detailing the components of their implementation, and critically assessing the specific strategies made in each proposed solution. Third, the paper provides insight into the direction of future research in this field, to address the major limitations still facing monocular SLAM; namely, in the issues of illumination changes, initialization, highly dynamic motion, poorly textured scenes, repetitive textures, map maintenance, and failure recovery

    LEVEL-BASED CORRESPONDENCE APPROACH TO COMPUTATIONAL STEREO

    Get PDF
    One fundamental problem in computational stereo reconstruction is correspondence. Correspondence is the method of detecting the real world object reflections in two camera views. This research focuses on correspondence, proposing an algorithm to improve such detection for low quality cameras (webcams) while trying to achieve real-time image processing. Correspondence plays an important role in computational stereo reconstruction and it has a vast spectrum of applicability. This method is useful in other areas such as structure from motion reconstruction, object detection, tracking in robot vision and virtual reality. Due to its importance, a correspondence method needs to be accurate enough to meet the requirement of such fields but it should be less costly and easy to use and configure, to be accessible by everyone. By comparing current local correspondence method and discussing their weakness and strength, this research tries to enhance an algorithm to improve previous works to achieve fast detection, less costly and acceptable accuracy to meet the requirement of reconstruction. In this research, the correspondence is divided into four stages. Two stages of preprocessing which are noise reduction and edge detection have been compared with respect to different methods available. In the next stage, the feature detection process is introduced and discussed focusing on possible solutions to reduce errors created by system or problem occurring in the scene such as occlusion. Lastly, in the final stage it elaborates different methods of displaying reconstructed result. Different sets of data are processed based on the steps involved in correspondence and the results are discussed and compared in detail. The finding shows how this system can achieve high speed and acceptable outcome despite of poor quality input. As a conclusion, some possible improvements are proposed based on ultimate outcome

    Geometric and photometric affine invariant image registration

    Get PDF
    This thesis aims to present a solution to the correspondence problem for the registration of wide-baseline images taken from uncalibrated cameras. We propose an affine invariant descriptor that combines the geometry and photometry of the scene to find correspondences between both views. The geometric affine invariant component of the descriptor is based on the affine arc-length metric, whereas the photometry is analysed by invariant colour moments. A graph structure represents the spatial distribution of the primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs represent connectivities by extracted contours. After matching, we refine the search for correspondences by using a maximum likelihood robust algorithm. We have evaluated the system over synthetic and real data. The method is endemic to propagation of errors introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System

    Image understanding and feature extraction for applications in industry and mapping

    Get PDF
    Bibliography: p. 212-220.The aim of digital photogrammetry is the automated extraction and classification of the three dimensional information of a scene from a number of images. Existing photogrammetric systems are semi-automatic requiring manual editing and control, and have very limited domains of application so that image understanding capabilities are left to the user. Among the most important steps in a fully integrated system are the extraction of features suitable for matching, the establishment of the correspondence between matching points and object classification. The following study attempts to explore the applicability of pattern recognition concepts in conjunction with existing area-based methods, feature-based techniques and other approaches used in computer vision in order to increase the level of automation and as a general alternative and addition to existing methods. As an illustration of the pattern recognition approach examples of industrial applications are given. The underlying method is then extended to the identification of objects in aerial images of urban scenes and to the location of targets in close-range photogrammetric applications. Various moment-based techniques are considered as pattern classifiers including geometric invariant moments, Legendre moments, Zernike moments and pseudo-Zernike moments. Two-dimensional Fourier transforms are also considered as pattern classifiers. The suitability of these techniques is assessed. These are then applied as object locators and as feature extractors or interest operators. Additionally the use of fractal dimension to segment natural scenes for regional classification in order to limit the search space for particular objects is considered. The pattern recognition techniques require considerable preprocessing of images. The various image processing techniques required are explained where needed. Extracted feature points are matched using relaxation based techniques in conjunction with area-based methods to 'obtain subpixel accuracy. A subpixel pattern recognition based method is also proposed and an investigation into improved area-based subpixel matching methods is undertaken. An algorithm for determining relative orientation parameters incorporating the epipolar line constraint is investigated and compared with a standard relative orientation algorithm. In conclusion a basic system that can be automated based on some novel techniques in conjunction with existing methods is described and implemented in a mapping application. This system could be largely automated with suitably powerful computers

    An empirical assessment of real-time progressive stereo reconstruction

    Get PDF
    3D reconstruction from images, the problem of reconstructing depth from images, is one of the most well-studied problems within computer vision. In part because it is academically interesting, but also because of the significant growth in the use of 3D models. This growth can be attributed to the development of augmented reality, 3D printing and indoor mapping. Progressive stereo reconstruction is the sequential application of stereo reconstructions to reconstruct a scene. To achieve a reliable progressive stereo reconstruction a combination of best practice algorithms needs to be used. The purpose of this research is to determine the combinat ion of best practice algorithms that lead to the most accurate and efficient progressive stereo reconstruction i.e the best practice combination. In order to obtain a similarity reconstruction the in t rinsic parameters of the camera need to be known. If they are not known they are determined by capturing ten images of a checkerboard with a known calibration pattern from different angles and using the moving plane algori thm. Thereafter in order to perform a near real-time reconstruction frames are acquired and reconstructed simultaneously. For the first pair of frames keypoints are detected and matched using a best practice keypoint detection and matching algorithm. The motion of the camera between the frames is then determined by decomposing the essential matrix which is determined from the fundamental matrix, which is determined using a best practice ego-motion estimation algorithm. Finally the keypoints are reconstructed using a best practice reconstruction algorithm. For sequential frames each frame is paired with t he previous frame and keypoints are therefore only detected in the sequential frame. They are detected , matched and reconstructed in the same fashion as the first pair of frames, however to ensure that the reconstructed points are in the same scale as the points reconstructed from the first pair of frames the motion of the camera between t he frames is estimated from 3D-2D correspondences using a best practice algorithm. If the purpose of progressive reconstruction is for visualization the best practice combination algorithm for keypoint detection was found to be Speeded Up Robust Features (SURF) as it results in more reconstructed points than Scale-Invariant Feature Transform (SIFT). SIFT is however more computationally efficient and thus better suited if the number of reconstructed points does not matter, for example if the purpose of progressive reconstruction is for camera tracking. For all purposes the best practice combination algorithm for matching was found to be optical flow as it is the most efficient and for ego-motion estimation the best practice combination algorithm was found to be the 5-point algorithm as it is robust to points located on planes. This research is significant as the effects of the key steps of progressive reconstruction and the choices made at each step on the accuracy and efficiency of the reconstruction as a whole have never been studied. As a result progressive stereo reconstruction can now be performed in near real-time on a mobile device without compromising the accuracy of reconstruction

    Learning and Searching Methods for Robust, Real-Time Visual Odometry.

    Full text link
    Accurate position estimation provides a critical foundation for mobile robot perception and control. While well-studied, it remains difficult to provide timely, precise, and robust position estimates for applications that operate in uncontrolled environments, such as robotic exploration and autonomous driving. Continuous, high-rate egomotion estimation is possible using cameras and Visual Odometry (VO), which tracks the movement of sparse scene content known as image keypoints or features. However, high update rates, often 30~Hz or greater, leave little computation time per frame, while variability in scene content stresses robustness. Due to these challenges, implementing an accurate and robust visual odometry system remains difficult. This thesis investigates fundamental improvements throughout all stages of a visual odometry system, and has three primary contributions: The first contribution is a machine learning method for feature detector design. This method considers end-to-end motion estimation accuracy during learning. Consequently, accuracy and robustness are improved across multiple challenging datasets in comparison to state of the art alternatives. The second contribution is a proposed feature descriptor, TailoredBRIEF, that builds upon recent advances in the field in fast, low-memory descriptor extraction and matching. TailoredBRIEF is an in-situ descriptor learning method that improves feature matching accuracy by efficiently customizing descriptor structures on a per-feature basis. Further, a common asymmetry in vision system design between reference and query images is described and exploited, enabling approaches that would otherwise exceed runtime constraints. The final contribution is a new algorithm for visual motion estimation: Perspective Alignment Search~(PAS). Many vision systems depend on the unique appearance of features during matching, despite a large quantity of non-unique features in otherwise barren environments. A search-based method, PAS, is proposed to employ features that lack unique appearance through descriptorless matching. This method simplifies visual odometry pipelines, defining one method that subsumes feature matching, outlier rejection, and motion estimation. Throughout this work, evaluations of the proposed methods and systems are carried out on ground-truth datasets, often generated with custom experimental platforms in challenging environments. Particular focus is placed on preserving runtimes compatible with real-time operation, as is necessary for deployment in the field.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113365/1/chardson_1.pd

    Robust surface modelling of visual hull from multiple silhouettes

    Get PDF
    Reconstructing depth information from images is one of the actively researched themes in computer vision and its application involves most vision research areas from object recognition to realistic visualisation. Amongst other useful vision-based reconstruction techniques, this thesis extensively investigates the visual hull (VH) concept for volume approximation and its robust surface modelling when various views of an object are available. Assuming that multiple images are captured from a circular motion, projection matrices are generally parameterised in terms of a rotation angle from a reference position in order to facilitate the multi-camera calibration. However, this assumption is often violated in practice, i.e., a pure rotation in a planar motion with accurate rotation angle is hardly realisable. To address this problem, at first, this thesis proposes a calibration method associated with the approximate circular motion. With these modified projection matrices, a resulting VH is represented by a hierarchical tree structure of voxels from which surfaces are extracted by the Marching cubes (MC) algorithm. However, the surfaces may have unexpected artefacts caused by a coarser volume reconstruction, the topological ambiguity of the MC algorithm, and imperfect image processing or calibration result. To avoid this sensitivity, this thesis proposes a robust surface construction algorithm which initially classifies local convex regions from imperfect MC vertices and then aggregates local surfaces constructed by the 3D convex hull algorithm. Furthermore, this thesis also explores the use of wide baseline images to refine a coarse VH using an affine invariant region descriptor. This improves the quality of VH when a small number of initial views is given. In conclusion, the proposed methods achieve a 3D model with enhanced accuracy. Also, robust surface modelling is retained when silhouette images are degraded by practical noise

    Object recognition using multi-view imaging

    No full text
    Single view imaging data has been used in most previous research in computer vision and image understanding and lots of techniques have been developed. Recently with the fast development and dropping cost of multiple cameras, it has become possible to have many more views to achieve image processing tasks. This thesis will consider how to use the obtained multiple images in the application of target object recognition. In this context, we present two algorithms for object recognition based on scale- invariant feature points. The first is single view object recognition method (SOR), which operates on single images and uses a chirality constraint to reduce the recognition errors that arise when only a small number of feature points are matched. The procedure is extended in the second multi-view object recognition algorithm (MOR) which operates on a multi-view image sequence and, by tracking feature points using a dynamic programming method in the plenoptic domain subject to the epipolar constraint, is able to fuse feature point matches from all the available images, resulting in more robust recognition. We evaluated these algorithms using a number of data sets of real images capturing both indoor and outdoor scenes. We demonstrate that MOR is better than SOR particularly for noisy and low resolution images, and it is also able to recognize objects that are partially occluded by combining it with some segmentation techniques
    corecore