6 research outputs found

    Correspondenceless scan-to-map-scan matching of homoriented 2D scans for mobile robot localisation

    Full text link
    The objective of this study is improving the location estimate of a mobile robot capable of motion on a plane and mounted with a conventional 2D LIDAR sensor, given an initial guess for its location on a 2D map of its surroundings. Documented herein is the theoretical reasoning behind solving a matching problem between two homoriented 2D scans, one derived from the robot's physical sensor and one derived by simulating its operation within the map, in a manner that does not require the establishing of correspondences between their constituting rays. Two results are proved and subsequently shown through experiments. The first is that the true position of the sensor can be recovered with arbitrary precision when the physical sensor reports faultless measurements and there is no discrepancy between the environment the robot operates in and its perception of it by the robot. The second is that when either is affected by disturbance, the location estimate is bound in a neighbourhood of the true location whose radius is proportional to the affecting disturbance.Comment: 19 pages, 19 figure

    Stochastic constraints for vision-aided inertial navigation

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2005.Includes bibliographical references (p. 107-110).This thesis describes a new method to improve inertial navigation using feature-based constraints from one or more video cameras. The proposed method lengthens the period of time during which a human or vehicle can navigate in GPS-deprived environments. Our approach integrates well with existing navigation systems, because we invoke general sensor models that represent a wide range of available hardware. The inertial model includes errors in bias, scale, and random walk. Any camera and tracking algorithm may be used, as long as the visual output can be expressed as ray vectors extending from known locations on the sensor body. A modified linear Kalman filter performs the data fusion. Unlike traditional Simultaneous Localization and Mapping (SLAM/CML), our state vector contains only inertial sensor errors related to position. This choice allows uncertainty to be properly represented by a covariance matrix. We do not augment the state with feature coordinates. Instead, image data contributes stochastic epipolar constraints over a broad baseline in time and space, resulting in improved observability of the IMU error states. The constraints lead to a relative residual and associated relative covariance, defined partly by the state history. Navigation results are presented using high-quality synthetic data and real fisheye imagery.by David D. Diel.S.M

    On unifying sparsity and geometry for image-based 3D scene representation

    Get PDF
    Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding
    corecore