358 research outputs found

    A comparative evaluation of interest point detectors and local descriptors for visual SLAM

    Get PDF
    Abstract In this paper we compare the behavior of different interest points detectors and descriptors under the conditions needed to be used as landmarks in vision-based simultaneous localization and mapping (SLAM). We evaluate the repeatability of the detectors, as well as the invariance and distinctiveness of the descriptors, under different perceptual conditions using sequences of images representing planar objects as well as 3D scenes. We believe that this information will be useful when selecting an appropriat

    Local Features, Structure-from-motion and View Synthesis in Spherical Video

    Get PDF
    This thesis addresses the problem of synthesising new views from spherical video or image sequences. We propose an interest point detector and feature descriptor that allows us to robustly match local features between pairs of spherical images and use this as part of a structure-from-motion pipeline that allows us to estimate camera pose from a spherical video sequence. With pose estimates to hand, we propose methods for view stabilisation and novel viewpoint synthesis. In Chapter 3 we describe our contribution in the area of feature detection and description in spherical images. First, we present a novel representation for spherical images which uses a discrete geodesic grid composed of hexagonal pixels. Second, we extend the BRISK binary descriptor to the sphere, proposing methods for multiscale corner detection, sub-pixel position and sub-octave scale refinement and descriptor construction in the tangent space to the sphere. In Chapter 4 we describe our contributions in the area of spherical structure-from-motion. We revisit problems from multiview geometry in the context of spherical images. We propose methods suited to spherical camera geometry for the spherical-n-point problem and calibrated spherical reconstruction. We introduce a new probabilistic interpretation of spherical structure-from-motion which uses the von Mises-Fisher distribution in spherical feature point positions. This model provides an alternate objective function that we use in bundle adjustment. In Chapter 5 we describe our contributions in the area of view synthesis from spherical images. We exploit the camera pose estimates made by our pipeline and use these in two view synthesis applications. The first is view stabilisation where we remove the effect of viewing direction changes, often present in first person video. Second, we propose a method for synthesising novel viewpoints

    Distributed scene reconstruction from multiple mobile platforms

    Get PDF
    Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed

    Superpixel Finite Element Segmentation for RGB-D Images

    Get PDF

    Real-time Three-dimensional Photoacoustic Imaging

    Get PDF
    Photoacoustic imaging is a modality that combines the benefits of two prominent imaging techniques; the strong contrast inherent to optical imaging techniques with the enhanced penetration depth and resolution of ultrasound imaging. PA waves are generated by illuminating a light-absorbing object with a short laser pulse. The deposited energy causes a pressure change in the object and, consequently, an outwardly propagating acoustic wave. Images are produced by using characteristic optical information contained within the waves. We have developed a 3D PA imaging system by using a staring, sparse array approach to produce real-time PA images. The technique employs the use of a limited number of transducers and by solving a linear system model, 3D PA images are rendered. In this thesis, the development of an omni-directional PA source is introduced as a method to characterize the shift-variant system response. From this foundation, a technique is presented to generate an experimental estimate of the imaging operator for a PA system. This allows further characterization of the object space by two techniques; the crosstalk matrix and singular value decomposition. Finally, the results of the singular value decomposition analysis coupled with the linear system model approach to image reconstruction, 3D PA images are produced at a frame rate of 0.7 Hz. This approach to 3D PA imaging has provided the foundation for 3D PA images to be produced at frame rates limited only by the laser repetition rate, as straightforward system improvements could see the imaging process reduced to tens of milliseconds

    Visual Localization of Mobile Robot

    Get PDF
    Tato práce se zaměřuje na prozkoumání současné situace na poli určování polohy z kamerových dat a na navržení vhodného řešení pro mobilní robotickou platformu vybavenou vertikálně orientovanou RGB kamerou s fisheye čočkou. Navržený systém by měl být schopen dlouhodobě vykonávat globální lokalizaci v měnícím se vnitřním prostředí výrobních závodů a kancelářských budov. Pro ověření funkč nosti vybraných metod byl nasnímán dataset fisheye obrazů spolu s jejich polohou. VLAD a NetVLAD deskriptory byly otestovány v kombinaci s dlaždicovou reprezentací panoramat. Jako řešení byla navržena jednoduchá metoda, určující aktuální polohu na základě polohy nejpodobnějšího obrazu z databáze.This work aims to examine the current state of the art in visual localization and find a suitable solution for an indoor mobile robotic platform equipped with a single upward-looking RGB camera and fisheye lens. The system should be able to perform longterm global localization in changing indoor industrial or office environment. A dataset of localized omnidirectional images was captured and used for evaluation of the performance of selected methods. VLAD and NetVLAD descriptors were tested in combination with tiled panorama representation. A simple localization method based on taking the position of the most similar database image is proposed as the solution

    Scene representation and matching for visual localization in hybrid camera scenarios

    Get PDF
    Scene representation and matching are crucial steps in a variety of tasks ranging from 3D reconstruction to virtual/augmented/mixed reality applications, to robotics, and others. While approaches exist that tackle these tasks, they mostly overlook the issue of efficiency in the scene representation, which is fundamental in resource-constrained systems and for increasing computing speed. Also, they normally assume the use of projective cameras, while performance on systems based on other camera geometries remains suboptimal. This dissertation contributes with a new efficient scene representation method that dramatically reduces the number of 3D points. The approach sets up an optimization problem for the automated selection of the most relevant points to retain. This leads to a constrained quadratic program, which is solved optimally with a newly introduced variant of the sequential minimal optimization method. In addition, a new initialization approach is introduced for the fast convergence of the method. Extensive experimentation on public benchmark datasets demonstrates that the approach produces a compressed scene representation quickly while delivering accurate pose estimates. The dissertation also contributes with new methods for scene matching that go beyond the use of projective cameras. Alternative camera geometries, like fisheye cameras, produce images with very high distortion, making current image feature point detectors and descriptors less efficient, since designed for projective cameras. New methods based on deep learning are introduced to address this problem, where feature detectors and descriptors can overcome distortion effects and more effectively perform feature matching between pairs of fisheye images, and also between hybrid pairs of fisheye and perspective images. Due to the limited availability of fisheye-perspective image datasets, three datasets were collected for training and testing the methods. The results demonstrate an increase of the detection and matching rates which outperform the current state-of-the-art methods
    corecore