522 research outputs found

    Challenges in 3D scanning: Focusing on Ears and Multiple View Stereopsis

    Get PDF

    Semantic Mapping of Road Scenes

    Get PDF
    The problem of understanding road scenes has been on the fore-front in the computer vision community for the last couple of years. This enables autonomous systems to navigate and understand the surroundings in which it operates. It involves reconstructing the scene and estimating the objects present in it, such as ‘vehicles’, ‘road’, ‘pavements’ and ‘buildings’. This thesis focusses on these aspects and proposes solutions to address them. First, we propose a solution to generate a dense semantic map from multiple street-level images. This map can be imagined as the bird’s eye view of the region with associated semantic labels for ten’s of kilometres of street level data. We generate the overhead semantic view from street level images. This is in contrast to existing approaches using satellite/overhead imagery for classification of urban region, allowing us to produce a detailed semantic map for a large scale urban area. Then we describe a method to perform large scale dense 3D reconstruction of road scenes with associated semantic labels. Our method fuses the depth-maps in an online fashion, generated from the stereo pairs across time into a global 3D volume, in order to accommodate arbitrarily long image sequences. The object class labels estimated from the street level stereo image sequence are used to annotate the reconstructed volume. Then we exploit the scene structure in object class labelling by performing inference over the meshed representation of the scene. By performing labelling over the mesh we solve two issues: Firstly, images often have redundant information with multiple images describing the same scene. Solving these images separately is slow, where our method is approximately a magnitude faster in the inference stage compared to normal inference in the image domain. Secondly, often multiple images, even though they describe the same scene result in inconsistent labelling. By solving a single mesh, we remove the inconsistency of labelling across the images. Also our mesh based labelling takes into account of the object layout in the scene, which is often ambiguous in the image domain, thereby increasing the accuracy of object labelling. Finally, we perform labelling and structure computation through a hierarchical robust PN Markov Random Field defined on voxels and super-voxels given by an octree. This allows us to infer the 3D structure and the object-class labels in a principled manner, through bounded approximate minimisation of a well defined and studied energy functional. In this thesis, we also introduce two object labelled datasets created from real world data. The 15 kilometre Yotta Labelled dataset consists of 8,000 images per camera view of the roadways of the United Kingdom with a subset of them annotated with object class labels and the second dataset is comprised of ground truth object labels for the publicly available KITTI dataset. Both the datasets are available publicly and we hope will be helpful to the vision research community

    Object-Aware Tracking and Mapping

    Get PDF
    Reasoning about geometric properties of digital cameras and optical physics enabled researchers to build methods that localise cameras in 3D space from a video stream, while – often simultaneously – constructing a model of the environment. Related techniques have evolved substantially since the 1980s, leading to increasingly accurate estimations. Traditionally, however, the quality of results is strongly affected by the presence of moving objects, incomplete data, or difficult surfaces – i.e. surfaces that are not Lambertian or lack texture. One insight of this work is that these problems can be addressed by going beyond geometrical and optical constraints, in favour of object level and semantic constraints. Incorporating specific types of prior knowledge in the inference process, such as motion or shape priors, leads to approaches with distinct advantages and disadvantages. After introducing relevant concepts in Chapter 1 and Chapter 2, methods for building object-centric maps in dynamic environments using motion priors are investigated in Chapter 5. Chapter 6 addresses the same problem as Chapter 5, but presents an approach which relies on semantic priors rather than motion cues. To fully exploit semantic information, Chapter 7 discusses the conditioning of shape representations on prior knowledge and the practical application to monocular, object-aware reconstruction systems

    Voxel-Based Indoor Reconstruction From HoloLens Triangle Meshes

    Get PDF
    Current mobile augmented reality devices are often equipped with range sensors. The Microsoft HoloLens for instance is equipped with a Time-Of-Flight (ToF) range camera providing coarse triangle meshes that can be used in custom applications. We suggest to use the triangle meshes for the automatic generation of indoor models that can serve as basis for augmenting their physical counterpart with location-dependent information. In this paper, we present a novel voxel-based approach for automated indoor reconstruction from unstructured three-dimensional geometries like triangle meshes. After an initial voxelization of the input data, rooms are detected in the resulting voxel grid by segmenting connected voxel components of ceiling candidates and extruding them downwards to find floor candidates. Semantic class labels like 'Wall', 'Wall Opening', 'Interior Object' and 'Empty Interior' are then assigned to the room voxels in-between ceiling and floor by a rule-based voxel sweep algorithm. Finally, the geometry of the detected walls and their openings is refined in voxel representation. The proposed approach is not restricted to Manhattan World scenarios and does not rely on room surfaces being planar.Comment: 8 pages, 4 figure

    Statistical Modelling and Inference in Image Analysis

    Get PDF
    The aim of the thesis is to investigate classes of model-based approaches to statistical image analysis. We explored the properties of models and examined the problem of parameter estimation from the original image data and, in particular, from noisy versions of the the scene. We concentrated on Markov random field (MRF) models, Markov mesh random field (MMRF) models and Multi-dimensional Markov chain (MDMC) models. In Chapter 2, for the one-dimensional version of Markov random fields, we developed a recursive technique which enables us to achieve maximum likelihood estimation for the underlying parameter and to carry out the EM algorithm for parameter estimation when only noisy data are available. This technique also enables us, in just a single pass, to generate a sample from a one-dimensional Markov random field. Although, unfortunately, this technique cannot be extended to two- or multi-dimensional models, it was applied to many cases in this thesis. Since, for two-dimensional Markov random fields, the density of each row (column), conditionally on all other rows (columns) is of the form of a one-dimensional Markov random field, and since the distribution of the original image, conditionally on the noisy version of data, is still a Markov random field, the technique can be used on different forms of conditional density of one row (column). In Chapter 3, therefore, we developed the line-relaxation method for simulating MRFs and maximum line pseudo-likelihood estimation of parameter(s), and in Chapter 5, we developed a simultaneous procedure of parameter estimation and restoration, in which line pseudo-likelihood and a modified EM algorithm were used. The first part of Chapter 3 and Chapter 4 concentrate on inference for two-dimensional MRFs. We obtained a matrix expression for partition functins for general models, and a more explicit form for a multi-colour Ising model, and thus located the positions of critical points of this multi-colour model. We examined the asymptotic properties of an asymmetric, two-colour Ising model. For general models, in Chapter 4, we explored asymptotic properties under an "independence" or a "near independence" condition, and then developed the approach of maximum approximate-likelihood estimation. For three-dimensional MMRF models, in chapter 6, a generalization of Devijver's F-G-H algorithm is developed for restoration. In Chapter 7, the recursive technique was again used to introduce MDMC models, which form a natural extension of a Markov chain. By suitable choice of model parameters, textures can be generated that are similar to those simulated from MRFs, but the simulation procedure is computationally much more economical. The recursive technique also enables us to maximize the likelihood function of the model. For all three sorts of prior random field models considered in this thesis, we developed a simultaneous procedure for parameter estimation and image restoration, when only noisy data are available. The currently restored image was used, together with noisy data, in modified versions of the EM algorithm. In simulation studies, quite good results were obtained, in terms of estimation of parameters in both the original model and, particularly, in the noise model, and in terms of restoration

    CONSISTENT MULTI-VIEW TEXTURING OF DETAILED 3D SURFACE MODELS

    Get PDF

    A review on deep learning techniques for 3D sensed data classification

    Get PDF
    Over the past decade deep learning has driven progress in 2D image understanding. Despite these advancements, techniques for automatic 3D sensed data understanding, such as point clouds, is comparatively immature. However, with a range of important applications from indoor robotics navigation to national scale remote sensing there is a high demand for algorithms that can learn to automatically understand and classify 3D sensed data. In this paper we review the current state-of-the-art deep learning architectures for processing unstructured Euclidean data. We begin by addressing the background concepts and traditional methodologies. We review the current main approaches including; RGB-D, multi-view, volumetric and fully end-to-end architecture designs. Datasets for each category are documented and explained. Finally, we give a detailed discussion about the future of deep learning for 3D sensed data, using literature to justify the areas where future research would be most valuable.Comment: 25 pages, 9 figures. Review pape
    • …
    corecore