Search CORE

522 research outputs found

Challenges in 3D scanning: Focusing on Ears and Multiple View Stereopsis

Author: Jensen Rasmus Ramsbøl
Publication venue: Technical University of Denmark
Publication date: 01/01/2013
Field of study

Semantic Mapping of Road Scenes

Author: Sengupta S
Publication venue: 'Oxford Brookes University'
Publication date: 01/01/2014
Field of study

The problem of understanding road scenes has been on the fore-front in the computer vision community for the last couple of years. This enables autonomous systems to navigate and understand the surroundings in which it operates. It involves reconstructing the scene and estimating the objects present in it, such as ‘vehicles’, ‘road’, ‘pavements’ and ‘buildings’. This thesis focusses on these aspects and proposes solutions to address them. First, we propose a solution to generate a dense semantic map from multiple street-level images. This map can be imagined as the bird’s eye view of the region with associated semantic labels for ten’s of kilometres of street level data. We generate the overhead semantic view from street level images. This is in contrast to existing approaches using satellite/overhead imagery for classification of urban region, allowing us to produce a detailed semantic map for a large scale urban area. Then we describe a method to perform large scale dense 3D reconstruction of road scenes with associated semantic labels. Our method fuses the depth-maps in an online fashion, generated from the stereo pairs across time into a global 3D volume, in order to accommodate arbitrarily long image sequences. The object class labels estimated from the street level stereo image sequence are used to annotate the reconstructed volume. Then we exploit the scene structure in object class labelling by performing inference over the meshed representation of the scene. By performing labelling over the mesh we solve two issues: Firstly, images often have redundant information with multiple images describing the same scene. Solving these images separately is slow, where our method is approximately a magnitude faster in the inference stage compared to normal inference in the image domain. Secondly, often multiple images, even though they describe the same scene result in inconsistent labelling. By solving a single mesh, we remove the inconsistency of labelling across the images. Also our mesh based labelling takes into account of the object layout in the scene, which is often ambiguous in the image domain, thereby increasing the accuracy of object labelling. Finally, we perform labelling and structure computation through a hierarchical robust PN Markov Random Field defined on voxels and super-voxels given by an octree. This allows us to infer the 3D structure and the object-class labels in a principled manner, through bounded approximate minimisation of a well defined and studied energy functional. In this thesis, we also introduce two object labelled datasets created from real world data. The 15 kilometre Yotta Labelled dataset consists of 8,000 images per camera view of the roadways of the United Kingdom with a subset of them annotated with object class labels and the second dataset is comprised of ground truth object labels for the publicly available KITTI dataset. Both the datasets are available publicly and we hope will be helpful to the vision research community

Oxford Brookes University: RADAR

Object-Aware Tracking and Mapping

Author: Runz Martin
Publication venue: UCL (University College London)
Publication date: 28/03/2022
Field of study

Reasoning about geometric properties of digital cameras and optical physics enabled researchers to build methods that localise cameras in 3D space from a video stream, while – often simultaneously – constructing a model of the environment. Related techniques have evolved substantially since the 1980s, leading to increasingly accurate estimations. Traditionally, however, the quality of results is strongly affected by the presence of moving objects, incomplete data, or difficult surfaces – i.e. surfaces that are not Lambertian or lack texture. One insight of this work is that these problems can be addressed by going beyond geometrical and optical constraints, in favour of object level and semantic constraints. Incorporating specific types of prior knowledge in the inference process, such as motion or shape priors, leads to approaches with distinct advantages and disadvantages. After introducing relevant concepts in Chapter 1 and Chapter 2, methods for building object-centric maps in dynamic environments using motion priors are investigated in Chapter 5. Chapter 6 addresses the same problem as Chapter 5, but presents an approach which relies on semantic priors rather than motion cues. To fully exploit semantic information, Chapter 7 discusses the conditioning of shape representations on prior knowledge and the practical application to monocular, object-aware reconstruction systems

UCL Discovery

MODELLING APPEARANCE AND GEOMETRY FROM IMAGES

Author: Melendez Francisco
Publication venue
Publication date: 01/08/2011
Field of study

The University of Manchester - Institutional Repository

Voxel-Based Indoor Reconstruction From HoloLens Triangle Meshes

Author: Hübner P.
Weinmann M.
Wursthorn S.
Publication venue
Publication date: 01/01/2020
Field of study

Current mobile augmented reality devices are often equipped with range sensors. The Microsoft HoloLens for instance is equipped with a Time-Of-Flight (ToF) range camera providing coarse triangle meshes that can be used in custom applications. We suggest to use the triangle meshes for the automatic generation of indoor models that can serve as basis for augmenting their physical counterpart with location-dependent information. In this paper, we present a novel voxel-based approach for automated indoor reconstruction from unstructured three-dimensional geometries like triangle meshes. After an initial voxelization of the input data, rooms are detected in the resulting voxel grid by segmenting connected voxel components of ceiling candidates and extruding them downwards to find floor candidates. Semantic class labels like 'Wall', 'Wall Opening', 'Interior Object' and 'Empty Interior' are then assigned to the room voxels in-between ceiling and floor by a rule-based voxel sweep algorithm. Finally, the geometry of the detected walls and their openings is refined in voxel representation. The proposed approach is not restricted to Manhattan World scenarios and does not rely on room surfaces being planar.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

TUbiblio

KITopen

Statistical Modelling and Inference in Image Analysis

Author: Qian Wei
Publication venue: ProQuest Dissertations & Theses,
Publication date: 01/01/1990
Field of study

The aim of the thesis is to investigate classes of model-based approaches to statistical image analysis. We explored the properties of models and examined the problem of parameter estimation from the original image data and, in particular, from noisy versions of the the scene. We concentrated on Markov random field (MRF) models, Markov mesh random field (MMRF) models and Multi-dimensional Markov chain (MDMC) models. In Chapter 2, for the one-dimensional version of Markov random fields, we developed a recursive technique which enables us to achieve maximum likelihood estimation for the underlying parameter and to carry out the EM algorithm for parameter estimation when only noisy data are available. This technique also enables us, in just a single pass, to generate a sample from a one-dimensional Markov random field. Although, unfortunately, this technique cannot be extended to two- or multi-dimensional models, it was applied to many cases in this thesis. Since, for two-dimensional Markov random fields, the density of each row (column), conditionally on all other rows (columns) is of the form of a one-dimensional Markov random field, and since the distribution of the original image, conditionally on the noisy version of data, is still a Markov random field, the technique can be used on different forms of conditional density of one row (column). In Chapter 3, therefore, we developed the line-relaxation method for simulating MRFs and maximum line pseudo-likelihood estimation of parameter(s), and in Chapter 5, we developed a simultaneous procedure of parameter estimation and restoration, in which line pseudo-likelihood and a modified EM algorithm were used. The first part of Chapter 3 and Chapter 4 concentrate on inference for two-dimensional MRFs. We obtained a matrix expression for partition functins for general models, and a more explicit form for a multi-colour Ising model, and thus located the positions of critical points of this multi-colour model. We examined the asymptotic properties of an asymmetric, two-colour Ising model. For general models, in Chapter 4, we explored asymptotic properties under an "independence" or a "near independence" condition, and then developed the approach of maximum approximate-likelihood estimation. For three-dimensional MMRF models, in chapter 6, a generalization of Devijver's F-G-H algorithm is developed for restoration. In Chapter 7, the recursive technique was again used to introduce MDMC models, which form a natural extension of a Markov chain. By suitable choice of model parameters, textures can be generated that are similar to those simulated from MRFs, but the simulation procedure is computationally much more economical. The recursive technique also enables us to maximize the likelihood function of the model. For all three sorts of prior random field models considered in this thesis, we developed a simultaneous procedure for parameter estimation and image restoration, when only noisy data are available. The currently restored image was used, together with noisy data, in modified versions of the EM algorithm. In simulation studies, quite good results were obtained, in terms of estimation of parameters in both the original model and, particularly, in the noise model, and in terms of restoration

Glasgow Theses Service

CONSISTENT MULTI-VIEW TEXTURING OF DETAILED 3D SURFACE MODELS

Author
Publication venue: 'Copernicus GmbH'
Publication date
Field of study

Crossref

A review on deep learning techniques for 3D sensed data classification

Author: Boehm Jan
Griffiths David
Publication venue: 'MDPI AG'
Publication date: 02/06/2019
Field of study

Over the past decade deep learning has driven progress in 2D image understanding. Despite these advancements, techniques for automatic 3D sensed data understanding, such as point clouds, is comparatively immature. However, with a range of important applications from indoor robotics navigation to national scale remote sensing there is a high demand for algorithms that can learn to automatically understand and classify 3D sensed data. In this paper we review the current state-of-the-art deep learning architectures for processing unstructured Euclidean data. We begin by addressing the background concepts and traditional methodologies. We review the current main approaches including; RGB-D, multi-view, volumetric and fully end-to-end architecture designs. Datasets for each category are documented and explained. Finally, we give a detailed discussion about the future of deep learning for 3D sensed data, using literature to justify the areas where future research would be most valuable.Comment: 25 pages, 9 figures. Review pape

arXiv.org e-Print Archive

UCL Discovery

Recommended from our members

Image based human body rendering via regression & MRF energy minimization

Author: Li Xinfeng
Publication venue
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A machine learning method for synthesising human images is explored to create new images without relying on 3D modelling. Machine learning allows the creation of new images through prediction from existing data based on the use of training images. In the present study, image synthesis is performed at two levels: contour and pixel. A class of learning-based methods is formulated to create object contours from the training image for the synthetic image that allow pixel synthesis within the contours in the second level. The methods rely on applying robust object descriptions, dynamic learning models after appropriate motion segmentation, and machine learning-based frameworks. Image-based human image synthesis using machine learning is a research focus that has recently gained considerable attention in the field of computer graphics. It makes use of techniques from image/motion analysis in computer vision. The problem lies in the estimation of methods for image-based object configuration (i.e. segmentation, contour outline). Using the results of these analysis methods as bases, the research adopts the machine learning approach, in which human images are synthesised by executing the synthesis of contour and pixels through the learning from training image. Firstly, thesis shows how an accurate silhouette is distilled using developed background subtraction for accuracy and efficiency. The traditional vector machine approach is used to avoid ambiguities within the regression process. Images can be represented as a class of accurate and efficient vectors for single images as well as sequences. Secondly, the framework is explored using a unique view of machine learning methods, i.e., support vector regression (SVR), to obtain the convergence result of vectors for contour allocation. The changing relationship between the synthetic image and the training image is expressed as a vector and represented in functions. Finally, a pixel synthesis is performed based on belief propagation. This thesis proposes a novel image-based rendering method for colour image synthesis using SVR and belief propagation for generalisation to enable the prediction of contour and colour information from input colour images. The methods rely on using appropriately defined and robust input colour images, optimising the input contour images within a sparse SVR framework. Firstly, the thesis shows how contour can effectively and efficiently be predicted from small numbers of input contour images. In addition, the thesis exploits the sparse properties of SVR efficiency, and makes use of SVR to estimate regression function. The image-based rendering method employed in this study enables contour synthesis for the prediction of small numbers of input source images. This procedure avoids the use of complex models and geometry information. Secondly, the method used for human body contour colouring is extended to define eight differently connected pixels, and construct a link distance field via the belief propagation method. The link distance, which acts as the message in propagation, is transformed by improving the low-envelope method in fast distance transform. Finally, the methodology is tested by considering human facial and human body clothing information. The accuracy of the test results for the human body model confirms the efficiency of the proposed method

Brunel University Research Archive