187 research outputs found

    3D object reconstruction using computer vision : reconstruction and characterization applications for external human anatomical structures

    Get PDF
    Tese de doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201

    Examination of scanner precision by analysing orthodontic parameters

    Get PDF
    Background: 3D modelling in orthodontics is becoming an increasingly widespread technique in practice. One of the significant questions already being asked is related to determining the precision of the scanner used for generating surfaces on a 3D model of the jaw. Materials and methods: This research was conducted by generating a set of identical 3D models on Atos optical 3D scanner and Lazak Scan laboratory scanner, which precision was established by measuring a set of orthodontic parameters (54 overall) in all three orthodontic planes. In this manner we explored their precision in space, since they are used for generating spatial models - 3D jaws. Results: There were significant differences between parameters scanned with Atos and Lazak Scan. The smallest difference was 0.017 mm, and the biggest 1.109 mm. Conclusion: This research reveals that both scanners (Atos and Lazak Scan), which belong to general purpose scanners, based on precision parameters can be used in orthodontics. Early analyses indicate that the reference scanner in terms of precision is Atos

    Reconstrução tridimensional de ambientes reais usando dados laser e de intensidade

    Get PDF
    O objectivo do trabalho apresentado nesta tese é a criação de modelos tridimensionais completos e de alta resolução de ambientes reais (informação geométrica e de textura) a partir de imagens passivas de intensidade e de sensores de distância activos. A maior parte dos sistemas de reconstrução 3D são baseados em sensores laser de distância ou em câmaras fotográficas, mas muito pouco trabalho tem tentado combinar estes dois tipos de sensores. A extracção de profundidade a partir de imagens de intensidade é complicada. Por outro lado, as fotografias fornecem informação adicional sobre os ambientes que pode ser usada durante o processo de modelação, em particular para definir, de uma forma precisa, as fronteiras das superfícies. Isto torna os sensores activos e passivos complementares em varios modos e é a ideia de base que motivou o trabalho apresentado nesta tese. Na primeira parte da tese, concentramo-nos no registro entre dados oriundos de sensores activos de distância e de câmaras digitais passivas e no desenvolvimento de ferramentas para tornar este passo mais fácil, independente do utilizador e mais preciso. No fim, com esta técnica, obtém-se um mapa de textura para os modelos baseado em várias fotografias digitais. O modelo 3D assim obtido é realizado baseado nos dados de distância para a geometria e nas fotografias digitais para a textura. Com estes modelos, obtémse uma qualidade fotográfica: uma espécie de fotografia de alta resolução em 3D dum ambiente real. Na segunda parte da tese, vai-se mais longe na combinação dos dados. As fotografias digitais são usadas como uma fonte adicional de informação tridimensional que pode ser valiosa para definir com precisão as fronteiras das superfícies (onde a informação de distância é menos fiável) ou então preencher falhas nos dados ou aumentar a densidade de pontos 3D em áreas de interesse.The objective of the work presented in this thesis is to generate complete, highresolution three-dimensional models of real world scenes (3D geometric and texture information) from passive intensity images and active range sensors. Most 3D reconstruction systems are based either in range finders or in digital cameras but little work tries to combine these two sensors. Depth extraction from intensity images is complex. On the other hand digital photographs provide additional information about the scenes that can be used to help the modelling process, in particular to define accurate surface boundary conditions. This makes active and passive sensors complementary in many ways and is the base idea that motivates the work in this thesis. In the first part of the thesis, we concentrate in the registration between data coming from active range sensors and passive digital cameras and the development of tools to make this step easier, more user-independent and more precise. In the end, with this technique, a texture map for the models is computed based on several digital photographs. This will lead to 3D models where 3D geometry is extracted from range data, whereas texture information comes from digital photographs. With these models, photo realistic quality is achieved: a kind of high-resolution 3D photograph of a real scene. In the second part of the thesis, we go further in the combination between the datasets. The digital photographs are used as an additional source of threedimensional information that can be valuable to define accurate surface boundary conditions (where range data is less reliable) or even to fill holes in the data or increase 3D point density in areas of interest

    An investigation into common challenges of 3D scene understanding in visual surveillance

    Get PDF
    Nowadays, video surveillance systems are ubiquitous. Most installations simply consist of CCTV cameras connected to a central control room and rely on human operators to interpret what they see on the screen in order to, for example, detect a crime (either during or after an event). Some modern computer vision systems aim to automate the process, at least to some degree, and various algorithms have been somewhat successful in certain limited areas. However, such systems remain inefficient in general circumstances and present real challenges yet to be solved. These challenges include the ability to recognise and ultimately predict and prevent abnormal behaviour or even reliably recognise objects, for example in order to detect left luggage or suspicious objects. This thesis first aims to study the state-of-the-art and identify the major challenges and possible requirements of future automated and semi-automated CCTV technology in the field. This thesis presents the application of a suite of 2D and highly novel 3D methodologies that go some way to overcome current limitations.The methods presented here are based on the analysis of object features directly extracted from the geometry of the scene and start with a consideration of mainly existing techniques, such as the use of lines, vanishing points (VPs) and planes, applied to real scenes. Then, an investigation is presented into the use of richer 2.5D/3D surface normal data. In all cases the aim is to combine both 2D and 3D data to obtain a better understanding of the scene, aimed ultimately at capturing what is happening within the scene in order to be able to move towards automated scene analysis. Although this thesis focuses on the widespread application of video surveillance, an example case of the railway station environment is used to represent typical real-world challenges, where the principles can be readily extended elsewhere, such as to airports, motorways, the households, shopping malls etc. The context of this research work, together with an overall presentation of existing methods used in video surveillance and their challenges are described in chapter 1.Common computer vision techniques such as VP detection, camera calibration, 3D reconstruction, segmentation etc., can be applied in an effort to extract meaning to video surveillance applications. According to the literature, these methods have been well researched and their use will be assessed in the context of current surveillance requirements in chapter 2. While existing techniques can perform well in some contexts, such as an architectural environment composed of simple geometrical elements, their robustness and performance in feature extraction and object recognition tasks is not sufficient to solve the key challenges encountered in general video surveillance context. This is largely due to issues such as variable lighting, weather conditions, and shadows and in general complexity of the real-world environment. Chapter 3 presents the research and contribution on those topics – methods to extract optimal features for a specific CCTV application – as well as their strengths and weaknesses to highlight that the proposed algorithm obtains better results than most due to its specific design.The comparison of current surveillance systems and methods from the literature has shown that 2D data are however almost constantly used for many applications. Indeed, industrial systems as well as the research community have been improving intensively 2D feature extraction methods since image analysis and Scene understanding has been of interest. The constant progress on 2D feature extraction methods throughout the years makes it almost effortless nowadays due to a large variety of techniques. Moreover, even if 2D data do not allow solving all challenges in video surveillance or other applications, they are still used as starting stages towards scene understanding and image analysis. Chapter 4 will then explore 2D feature extraction via vanishing point detection and segmentation methods. A combination of most common techniques and a novel approach will be then proposed to extract vanishing points from video surveillance environments. Moreover, segmentation techniques will be explored in the aim to determine how they can be used to complement vanishing point detection and lead towards 3D data extraction and analysis. In spite of the contribution above, 2D data is insufficient for all but the simplest applications aimed at obtaining an understanding of a scene, where the aim is for a robust detection of, say, left luggage or abnormal behaviour; without significant a priori information about the scene geometry. Therefore, more information is required in order to be able to design a more automated and intelligent algorithm to obtain richer information from the scene geometry and so a better understanding of what is happening within. This can be overcome by the use of 3D data (in addition to 2D data) allowing opportunity for object “classification” and from this to infer a map of functionality, describing feasible and unfeasible object functionality in a given environment. Chapter 5 presents how 3D data can be beneficial for this task and the various solutions investigated to recover 3D data, as well as some preliminary work towards plane extraction.It is apparent that VPs and planes give useful information about a scene’s perspective and can assist in 3D data recovery within a scene. However, neither VPs nor plane detection techniques alone allow the recovery of more complex generic object shapes - for example composed of spheres, cylinders etc - and any simple model will suffer in the presence of non-Manhattan features, e.g. introduced by the presence of an escalator. For this reason, a novel photometric stereo-based surface normal retrieval methodology is introduced to capture the 3D geometry of the whole scene or part of it. Chapter 6 describes how photometric stereo allows recovery of 3D information in order to obtain a better understanding of a scene, as well as also partially overcoming some current surveillance challenges, such as difficulty in resolving fine detail, particularly at large standoff distances, and in isolating and recognising more complex objects in real scenes. Here items of interest may be obscured by complex environmental factors that are subject to rapid change, making, for example, the detection of suspicious objects and behaviour highly problematic. Here innovative use is made of an untapped latent capability offered within modern surveillance environments to introduce a form of environmental structuring to good advantage in order to achieve a richer form of data acquisition. This chapter also goes on to explore the novel application of photometric stereo in such diverse applications, how our algorithm can be incorporated into an existing surveillance system and considers a typical real commercial application.One of the most important aspects of this research work is its application. Indeed, while most of the research literature has been based on relatively simple structured environments, the approach here has been designed to be applied to real surveillance environments, such as railway stations, airports, waiting rooms, etc, and where surveillance cameras may be fixed or in the future form part of a mobile robotic free roaming surveillance device, that must continually reinterpret its changing environment. So, as mentioned previously, while the main focus has been to apply this algorithm to railway station environments, the work has been approached in a way that allows adaptation to many other applications, such as autonomous robotics, and in motorway, shopping centre, street and home environments. All of these applications require a better understanding of the scene for security or safety purposes. Finally, chapter 7 presents a global conclusion and what will be achieved in the future

    Parallelized computational 3D video microscopy of freely moving organisms at multiple gigapixels per second

    Full text link
    To study the behavior of freely moving model organisms such as zebrafish (Danio rerio) and fruit flies (Drosophila) across multiple spatial scales, it would be ideal to use a light microscope that can resolve 3D information over a wide field of view (FOV) at high speed and high spatial resolution. However, it is challenging to design an optical instrument to achieve all of these properties simultaneously. Existing techniques for large-FOV microscopic imaging and for 3D image measurement typically require many sequential image snapshots, thus compromising speed and throughput. Here, we present 3D-RAPID, a computational microscope based on a synchronized array of 54 cameras that can capture high-speed 3D topographic videos over a 135-cm^2 area, achieving up to 230 frames per second at throughputs exceeding 5 gigapixels (GPs) per second. 3D-RAPID features a 3D reconstruction algorithm that, for each synchronized temporal snapshot, simultaneously fuses all 54 images seamlessly into a globally-consistent composite that includes a coregistered 3D height map. The self-supervised 3D reconstruction algorithm itself trains a spatiotemporally-compressed convolutional neural network (CNN) that maps raw photometric images to 3D topography, using stereo overlap redundancy and ray-propagation physics as the only supervision mechanism. As a result, our end-to-end 3D reconstruction algorithm is robust to generalization errors and scales to arbitrarily long videos from arbitrarily sized camera arrays. The scalable hardware and software design of 3D-RAPID addresses a longstanding problem in the field of behavioral imaging, enabling parallelized 3D observation of large collections of freely moving organisms at high spatiotemporal throughputs, which we demonstrate in ants (Pogonomyrmex barbatus), fruit flies, and zebrafish larvae

    Compact Environment Modelling from Unconstrained Camera Platforms

    Get PDF
    Mobile robotic systems need to perceive their surroundings in order to act independently. In this work a perception framework is developed which interprets the data of a binocular camera in order to transform it into a compact, expressive model of the environment. This model enables a mobile system to move in a targeted way and interact with its surroundings. It is shown how the developed methods also provide a solid basis for technical assistive aids for visually impaired people

    Automatic registration of 3D models to laparoscopic video images for guidance during liver surgery

    Get PDF
    Laparoscopic liver interventions offer significant advantages over open surgery, such as less pain and trauma, and shorter recovery time for the patient. However, they also bring challenges for the surgeons such as the lack of tactile feedback, limited field of view and occluded anatomy. Augmented reality (AR) can potentially help during laparoscopic liver interventions by displaying sub-surface structures (such as tumours or vasculature). The initial registration between the 3D model extracted from the CT scan and the laparoscopic video feed is essential for an AR system which should be efficient, robust, intuitive to use and with minimal disruption to the surgical procedure. Several challenges of registration methods in laparoscopic interventions include the deformation of the liver due to gas insufflation in the abdomen, partial visibility of the organ and lack of prominent geometrical or texture-wise landmarks. These challenges are discussed in detail and an overview of the state of the art is provided. This research project aims to provide the tools to move towards a completely automatic registration. Firstly, the importance of pre-operative planning is discussed along with the characteristics of the liver that can be used in order to constrain a registration method. Secondly, maximising the amount of information obtained before the surgery, a semi-automatic surface based method is proposed to recover the initial rigid registration irrespective of the position of the shapes. Finally, a fully automatic 3D-2D rigid global registration is proposed which estimates a global alignment of the pre-operative 3D model using a single intra-operative image. Moving towards incorporating the different liver contours can help constrain the registration, especially for partial surfaces. Having a robust, efficient AR system which requires no manual interaction from the surgeon will aid in the translation of such approaches to the clinics

    Map-Based Localization for Unmanned Aerial Vehicle Navigation

    Get PDF
    Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments. Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments. The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%
    • …
    corecore