12 research outputs found

    3D Geometry Reconstruction from Discrete Volumetric Data

    Get PDF
    Převod diskrétních volumetrických dat na jejich povrchovou reprezentaci je dnes relativně běžnou operací. Standardním řešením tohoto problému je užití algoritmu Marching cubes, který ačkoli je jednoduchý a robustní, produkuje nekvalitní výstup, který vyžaduje následný post-procesing. Tato diplomová práce se zabývá studiem alternativních algoritmů pro extrakci izoploch z objemových dat. Čtenář bude srozuměn s fundamenty této problematiky a s principy metody Hierarchical Iso-Surface Extraction, jejíž nezávislá implementace byla v rámci této práce provedena a testována.Conversion of discrete volumetric data to boundary representation is quite common operation. Standard approach to resolve this problem is to use well-known Marching cubes algorithm, which although simple and robust, generates low-quality output that requires subsequent post-processing. This master's thesis deals with uncommon algorithms used for isosurface extraction from volumes. The reader will be acquainted with fundamental principles of Hierarchical Iso-Surface Extraction method, that was independently implemented and tested in this work.

    Single-view 3d body and cloth reconstruction under complex poses

    Get PDF
    Recent advances in 3D human shape reconstruction from single images have shown impressive results, leveraging on deep networks that model the so-called implicit function to learn the occupancy status of arbitrarily dense 3D points in space. However, while current algorithms based on this paradigm, like PiFuHD (Saito et al., 2020), are able to estimate accurate geometry of the human shape and clothes, they require high-resolution input images and are not able to capture complex body poses. Most training and evaluation is performed on 1k-resolution images of humans standing in front of the camera under neutral body poses. In this paper, we leverage publicly available data to extend existing implicit function-based models to deal with images of humans that can have arbitrary poses and self-occluded limbs. We argue that the representation power of the implicit function is not sufficient to simultaneously model details of the geometry and of the body pose. We, therefore, propose a coarse- to-fine approach in which we first learn an implicit function that maps the input image to a 3D body shape with a low level of detail, but which correctly fits the underlying human pose, despite its complexity. We then learn a displacement map, conditioned on the smoothed surface and on the input image, which encodes the high-frequency details of the clothes and body. In the experimental section, we show that this coarse-to-fine strategy represents a very good trade-off between shape detail and pose correctness, comparing favorably to the most recent state-of-the-art approaches. Our code will be made publicly available.This work is supported by the Spanish government with the projects MoHuCo PID2020-120049RB-I00 and MarĂ­a de Maeztu Seal of Excellence MDM-2016- 0656.Peer ReviewedPostprint (author's final draft

    Remeshing Visual Hull Approximation by Displaced Butterfly Subdivision Surfaces

    Full text link

    Multiple dataset visualization (MDV) framework for scalar volume data

    Get PDF
    Many applications require comparative analysis of multiple datasets representing different samples, conditions, time instants, or views in order to develop a better understanding of the scientific problem/system under consideration. One effective approach for such analysis is visualization of the data. In this PhD thesis, we propose an innovative multiple dataset visualization (MDV) approach in which two or more datasets of a given type are rendered concurrently in the same visualization. MDV is an important concept for the cases where it is not possible to make an inference based on one dataset, and comparisons between many datasets are required to reveal cross-correlations among them. The proposed MDV framework, which deals with some fundamental issues that arise when several datasets are visualized together, follows a multithreaded architecture consisting of three core components, data preparation/loading, visualization and rendering. The visualization module - the major focus of this study, currently deals with isosurface extraction and texture-based rendering techniques. For isosurface extraction, our all-in-memory approach keeps datasets under consideration and the corresponding geometric data in the memory. Alternatively, the only-polygons- or points-in-memory only keeps the geometric data in memory. To address the issues related to storage and computation, we develop adaptive data coherency and multiresolution schemes. The inter-dataset coherency scheme exploits the similarities among datasets to approximate the portions of isosurfaces of datasets using the isosurface of one or more reference datasets whereas the intra/inter-dataset multiresolution scheme processes the selected portions of each data volume at varying levels of resolution. The graphics hardware-accelerated approaches adopted for MDV include volume clipping, isosurface extraction and volume rendering, which use 3D textures and advanced per fragment operations. With appropriate user-defined threshold criteria, we find that various MDV techniques maintain a linear time-N relationship, improve the geometry generation and rendering time, and increase the maximum N that can be handled (N: number of datasets). Finally, we justify the effectiveness and usefulness of the proposed MDV by visualizing 3D scalar data (representing electron density distributions in magnesium oxide and magnesium silicate) from parallel quantum mechanical simulation

    Stability and Expressiveness of Deep Generative Models

    Get PDF
    In den letzten Jahren hat Deep Learning sowohl das maschinelle Lernen als auch die maschinelle Bildverarbeitung revolutioniert. Viele klassische Computer Vision-Aufgaben, wie z.B. die Objekterkennung und semantische Segmentierung, die traditionell sehr anspruchsvoll waren, können nun mit Hilfe von überwachten Deep Learning-Techniken gelöst werden. Überwachtes Lernen ist ein mächtiges Werkzeug, wenn annotierte Daten verfügbar sind und die betrachtete Aufgabe eine eindeutige Lösung hat. Diese Bedingungen sind allerdings nicht immer erfüllt. Ein vielversprechender Ansatz ist in diesem Fall die generative Modellierung. Im Gegensatz zu rein diskriminativen Modellen können generative Modelle mit Unsicherheiten umgehen und leistungsfähige Modelle lernen, auch wenn keine annotierten Trainingsdaten verfügbar sind. Obwohl aktuelle Ansätze zur generativen Modellierung vielversprechende Ergebnisse erzielen, beeinträchtigen zwei Aspekte ihre Expressivität: (i) Einige der erfolgreichsten Ansätze zur Modellierung von Bilddaten werden nicht mehr mit Hilfe von Optimierungsalgorithmen trainiert, sondern mit Algorithmen, deren Dynamik bisher nicht gut verstanden wurde. (ii) Generative Modelle sind oft durch den Speicherbedarf der Ausgaberepräsentation begrenzt. In dieser Arbeit gehen wir auf beide Probleme ein: Im ersten Teil der Arbeit stellen wir eine Theorie vor, die es erlaubt, die Trainingsdynamik von Generative Adversarial Networks (GANs), einem der vielversprechendsten Ansätze zur generativen Modellierung, besser zu verstehen. Wir nähern uns dieser Problemstellung, indem wir minimale Beispielprobleme des GAN-Trainings vorstellen, die analytisch verstanden werden können. Anschließend erhöhen wir schrittweise die Komplexität dieser Beispiele. Dadurch gewinnen wir neue Einblicke in die Trainingsdynamik von GANs und leiten neue Regularisierer her, die auch für allgemeine GANs sehr gut funktionieren. Insbesondere ermöglichen unsere neuen Regularisierer erstmals, ein GAN mit einer Auflösung von einem Megapixel zu trainieren, ohne dass wir die Auflösung der Trainingsverteilung schrittweise erhöhen müssen. Im zweiten Teil dieser Arbeit betrachten wir Ausgaberepräsentationen für generative Modelle in 3D und für 3D-Rekonstruktionstechniken. Durch die Einführung von impliziten Repräsentationen sind wir in der Lage, viele Techniken, die in 2D funktionieren, auf den 3D-Bereich auszudehnen ohne ihre Expressivität einzuschränken.In recent years, deep learning has revolutionized both machine learning and computer vision. Many classical computer vision tasks (e.g. object detection and semantic segmentation), which traditionally were very challenging, can now be solved using supervised deep learning techniques. While supervised learning is a powerful tool when labeled data is available and the task under consideration has a well-defined output, these conditions are not always satisfied. One promising approach in this case is given by generative modeling. In contrast to purely discriminative models, generative models can deal with uncertainty and learn powerful models even when labeled training data is not available. However, while current approaches to generative modeling achieve promising results, they suffer from two aspects that limit their expressiveness: (i) some of the most successful approaches to modeling image data are no longer trained using optimization algorithms, but instead employ algorithms whose dynamics are not well understood and (ii) generative models are often limited by the memory requirements of the output representation. We address both problems in this thesis: in the first part we introduce a theory which enables us to better understand the training dynamics of Generative Adversarial Networks (GANs), one of the most promising approaches to generative modeling. We tackle this problem by introducing minimal example problems of GAN training which can be understood analytically. Subsequently, we gradually increase the complexity of these examples. By doing so, we gain new insights into the training dynamics of GANs and derive new regularizers that also work well for general GANs. Our new regularizers enable us - for the first time - to train a GAN at one megapixel resolution without having to gradually increase the resolution of the training distribution. In the second part of this thesis we consider output representations in 3D for generative models and 3D reconstruction techniques. By introducing implicit representations to deep learning, we are able to extend many techniques that work in 2D to the 3D domain without sacrificing their expressiveness

    Representação, visualização e manipulação de dados mÊdicos tridimensionais: um estudo sobre as bases da simulação cirúrgica imersiva

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Ciência da Computação.Dados tridimensionais referentes a pacientes são utilizados em diversos setores médico-hospitalares, fornecendo embasamento à diagnósticos e orientação durante procedimentos cirúrgicos. No entanto, apesar de bastante úteis estes dados são bastante inflexíveis, não permitindo que o usuário interaja com estes ou os manipule. O emprego de técnicas de computação gráfica e realidade virtual para a representação destes dados sanaria estas dificuldades, gerando representações indivíduais e adaptadas para cada paciente e permitindo a realização de planejamentos cirúrgicos e cirurgias auxiliadas por computador, dentre outras possibilidades. A representação destes dados e as formas de manipulação devem conter um conjunto de elementos e obedecer alguns requisitos para que se obtenha realismo nas aplicações, caso contrário, o emprego destas técnicas não traria grandes vantagens. Analisando os elementos e requisitos a serem obedecidos, é construído um grafo de dependências que mostra as técnicas e estruturas computacionais necessárias para a obtenção de ambientes virtuais imersivos realistas. Tal grafo demonstra as estruturas de dados para representação de sólidos como peça chave para este tipo de aplicativos. Para suprir as necessidades destes, é apresentada uma estrutura de dado capaz de representar uma vasta classe de topologias espaciais, além de permitir rápido acesso a elementos e suas vizinhanças, bem como métodos para a construção de tal estrutura. É apresentada, também, uma aplicação para mensuração de artérias utilizando a estrutura e os métodos previamente mencionados e os resultados por obtidos por estes

    Neural Scene Representations for 3D Reconstruction and Generative Modeling

    Get PDF
    With the increasing technologization of society, we use machines for more and more complex tasks, ranging from driving assistance to video conferencing, to exploring planets. The scene representation, i.e., how sensory data is converted to compact descriptions of the environment, is a fundamental property for enabling the success but also the safety of such systems. A promising approach for developing robust, adaptive, and powerful scene representations are learning-based systems that can adapt themselves from observations. Indeed, deep learning has revolutionized computer vision in recent years. In particular, better model architectures, large amounts of training data, and more powerful computing devices enabled deep learning systems with unprecedented performance, and they now set the state-of-the-art in many benchmarks, ranging from image classification, to object detection, to semantic segmentation. Despite these successes, the way these systems operate is still fundamentally different from human cognition. In particular, most approaches operate in the 2D domain, while humans understand that images are projections of the three-dimensional world. In addition, they often do not follow a compositional understanding of scenes, which is fundamental to human reasoning. In this thesis, our goal is to develop scene representations that enable autonomous agents to navigate and act robustly and safely in complex environments while reasoning compositionally in 3D. To this end, we first propose a novel output representation for deep learning-based 3D reconstruction and generative modeling. We find that, compared to previous representations, our neural field-based approach does not require 3D space to be discretized achieving reconstructions at arbitrary resolution with a constant memory footprint. Next, we develop a differentiable rendering technique to infer these neural field-based 3D shape and texture representations from 2D observations and find that this allows us to scale to more complex, real-world scenarios. Subsequently, we combine our novel 3D shape representation with a spatially and temporally continuous vector field to model non-rigid shapes in motion. We observe that our novel 4D representation can be used for various discriminative and generative tasks, ranging from 4D reconstruction to 4D interpolation, to motion transfer. Finally, we develop an object-centric generative model that can generate 3D scenes in a compositional manner and that allows for photorealistic renderings of generated scenes. We find that our model not only improves image fidelity but also enables more controllable scene generation and image synthesis than prior work while training only from raw, unposed image collections

    Hierarchical Iso-Surface Extraction

    No full text

    ABSTRACT

    No full text
    Figure 1: First three levels and final result of our hierarchical iso-surface extraction algorithm. In this paper we present a novel approach to iso-surface extraction which is based on a multiresolution volume data representation and hierarchically approximates the iso-surface with a semiregular mesh. After having generated a hierarchy of volumes, we extract the iso-surface from the coarsest resolution with a standard Marching Cubes algorithm, apply a simple mesh decimation strategy to improve the shape of the triangles, and use the result as a base mesh. Then we iteratively fit the mesh to the iso-surface at the finer volume levels, thereby subdividing it adaptively in order to be able to correctly reconstruct local features. We also take care of generating an even vertex distribution over the iso-surface so that the final result consists of triangles with good aspect ratio. The advantage of this approach as opposed to the standard method of extracting the iso-surface from the finest resolution with Marching Cubes is that it generates a mesh with subdivision connectivit
    corecore