39 research outputs found

    Deep-learning the Latent Space of Light Transport

    Get PDF
    We suggest a method to directly deep‐learn light transport, i. e., the mapping from a 3D geometry‐illumination‐material configuration to a shaded 2D image. While many previous learning methods have employed 2D convolutional neural networks applied to images, we show for the first time that light transport can be learned directly in 3D. The benefit of 3D over 2D is, that the former can also correctly capture illumination effects related to occluded and/or semi‐transparent geometry. To learn 3D light transport, we represent the 3D scene as an unstructured 3D point cloud, which is later, during rendering, projected to the 2D output image. Thus, we suggest a two‐stage operator comprising a 3D network that first transforms the point cloud into a latent representation, which is later on projected to the 2D output image using a dedicated 3D‐2D network in a second step. We will show that our approach results in improved quality in terms of temporal coherence while retaining most of the computational efficiency of common 2D methods. As a consequence, the proposed two stage‐operator serves as a valuable extension to modern deferred shading approaches

    Point based graphics rendering with unified scalability solutions.

    Get PDF
    Standard real-time 3D graphics rendering algorithms use brute force polygon rendering, with complexity linear in the number of polygons and little regard for limiting processing to data that contributes to the image. Modern hardware can now render smaller scenes to pixel levels of detail, relaxing surface connectivity requirements. Sub-linear scalability optimizations are typically self-contained, requiring specific data structures, without shared functions and data. A new point based rendering algorithm 'Canopy' is investigated that combines multiple typically sub-linear scalability solutions, using a small core of data structures. Specifically, locale management, hierarchical view volume culling, backface culling, occlusion culling, level of detail and depth ordering are addressed. To demonstrate versatility further, shadows and collision detection are examined. Polygon models are voxelized with interpolated attributes to provide points. A scene tree is constructed, based on a BSP tree of points, with compressed attributes. The scene tree is embedded in a compressed, partitioned, procedurally based scene graph architecture that mimics conventional systems with groups, instancing, inlines and basic read on demand rendering from backing store. Hierarchical scene tree refinement constructs an image tree image space equivalent, with object space scene node points projected, forming image node equivalents. An image graph of image nodes is maintained, describing image and object space occlusion relationships, hierarchically refined with front to back ordering to a specified threshold whilst occlusion culling with occluder fusion. Visible nodes at medium levels of detail are refined further to rasterization scales. Occlusion culling defines a set of visible nodes that can support caching for temporal coherence. Occlusion culling is approximate, possibly not suiting critical applications. Qualities and performance are tested against standard rendering. Although the algorithm has a 0(f) upper bound in the scene sizef, it is shown to practically scale sub-linearly. Scenes with several hundred billion polygons conventionally, are rendered at interactive frame rates with minimal graphics hardware support

    H3DNET: A Deep Learning Framework for Hierarchical 3D Object Classification

    Get PDF
    Title from PDF of title page viewed January 31, 2018Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (pages 81-83)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2017Deep learning has received a lot of attention in the fields such as speech recognition and image classification because of the ability to learn multiple levels of features from raw data. However, 3D deep learning is relatively new but in high demand with their great research values. Current research and usage of deep learning for 3D data suffer from the limited ability to process large volumes of data as well as low performance, especially in increasing the number of classes in the image classification task. One of the open questions is whether an efficient as well as an accurate 3D Deep Learning model can be built with large-scale 3D data. In this thesis, we aim to design a hierarchical framework for 3D Deep Learning, called H3DNET, which can build a DL 3D model in a distributed and scalable manner. In the H3DNET framework, a learning problem is composed of two stages: divide and conquer. At the divide learning stage, a learning problem is divided into several smaller problems. At the conquer learning stage, an optimized solution is used to solve these smaller subproblems for a better learning performance. This involves training of models and optimizing them with refined division for a better performance. The inferencing can achieve the efficiency and high accuracy with fuzzy classification using such a two-step approach in a hierarchical manner. The H3DNET framework was implemented in TensorFlow which is capable of using GPU computations in parallel to build 3D neural network. We evaluated the H3DNET framework on a 3D object classification with MODELNET10 and MODELNET40 datasets to check the efficiency of the framework. The evaluation results verified that the H3DNET framework supports hierarchical 3D Deep Learning with 3D images in a scalable manner. The classification accuracy is higher than the state-of-the-art, VOXNET[7] and POINTNET.Introduction -- Background and related work -- The hierarchical 3D net of 3D object classification -- Results and evaluation -- Conclusion and future wor

    Comparación de volúmenes: aplicación al análisis del comportamiento de modelos biomecánicos de órganos

    Full text link
    En este trabajo se proponen distintas métricas que permiten validar el comportamiento del modelo biomecánico. Estas métricas son los coeficientes clásicos de validación de segmentaciones basadas en el solape y las distancias entre volúmenes. Esto permite dar un grado de ajuste entre el modelo biomecánico y el comportamiento real.Lago Ángel, MÁ. (2011). Comparación de volúmenes: aplicación al análisis del comportamiento de modelos biomecánicos de órganos. http://hdl.handle.net/10251/11309Archivo delegad

    Interactive volume ray tracing

    Get PDF
    Die Visualisierung von volumetrischen Daten ist eine der interessantesten, aber sicherlich auch schwierigsten Anwendungsgebiete innerhalb der wissenschaftlichen Visualisierung. Im Gegensatz zu Oberflächenmodellen, repräsentieren solche Daten ein semi-transparentes Medium in einem 3D-Feld. Anwendungen reichen von medizinischen Untersuchungen, Simulation physikalischer Prozesse bis hin zur visuellen Kunst. Viele dieser Anwendungen verlangen Interaktivität hinsichtlich Darstellungs- und Visualisierungsparameter. Der Ray-Tracing- (Stahlverfolgungs-) Algorithmus wurde dabei, obwohl er inhärent die Interaktion mit einem solchen Medium simulieren kann, immer als zu langsam angesehen. Die meisten Forscher konzentrierten sich vielmehr auf Rasterisierungsansätze, da diese besser für Grafikkarten geeignet sind. Dabei leiden diese Ansätze entweder unter einer ungenügenden Qualität respektive Flexibilität. Die andere Alternative besteht darin, den Ray-Tracing-Algorithmus so zu beschleunigen, dass er sinnvoll für Visualisierungsanwendungen benutzt werden kann. Seit der Verfügbarkeit moderner Grafikkarten hat die Forschung auf diesem Gebiet nachgelassen, obwohl selbst moderne GPUs immer noch Limitierungen, wie beispielsweise der begrenzte Grafikkartenspeicher oder das umständliche Programmiermodell, enthalten. Die beiden in dieser Arbeit vorgestellten Methoden sind deshalb vollständig softwarebasiert, da es sinnvoller erscheint, möglichst viele Optimierungen in Software zu realisieren, bevor eine Portierung auf Hardware erfolgt. Die erste Methode wird impliziter Kd-Baum genannt, eine hierarchische und räumliche Beschleunigungstruktur, die ursprünglich für die Generierung von Isoflächen reguläre Gitterdatensätze entwickelt wurde. In der Zwischenzeit unterstützt sie auch die semi-transparente Darstellung, die Darstellung von zeitabhängigen Datensätzen und wurde erfolgreich für andere Anwendungen eingesetzt. Der zweite Algorithmus benutzt so genannte Plücker-Koordinaten, welche die Implementierung eines schnellen inkrementellen Traversierers für Datensätze erlauben, deren Primitive Tetraeder beziehungsweise Hexaeder sind. Beide Algorithmen wurden wesentlich optimiert, um eine interaktive Bildgenerierung volumetrischer Daten zu ermöglichen und stellen deshalb einen wichtigen Beitrag hin zu einem flexiblen und interaktiven Volumen-Ray-Tracing-System dar.Volume rendering is one of the most demanding and interesting topics among scientific visualization. Applications include medical examinations, simulation of physical processes, and visual art. Most of these applications demand interactivity with respect to the viewing and visualization parameters. The ray tracing algorithm, although inherently simulating light interaction with participating media, was always considered too slow. Instead, most researchers followed object-order algorithms better suited for graphics adapters, although such approaches often suffer either from low quality or lack of flexibility. Another alternative is to speed up the ray tracing algorithm to make it competitive for volumetric visualization tasks. Since the advent of modern graphic adapters, research in this area had somehow ceased, although some limitations of GPUs, e.g. limited graphics board memory and tedious programming model, are still a problem. The two methods discussed in this thesis are therefore purely software-based since it is believed that software implementations allow for a far better optimization process before porting algorithms to hardware. The first method is called implicit kd-tree, which is a hierarchical spatial acceleration structure originally developed for iso-surface rendering of regular data sets that now supports semi-transparent rendering, time-dependent data visualization, and is even used in non volume-rendering applications. The second algorithm uses so-called Plücker coordinates, providing a fast incremental traversal for data sets consisting of tetrahedral or hexahedral primitives. Both algorithms are highly optimized to support interactive rendering of volumetric data sets and are therefore major contributions towards a flexible and interactive volume ray tracing framework

    Robust and Efficient Deep Visual Learning

    Get PDF
    The past decade was marked by significant progress in the field of artificial intelligence and statistical learning. However, the most impressive of modern models come in the form of computationally expensive black boxes, with the majority of them lacking the ability to reason about the confidence of their predictions robustly. Being capable of quantifying model uncertainty and recognizing failure scenarios is crucial when it comes to incorporating them into complex decision-making pipelines, e.g. autonomous driving or medical image analysis systems. It is also important to maintain a low computational cost of these models. In the present thesis, the aforementioned desired properties of robustness and efficiency of deep learning models are studied and developed in the three specific realms of computer vision. First, we investigate deep probabilistic models that allow uncertainty quantification, i.e. the models that "know what they do not know". Here, we propose a novel model for the task of angular regression that allows probabilistic object pose estimation from 2D images. We also showcase how the general deep density estimation paradigm can be adapted and utilized in two other real-world applications, ball trajectory prediction and brain imaging. Next, we turn to the field of 3D shape analysis and rendering. We propose a method for efficient encoding of 3D point clouds, the type of data that is hard to handle with conventional learning algorithms due to its unordered nature. We show that simple neural networks that use the developed encoding as input can match the performance of state-of-the-art methods on various point cloud processing tasks while using orders of magnitude less floating-point operations. Finally, we explore the emerging field of neural rendering and develop the framework that connects classic deformable 3D body models with modern image-to-image translation neural networks. This combination allows efficient photorealistic human avatar rendering in a controlled manner, with the possibility to control the camera flexibly and to change the body pose and shape appearance. The thesis concludes with the discussion of the presented methods, including current limitations and future research directions
    corecore