294 research outputs found

    High performance visualization through graphics hardware and integration issues in an electric power grid Computer-Aided-Design application

    Get PDF
    [Resumo]Esta tese presenta os resultados de traballos de investigación levados a cabo para mellora-la visualización dunha ferramenta de planificación de redes eléctricas. A primeira actuación consistiu en actualiza-lo seu motor de debuxo para explota-lo alto rendemento ofrecido por DirectX. Unha forma fundamental de mellora-lo rendemento dunha visualización é a reducción dos datos involucrados posta que canto mellor sexa seu volume mellor será. o tempo necesario para xera-la visualización. Ademáis a reducción de datos pode mellora-la calidade estética da visualización, facéndoa máis lexible. Presentáronse dous aproximacións á reducción de datos: mediante bases de datos especiais e mediante hardware gráfico. Non son exc1uíntes dado que a información das redes eléctricas normalmente provén dunha base de datos e, empregando as extensións espaciais que proporcionan a maioría dos sistemas de xestión de bases de datos modernos, pódese procesala infonnación para adecuala á visualización a realizar . O hardware gráfico moderno está equipado con potentes GPUs con gran capacidade para a computación paralela de alto rendemento e propósito xeral. Presentáronse tres técnicas que non só debuxan a visualización das redes eléctricas, senón que integran o proceso a simplificación das redes eléctricas para tratar de xera-la visualización máis rápidamente. Según sigan a evoluciona-las capacidades do hardware gráfico novas técnicas podeán ser implementadas e as existentes mellorarán o seu rendemento de forma casi transparente o que fai que este campo sexa altamente dinámico e unha fonte de innovación.[Resumen]Esta tesis presenta los resultados de trabajos de investigación llevados a cabo para mejorar la visualización de una herramienta de planificación de redes eléctricas. La primera actuación consistió en actualizar su motor de dibujo para explotar el alto rendimiento ofrecido por DirectX. Una forma fundamental de mejorar el rendimiento de una visualización es la reducción de los datos involucrados, puesto que cuanto menor sea su volumen, menor será el tiempo necesario para generar la visualización. Además, la reducción de datos puede mejorar la calidad estética de la visualización, haciéndola más legible. Se presentaron dos aproximaciones a la reducción de datos: mediante bases de datos espaciales y mediante hardware gráfico. No son excluyentes, dado que la información de las redes eléctricas normalmente proviene de una base de datos y, utilizando las extensiones espaciales que proporcionan la mayoría de sistemas de gestión de bases de datos modernos, se puede procesar la información para adecuadarla a la visualización a realizar. El hardware gráfico moderno está equipado con potentes GPUs con gran capacidad para la computación paralela de alto rendimiento y propósito general. Se han presentado tres técnicas que no solo dibujan la visualización de las redes eléctricas, sino que integran en el proceso la simplificación de las redes eléctricas para tratar de generar la visualización más rápidamente. Según sigan evolucionando las capacidades del hardware gráfico, nuevas técnicas podrán ser implementadas y las existentes mejorarán su rendimiento de forma casi transparente, lo que hace que este campo sea altamente dinámico y una fuente de innovación

    Doctor of Philosophy

    Get PDF
    dissertationVisualizing surfaces is a fundamental technique in computer science and is frequently used across a wide range of fields such as computer graphics, biology, engineering, and scientific visualization. In many cases, visualizing an interface between boundaries can provide meaningful analysis or simplification of complex data. Some examples include physical simulation for animation, multimaterial mesh extraction in biophysiology, flow on airfoils in aeronautics, and integral surfaces. However, the quest for high-quality visualization, coupled with increasingly complex data, comes with a high computational cost. Therefore, new techniques are needed to solve surface visualization problems within a reasonable amount of time while also providing sophisticated visuals that are meaningful to scientists and engineers. In this dissertation, novel techniques are presented to facilitate surface visualization. First, a particle system for mesh extraction is parallelized on the graphics processing unit (GPU) with a red-black update scheme to achieve an order of magnitude speed-up over a central processing unit (CPU) implementation. Next, extending the red-black technique to multiple materials showed inefficiencies on the GPU. Therefore, we borrow the underlying data structure from the closest point method, the closest point embedding, and the particle system solver is switched to hierarchical octree-based approach on the GPU. Third, to demonstrate that the closest point embedding is a fast, flexible data structure for surface particles, it is adapted to unsteady surface flow visualization at near-interactive speeds. Finally, the closest point embedding is a three-dimensional dense structure that does not scale well. Therefore, we introduce a closest point sparse octree that allows the closest point embedding to scale to higher resolution. Further, we demonstrate unsteady line integral convolution using the closest point method

    Comparison of two approaches for web-based 3D visualization of smart building sensor data

    Get PDF
    Abstract. This thesis presents a comparative study on two different approaches for visualizing sensor data collected from smart buildings on the web using 3D virtual environments. The sensor data is provided by sensors that are deployed in real buildings to measure several environmental parameters including temperature, humidity, air quality and air pressure. The first approach uses the three.js WebGL framework to create the 3D model of a smart apartment where sensor data is illustrated with point and wall visualizations. Point visualizations show sensor values at the real locations of the sensors using text, icons or a mixture of the two. Wall visualizations display sensor values inside panels placed on the interior walls of the apartment. The second approach uses the Unity game engine to create the 3D model of a 4-floored hospice where sensor data is illustrated with aforementioned point visualizations and floor visualizations, where the sensor values are shown on the floor around the location of the sensors in form of color or other effects. The two approaches are compared with respect to their technical performance in terms of rendering speed, model size and request size, and with respect to the relative advantages and disadvantages of the two development environments as experienced in this thesis

    Ambient occlusion and shadows for molecular graphics

    Get PDF
    Computer based visualisations of molecules have been produced as early as the 1950s to aid researchers in their understanding of biomolecular structures. An important consideration for Molecular Graphics software is the ability to visualise the 3D structure of the molecule in a clear manner. Recent advancements in computer graphics have led to improved rendering capabilities of the visualisation tools. The capabilities of current shading languages allow the inclusion of advanced graphic effects such as ambient occlusion and shadows that greatly improve the comprehension of the 3D shapes of the molecules. This thesis focuses on finding improved solutions to the real time rendering of Molecular Graphics on modern day computers. The methods of calculating ambient occlusion and both hard and soft shadows are examined and implemented to give the user a more complete experience when navigating large molecular structures

    GPGPU-Enabled Physics Based Deformed Model Simulation

    Get PDF
    Computer simulation techniques are widely adopted nowadays in many areas like manufacturing, engineering, graphics, animation, virtual reality and so on. However, the standard finite element based simulation is notorious for its expensive computation. To address this challenge, I present a GPU-based parallel implementation for simulating large elastic deformation. Classic modal analysis provides a set of orthonormal bases vectors, which span a spectral space encoding the dynamics of the elastic body. As each basis vector is orthogonal to each other, the computation is completely decoupled and can be well-fit into the modern GPGPU platform. We further explore the latest feature of NVIDIA CUDA so that the result of GPU computation can be directly used for upcoming rendering/visualization and a significant amount of overheads for transmitting data from client GPU and host CPU via the PCI-Express bus are avoided. Real-time simulation is made possible with this technique for many cases that otherwise is not possible

    Towards Expressive and Versatile Visualization-as-a-Service (VaaS)

    Get PDF
    The rapid growth of data in scientific visualization has posed significant challenges to the scalability and availability of interactive visualization tools. These challenges can be largely attributed to the limitations of traditional monolithic applications in handling large datasets and accommodating multiple users or devices. To address these issues, the Visualization-as-a-Service (VaaS) architecture has emerged as a promising solution. VaaS leverages cloud-based visualization capabilities to provide on-demand and cost-effective interactive visualization. Existing VaaS has been simplistic by design with focuses on task-parallelism with single-user-per-device tasks for predetermined visualizations. This dissertation aims to extend the capabilities of VaaS by exploring data-parallel visualization services with multi-device support and hypothesis-driven explorations. By incorporating stateful information and enabling dynamic computation, VaaS\u27 performance and flexibility for various real-world applications is improved. This dissertation explores the history of monolithic and VaaS architectures, the design and implementations of 3 new VaaS applications, and a final exploration of the future of VaaS. This research contributes to the advancement of interactive scientific visualization, addressing the challenges posed by large datasets and remote collaboration scenarios

    Shader optimization and specialization

    Get PDF
    In the field of real-time graphics for computer games, performance has a significant effect on the player’s enjoyment and immersion. Graphics processing units (GPUs) are hardware accelerators that run small parallelized shader programs to speed up computationally expensive rendering calculations. This thesis examines optimizing shader programs and explores ways in which data patterns on both the CPU and GPU can be analyzed to automatically speed up rendering in games. Initially, the effect of traditional compiler optimizations on shader source-code was explored. Techniques such as loop unrolling or arithmetic reassociation provided speed-ups on several devices, but different GPU hardware responded differently to each set of optimizations. Analyzing execution traces from numerous popular PC games revealed that much of the data passed from CPU-based API calls to GPU-based shaders is either unused, or remains constant. A system was developed to capture this constant data and fold it into the shaders’ source-code. Re-running the game’s rendering code using these specialized shader variants resulted in performance improvements in several commercial games without impacting their visual quality

    Contributions to virtual reality

    Get PDF
    153 p.The thesis contributes in three Virtual Reality areas: ¿ Visual perception: a calibration algorithm is proposed to estimate stereo projection parameters in head-mounted displays, so that correct shapes and distances can be perceived, and calibration and control procedures are proposed to obtain desired accommodation stimuli at different virtual distances.¿ Immersive scenarios: the thesis analyzes several use cases demanding varying degrees of immersion and special, innovative visualization solutions are proposed to fulfil their requirements. Contributions focus on machinery simulators, weather radar volumetric visualization and manual arc welding simulation.¿ Ubiquitous visualization: contributions are presented to scenarios where users access interactive 3D applications remotely. The thesis follows the evolution of Web3D standards and technologies to propose original visualization solutions for volume rendering of weather radar data, e-learning on energy efficiency, virtual e-commerce and visual product configurators

    Exploiting frame coherence in real-time rendering for energy-efficient GPUs

    Get PDF
    The computation capabilities of mobile GPUs have greatly evolved in the last generations, allowing real-time rendering of realistic scenes. However, the desire for processing complex environments clashes with the battery-operated nature of smartphones, for which users expect long operating times per charge and a low-enough temperature to comfortably hold them. Consequently, improving the energy-efficiency of mobile GPUs is paramount to fulfill both performance and low-power goals. The work of the processors from within the GPU and their accesses to off-chip memory are the main sources of energy consumption in graphics workloads. Yet most of this energy is spent in redundant computations, as the frame rate required to produce animations results in a sequence of extremely similar images. The goal of this thesis is to improve the energy-efficiency of mobile GPUs by designing micro-architectural mechanisms that leverage frame coherence in order to reduce the redundant computations and memory accesses inherent in graphics applications. First, we focus on reducing redundant color computations. Mobile GPUs typically employ an architecture called Tile-Based Rendering, in which the screen is divided into tiles that are independently rendered in on-chip buffers. It is common that more than 80% of the tiles produce exactly the same output between consecutive frames. We propose Rendering Elimination (RE), a mechanism that accurately determines such occurrences by computing and storing signatures of the inputs of all the tiles in a frame. If the signatures of a tile across consecutive frames are the same, the colors computed in the preceding frame are reused, saving all computations and memory accesses associated to the rendering of the tile. We show that RE vastly outperforms related schemes found in the literature, achieving a reduction of energy consumption of 37% and execution time of 33% with minimal overheads. Next, we focus on reducing redundant computations of fragments that will eventually not be visible. In real-time rendering, objects are processed in the order they are submitted to the GPU, which usually causes that the results of previously-computed objects are overwritten by new objects that turn occlude them. Consequently, whether or not a particular object will be occluded is not known until the entire scene has been processed. Based on the fact that visibility tends to remain constant across consecutive frames, we propose Early Visibility Resolution (EVR), a mechanism that predicts visibility based on information obtained in the preceding frame. EVR first computes and stores the depth of the farthest visible point after rendering each tile. Whenever a tile is rendered in the following frame, primitives that are farther from the observer than the stored depth are predicted to be occluded, and processed after the ones predicted to be visible. Additionally, this visibility prediction scheme is used to improve Rendering Elimination’s equal tile detection capabilities by not adding primitives predicted to be occluded in the signature. With minor hardware costs, EVR is shown to provide a reduction of energy consumption of 43% and execution time of 39%. Finally, we focus on reducing computations in tiles with low spatial frequencies. GPUs produce pixel colors by sampling triangles once per pixel and performing computations on each sampling location. However, most screen regions do not include sufficient detail to require high sampling rates, leading to a significant amount of energy wasted computing the same color for neighboring pixels. Given that spatial frequencies are maintained across frames, we propose Dynamic Sampling Rate, a mechanism that analyzes the spatial frequencies of tiles and determines the best sampling rate for them, which is applied in the following frame. Results show that Dynamic Sampling Rate significantly reduces processor activity, yielding energy savings of 40% and execution time reductions of 35%.La capacitat de càlcul de les GPU mòbils ha augmentat en gran mesura en les darreres generacions, permetent el renderitzat de paisatges complexos en temps real. Nogensmenys, el desig de processar escenes cada vegada més realistes xoca amb el fet que aquests dispositius funcionen amb bateries, i els usuaris n’esperen llargues durades i una temperatura prou baixa com per a ser agafats còmodament. En conseqüència, millorar l’eficiència energètica de les GPU mòbils és essencial per a aconseguir els objectius de rendiment i baix consum. Els processadors de la GPU i els seus accessos a memòria són els principals consumidors d’energia en càrregues gràfiques, però molt d’aquest consum és malbaratat en càlculs redundants, ja que les animacions produïdes s¿aconsegueixen renderitzant una seqüència d’imatges molt similars. L’objectiu d’aquesta tesi és millorar l’eficiència energètica de les GPU mòbils mitjançant el disseny de mecanismes microarquitectònics que aprofitin la coherència entre imatges per a reduir els càlculs i accessos redundants inherents a les aplicacions gràfiques. Primerament, ens centrem en reduir càlculs redundants de colors. A les GPU mòbils, sovint s'empra una arquitectura anomenada Tile-Based Rendering, en què la pantalla es divideix en regions que es processen independentment dins del xip. És habitual que més del 80% de les regions de pantalla produeixin els mateixos colors entre imatges consecutives. Proposem Rendering Elimination (RE), un mecanisme que determina acuradament aquests casos computant una signatura de les entrades de totes les regions. Si les signatures de dues imatges són iguals, es reutilitzen els colors calculats a la imatge anterior, el que estalvia tots els càlculs i accessos a memòria de la regió. RE supera àmpliament propostes relacionades de la literatura, aconseguint una reducció del consum energètic del 37% i del temps d’execució del 33%. Seguidament, ens centrem en reduir càlculs redundants en fragments que eventualment no seran visibles. En aplicacions gràfiques, els objectes es processen en l’ordre en què son enviats a la GPU, el que sovint causa que resultats ja processats siguin sobreescrits per nous objectes que els oclouen. Per tant, no se sap si un objecte serà visible o no fins que tota l’escena ha estat processada. Fonamentats en el fet que la visibilitat tendeix a ser constant entre imatges, proposem Early Visibility Resolution (EVR), un mecanisme que prediu la visibilitat basat en informació obtinguda a la imatge anterior. EVR computa i emmagatzema la profunditat del punt visible més llunyà després de processar cada regió de pantalla. Quan es processa una regió a la imatge següent, es prediu que les primitives més llunyanes a el punt guardat seran ocloses i es processen després de les que es prediuen que seran visibles. Addicionalment, aquest esquema de predicció s’empra en millorar la detecció de regions redundants de RE al no afegir les primitives que es prediu que seran ocloses a les signatures. Amb un cost de maquinari mínim, EVR aconsegueix una millora del consum energètic del 43% i del temps d’execució del 39%. Finalment, ens centrem a reduir càlculs en regions de pantalla amb poca freqüència espacial. Les GPU actuals produeixen colors mostrejant els triangles una vegada per cada píxel i fent càlculs a cada localització mostrejada. Però la majoria de regions no tenen suficient detall per a necessitar altes freqüències de mostreig, el que implica un malbaratament d’energia en el càlcul del mateix color en píxels adjacents. Com les freqüències tendeixen a mantenir-se en el temps, proposem Dynamic Sampling Rate (DSR)¸ un mecanisme que analitza les freqüències de les regions una vegada han estat renderitzades i en determina la menor freqüència de mostreig a la que es poden processar, que s’aplica a la següent imatge..

    Exploiting frame coherence in real-time rendering for energy-efficient GPUs

    Get PDF
    The computation capabilities of mobile GPUs have greatly evolved in the last generations, allowing real-time rendering of realistic scenes. However, the desire for processing complex environments clashes with the battery-operated nature of smartphones, for which users expect long operating times per charge and a low-enough temperature to comfortably hold them. Consequently, improving the energy-efficiency of mobile GPUs is paramount to fulfill both performance and low-power goals. The work of the processors from within the GPU and their accesses to off-chip memory are the main sources of energy consumption in graphics workloads. Yet most of this energy is spent in redundant computations, as the frame rate required to produce animations results in a sequence of extremely similar images. The goal of this thesis is to improve the energy-efficiency of mobile GPUs by designing micro-architectural mechanisms that leverage frame coherence in order to reduce the redundant computations and memory accesses inherent in graphics applications. First, we focus on reducing redundant color computations. Mobile GPUs typically employ an architecture called Tile-Based Rendering, in which the screen is divided into tiles that are independently rendered in on-chip buffers. It is common that more than 80% of the tiles produce exactly the same output between consecutive frames. We propose Rendering Elimination (RE), a mechanism that accurately determines such occurrences by computing and storing signatures of the inputs of all the tiles in a frame. If the signatures of a tile across consecutive frames are the same, the colors computed in the preceding frame are reused, saving all computations and memory accesses associated to the rendering of the tile. We show that RE vastly outperforms related schemes found in the literature, achieving a reduction of energy consumption of 37% and execution time of 33% with minimal overheads. Next, we focus on reducing redundant computations of fragments that will eventually not be visible. In real-time rendering, objects are processed in the order they are submitted to the GPU, which usually causes that the results of previously-computed objects are overwritten by new objects that turn occlude them. Consequently, whether or not a particular object will be occluded is not known until the entire scene has been processed. Based on the fact that visibility tends to remain constant across consecutive frames, we propose Early Visibility Resolution (EVR), a mechanism that predicts visibility based on information obtained in the preceding frame. EVR first computes and stores the depth of the farthest visible point after rendering each tile. Whenever a tile is rendered in the following frame, primitives that are farther from the observer than the stored depth are predicted to be occluded, and processed after the ones predicted to be visible. Additionally, this visibility prediction scheme is used to improve Rendering Elimination’s equal tile detection capabilities by not adding primitives predicted to be occluded in the signature. With minor hardware costs, EVR is shown to provide a reduction of energy consumption of 43% and execution time of 39%. Finally, we focus on reducing computations in tiles with low spatial frequencies. GPUs produce pixel colors by sampling triangles once per pixel and performing computations on each sampling location. However, most screen regions do not include sufficient detail to require high sampling rates, leading to a significant amount of energy wasted computing the same color for neighboring pixels. Given that spatial frequencies are maintained across frames, we propose Dynamic Sampling Rate, a mechanism that analyzes the spatial frequencies of tiles and determines the best sampling rate for them, which is applied in the following frame. Results show that Dynamic Sampling Rate significantly reduces processor activity, yielding energy savings of 40% and execution time reductions of 35%.La capacitat de càlcul de les GPU mòbils ha augmentat en gran mesura en les darreres generacions, permetent el renderitzat de paisatges complexos en temps real. Nogensmenys, el desig de processar escenes cada vegada més realistes xoca amb el fet que aquests dispositius funcionen amb bateries, i els usuaris n’esperen llargues durades i una temperatura prou baixa com per a ser agafats còmodament. En conseqüència, millorar l’eficiència energètica de les GPU mòbils és essencial per a aconseguir els objectius de rendiment i baix consum. Els processadors de la GPU i els seus accessos a memòria són els principals consumidors d’energia en càrregues gràfiques, però molt d’aquest consum és malbaratat en càlculs redundants, ja que les animacions produïdes s¿aconsegueixen renderitzant una seqüència d’imatges molt similars. L’objectiu d’aquesta tesi és millorar l’eficiència energètica de les GPU mòbils mitjançant el disseny de mecanismes microarquitectònics que aprofitin la coherència entre imatges per a reduir els càlculs i accessos redundants inherents a les aplicacions gràfiques. Primerament, ens centrem en reduir càlculs redundants de colors. A les GPU mòbils, sovint s'empra una arquitectura anomenada Tile-Based Rendering, en què la pantalla es divideix en regions que es processen independentment dins del xip. És habitual que més del 80% de les regions de pantalla produeixin els mateixos colors entre imatges consecutives. Proposem Rendering Elimination (RE), un mecanisme que determina acuradament aquests casos computant una signatura de les entrades de totes les regions. Si les signatures de dues imatges són iguals, es reutilitzen els colors calculats a la imatge anterior, el que estalvia tots els càlculs i accessos a memòria de la regió. RE supera àmpliament propostes relacionades de la literatura, aconseguint una reducció del consum energètic del 37% i del temps d’execució del 33%. Seguidament, ens centrem en reduir càlculs redundants en fragments que eventualment no seran visibles. En aplicacions gràfiques, els objectes es processen en l’ordre en què son enviats a la GPU, el que sovint causa que resultats ja processats siguin sobreescrits per nous objectes que els oclouen. Per tant, no se sap si un objecte serà visible o no fins que tota l’escena ha estat processada. Fonamentats en el fet que la visibilitat tendeix a ser constant entre imatges, proposem Early Visibility Resolution (EVR), un mecanisme que prediu la visibilitat basat en informació obtinguda a la imatge anterior. EVR computa i emmagatzema la profunditat del punt visible més llunyà després de processar cada regió de pantalla. Quan es processa una regió a la imatge següent, es prediu que les primitives més llunyanes a el punt guardat seran ocloses i es processen després de les que es prediuen que seran visibles. Addicionalment, aquest esquema de predicció s’empra en millorar la detecció de regions redundants de RE al no afegir les primitives que es prediu que seran ocloses a les signatures. Amb un cost de maquinari mínim, EVR aconsegueix una millora del consum energètic del 43% i del temps d’execució del 39%. Finalment, ens centrem a reduir càlculs en regions de pantalla amb poca freqüència espacial. Les GPU actuals produeixen colors mostrejant els triangles una vegada per cada píxel i fent càlculs a cada localització mostrejada. Però la majoria de regions no tenen suficient detall per a necessitar altes freqüències de mostreig, el que implica un malbaratament d’energia en el càlcul del mateix color en píxels adjacents. Com les freqüències tendeixen a mantenir-se en el temps, proposem Dynamic Sampling Rate (DSR)¸ un mecanisme que analitza les freqüències de les regions una vegada han estat renderitzades i en determina la menor freqüència de mostreig a la que es poden processar, que s’aplica a la següent imatge...Postprint (published version
    • …
    corecore