61 research outputs found

    Exploiting frame coherence in real-time rendering for energy-efficient GPUs

    Get PDF
    The computation capabilities of mobile GPUs have greatly evolved in the last generations, allowing real-time rendering of realistic scenes. However, the desire for processing complex environments clashes with the battery-operated nature of smartphones, for which users expect long operating times per charge and a low-enough temperature to comfortably hold them. Consequently, improving the energy-efficiency of mobile GPUs is paramount to fulfill both performance and low-power goals. The work of the processors from within the GPU and their accesses to off-chip memory are the main sources of energy consumption in graphics workloads. Yet most of this energy is spent in redundant computations, as the frame rate required to produce animations results in a sequence of extremely similar images. The goal of this thesis is to improve the energy-efficiency of mobile GPUs by designing micro-architectural mechanisms that leverage frame coherence in order to reduce the redundant computations and memory accesses inherent in graphics applications. First, we focus on reducing redundant color computations. Mobile GPUs typically employ an architecture called Tile-Based Rendering, in which the screen is divided into tiles that are independently rendered in on-chip buffers. It is common that more than 80% of the tiles produce exactly the same output between consecutive frames. We propose Rendering Elimination (RE), a mechanism that accurately determines such occurrences by computing and storing signatures of the inputs of all the tiles in a frame. If the signatures of a tile across consecutive frames are the same, the colors computed in the preceding frame are reused, saving all computations and memory accesses associated to the rendering of the tile. We show that RE vastly outperforms related schemes found in the literature, achieving a reduction of energy consumption of 37% and execution time of 33% with minimal overheads. Next, we focus on reducing redundant computations of fragments that will eventually not be visible. In real-time rendering, objects are processed in the order they are submitted to the GPU, which usually causes that the results of previously-computed objects are overwritten by new objects that turn occlude them. Consequently, whether or not a particular object will be occluded is not known until the entire scene has been processed. Based on the fact that visibility tends to remain constant across consecutive frames, we propose Early Visibility Resolution (EVR), a mechanism that predicts visibility based on information obtained in the preceding frame. EVR first computes and stores the depth of the farthest visible point after rendering each tile. Whenever a tile is rendered in the following frame, primitives that are farther from the observer than the stored depth are predicted to be occluded, and processed after the ones predicted to be visible. Additionally, this visibility prediction scheme is used to improve Rendering Elimination’s equal tile detection capabilities by not adding primitives predicted to be occluded in the signature. With minor hardware costs, EVR is shown to provide a reduction of energy consumption of 43% and execution time of 39%. Finally, we focus on reducing computations in tiles with low spatial frequencies. GPUs produce pixel colors by sampling triangles once per pixel and performing computations on each sampling location. However, most screen regions do not include sufficient detail to require high sampling rates, leading to a significant amount of energy wasted computing the same color for neighboring pixels. Given that spatial frequencies are maintained across frames, we propose Dynamic Sampling Rate, a mechanism that analyzes the spatial frequencies of tiles and determines the best sampling rate for them, which is applied in the following frame. Results show that Dynamic Sampling Rate significantly reduces processor activity, yielding energy savings of 40% and execution time reductions of 35%.La capacitat de càlcul de les GPU mòbils ha augmentat en gran mesura en les darreres generacions, permetent el renderitzat de paisatges complexos en temps real. Nogensmenys, el desig de processar escenes cada vegada més realistes xoca amb el fet que aquests dispositius funcionen amb bateries, i els usuaris n’esperen llargues durades i una temperatura prou baixa com per a ser agafats còmodament. En conseqüència, millorar l’eficiència energètica de les GPU mòbils és essencial per a aconseguir els objectius de rendiment i baix consum. Els processadors de la GPU i els seus accessos a memòria són els principals consumidors d’energia en càrregues gràfiques, però molt d’aquest consum és malbaratat en càlculs redundants, ja que les animacions produïdes s¿aconsegueixen renderitzant una seqüència d’imatges molt similars. L’objectiu d’aquesta tesi és millorar l’eficiència energètica de les GPU mòbils mitjançant el disseny de mecanismes microarquitectònics que aprofitin la coherència entre imatges per a reduir els càlculs i accessos redundants inherents a les aplicacions gràfiques. Primerament, ens centrem en reduir càlculs redundants de colors. A les GPU mòbils, sovint s'empra una arquitectura anomenada Tile-Based Rendering, en què la pantalla es divideix en regions que es processen independentment dins del xip. És habitual que més del 80% de les regions de pantalla produeixin els mateixos colors entre imatges consecutives. Proposem Rendering Elimination (RE), un mecanisme que determina acuradament aquests casos computant una signatura de les entrades de totes les regions. Si les signatures de dues imatges són iguals, es reutilitzen els colors calculats a la imatge anterior, el que estalvia tots els càlculs i accessos a memòria de la regió. RE supera àmpliament propostes relacionades de la literatura, aconseguint una reducció del consum energètic del 37% i del temps d’execució del 33%. Seguidament, ens centrem en reduir càlculs redundants en fragments que eventualment no seran visibles. En aplicacions gràfiques, els objectes es processen en l’ordre en què son enviats a la GPU, el que sovint causa que resultats ja processats siguin sobreescrits per nous objectes que els oclouen. Per tant, no se sap si un objecte serà visible o no fins que tota l’escena ha estat processada. Fonamentats en el fet que la visibilitat tendeix a ser constant entre imatges, proposem Early Visibility Resolution (EVR), un mecanisme que prediu la visibilitat basat en informació obtinguda a la imatge anterior. EVR computa i emmagatzema la profunditat del punt visible més llunyà després de processar cada regió de pantalla. Quan es processa una regió a la imatge següent, es prediu que les primitives més llunyanes a el punt guardat seran ocloses i es processen després de les que es prediuen que seran visibles. Addicionalment, aquest esquema de predicció s’empra en millorar la detecció de regions redundants de RE al no afegir les primitives que es prediu que seran ocloses a les signatures. Amb un cost de maquinari mínim, EVR aconsegueix una millora del consum energètic del 43% i del temps d’execució del 39%. Finalment, ens centrem a reduir càlculs en regions de pantalla amb poca freqüència espacial. Les GPU actuals produeixen colors mostrejant els triangles una vegada per cada píxel i fent càlculs a cada localització mostrejada. Però la majoria de regions no tenen suficient detall per a necessitar altes freqüències de mostreig, el que implica un malbaratament d’energia en el càlcul del mateix color en píxels adjacents. Com les freqüències tendeixen a mantenir-se en el temps, proposem Dynamic Sampling Rate (DSR)¸ un mecanisme que analitza les freqüències de les regions una vegada han estat renderitzades i en determina la menor freqüència de mostreig a la que es poden processar, que s’aplica a la següent imatge...Postprint (published version

    Design for scalability in 3D computer graphics architectures

    Get PDF

    Reaaliaikaisten antialiasiontimenetelmien vertailu virtuaalilaseilla

    Get PDF
    Virtual reality and head-mounted devices have gained popularity in the past few years. Their increased field-of-view combined with a display that is near to the eyes have increased the importance of anti-aliasing i.e. softening of the visible jagged edges resulting from insufficient rendering resolution. In this thesis, elementary theory of real-time rendering, anti-aliasing and virtual reality is studied. Based on the theory and review of recent studies, multisample anti-aliasing (MSAA), fast-approximate anti-aliasing (FXAA) and temporal anti-aliasing (TAA) were implemented into a real-time deferred rendering engine and the different techniques were studied in both subjective image quality and objective performance measures. In the scope of this thesis, only each methods’ ability to prevent or lessen jagged edges and small flickering detailed geometries is examined. Performance was measured on two different machines; the FXAA implementation was found to be the fastest with 3% impact on performance and required the least memory, the TAA performance impact was 10-11% and 22% to 62% for MSAA was depending on the sample count. Each techniques’ ability to prevent or reduce aliasing was examined by measuring the visual quality and fatigue reported by participants. Each anti-aliasing method was presented in a 3D scene using Oculus Rift CV1. The results indicate that the 4xMSAA and 2xMSAA had clearly the best visual quality and made participants the least fatigued. FXAA appears visually not as good, but did not cause significant fatigue. TAA appeared slightly blurry for the most of the participants, and this caused them to experience more fatigue. This study emphasizes the need for understanding the human visual system when developing real-time graphics for virtual reality application.Virtuaalitodellisuus (VR) ja VR-lasit ovat yleistyneet viime vuosina. VR-lasien huomattavasti suuremman näkökentän sekä lähelle silmiä tulevan näytön vuoksi antialiasointi, eli reunojen pehmennystekniikoista, on tullut tärkeäksi. Diplomityössä tehdään kirjallisuuskatsaus reaaliaikarenderöinnin, antialiasoinnin sekä virtuaalitodellisuuden perusteisiin. Teoriaan sekä viimeaikaisiin tutkimuksiin perustuen kolme antialiasointimenetelmää fast-approximate (FXAA), temporaalinen (TAA) sekä moninäytteistys (MSAA) ovat valittu implementoitavaksi reaaliaikaohjelmistoon ja tarkemmin tutkittavaksi suorituskyvyn sekä subjektiivisesti testattavan visuaalisen laadun puolesta. Diplomityö keskittyy visuaalisessa laadussa tutkimaan vain eri menetelmien kykyä estää tai redusoida reunojen antialiasointia ja esimerkiksi pienien geometristen objektien yksityiskohtien välkkymistä. Suorituskyvyn mittauksissa FXAA oli menetelmistä nopein (3% menetys suorituskyvyssä), TAA 10-11% menetys suorituskyvyssä sekä MSAA hitain 22-62% suorituskyvyn menetyksellä. Subjektiivisen laadun testillä mitattiin kokemuksen laatua, joka koostui visuaalisen laadun sekä uupumuksen arvostelusta eri tapauksissa. Ärsykkeet eli eri antialiasointimenetelmät esitettiin reaaliaikaisessa 3D-ympäristössä, jota katsottiin Oculus Rift CV1 -virtuaalilaseilla. Tulosten mukaan neljän sekä kahden näytteen versiot MSAA:sta olivat selkesti visuaalisesti laadukkaimmat sekä aiheuttivat vähiten uupuneisuutta koehenkilöissä. FXAA havaittiin laadultaan hiekommaksi, mutta ei MSAA:ta enemmän uupumusta aiheuttavaski. TAA aiheutti selkeästi eniten uupumusta sekä oli laadullisesti huonoin liiallisen pehmeyden ja haamuefektin vuoksi. Tämä tutkimus painottaa ihmisen näköjärjestelmän ymmärrystä kehittäessä reaaliaikagrafiikkaa VR-ohjelmistoihin

    Hardware accelerated volume texturing.

    Get PDF
    The emergence of volume graphics, a sub field in computer graphics, has been evident for the last 15 years. Growing from scientific visualization problems, volume graphics has established itself as an important field in general computer graphics. However, the general graphics fraternity still favour the established surface graphics techniques. This is due to well founded and established techniques and a complete pipeline through software onto display hardware. This enables real-time applications to be constructed with ease and used by a wide range of end users due to the readily available graphics hardware adopted by many computer manufacturers. Volume graphics has traditionally been restricted to high-end systems due to the complexity involved with rendering volume datasets. Either specialised graphics hardware or powerful computers were required to generate images, many of these not in real-time. Although there have been specialised hardware solutions to the volume rendering problem, the adoption of the volume dataset as a primitive relies on end-users with commodity hardware being able to display images at interactive rates. The recent emergence of programmable consumer level graphics hardware is now allowing these platforms to compute volume rendering at interactive rates. Most of the work in this field is directed towards scientific visualisation. The work in this thesis addresses the issues in providing real-time volume graphics techniques to the general graphics community using commodity graphics hardware. Real-time texturing of volumetric data is explored as an important set of techniques in delivering volume datasets as a general graphics primitive. The main contributions of this work are; The introduction of efficient acceleration techniques; Interactive display of amorphous phenomena modelled outside an object defined in a volume dataset; Interactive procedural texture synthesis for volume data; 2D texturing techniques and extensions for volume data in real-time; A flexible surface detail mapping algorithm that removes many previous restrictions Parts of this work have been presented at the 4th International Workshop on Volume Graphics and also published in Volume Graphics 2005
    • …
    corecore