191 research outputs found
Interactive global illumination on the CPU
Computing realistic physically-based global illumination in real-time remains one
of the major goals in the fields of rendering and visualisation; one that has not
yet been achieved due to its inherent computational complexity. This thesis focuses
on CPU-based interactive global illumination approaches with an aim to
develop generalisable hardware-agnostic algorithms. Interactive ray tracing is reliant
on spatial and cache coherency to achieve interactive rates which conflicts
with needs of global illumination solutions which require a large number of incoherent
secondary rays to be computed. Methods that reduce the total number of
rays that need to be processed, such as Selective rendering, were investigated to
determine how best they can be utilised.
The impact that selective rendering has on interactive ray tracing was analysed
and quantified and two novel global illumination algorithms were developed,
with the structured methodology used presented as a framework. Adaptive Inter-
leaved Sampling, is a generalisable approach that combines interleaved sampling
with an adaptive approach, which uses efficient component-specific adaptive guidance
methods to drive the computation. Results of up to 11 frames per second
were demonstrated for multiple components including participating media. Temporal Instant Caching, is a caching scheme for accelerating the computation of
diffuse interreflections to interactive rates. This approach achieved frame rates
exceeding 9 frames per second for the majority of scenes. Validation of the results
for both approaches showed little perceptual difference when comparing
against a gold-standard path-traced image. Further research into caching led to
the development of a new wait-free data access control mechanism for sharing the
irradiance cache among multiple rendering threads on a shared memory parallel
system. By not serialising accesses to the shared data structure the irradiance
values were shared among all the threads without any overhead or contention,
when reading and writing simultaneously. This new approach achieved efficiencies
between 77% and 92% for 8 threads when calculating static images and animations.
This work demonstrates that, due to the
flexibility of the CPU, CPU-based
algorithms remain a valid and competitive choice for achieving global illumination
interactively, and an alternative to the generally brute-force GPU-centric
algorithms
Workload distribution for ray tracing in multi-core systems
One of the features that made interactive ray tracing possible over the last few years was the careful exploitation of the computational power and parallelism available on modern multicore processors. Multithreaded interactive ray tracing engines have to share the workload (rays to be processed) among rendering threads. This may be achieved by storing tasks on a shared FIFO-queue, accessed by all threads. Accessing this shared data structure requires a data access control mechanism, which ensures that the data structure is not corrupted. This access mechanism must incur minimal overheads such that performance is not penalized. This paper proposes a lock-free data access control mechanism to such queue, which avoids all locks by carefully reordering instructions. This technique
is compared with a classical lock-based approach and with a conservative local technique, where each thread
maintains its local queue of tasks and shares nothing with other threads. Although the local approach outperforms
the other two due to very good load balancing conditions, we demonstrate that the lock-free approach outperforms
the lock-based one for large processor counts. Efficient and reliable sharing of data structures within a shared
memory system is becoming a very relevant problem with the advent of many core processors. Lock free approaches are a promising manner of achieving such goal
High-fidelity rendering on shared computational resources
The generation of high-fidelity imagery is a computationally expensive process
and parallel computing has been traditionally employed to alleviate this cost.
However, traditional parallel rendering has been restricted to expensive shared
memory or dedicated distributed processors. In contrast, parallel computing on
shared resources such as a computational or a desktop grid, offers a low cost alternative. But, the prevalent rendering systems are currently incapable of seamlessly handling such shared resources as they suffer from high latencies, restricted
bandwidth and volatility. A conventional approach of rescheduling failed jobs in
a volatile environment inhibits performance by using redundant computations.
Instead, clever task subdivision along with image reconstruction techniques provides an unrestrictive fault-tolerance mechanism, which is highly suitable for
high-fidelity rendering. This thesis presents novel fault-tolerant parallel rendering algorithms for effectively tapping the enormous inexpensive computational
power provided by shared resources.
A first of its kind system for fully dynamic high-fidelity interactive rendering
on idle resources is presented which is key for providing an immediate feedback
to the changes made by a user. The system achieves interactivity by monitoring
and adapting computations according to run-time variations in the computational
power and employs a spatio-temporal image reconstruction technique for enhancing the visual fidelity. Furthermore, algorithms described for time-constrained offline rendering of still images and animation sequences, make it possible to deliver
the results in a user-defined limit. These novel methods enable the employment
of variable resources in deadline-driven environments
High-fidelity graphics using unconventional distributed rendering approaches
High-fidelity rendering requires a substantial amount of computational resources to accurately simulate lighting in virtual environments. While desktop computing, with the aid of modern graphics hardware, has shown promise in delivering realistic rendering at interactive rates, real-time rendering of moderately complex scenes is still unachievable on the majority of desktop machines and the vast plethora of mobile computing devices that have recently become commonplace. This work provides a wide range of computing devices with high-fidelity rendering capabilities via oft-unused distributed computing paradigms. It speeds up the rendering process on formerly capable devices and provides full functionality to incapable devices. Novel scheduling and rendering algorithms have been designed to best take advantage of the characteristics of these systems and demonstrate the efficacy of such distributed methods. The first is a novel system that provides multiple clients with parallel resources for rendering a single task, and adapts in real-time to the number of concurrent requests. The second is a distributed algorithm for the remote asynchronous computation of the indirect diffuse component, which is merged with locally-computed direct lighting for a full global illumination solution. The third is a method for precomputing indirect lighting information for dynamically-generated multi-user environments by using the aggregated resources of the clients themselves. The fourth is a novel peer-to-peer system for improving the rendering performance in multi-user environments through the sharing of computation results, propagated via a mechanism based on epidemiology. The results demonstrate that the boundaries of the distributed computing typically used for computer graphics can be significantly and successfully expanded by adapting alternative distributed methods
Recommended from our members
Software-Defined Infrastructure for IoT-based Energy Systems
Internet of Things (IoT) devices are becoming an essential part of our everyday lives. These physical devices are connected to the internet and can measure or control the environment around us. Further, IoT devices are increasingly being used to monitor buildings, farms, health, and transportation. As these connected devices become more pervasive, these devices will generate vast amounts of data that can be used to gain insights and build intelligence into the system. At the same time, large-scale deployment of these devices will raise new challenges in efficiently managing and controlling them.
In this thesis, I argue that the IoT devices need programmability and need to provide software controls in order to manage them efficiently. Further, it will need data-driven modeling techniques to process and analyze a vast amount of data from heterogeneous devices to derive actionable insights. My thesis explores the problems posed by software-defined IoT energy infrastructure. I present four techniques that use systems and machine learning principles to design, analyze and deploy the next generation of smart IoT energy systems.
First, I discuss how current state-of-the-art LIDAR-based approaches in identifying ideal locations on rooftops for deploying energy systems such as solar do not scale to many regions of the world. To address the challenges, I propose DeepRoof, a data-driven approach that uses deep learning to estimate the solar potential of roofs using satellite imagery and identify ideal locations for installation. We evaluate our approach on different types of roof and show that our technique is comparable to LIDAR-based methods.
Second, I study how excessive solar can cause problems in the grid and examine how programmatic control of the solar output can prevent congestion in the electric grid. Further, I present a decentralized approach that can control the solar arrays in a grid-friendly manner. Also, my approach provides flexible control of solar output, and I show that such mechanisms allow for higher solar penetration in the grid.
Third, I discuss the challenges in community-owned (and shared) distributed energy resources that do not provide independent control to users. To do so, I propose vSolar, an approach to virtualize the solar arrays and energy storage that allows independent control. Further, I show how using vSolar users can exercise independent control, implement their custom energy sharing policies, and reduce energy costs through energy trading.
Finally, I present the challenges, and the high throughput needs to enable a peer-to-peer energy trading platform using permissioned blockchains. I propose FabricPlus, an enhanced Hyperledger Fabric blockchain, that contains a series of optimizations to enable high throughput transactions. FabricPlus increases the transaction throughput many folds, without requiring any changes to its external interfaces. I also show considerable performance improvement over the baseline Fabric
Ray Tracing Gems
This book is a must-have for anyone serious about rendering in real time. With the announcement of new ray tracing APIs and hardware to support them, developers can easily create real-time applications with ray tracing as a core component. As ray tracing on the GPU becomes faster, it will play a more central role in real-time rendering. Ray Tracing Gems provides key building blocks for developers of games, architectural applications, visualizations, and more. Experts in rendering share their knowledge by explaining everything from nitty-gritty techniques that will improve any ray tracer to mastery of the new capabilities of current and future hardware. What you'll learn: The latest ray tracing techniques for developing real-time applications in multiple domains Guidance, advice, and best practices for rendering applications with Microsoft DirectX Raytracing (DXR) How to implement high-performance graphics for interactive visualizations, games, simulations, and more Who this book is for: Developers who are looking to leverage the latest APIs and GPU technology for real-time rendering and ray tracing Students looking to learn about best practices in these areas Enthusiasts who want to understand and experiment with their new GPU
Analysis and parallel implementation of an individually based algae model
The focus of this research is the analysis and parallel implementation of an individually based algae population model. Analysis included examination of the sensitivity of the population\u27s dynamics to increases of 25% and 50% and decreases of 25% and 50% of 58 of the model\u27s parameters from a reference set of values under three different nutrient limiting conditions. Results indicate that the most sensitive parameters are those either directly or indirectly associated with the construction of protein. Analysis of the model also included examination of the influence of fluctuating temperatures on uptake of nutrients. Results indicate that while external nutrients are abundant, temperature influences the system, but when external nutrients become limiting, temperature effects diminish.
Parallel implementation included analysis of a pre-exisiting algae code in order to identify avenues for parallelization. To accomodate identified parallelization avenues, the original code was restructured and subsequently parallelized. Results from the parallel model were then compared with results from the sequential model to determine accuracy, and speed-up issues were addressed. It was determined that the parallel model, in its current form, offers no advantage over the sequential model
Practical photon mapping in hardware
Photon mapping is a popular global illumination algorithm that can reproduce a wide range of visual effects including indirect illumination, color bleeding and caustics on complex diffuse, glossy, and specular surfaces modeled using arbitrary geometric primitives. However, the large amount of computation and tremendous amount of memory bandwidth, terabytes per second, required makes photon mapping prohibitively expensive for interactive applications. In this dissertation I present three techniques that work together to reduce the bandwidth requirements of photon mapping by over an order of magnitude. These are combined in a hardware architecture that can provide interactive performance on moderately-sized indirectly-illuminated scenes using a pre-computed photon map. 1. The computations of the naive photon map algorithm are efficiently reordered, generating exactly the same image, but with an order of magnitude less bandwidth due to an easily cacheable sequence of memory accesses. 2. The irradiance caching algorithm is modified to allow fine-grain parallel execution by removing the sequential dependency between pixels. The bandwidth requirements of scenes with diffuse surfaces and low geometric complexity is reduced by an additional 40% or more. 3. Generating final gather rays in proportion to both the incident radiance and the reflectance functions requires fewer final gather rays for images of the same quality. Combined Importance Sampling is simple to implement, cheap to compute, compatible with query reordering, and can reduce bandwidth requirements by an order of magnitude. Functional simulation of a practical and scalable hardware architecture based on these three techniques shows that an implementation that would fit within a host workstation will achieve interactive rates. This architecture is therefore a candidate for the next generation of graphics hardware
Efficient Methods for Computational Light Transport
En esta tesis presentamos contribuciones sobre distintos retos computacionales relacionados con transporte de luz. Los algoritmos que utilizan información sobre el transporte de luz están presentes en muchas aplicaciones de hoy en día, desde la generación de efectos visuales, a la detección de objetos en tiempo real. La luz es una valiosa fuente de información que nos permite entender y representar nuestro entorno, pero obtener y procesar esta información presenta muchos desafíos debido a la complejidad de las interacciones entre la luz y la materia. Esta tesis aporta contribuciones en este tema desde dos puntos de vista diferentes: algoritmos en estado estacionario, en los que se asume que la velocidad de la luz es infinita; y algoritmos en estado transitorio, que tratan la luz no solo en el dominio espacial, sino también en el temporal. Nuestras contribuciones en algoritmos estacionarios abordan problemas tanto en renderizado offline como en tiempo real. Nos enfocamos en la reducción de varianza para métodos offline,proponiendo un nuevo método para renderizado eficiente de medios participativos. En renderizado en tiempo real, abordamos las limitacionesde consumo de batería en dispositivos móviles proponiendo un sistema de renderizado que incrementa la eficiencia energética en aplicaciones gráficas en tiempo real. En el transporte de luz transitorio, formalizamos la simulación de este tipo transporte en este nuevo dominio, y presentamos nuevos algoritmos y métodos para muestreo eficiente para render transitorio. Finalmente, demostramos la utilidad de generar datos en este dominio, presentando un nuevo método para corregir interferencia multi-caminos en camaras Timeof- Flight, un problema patológico en el procesamiento de imágenes transitorias.n this thesis we present contributions to different challenges of computational light transport. Light transport algorithms are present in many modern applications, from image generation for visual effects to real-time object detection. Light is a rich source of information that allows us to understand and represent our surroundings, but obtaining and processing this information presents many challenges due to its complex interactions with matter. This thesis provides advances in this subject from two different perspectives: steady-state algorithms, where the speed of light is assumed infinite, and transient-state algorithms, which deal with light as it travels not only through space but also time. Our steady-state contributions address problems in both offline and real-time rendering. We target variance reduction in offline rendering by proposing a new efficient method for participating media rendering. In real-time rendering, we target energy constraints of mobile devices by proposing a power-efficient rendering framework for real-time graphics applications. In transient-state we first formalize light transport simulation under this domain, and present new efficient sampling methods and algorithms for transient rendering. We finally demonstrate the potential of simulated data to correct multipath interference in Time-of-Flight cameras, one of the pathological problems in transient imaging.<br /
- …