48 research outputs found
High-fidelity rendering on shared computational resources
The generation of high-fidelity imagery is a computationally expensive process
and parallel computing has been traditionally employed to alleviate this cost.
However, traditional parallel rendering has been restricted to expensive shared
memory or dedicated distributed processors. In contrast, parallel computing on
shared resources such as a computational or a desktop grid, offers a low cost alternative. But, the prevalent rendering systems are currently incapable of seamlessly handling such shared resources as they suffer from high latencies, restricted
bandwidth and volatility. A conventional approach of rescheduling failed jobs in
a volatile environment inhibits performance by using redundant computations.
Instead, clever task subdivision along with image reconstruction techniques provides an unrestrictive fault-tolerance mechanism, which is highly suitable for
high-fidelity rendering. This thesis presents novel fault-tolerant parallel rendering algorithms for effectively tapping the enormous inexpensive computational
power provided by shared resources.
A first of its kind system for fully dynamic high-fidelity interactive rendering
on idle resources is presented which is key for providing an immediate feedback
to the changes made by a user. The system achieves interactivity by monitoring
and adapting computations according to run-time variations in the computational
power and employs a spatio-temporal image reconstruction technique for enhancing the visual fidelity. Furthermore, algorithms described for time-constrained offline rendering of still images and animation sequences, make it possible to deliver
the results in a user-defined limit. These novel methods enable the employment
of variable resources in deadline-driven environments
Practical photon mapping in hardware
Photon mapping is a popular global illumination algorithm that can reproduce a wide range of visual effects including indirect illumination, color bleeding and caustics on complex diffuse, glossy, and specular surfaces modeled using arbitrary geometric primitives. However, the large amount of computation and tremendous amount of memory bandwidth, terabytes per second, required makes photon mapping prohibitively expensive for interactive applications. In this dissertation I present three techniques that work together to reduce the bandwidth requirements of photon mapping by over an order of magnitude. These are combined in a hardware architecture that can provide interactive performance on moderately-sized indirectly-illuminated scenes using a pre-computed photon map. 1. The computations of the naive photon map algorithm are efficiently reordered, generating exactly the same image, but with an order of magnitude less bandwidth due to an easily cacheable sequence of memory accesses. 2. The irradiance caching algorithm is modified to allow fine-grain parallel execution by removing the sequential dependency between pixels. The bandwidth requirements of scenes with diffuse surfaces and low geometric complexity is reduced by an additional 40% or more. 3. Generating final gather rays in proportion to both the incident radiance and the reflectance functions requires fewer final gather rays for images of the same quality. Combined Importance Sampling is simple to implement, cheap to compute, compatible with query reordering, and can reduce bandwidth requirements by an order of magnitude. Functional simulation of a practical and scalable hardware architecture based on these three techniques shows that an implementation that would fit within a host workstation will achieve interactive rates. This architecture is therefore a candidate for the next generation of graphics hardware
Software-Defined Infrastructure for IoT-based Energy Systems
Internet of Things (IoT) devices are becoming an essential part of our everyday lives. These physical devices are connected to the internet and can measure or control the environment around us. Further, IoT devices are increasingly being used to monitor buildings, farms, health, and transportation. As these connected devices become more pervasive, these devices will generate vast amounts of data that can be used to gain insights and build intelligence into the system. At the same time, large-scale deployment of these devices will raise new challenges in efficiently managing and controlling them.
In this thesis, I argue that the IoT devices need programmability and need to provide software controls in order to manage them efficiently. Further, it will need data-driven modeling techniques to process and analyze a vast amount of data from heterogeneous devices to derive actionable insights. My thesis explores the problems posed by software-defined IoT energy infrastructure. I present four techniques that use systems and machine learning principles to design, analyze and deploy the next generation of smart IoT energy systems.
First, I discuss how current state-of-the-art LIDAR-based approaches in identifying ideal locations on rooftops for deploying energy systems such as solar do not scale to many regions of the world. To address the challenges, I propose DeepRoof, a data-driven approach that uses deep learning to estimate the solar potential of roofs using satellite imagery and identify ideal locations for installation. We evaluate our approach on different types of roof and show that our technique is comparable to LIDAR-based methods.
Second, I study how excessive solar can cause problems in the grid and examine how programmatic control of the solar output can prevent congestion in the electric grid. Further, I present a decentralized approach that can control the solar arrays in a grid-friendly manner. Also, my approach provides flexible control of solar output, and I show that such mechanisms allow for higher solar penetration in the grid.
Third, I discuss the challenges in community-owned (and shared) distributed energy resources that do not provide independent control to users. To do so, I propose vSolar, an approach to virtualize the solar arrays and energy storage that allows independent control. Further, I show how using vSolar users can exercise independent control, implement their custom energy sharing policies, and reduce energy costs through energy trading.
Finally, I present the challenges, and the high throughput needs to enable a peer-to-peer energy trading platform using permissioned blockchains. I propose FabricPlus, an enhanced Hyperledger Fabric blockchain, that contains a series of optimizations to enable high throughput transactions. FabricPlus increases the transaction throughput many folds, without requiring any changes to its external interfaces. I also show considerable performance improvement over the baseline Fabric
Hardware acceleration of photon mapping
PhD ThesisThe quest for realism in computer-generated graphics has yielded a range of algorithmic
techniques, the most advanced of which are capable of rendering images at close to photorealistic
quality. Due to the realism available, it is now commonplace that computer graphics are used in
the creation of movie sequences, architectural renderings, medical imagery and product
visualisations.
This work concentrates on the photon mapping algorithm [1, 2], a physically based global
illumination rendering algorithm. Photon mapping excels in producing highly realistic, physically
accurate images.
A drawback to photon mapping however is its rendering times, which can be significantly longer
than other, albeit less realistic, algorithms. Not surprisingly, this increase in execution time is
associated with a high computational cost. This computation is usually performed using the
general purpose central processing unit (CPU) of a personal computer (PC), with the algorithm
implemented as a software routine. Other options available for processing these algorithms
include desktop PC graphics processing units (GPUs) and custom designed acceleration hardware
devices.
GPUs tend to be efficient when dealing with less realistic rendering solutions such as rasterisation,
however with their recent drive towards increased programmability they can also be used to
process more realistic algorithms. A drawback to the use of GPUs is that these algorithms often
have to be reworked to make optimal use of the limited resources available.
There are very few custom hardware devices available for acceleration of the photon mapping
algorithm. Ray-tracing is the predecessor to photon mapping, and although not capable of
producing the same physical accuracy and therefore realism, there are similarities between the
algorithms. There have been several hardware prototypes, and at least one commercial offering,
created with the goal of accelerating ray-trace rendering [3]. However, properties making many of
these proposals suitable for the acceleration of ray-tracing are not shared by photon mapping.
There are even fewer proposals for acceleration of the additional functions found only in photon
mapping.
All of these approaches to algorithm acceleration offer limited scalability. GPUs are inherently
difficult to scale, while many of the custom hardware devices available thus far make use of large
processing elements and complex acceleration data structures.
In this work we make use of three novel approaches in the design of highly scalable specialised
hardware structures for the acceleration of the photon mapping algorithm. Increased scalability is
gained through:
• The use of a brute-force approach in place of the commonly used smart approach, thus
eliminating much data pre-processing, complex data structures and large processing units
often required.
• The use of Logarithmic Number System (LNS) arithmetic computation, which facilitates a
reduction in processing area requirement.
• A novel redesign of the photon inclusion test, used within the photon search method of
the photon mapping algorithm. This allows an intelligent memory structure to be used for
the search.
The design uses two hardware structures, both of which accelerate one core rendering function.
Renderings produced using field programmable gate array (FPGA) based prototypes are presented,
along with details of 90nm synthesised versions of the designs which show that close to an orderof-
magnitude speedup over a software implementation is possible. Due to the scalable nature of
the design, it is likely that any advantage can be maintained in the face of improving processor
speeds.
Significantly, due to the brute-force approach adopted, it is possible to eliminate an often-used
software acceleration method. This means that the device can interface almost directly to a frontend
modelling package, minimising much of the pre-processing required by most other proposals
Hardware acceleration of photon mapping
The quest for realism in computer-generated graphics has yielded a range of algorithmic techniques, the most advanced of which are capable of rendering images at close to photorealistic quality. Due to the realism available, it is now commonplace that computer graphics are used in the creation of movie sequences, architectural renderings, medical imagery and product visualisations. This work concentrates on the photon mapping algorithm [1, 2], a physically based global illumination rendering algorithm. Photon mapping excels in producing highly realistic, physically accurate images. A drawback to photon mapping however is its rendering times, which can be significantly longer than other, albeit less realistic, algorithms. Not surprisingly, this increase in execution time is associated with a high computational cost. This computation is usually performed using the general purpose central processing unit (CPU) of a personal computer (PC), with the algorithm implemented as a software routine. Other options available for processing these algorithms include desktop PC graphics processing units (GPUs) and custom designed acceleration hardware devices. GPUs tend to be efficient when dealing with less realistic rendering solutions such as rasterisation, however with their recent drive towards increased programmability they can also be used to process more realistic algorithms. A drawback to the use of GPUs is that these algorithms often have to be reworked to make optimal use of the limited resources available. There are very few custom hardware devices available for acceleration of the photon mapping algorithm. Ray-tracing is the predecessor to photon mapping, and although not capable of producing the same physical accuracy and therefore realism, there are similarities between the algorithms. There have been several hardware prototypes, and at least one commercial offering, created with the goal of accelerating ray-trace rendering [3]. However, properties making many of these proposals suitable for the acceleration of ray-tracing are not shared by photon mapping. There are even fewer proposals for acceleration of the additional functions found only in photon mapping. All of these approaches to algorithm acceleration offer limited scalability. GPUs are inherently difficult to scale, while many of the custom hardware devices available thus far make use of large processing elements and complex acceleration data structures. In this work we make use of three novel approaches in the design of highly scalable specialised hardware structures for the acceleration of the photon mapping algorithm. Increased scalability is gained through: • The use of a brute-force approach in place of the commonly used smart approach, thus eliminating much data pre-processing, complex data structures and large processing units often required. • The use of Logarithmic Number System (LNS) arithmetic computation, which facilitates a reduction in processing area requirement. • A novel redesign of the photon inclusion test, used within the photon search method of the photon mapping algorithm. This allows an intelligent memory structure to be used for the search. The design uses two hardware structures, both of which accelerate one core rendering function. Renderings produced using field programmable gate array (FPGA) based prototypes are presented, along with details of 90nm synthesised versions of the designs which show that close to an orderof- magnitude speedup over a software implementation is possible. Due to the scalable nature of the design, it is likely that any advantage can be maintained in the face of improving processor speeds. Significantly, due to the brute-force approach adopted, it is possible to eliminate an often-used software acceleration method. This means that the device can interface almost directly to a frontend modelling package, minimising much of the pre-processing required by most other proposals.EThOS - Electronic Theses Online ServiceGBUnited Kingdo