592 research outputs found

    Hardware acceleration of photon mapping

    Get PDF
    PhD ThesisThe quest for realism in computer-generated graphics has yielded a range of algorithmic techniques, the most advanced of which are capable of rendering images at close to photorealistic quality. Due to the realism available, it is now commonplace that computer graphics are used in the creation of movie sequences, architectural renderings, medical imagery and product visualisations. This work concentrates on the photon mapping algorithm [1, 2], a physically based global illumination rendering algorithm. Photon mapping excels in producing highly realistic, physically accurate images. A drawback to photon mapping however is its rendering times, which can be significantly longer than other, albeit less realistic, algorithms. Not surprisingly, this increase in execution time is associated with a high computational cost. This computation is usually performed using the general purpose central processing unit (CPU) of a personal computer (PC), with the algorithm implemented as a software routine. Other options available for processing these algorithms include desktop PC graphics processing units (GPUs) and custom designed acceleration hardware devices. GPUs tend to be efficient when dealing with less realistic rendering solutions such as rasterisation, however with their recent drive towards increased programmability they can also be used to process more realistic algorithms. A drawback to the use of GPUs is that these algorithms often have to be reworked to make optimal use of the limited resources available. There are very few custom hardware devices available for acceleration of the photon mapping algorithm. Ray-tracing is the predecessor to photon mapping, and although not capable of producing the same physical accuracy and therefore realism, there are similarities between the algorithms. There have been several hardware prototypes, and at least one commercial offering, created with the goal of accelerating ray-trace rendering [3]. However, properties making many of these proposals suitable for the acceleration of ray-tracing are not shared by photon mapping. There are even fewer proposals for acceleration of the additional functions found only in photon mapping. All of these approaches to algorithm acceleration offer limited scalability. GPUs are inherently difficult to scale, while many of the custom hardware devices available thus far make use of large processing elements and complex acceleration data structures. In this work we make use of three novel approaches in the design of highly scalable specialised hardware structures for the acceleration of the photon mapping algorithm. Increased scalability is gained through: • The use of a brute-force approach in place of the commonly used smart approach, thus eliminating much data pre-processing, complex data structures and large processing units often required. • The use of Logarithmic Number System (LNS) arithmetic computation, which facilitates a reduction in processing area requirement. • A novel redesign of the photon inclusion test, used within the photon search method of the photon mapping algorithm. This allows an intelligent memory structure to be used for the search. The design uses two hardware structures, both of which accelerate one core rendering function. Renderings produced using field programmable gate array (FPGA) based prototypes are presented, along with details of 90nm synthesised versions of the designs which show that close to an orderof- magnitude speedup over a software implementation is possible. Due to the scalable nature of the design, it is likely that any advantage can be maintained in the face of improving processor speeds. Significantly, due to the brute-force approach adopted, it is possible to eliminate an often-used software acceleration method. This means that the device can interface almost directly to a frontend modelling package, minimising much of the pre-processing required by most other proposals

    Sparse Volumetric Deformation

    Get PDF
    Volume rendering is becoming increasingly popular as applications require realistic solid shape representations with seamless texture mapping and accurate filtering. However rendering sparse volumetric data is difficult because of the limited memory and processing capabilities of current hardware. To address these limitations, the volumetric information can be stored at progressive resolutions in the hierarchical branches of a tree structure, and sampled according to the region of interest. This means that only a partial region of the full dataset is processed, and therefore massive volumetric scenes can be rendered efficiently. The problem with this approach is that it currently only supports static scenes. This is because it is difficult to accurately deform massive amounts of volume elements and reconstruct the scene hierarchy in real-time. Another problem is that deformation operations distort the shape where more than one volume element tries to occupy the same location, and similarly gaps occur where deformation stretches the elements further than one discrete location. It is also challenging to efficiently support sophisticated deformations at hierarchical resolutions, such as character skinning or physically based animation. These types of deformation are expensive and require a control structure (for example a cage or skeleton) that maps to a set of features to accelerate the deformation process. The problems with this technique are that the varying volume hierarchy reflects different feature sizes, and manipulating the features at the original resolution is too expensive; therefore the control structure must also hierarchically capture features according to the varying volumetric resolution. This thesis investigates the area of deforming and rendering massive amounts of dynamic volumetric content. The proposed approach efficiently deforms hierarchical volume elements without introducing artifacts and supports both ray casting and rasterization renderers. This enables light transport to be modeled both accurately and efficiently with applications in the fields of real-time rendering and computer animation. Sophisticated volumetric deformation, including character animation, is also supported in real-time. This is achieved by automatically generating a control skeleton which is mapped to the varying feature resolution of the volume hierarchy. The output deformations are demonstrated in massive dynamic volumetric scenes

    Generalized Neighbor Search using Commodity Hardware Acceleration

    Full text link
    Tree-based Nearest Neighbor Search (NNS) is hard to parallelize on GPUs. However, newer Nvidia GPUs are equipped with Ray Tracing (RT) cores that can build a spatial tree called Bounding Volume Hierarchy (BVH) to accelerate graphics rendering. Recent work proposed using RT cores to implement NNS, but they all have a hardware-imposed constraint on the type of distance metric, which is the Euclidean distance. We propose and implement two approaches for generalized distance computations: filter-refine, and monotone transformation, each of which allows non-euclidean nearest neighbor queries to be performed in terms of Euclidean distances. We find that our reductions improve the time taken to perform distance computations during the search, thereby improving the overall performance of the NNS

    Hardware acceleration of photon mapping

    Get PDF
    The quest for realism in computer-generated graphics has yielded a range of algorithmic techniques, the most advanced of which are capable of rendering images at close to photorealistic quality. Due to the realism available, it is now commonplace that computer graphics are used in the creation of movie sequences, architectural renderings, medical imagery and product visualisations. This work concentrates on the photon mapping algorithm [1, 2], a physically based global illumination rendering algorithm. Photon mapping excels in producing highly realistic, physically accurate images. A drawback to photon mapping however is its rendering times, which can be significantly longer than other, albeit less realistic, algorithms. Not surprisingly, this increase in execution time is associated with a high computational cost. This computation is usually performed using the general purpose central processing unit (CPU) of a personal computer (PC), with the algorithm implemented as a software routine. Other options available for processing these algorithms include desktop PC graphics processing units (GPUs) and custom designed acceleration hardware devices. GPUs tend to be efficient when dealing with less realistic rendering solutions such as rasterisation, however with their recent drive towards increased programmability they can also be used to process more realistic algorithms. A drawback to the use of GPUs is that these algorithms often have to be reworked to make optimal use of the limited resources available. There are very few custom hardware devices available for acceleration of the photon mapping algorithm. Ray-tracing is the predecessor to photon mapping, and although not capable of producing the same physical accuracy and therefore realism, there are similarities between the algorithms. There have been several hardware prototypes, and at least one commercial offering, created with the goal of accelerating ray-trace rendering [3]. However, properties making many of these proposals suitable for the acceleration of ray-tracing are not shared by photon mapping. There are even fewer proposals for acceleration of the additional functions found only in photon mapping. All of these approaches to algorithm acceleration offer limited scalability. GPUs are inherently difficult to scale, while many of the custom hardware devices available thus far make use of large processing elements and complex acceleration data structures. In this work we make use of three novel approaches in the design of highly scalable specialised hardware structures for the acceleration of the photon mapping algorithm. Increased scalability is gained through: • The use of a brute-force approach in place of the commonly used smart approach, thus eliminating much data pre-processing, complex data structures and large processing units often required. • The use of Logarithmic Number System (LNS) arithmetic computation, which facilitates a reduction in processing area requirement. • A novel redesign of the photon inclusion test, used within the photon search method of the photon mapping algorithm. This allows an intelligent memory structure to be used for the search. The design uses two hardware structures, both of which accelerate one core rendering function. Renderings produced using field programmable gate array (FPGA) based prototypes are presented, along with details of 90nm synthesised versions of the designs which show that close to an orderof- magnitude speedup over a software implementation is possible. Due to the scalable nature of the design, it is likely that any advantage can be maintained in the face of improving processor speeds. Significantly, due to the brute-force approach adopted, it is possible to eliminate an often-used software acceleration method. This means that the device can interface almost directly to a frontend modelling package, minimising much of the pre-processing required by most other proposals.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A graphics architecture for ray tracing and photon mapping

    Get PDF
    Recently, methods were developed to render various global illumination effects with rasterization GPUs. Among those were hardware based ray tracing and photon mapping. However, due to current GPU??s inherent architectural limitations, the efficiency and throughput of these methods remained low. In this thesis, we propose a coherent rendering system that addresses these issues. First, we introduce new photon mapping and ray racing acceleration algorithms that facilitate data coherence and spatial locality, as well as eliminating unnecessary random memory accesses. A high level abstraction of the combined ray tracing and photon mapping streaming pipeline is introduced. Based on this abstraction, an efficient ray tracing and photon mapping GPU is designed. Using an event driven simulator, developed for this GPU, we verify and validate the proposed algorithms and architecture. Simulation results have validated better interactive performances compared to the current GPUs

    Hardware Accelerators for Animated Ray Tracing

    Get PDF
    Future graphics processors are likely to incorporate hardware accelerators for real-time ray tracing, in order to render increasingly complex lighting effects in interactive applications. However, ray tracing poses difficulties when drawing scenes with dynamic content, such as animated characters and objects. In dynamic scenes, the spatial datastructures used to accelerate ray tracing are invalidated on each animation frame, and need to be rapidly updated. Tree update is a complex subtask in its own right, and becomes highly expensive in complex scenes. Both ray tracing and tree update are highly memory-intensive tasks, and rendering systems are increasingly bandwidth-limited, so research on accelerator hardware has focused on architectural techniques to optimize away off-chip memory traffic. Dynamic scene support is further complicated by the recent introduction of compressed trees, which use low-precision numbers for storage and computation. Such compression reduces both the arithmetic and memory bandwidth cost of ray tracing, but adds to the complexity of tree update.This thesis proposes methods to cope with dynamic scenes in hardware-accelerated ray tracing, with focus on reducing traffic to external memory. Firstly, a hardware architecture is designed for linear bounding volume hierarchy construction, an algorithm which is a basic building block in most state-of-the-art software tree builders. The algorithm is rearranged into a streaming form which reduces traffic to one-third of software implementations of the same algorithm. Secondly, an algorithm is proposed for compressing bounding volume hierarchies in a streaming manner as they are output from a hardware builder, instead of performing compression as a postprocessing pass. As a result, with the proposed method, compression reduces the overall cost of tree update rather than increasing it. The last main contribution of this thesis is an evaluation of shallow bounding volume hierarchies, common in software ray tracing, for use in hardware pipelines. These are found to be more energy-efficient than binary hierarchies. The results in this thesis both confirm that dynamic scene support may become a bottleneck in real time ray tracing, and add to the state of the art on tree update in terms of energy-efficiency, as well as the complexity of scenes that can be handled in real time on resource-constrained platforms

    Towards Fully Dynamic Surface Illumination in Real-Time Rendering using Acceleration Data Structures

    Get PDF
    The improvements in GPU hardware, including hardware-accelerated ray tracing, and the push for fully dynamic realistic-looking video games, has been driving more research in the use of ray tracing in real-time applications. The work described in this thesis covers multiple aspects such as optimisations, adapting existing offline methods to real-time constraints, and adding effects which were hard to simulate without the new hardware, all working towards a fully dynamic surface illumination rendering in real-time.Our first main area of research concerns photon-based techniques, commonly used to render caustics. As many photons can be required for a good coverage of the scene, an efficient approach for detecting which ones contribute to a pixel is essential. We improve that process by adapting and extending an existing acceleration data structure; if performance is paramount, we present an approximation which trades off some quality for a 2–3× improvement in rendering time. The tracing of all the photons, and especially when long paths are needed, had become the highest cost. As most paths do not change from frame to frame, we introduce a validation procedure allowing the reuse of as many as possible, even in the presence of dynamic lights and objects. Previous algorithms for associating pixels and photons do not robustly handle specular materials, so we designed an approach leveraging ray tracing hardware to allow for caustics to be visible in mirrors or behind transparent objects.Our second research focus switches from a light-based perspective to a camera-based one, to improve the picking of light sources when shading: photon-based techniques are wonderful for caustics, but not as efficient for direct lighting estimations. When a scene has thousands of lights, only a handful can be evaluated at any given pixel due to time constraints. Current selection methods in video games are fast but at the cost of introducing bias. By adapting an acceleration data structure from offline rendering that stochastically chooses a light source based on its importance, we provide unbiased direct lighting evaluation at about 30 fps. To support dynamic scenes, we organise it in a two-level system making it possible to only update the parts containing moving lights, and in a more efficient way.We worked on top of the new ray tracing hardware to handle lighting situations that previously proved too challenging, and presented optimisations relevant for future algorithms in that space. These contributions will help in reducing some artistic constraints while designing new virtual scenes for real-time applications

    Faster Ray Tracing through Hierarchy Cut Code

    Full text link
    We propose a novel ray reordering technique to accelerate the ray tracing process by encoding and sorting rays prior to traversal. Instead of spatial coordinates, our method encodes rays according to the cuts of the hierarchical acceleration structure, which is called the hierarchy cut code. This approach can better adapt to the acceleration structure and obtain a more reliable encoding result. We also propose a compression scheme to decrease the sorting overhead by a shorter sorting key. In addition, based on the phenomenon of boundary drift, we theoretically explain the reason why existing reordering methods cannot achieve better performance by using longer sorting keys. The experiment demonstrates that our method can accelerate secondary ray tracing by up to 1.81 times, outperforming the existing methods. Such result proves the effectiveness of hierarchy cut code, and indicate that the reordering technique can achieve greater performance improvement, which worth further research
    corecore