7,085 research outputs found
Decoupled Sampling for Real-Time Graphics Pipelines
We propose decoupled sampling, an approach that decouples shading from visibility sampling in order to enable motion blur and depth-of-field at reduced cost. More generally, it enables extensions of modern real-time graphics pipelines that provide controllable shading rates to trade off quality for performance. It can be thought of as a generalization of GPU-style multisample antialiasing (MSAA) to support unpredictable shading rates, with arbitrary mappings from visibility to shading samples as introduced by motion blur, depth-of-field, and adaptive shading. It is inspired by the Reyes architecture in offline rendering, but targets real-time pipelines by driving shading from visibility samples as in GPUs, and removes the need for micropolygon dicing or rasterization. Decoupled Sampling works by defining a many-to-one hash from visibility to shading samples, and using a buffer to memoize shading samples and exploit reuse across visibility samples. We present extensions of two modern GPU pipelines to support decoupled sampling: a GPU-style sort-last fragment architecture, and a Larrabee-style sort-middle pipeline. We study the architectural implications and derive end-to-end performance estimates on real applications through an instrumented functional simulator. We demonstrate high-quality motion blur and depth-of-field, as well as variable and adaptive shading rates
Decoupled Sampling for Graphics Pipelines
We propose a generalized approach to decoupling shading from visibility sampling in graphics pipelines, which we call decoupled sampling. Decoupled sampling enables stochastic supersampling of motion and defocus blur at reduced shading cost, as well as controllable or adaptive shading rates which trade off shading quality for performance. It can be thought of as a generalization of multisample antialiasing (MSAA) to support complex and dynamic mappings from visibility to shading samples, as introduced by motion and defocus blur and adaptive shading. It works by defining a many-to-one hash from visibility to shading samples, and using a buffer to memoize shading samples and exploit reuse across visibility samples. Decoupled sampling is inspired by the Reyes rendering architecture, but like traditional graphics pipelines, it shades fragments rather than micropolygon vertices, decoupling shading from the geometry sampling rate. Also unlike Reyes, decoupled sampling only shades fragments after precise computation of visibility, reducing overshading.
We present extensions of two modern graphics pipelines to support decoupled sampling: a GPU-style sort-last fragment architecture, and a Larrabee-style sort-middle pipeline. We study the architectural implications of decoupled sampling and blur, and derive end-to-end performance estimates on real applications through an instrumented functional simulator. We demonstrate high-quality motion and defocus blur, as well as variable and adaptive shading rates
A survey of real-time crowd rendering
In this survey we review, classify and compare existing approaches for real-time crowd rendering. We first overview character animation techniques, as they are highly tied to crowd rendering performance, and then we analyze the state of the art in crowd rendering. We discuss different representations for level-of-detail (LoD) rendering of animated characters, including polygon-based, point-based, and image-based techniques, and review different criteria for runtime LoD selection. Besides LoD approaches, we review classic acceleration schemes, such as frustum culling and occlusion culling, and describe how they can be adapted to handle crowds of animated characters. We also discuss specific acceleration techniques for crowd rendering, such as primitive pseudo-instancing, palette skinning, and dynamic key-pose caching, which benefit from current graphics hardware. We also address other factors affecting performance and realism of crowds such as lighting, shadowing, clothing and variability. Finally we provide an exhaustive comparison of the most relevant approaches in the field.Peer ReviewedPostprint (author's final draft
Massively Parallel Ray Tracing Algorithm Using GPU
Ray tracing is a technique for generating an image by tracing the path of
light through pixels in an image plane and simulating the effects of
high-quality global illumination at a heavy computational cost. Because of the
high computation complexity, it can't reach the requirement of real-time
rendering. The emergence of many-core architectures, makes it possible to
reduce significantly the running time of ray tracing algorithm by employing the
powerful ability of floating point computation. In this paper, a new GPU
implementation and optimization of the ray tracing to accelerate the rendering
process is presented
Enabling a High Throughput Real Time Data Pipeline for a Large Radio Telescope Array with GPUs
The Murchison Widefield Array (MWA) is a next-generation radio telescope
currently under construction in the remote Western Australia Outback. Raw data
will be generated continuously at 5GiB/s, grouped into 8s cadences. This high
throughput motivates the development of on-site, real time processing and
reduction in preference to archiving, transport and off-line processing. Each
batch of 8s data must be completely reduced before the next batch arrives.
Maintaining real time operation will require a sustained performance of around
2.5TFLOP/s (including convolutions, FFTs, interpolations and matrix
multiplications). We describe a scalable heterogeneous computing pipeline
implementation, exploiting both the high computing density and FLOP-per-Watt
ratio of modern GPUs. The architecture is highly parallel within and across
nodes, with all major processing elements performed by GPUs. Necessary
scatter-gather operations along the pipeline are loosely synchronized between
the nodes hosting the GPUs. The MWA will be a frontier scientific instrument
and a pathfinder for planned peta- and exascale facilities.Comment: Version accepted by Comp. Phys. Com
- …