2,908 research outputs found

    Understanding Next-Generation VR: Classifying Commodity Clusters for Immersive Virtual Reality

    Get PDF
    Commodity clusters offer the ability to deliver higher performance computer graphics at lower prices than traditional graphics supercomputers. Immersive virtual reality systems demand notoriously high computational requirements to deliver adequate real-time graphics, leading to the emergence of commodity clusters for immersive virtual reality. Such clusters deliver the graphics power needed by leveraging the combined power of several computers to meet the demands of real-time interactive immersive computer graphics.However, the field of commodity cluster-based virtual reality is still in early stages of development and the field is currently adhoc in nature and lacks order. There is no accepted means for comparing approaches and implementers are left with instinctual or trial-and-error means for selecting an approach.This paper provides a classification system that facilitates understanding not only of the nature of different clustering systems but also the interrelations between them. The system is built from a new model for generalized computer graphics applications, which is based on the flow of data through a sequence of operations over the entire context of the application. Prior models and classification systems have been too focused in context and application whereas the system described here provides a unified means for comparison of works within the field

    Decoupled Sampling for Graphics Pipelines

    Get PDF
    We propose a generalized approach to decoupling shading from visibility sampling in graphics pipelines, which we call decoupled sampling. Decoupled sampling enables stochastic supersampling of motion and defocus blur at reduced shading cost, as well as controllable or adaptive shading rates which trade off shading quality for performance. It can be thought of as a generalization of multisample antialiasing (MSAA) to support complex and dynamic mappings from visibility to shading samples, as introduced by motion and defocus blur and adaptive shading. It works by defining a many-to-one hash from visibility to shading samples, and using a buffer to memoize shading samples and exploit reuse across visibility samples. Decoupled sampling is inspired by the Reyes rendering architecture, but like traditional graphics pipelines, it shades fragments rather than micropolygon vertices, decoupling shading from the geometry sampling rate. Also unlike Reyes, decoupled sampling only shades fragments after precise computation of visibility, reducing overshading. We present extensions of two modern graphics pipelines to support decoupled sampling: a GPU-style sort-last fragment architecture, and a Larrabee-style sort-middle pipeline. We study the architectural implications of decoupled sampling and blur, and derive end-to-end performance estimates on real applications through an instrumented functional simulator. We demonstrate high-quality motion and defocus blur, as well as variable and adaptive shading rates

    Real-Time Global Illumination for VR Applications

    Full text link
    Real-time global illumination in VR systems enhances scene realism by incorporating soft shadows, reflections of objects in the scene, and color bleeding. The Virtual Light Field (VLF) method enables real-time global illumination rendering in VR. The VLF has been integrated with the Extreme VR system for realtime GPU-based rendering in a Cave Automatic Virtual Environment

    GPU Cost Estimation for Load Balancing in Parallel Ray Tracing

    Get PDF
    Interactive ray tracing has seen enormous progress in recent years. However, advanced rendering techniques requiring many million rays per second are still not feasible at interactive speed, and are only possible by means of highly parallel ray tracing. When using compute clusters, good load balancing is crucial in order to fully exploit the available computational power, and to not suffer from the overhead involved by synchronization barriers. In this paper, we present a novel GPU method to compute a costmap: a per-pixel cost estimate of the ray tracing rendering process. We show that the cost map is a powerful tool to improve load balancing in parallel ray tracing, and it can be used for adaptive task partitioning and enhanced dynamic load balancing. Its effectiveness has been proven in a parallel ray tracer implementation tailored for a cluster of workstations

    Decoupled Sampling for Real-Time Graphics Pipelines

    Get PDF
    We propose decoupled sampling, an approach that decouples shading from visibility sampling in order to enable motion blur and depth-of-field at reduced cost. More generally, it enables extensions of modern real-time graphics pipelines that provide controllable shading rates to trade off quality for performance. It can be thought of as a generalization of GPU-style multisample antialiasing (MSAA) to support unpredictable shading rates, with arbitrary mappings from visibility to shading samples as introduced by motion blur, depth-of-field, and adaptive shading. It is inspired by the Reyes architecture in offline rendering, but targets real-time pipelines by driving shading from visibility samples as in GPUs, and removes the need for micropolygon dicing or rasterization. Decoupled Sampling works by defining a many-to-one hash from visibility to shading samples, and using a buffer to memoize shading samples and exploit reuse across visibility samples. We present extensions of two modern GPU pipelines to support decoupled sampling: a GPU-style sort-last fragment architecture, and a Larrabee-style sort-middle pipeline. We study the architectural implications and derive end-to-end performance estimates on real applications through an instrumented functional simulator. We demonstrate high-quality motion blur and depth-of-field, as well as variable and adaptive shading rates

    Scalable Real-Time Rendering for Extremely Complex 3D Environments Using Multiple GPUs

    Get PDF
    In 3D visualization, real-time rendering of high-quality meshes in complex 3D environments is still one of the major challenges in computer graphics. New data acquisition techniques like 3D modeling and scanning have drastically increased the requirement for more complex models and the demand for higher display resolutions in recent years. Most of the existing acceleration techniques using a single GPU for rendering suffer from the limited GPU memory budget, the time-consuming sequential executions, and the finite display resolution. Recently, people have started building commodity workstations with multiple GPUs and multiple displays. As a result, more GPU memory is available across a distributed cluster of GPUs, more computational power is provided throughout the combination of multiple GPUs, and a higher display resolution can be achieved by connecting each GPU to a display monitor (resulting in a tiled large display configuration). However, using a multi-GPU workstation may not always give the desired rendering performance due to the imbalanced rendering workloads among GPUs and overheads caused by inter-GPU communication. In this dissertation, I contribute a multi-GPU multi-display parallel rendering approach for complex 3D environments. The approach has the capability to support a high-performance and high-quality rendering of static and dynamic 3D environments. A novel parallel load balancing algorithm is developed based on a screen partitioning strategy to dynamically balance the number of vertices and triangles rendered by each GPU. The overhead of inter-GPU communication is minimized by transferring only a small amount of image pixels rather than chunks of 3D primitives with a novel frame exchanging algorithm. The state-of-the-art parallel mesh simplification and GPU out-of-core techniques are integrated into the multi-GPU multi-display system to accelerate the rendering process

    Efficient algorithms for occlusion culling and shadows

    Get PDF
    The goal of this research is to develop more efficient techniques for computing the visibility and shadows in real-time rendering of three-dimensional scenes. Visibility algorithms determine what is visible from a camera, whereas shadow algorithms solve the same problem from the viewpoint of a light source. In rendering, a lot of computational resources are often spent on primitives that are not visible in the final image. One visibility algorithm for reducing the overhead is occlusion culling, which quickly discards the objects or primitives that are obstructed from the view by other primitives. A new method is presented for performing occlusion culling using silhouettes of meshes instead of triangles. Additionally, modifications are suggested to occlusion queries in order to reduce their computational overhead. The performance of currently available graphics hardware depends on the ordering of input primitives. A new technique, called delay streams, is proposed as a generic solution to order-dependent problems. The technique significantly reduces the pixel processing requirements by improving the efficiency of occlusion culling inside graphics hardware. Additionally, the memory requirements of order-independent transparency algorithms are reduced. A shadow map is a discretized representation of the scene geometry as seen by a light source. Typically the discretization causes difficult aliasing issues, such as jagged shadow boundaries and incorrect self-shadowing. A novel solution is presented for suppressing all types of aliasing artifacts by providing the correct sampling points for shadow maps, thus fully abandoning the previously used regular structures. Also, a simple technique is introduced for limiting the shadow map lookups to the pixels that get projected inside the shadow map. The fillrate problem of hardware-accelerated shadow volumes is greatly reduced with a new hierarchical rendering technique. The algorithm performs per-pixel shadow computations only at visible shadow boundaries, and uses lower resolution shadows for the parts of the screen that are guaranteed to be either fully lit or fully in shadow. The proposed techniques are expected to improve the rendering performance in most real-time applications that use 3D graphics, especially in computer games. More efficient algorithms for occlusion culling and shadows are important steps towards larger, more realistic virtual environments.reviewe

    Parallel Algorithms for Rendering Large 3D Models on a Graphics Cluster

    Get PDF
    We address the problem of distributing rendering computations for real-time display of very complex three dimensional (3D) scenes using a graphics cluster. The rendering of 3D scenes is increasingly being carried out using at least two different programs on the graphics processing unit (GPU): a vertex shader program for vertex (geometry) processing, and a fragment shader program for pixel (colour) processing. With fragment shader programs becoming more and more time consuming, distributing load solely based on geometry -- as is done in most contemporary systems -- can cause significant load imbalance and redundant work. In this thesis we propose a number of parallel rendering algorithms which divide the traditional cluster rendering pipeline into two different phases: one which primarily concerns itself over vertex operations to generate depth information, and a second which primarily concerns itself over fragment operations. By performing communication between these two phases, each node can perform fewer fragment operations with little overhead over traditional cluster rendering algorithms. We also propose a number of load-balancing algorithms which utilize the information gained earlier in the pipeline to improve the management of GPU resources. The techniques are implemented on a graphics cluster and experimental results demonstrate significant improvements in rendering performance
    corecore