229 research outputs found

    Integrating Occlusion Culling and Hardware Instancing for Efficient Real-Time Rendering of Building Information Models

    Get PDF
    This paper presents an efficient approach for integrating occlusion culling and hardware instancing. The work is primarily targeted at Building Information Models (BIM), which typically share characteristics addressed by these two acceleration techniques separately – high level of occlusion and frequent reuse of building components. Together, these two acceleration techniques complement each other and allows large and complex BIMs to be rendered in real-time. Specifically, the proposed method takes advantage of temporal coherence and uses a lightweight data transfer strategy to provide an efficient hardware instancing implementation. Compared to only using occlusion culling, additional speedups of 1.25x-1.7x is achieved for rendering large BIMs received from real-world projects. These speedups are measured in viewpoints that represents the worst case scenarios in terms of rendering performance when only occlusion culling is utilized

    Efficient algorithms for occlusion culling and shadows

    Get PDF
    The goal of this research is to develop more efficient techniques for computing the visibility and shadows in real-time rendering of three-dimensional scenes. Visibility algorithms determine what is visible from a camera, whereas shadow algorithms solve the same problem from the viewpoint of a light source. In rendering, a lot of computational resources are often spent on primitives that are not visible in the final image. One visibility algorithm for reducing the overhead is occlusion culling, which quickly discards the objects or primitives that are obstructed from the view by other primitives. A new method is presented for performing occlusion culling using silhouettes of meshes instead of triangles. Additionally, modifications are suggested to occlusion queries in order to reduce their computational overhead. The performance of currently available graphics hardware depends on the ordering of input primitives. A new technique, called delay streams, is proposed as a generic solution to order-dependent problems. The technique significantly reduces the pixel processing requirements by improving the efficiency of occlusion culling inside graphics hardware. Additionally, the memory requirements of order-independent transparency algorithms are reduced. A shadow map is a discretized representation of the scene geometry as seen by a light source. Typically the discretization causes difficult aliasing issues, such as jagged shadow boundaries and incorrect self-shadowing. A novel solution is presented for suppressing all types of aliasing artifacts by providing the correct sampling points for shadow maps, thus fully abandoning the previously used regular structures. Also, a simple technique is introduced for limiting the shadow map lookups to the pixels that get projected inside the shadow map. The fillrate problem of hardware-accelerated shadow volumes is greatly reduced with a new hierarchical rendering technique. The algorithm performs per-pixel shadow computations only at visible shadow boundaries, and uses lower resolution shadows for the parts of the screen that are guaranteed to be either fully lit or fully in shadow. The proposed techniques are expected to improve the rendering performance in most real-time applications that use 3D graphics, especially in computer games. More efficient algorithms for occlusion culling and shadows are important steps towards larger, more realistic virtual environments.reviewe

    Fast scalable visualization techniques for interactive billion-particle walkthrough

    Get PDF
    This research develops a comprehensive framework for interactive walkthrough involving one billion particles in an immersive virtual environment to enable interrogative visualization of large atomistic simulation data. As a mixture of scientific and engineering approaches, the framework is based on four key techniques: adaptive data compression based on space-filling curves, octree-based visibility and occlusion culling, predictive caching based on machine learning, and scalable data reduction based on parallel and distributed processing. In terms of parallel rendering, this system combines functional parallelism, data parallelism, and temporal parallelism to improve interactivity. The visualization framework will be applicable not only to material simulation, but also to computational biology, applied mathematics, mechanical engineering, and nanotechnology, etc

    A survey of real-time crowd rendering

    Get PDF
    In this survey we review, classify and compare existing approaches for real-time crowd rendering. We first overview character animation techniques, as they are highly tied to crowd rendering performance, and then we analyze the state of the art in crowd rendering. We discuss different representations for level-of-detail (LoD) rendering of animated characters, including polygon-based, point-based, and image-based techniques, and review different criteria for runtime LoD selection. Besides LoD approaches, we review classic acceleration schemes, such as frustum culling and occlusion culling, and describe how they can be adapted to handle crowds of animated characters. We also discuss specific acceleration techniques for crowd rendering, such as primitive pseudo-instancing, palette skinning, and dynamic key-pose caching, which benefit from current graphics hardware. We also address other factors affecting performance and realism of crowds such as lighting, shadowing, clothing and variability. Finally we provide an exhaustive comparison of the most relevant approaches in the field.Peer ReviewedPostprint (author's final draft

    View space linking, solid node compression and binary space partitioning for visibility determination in 3D walk-throughs

    Get PDF
    Today\u27s 3D games consumers are expecting more and more quality in their games. To enable high quality graphics at interactive rates, games programmers employ a technique known as hidden surface removal (HSR) or polygon culling. HSR is not just applicable to games; it may also be applied to any application that requires quality and interactive rates, including medical, military and building applications. One such commonly used technique for HSR is the binary space partition (BSP) tree, which is used for 3D ‘walk-throughs’, otherwise known as 3D static environments or first person shooters. Recent developments in 3D accelerated hardware technology do not mean that HSR is becoming redundant; in fact, HSR is increasingly becoming more important to the graphics pipeline. The well established potentially visible sets (PSV) BSP tree algorithm is used as a platform for exploring three enhanced algorithms; View Space Lighting, Solid Node Compression and hardware accelerated occlusion are shown to reducing the amounts of nodes that are traversed in a BSP tree, improving tree travel efficiency. These algorithms are proven (in cases) to improve overall efficiency

    Large Model Visualization : Techniques and Applications

    Get PDF
    The size of datasets in scientific computing is rapidly increasing. This increase is caused by a boost of processing power in the past years, which in turn was invested in an increase of the accuracy and the size of the models. A similar trend enabled a significant improvement of medical scanners; more than 1000 slices of a resolution of 512x512 can be generated by modern scanners in daily practice. Even in computer-aided engineering typical models eas-ily contain several million polygons. Unfortunately, the data complexity is growing faster than the rendering performance of modern computer systems. This is not only due to the slower growing graphics performance of the graphics subsystems, but in particular because of the significantly slower growing memory bandwidth for the transfer of the geometry and image data from the main memory to the graphics accelerator. Large model visualization addresses this growing divide between data complexity and rendering performance. Most methods focus on the reduction of the geometric or pixel complexity, and hence also the memory bandwidth requirements are reduced. In this dissertation, we discuss new approaches from three different research areas. All approaches target at the reduction of the processing complexity to achieve an interactive visualization of large datasets. In the second part, we introduce applications of the presented ap-proaches. Specifically, we introduce the new VIVENDI system for the interactive virtual endoscopy and other applications from mechanical engineering, scientific computing, and architecture.The size of datasets in scientific computing is rapidly increasing. This increase is caused by a boost of processing power in the past years, which in turn was invested in an increase of the accuracy and the size of the models. A similar trend enabled a significant improvement of medical scanners; more than 1000 slices of a resolution of 512x512 can be generated by modern scanners in daily practice. Even in computer-aided engineering typical models eas-ily contain several million polygons. Unfortunately, the data complexity is growing faster than the rendering performance of modern computer systems. This is not only due to the slower growing graphics performance of the graphics subsystems, but in particular because of the significantly slower growing memory bandwidth for the transfer of the geometry and image data from the main memory to the graphics accelerator. Large model visualization addresses this growing divide between data complexity and rendering performance. Most methods focus on the reduction of the geometric or pixel complexity, and hence also the memory bandwidth requirements are reduced. In this dissertation, we discuss new approaches from three different research areas. All approaches target at the reduction of the processing complexity to achieve an interactive visualization of large datasets. In the second part, we introduce applications of the presented ap-proaches. Specifically, we introduce the new VIVENDI system for the interactive virtual endoscopy and other applications from mechanical engineering, scientific computing, and architecture

    Object Hierarchies for Efficient Rendering

    Get PDF
    This thesis covers the efficient visualization of complex 3d scenes using various rendering methods such as photo-realistic and real-time rendering. Especially the important role of bounding volume hierarchies is discussed in detail in the context of illumination and visibility algorithms. We present a novel approach for automatic generation of object hierarchies and apply the resulting data structure to several rendering techniques. In the field of ray tracing we describe a novel ray acceleration method that combines objects hierarchies and regular grids. We demonstrate how radiosity computations may benefit from available scene hierarchies to determine the radiant flux between object clusters. Finally, we present an adaptive interactive rendering algorithm that may dramatically reduce the number of visibility tests in an occlusion culling framework for interactive real-time visualization.Diese Dissertation untersucht unterschiedliche Verfahren zur effizienten Visualisierung grosser dreidimensionaler Szenengeometrien, sowohl im Bereich des Photorealismus wie auch bei der Echtzeit-Visualisierung. Hierbei wird insbesondere die Nützlichkeit von Hüllkörperhierarchien bei der Beleuchtungsrechnung und bei der Beantwortung von Sichtbarkeitsfragen herausgearbeitet. Ein neuartiges, kostenbasiertes Verfahren zur automatischen Konstruktion von Objekthierarchien wird präsentiert sowie dessen Anwendung für alle gängigen Darstellungsverfahren. Zusätzlich beschreibt diese Disseration im Bereich Ray Tracing ein neues Verfahren zur Szenenstrukturierung, welches die Vorteile von Hüllkörperhierarchien und regulären Gittern kombiniert. Im Bereich der Radiosity wird gezeigt, wie sich Szenenhierarchien ideal zur Berechnung des Lichtflusses zwischen Objekt-Clustern nutzen lassen und im Bereich Echtzeit-Rendering wird ein adaptives Verfahren vorgestellt, dass die Zahl teurer Sichtbarkeitstests beim Occlusion-Culling deutlich reduziert

    Reducing redundancy of real time computer graphics in mobile systems

    Get PDF
    The goal of this thesis is to propose novel and effective techniques to eliminate redundant computations that waste energy and are performed in real-time computer graphics applications, with special focus on mobile GPU micro-architecture. Improving the energy-efficiency of CPU/GPU systems is not only key to enlarge their battery life, but also allows to increase their performance because, to avoid overheating above thermal limits, SoCs tend to be throttled when the load is high for a large period of time. Prior studies pointed out that the CPU and especially the GPU are the principal energy consumers in the graphics subsystem, being the off-chip main memory accesses and the processors inside the GPU the primary energy consumers of the graphics subsystem. First, we focus on reducing redundant fragment processing computations by means of improving the culling of hidden surfaces. During real-time graphics rendering, objects are processed by the GPU in the order they are submitted by the CPU, and occluded surfaces are often processed even though they will end up not being part of the final image. When the GPU realizes that an object or part of it is not going to be visible, all activity required to compute its color and store it has already been performed. We propose a novel architectural technique for mobile GPUs, Visibility Rendering Order (VRO), which reorders objects front-to-back entirely in hardware to maximize the culling effectiveness of the GPU and minimize overshading, hence reducing execution time and energy consumption. VRO exploits the fact that the objects in graphics animated applications tend to keep its relative depth order across consecutive frames (temporal coherence) to provide the feeling of smooth transition. VRO keeps visibility information of a frame, and uses it to reorder the objects of the following frame. VRO just requires adding a small hardware to capture the visibility information and use it later to guide the rendering of the following frame. Moreover, VRO works in parallel with the graphics pipeline, so negligible performance overheads are incurred. We illustrate the benefits of VRO using various unmodified commercial 3D applications for which VRO achieves 27% speed-up and 14.8% energy reduction on average. Then, we focus on avoiding redundant computations related to CPU Collision Detection (CD). Graphics applications such as 3D games represent a large percentage of downloaded applications for mobile devices and the trend is towards more complex and realistic scenes with accurate 3D physics simulations. CD is one of the most important algorithms in any physics kernel since it identifies the contact points between the objects of a scene and determines when they collide. However, real-time accurate CD is very expensive in terms of energy consumption. We propose Render Based Collision Detection (RBCD), a novel energy-efficient high-fidelity CD scheme that leverages some intermediate results of the rendering pipeline to perform CD, so that redundant tasks are done just once. Comparing RBCD with a conventional CD completely executed in the CPU, we show that its execution time is reduced by almost three orders of magnitude (600x speedup), because most of the CD task of our model comes for free by reusing the image rendering intermediate results. Although not necessarily, such a dramatic time improvement may result in better frames per second if physics simulation stays in the critical path. However, the most important advantage of our technique is the enormous energy savings that result from eliminating a long and costly CPU computation and converting it into a few simple operations executed by a specialized hardware within the GPU. Our results show that the energy consumed by CD is reduced on average by a factor of 448x (i.e., by 99.8\%). These dramatic benefits are accompanied by a higher fidelity CD analysis (i.e., with finer granularity), which improves the quality and realism of the application.El objetivo de esta tesis es proponer técnicas efectivas y originales para eliminar computaciones inútiles que aparecen en aplicaciones gráficas, con especial énfasis en micro-arquitectura de GPUs. Mejorar la eficiencia energética de los sistemas CPU/GPU no es solo clave para alargar la vida de la batería, sino también incrementar su rendimiento. Estudios previos han apuntado que la CPU y especialmente la GPU son los principales consumidores de energía en el sub-sistema gráfico, siendo los accesos a memoria off-chip y los procesadores dentro de la GPU los principales consumidores de energía del sub-sistema gráfico. Primero, nos hemos centrado en reducir computaciones redundantes de la fase de fragment processing mediante la mejora en la eliminación de superficies ocultas. Durante el renderizado de gráficos en tiempo real, los objetos son procesados por la GPU en el orden en el que son enviados por la CPU, y las superficies ocultas son a menudo procesadas incluso si no no acaban formando parte de la imagen final. Cuando la GPU averigua que el objeto o parte de él no es visible, toda la actividad requerida para computar su color y guardarlo ha sido realizada. Proponemos una técnica arquitectónica original para GPUs móviles, Visibility Rendering Order (VRO), la cual reordena los objetos de delante hacia atrás por completo en hardware para maximizar la efectividad del culling de la GPU y así minimizar el overshading, y por lo tanto reducir el tiempo de ejecución y el consumo de energía. VRO explota el hecho de que los objetos de las aplicaciones gráficas animadas tienden a mantener su orden relativo en profundidad a través de frames consecutivos (coherencia temporal) para proveer animaciones con transiciones suaves. Dado que las relaciones de orden en profundidad entre objetos son testeadas en la GPU, VRO introduce costes mínimos en energía. Solo requiere añadir una pequeña unidad hardware para capturar la información de visibilidad. Además, VRO trabaja en paralelo con el pipeline gráfico, por lo que introduce costes insignificantes en tiempo. Ilustramos los beneficios de VRO usango varias aplicaciones 3D comerciales para las cuales VRO consigue un 27% de speed-up y un 14.8% de reducción de energía en media. En segundo lugar, evitamos computaciones redundantes relacionadas con la Detección de Colisiones (CD) en la CPU. Las aplicaciones gráficas animadas como los juegos 3D representan un alto porcentaje de las aplicaciones descargadas en dispositivos móviles y la tendencia es hacia escenas más complejas y realistas con simulaciones físicas 3D precisas. La CD es uno de los algoritmos más importantes entre los kernel de físicas dado que identifica los puntos de contacto entre los objetos de una escena. Sin embargo, una CD en tiempo real y precisa es muy costosa en términos de consumo energético. Proponemos Render Based Collision Detection (RBCD), una técnica energéticamente eficiente y preciso de CD que utiliza resultados intermedios del rendering pipeline para realizar la CD. Comparando RBCD con una CD convencional completamente ejecutada en la CPU, mostramos que el tiempo de ejecución es reducido casi tres órdenes de magnitud (600x speedup), porque la mayoría de la CD de nuestro modelo reusa resultados intermedios del renderizado de la imagen. Aunque no es así necesariamente, esta espectacular en tiempo puede resultar en mejores frames por segundo si la simulación de físicas está en el camino crítico. Sin embargo, la ventaja más importante de nuestra técnica es el enorme ahorro de energía que resulta de eliminar las largas y costosas computaciones en la CPU, sustituyéndolas por unas pocas operaciones ejecutadas en un hardware especializado dentro de la GPU. Nuestros resultados muestran que la energía consumida por la CD es reducidad en media por un factor de 448x. Estos dramáticos beneficios vienen acompañados de una mayor fidelidad en la CD (i.e. con granularidad más fina)Postprint (published version
    corecore