18 research outputs found

    Fast Rendering of Forest Ecosystems with Dynamic Global Illumination

    Get PDF
    Real-time rendering of large-scale, forest ecosystems remains a challenging problem, in that important global illumination effects, such as leaf transparency and inter-object light scattering, are difficult to capture, given tight timing constraints and scenes that typically contain hundreds of millions of primitives. We propose a new lighting model, adapted from a model previously used to light convective clouds and other participating media, together with GPU ray tracing, in order to achieve these global illumination effects while maintaining near real-time performance. The lighting model is based on a lattice-Boltzmann method in which reflectance, transmittance, and absorption parameters are taken from measurements of real plants. The lighting model is solved as a preprocessing step, requires only seconds on a single GPU, and allows dynamic lighting changes at run-time. The ray tracing engine, which runs on one or multiple GPUs, combines multiple acceleration structures to achieve near real-time performance for large, complex scenes. Both the preprocessing step and the ray tracing engine make extensive use of NVIDIA\u27s Compute Unified Device Architecture (CUDA)

    Developing Efficient Discrete Simulations on Multicore and GPU Architectures

    Get PDF
    In this paper we show how to efficiently implement parallel discrete simulations on multicoreandGPUarchitecturesthrougharealexampleofanapplication: acellularautomatamodel of laser dynamics. We describe the techniques employed to build and optimize the implementations using OpenMP and CUDA frameworks. We have evaluated the performance on two different hardware platforms that represent different target market segments: high-end platforms for scientific computing, using an Intel Xeon Platinum 8259CL server with 48 cores, and also an NVIDIA Tesla V100GPU,bothrunningonAmazonWebServer(AWS)Cloud;and on a consumer-oriented platform, using an Intel Core i9 9900k CPU and an NVIDIA GeForce GTX 1050 TI GPU. Performance results were compared and analyzed in detail. We show that excellent performance and scalability can be obtained in both platforms, and we extract some important issues that imply a performance degradation for them. We also found that current multicore CPUs with large core numbers can bring a performance very near to that of GPUs, and even identical in some cases.Ministerio de Economía, Industria y Competitividad, Gobierno de España (MINECO), and the Agencia Estatal de Investigación (AEI) of Spain, cofinanced by FEDER funds (EU) TIN2017-89842

    Flux-Limited Diffusion for Multiple Scattering in Participating Media

    Full text link
    For the rendering of multiple scattering effects in participating media, methods based on the diffusion approximation are an extremely efficient alternative to Monte Carlo path tracing. However, in sufficiently transparent regions, classical diffusion approximation suffers from non-physical radiative fluxes which leads to a poor match to correct light transport. In particular, this prevents the application of classical diffusion approximation to heterogeneous media, where opaque material is embedded within transparent regions. To address this limitation, we introduce flux-limited diffusion, a technique from the astrophysics domain. This method provides a better approximation to light transport than classical diffusion approximation, particularly when applied to heterogeneous media, and hence broadens the applicability of diffusion-based techniques. We provide an algorithm for flux-limited diffusion, which is validated using the transport theory for a point light source in an infinite homogeneous medium. We further demonstrate that our implementation of flux-limited diffusion produces more accurate renderings of multiple scattering in various heterogeneous datasets than classical diffusion approximation, by comparing both methods to ground truth renderings obtained via volumetric path tracing.Comment: Accepted in Computer Graphics Foru

    Real-time smoke rendering using compensated ray marching

    Full text link
    We present a real-time algorithm called compensated ray march-ing for rendering of smoke under dynamic low-frequency environ-ment lighting. Our approach is based on a decomposition of the input smoke animation, represented as a sequence of volumetric density fields, into a set of radial basis functions (RBFs) and a se-quence of residual fields. To expedite rendering, the source radi-ance distribution within the smoke is computed from only the low-frequency RBF approximation of the density fields, since the high-frequency residuals have little impact on global illumination under low-frequency environment lighting. Furthermore, in computing source radiances the contributions from single and multiple scatter-ing are evaluated at only the RBF centers and then approximated at other points in the volume using an RBF-based interpolation. A slice-based integration of these source radiances along each view ray is then performed to render the final image. The high-frequency residual fields, which are a critical component in the local appear-ance of smoke, are compensated back into the radiance integral dur-ing this ray march to generate images of high detail. The runtime algorithm, which includes both light transfer simula-tion and ray marching, can be easily implemented on the GPU, and thus allows for real-time manipulation of viewpoint and lighting, as well as interactive editing of smoke attributes such as extinction cross section, scattering albedo, and phase function. Only moderate preprocessing time and storage is needed. This approach provides the first method for real-time smoke rendering that includes sin-gle and multiple scattering while generating results comparable in quality to offline algorithms like ray tracing

    Balancing Fidelity and Performance in Iridal Light Transport Simulations Aimed at Interactive Applications

    Get PDF
    Specific light transport models based on first-principles approaches have been proposed for complex organic materials such as human skin and blood. The driving force behind these efforts has been the high-fidelity reproduction of material appearance attributes without one having to rely on the manipulation of ad hoc parameters. These models, however, are usually considered excessively time consuming for rendering applications requiring interactive rates. In this thesis, we address this open problem with respect to one of the most challenging of these organic materials, namely the human iris. More specifically, we present a framework that consists in the careful configuration of algorithms employed by a biophysically-based iridal light transport model on the CUDA (Compute Unified Device Architecture) parallel computing platform. We then investigate the sensitivity of iridal appearance attributes to key model running parameters, namely spectral resolution and number of sample rays, in order to obtain a practical balance between appearance fidelity and performance on this platform. The results of our investigation indicate that predictive light transport simulations can be effectively employed in the generation of iridal images that are not only believable, but also controlled by biophysically meaningful parameters. Although our investigation is centered at the human iris, we believe that it can be viewed as a proof of concept, and the proposed configuration strategies and parameter space explorations can be employed to obtain similar results for other organic materials

    Interactive Rendering of Scattering and Refraction Effects in Heterogeneous Media

    Get PDF
    In this dissertation we investigate the problem of interactive and real-time visualization of single scattering, multiple scattering and refraction effects in heterogeneous volumes. Our proposed solutions span a variety of use scenarios: from a very fast yet physically-based approximation to a physically accurate simulation of microscopic light transmission. We add to the state of the art by introducing a novel precomputation and sampling strategy, a system for efficiently parallelizing the computation of different volumetric effects, and a new and fast version of the Discrete Ordinates Method. Finally, we also present a collateral work on real-time 3D acquisition devices

    Heterogeneous parallel algorithms for computational fluid dynamics on unstructured meshes

    Get PDF
    Frontiers of computational fluid dynamics (CFD) are constantly expanding and eagerly demanding more computational resources. Currently, we are experiencing an rapid evolution in the high performance computing systems driven by power consumption constraints. New HPC nodes incorporate accelerators that are used as math co-processors for increasing the throughput and the FLOP per watt ratio. On the other hand, multi-core CPUs have turned into energy efficient system-on-chip architectures. By doing so, the main components of the node are fused and integrated into a single chip reducing the energy costs. Nowadays, several institutions and governments are investing in the research and development of different aspects of HPC that could lead to the next generations of supercomputers. This initiatives have entitled the problem as the exascale challenge. This goal can only be achieved by incorporating major changes in computer architecture, memory design and network interfaces. The CFD community faces an important challenge: keep the pace at the rapid changes in the HPC resources. The codes and formulations need to be re-design in other to exploit the different levels of parallelism and complex memory hierarchies of the new heterogeneous systems. The main characteristics demanded to the new CFD software are: memory awareness, extreme concurrency, modularity and portability. This thesis is devoted to the study of a CFD algorithm re-factoring for the adoption of new technologies. Our application context is the solution of incompressible flows (DNS or LES) on unstructured meshes. The first approach was using GPUs for accelerating the Poisson solver, that is the most computational intensive part of our application. The positive results obtained in this first step motivated us to port the complete time integration phase of our application. This requires a major redesign of the code. We propose a portable implementation model for CFD applications. The main idea was substituting stencil data structures and kernels by algebraic storage formats and operators. By doing so, the algorithm was restructured into a minimal set of algebraic operations. The implementation strategy consisted in the creation of a low-level algebraic layer for computations on CPUs and GPUs, and a high-level user-friendly discretization layer for CPUs that is fully localized at the preprocessing stage where performance does not play an important role. As a result, at the time-integration phase the code relies only on three algebraic kernels: sparse-matrix-vector product (SpMV), linear combination of two vectors (AXPY) and dot product (DOT). Such a simple set of basic linear algebra operations naturally provides the desired portability to any computing architecture. Special attention was paid at the development of data structures compatibles with the stream processing model. A detailed performance analysis was studied in both sequential and parallel execution engaging up to 128 GPUs in a hybrid CPU/GPU supercomputer. Moreover, we tested the portable implementation model of TermoFluids code in the Mont-Blanc mobile-based supercomputer. The re-design of the kernels exploits a heterogeneous execution model using both computing devices CPU and GPU of the ARM-based nodes. The load balancing between the two computing devices exploits a tabu search strategy that tunes the workload distribution during the preprocessing stage. A comparison of the Mont-Blanc prototypes with high-end supercomputers in terms of the achieved net performance and energy consumption provided some guidelines of the behavior of CFD applications in ARM-based architectures. Finally, we present a memory aware auto-tuned Poisson solver for problems with one Fourier diagonalizable direction. This work was developed and tested in the BlueGene/Q Vesta supercomputer, and aims at demonstrating the relevance of vectorization and memory awareness for fully exploiting the modern energy efficient CPUs.Las fronteras de la dinámica de fluidos computacional (CFD) están en constante expansión y demandan más y más recursos computacionales. Actualmente, estamos experimentando una evolución en los sistemas de computación de alto rendimiento (HPC) impulsado por restricciones de consumo de energía. Los nuevos nodos HPC incorporan aceleradores que se utilizan como co-procesadores para incrementar el rendimiento y la relación FLOP por vatio. Por otro lado, CPUs multi-core se han convertido en arquitecturas system-on-chip. Hoy en día, varias instituciones y gobiernos están invirtiendo en la investigación y desarrollo de los diferentes aspectos de HPC que podrían llevar a las próximas generaciones de superordenadores. Estas iniciativas han titulado el problema como el "exascale challenge". Este objetivo sólo puede lograrse mediante la incorporación de cambios importantes en: la arquitectura de ordenador, diseño de la memoria y las interfaces de red. La comunidad de CFD se enfrenta a un reto importante: mantener el ritmo a los rápidos cambios en las infraestructuras de HPC. Los códigos y formulaciones necesitan ser rediseñados para explotar los diferentes niveles de paralelismo y complejas jerarquías de memoria de los nuevos sistemas heterogéneos. Las principales características exigidas al nuevo software CFD son: estructuras de datos, la concurrencia extrema, modularidad y portabilidad. Esta tesis está dedicada al estudio de un modelo de implementation CFD para la adopción de nuevas tecnologías. Nuestro contexto de aplicación es la solución de los flujos incompresibles (DNS o LES) en mallas no estructuradas. El primer enfoque se basó en utilizar GPUs para acelerar el solver de Poisson. Los resultados positivos obtenidos en este primer paso nos motivaron a la portabilidad completa de la fase de integración temporal de nuestra aplicación. Esto requiere un importante rediseño del código. Proponemos un modelo de implementacion portable para aplicaciones de CFD. La idea principal es sustituir las estructuras de datos de los stencils y kernels por formatos de almacenamiento algebraicos y operadores. La estrategia de implementación consistió en la creación de una capa algebraica de bajo nivel para los cálculos de CPU y GPU, y una capa de discretización fácil de usar de alto nivel para las CPU. Como resultado, la fase de integración temporal del código se basa sólo en tres funciones algebraicas: producto de una matriz dispersa con un vector (SPMV), combinación lineal de dos vectores (AXPY) y producto escalar (DOT). Además, se prestó especial atención en el desarrollo de estructuras de datos compatibles con el modelo stream processing. Un análisis detallado de rendimiento se ha estudiado tanto en ejecución secuencial y paralela utilizando hasta 128 GPUs en un superordenador híbrido CPU / GPU. Por otra parte, hemos probado el nuevo modelo de TermoFluids en el superordenador Mont-Blanc basado en tecnología móvil. El rediseño de las funciones explota un modelo de ejecución heterogénea utilizando tanto la CPU y la GPU de los nodos basados en arquitectura ARM. El equilibrio de carga entre las dos unidades de cálculo aprovecha una estrategia de búsqueda tabú que sintoniza la distribución de carga de trabajo durante la etapa de preprocesamiento. Una comparación de los prototipos Mont-Blanc con superordenadores de alta gama en términos de rendimiento y consumo de energía nos proporcionó algunas pautas del comportamiento de las aplicaciones CFD en arquitecturas basadas en ARM. Por último, se presenta una estructura de datos auto-sintonizada para el solver de Poisson en problemas con una dirección diagonalizable mediante una descomposicion de Fourier. Este trabajo fue desarrollado y probado en la superordenador BlueGene / Q Vesta, y tiene por objeto demostrar la relevancia de vectorización y las estructuras de datos para aprovechar plenamente las CPUs de los superodenadores modernos

    Realistic simulation and animation of clouds using SkewT-LogP diagrams

    Get PDF
    Nuvens e clima são tópicos importantes em computação gráfica, nomeadamente na simulação e animação de fenómenos naturais. Tal deve-se ao facto de a simulação de fenómenos naturais−onde as nuvens estão incluídas−encontrar aplicações em filmes, jogos e simuladores de voo. Contudo, as técnicas existentes em computação gráfica apenas permitem representações de nuvens simplificadas, tornadas possíveis através de dinâmicas fictícias que imitam a realidade. O problema que este trabalho pretende abordar prende-se com a simulação de nuvens adequadas para utilização em ambientes virtuais, isto é, nuvens com dinâmica baseada em física que variam ao longo do tempo. Em meteorologia é comum usar técnicas de simulação de nuvens baseadas em leis da física, contudoossistemasatmosféricosdeprediçãonuméricasãocomputacionalmente pesados e normalmente possuem maior precisão numérica do que o necessário em computação gráfica. Neste campo, torna-se necessário direcionar e ajustar as características físicas ou contornar a realidade de modo a atingir os objetivos artísticos, sendo um fator fundamental que faz com que a computação gráfica se distinga das ciências físicas. Contudo, simulações puramente baseadas em física geram soluções de acordo com regras predefinidas e tornam-se notoriamente difíceis de controlar. De modo a enfrentar esses desafios desenvolvemos um novo método de simulação de nuvens baseado em física que possui a característica de ser computacionalmente leve e simula as propriedades dinâmicas relacionadas com a formação de nuvens. Este novo modelo evita resolver as equações físicas, ao apresentar uma solução explícita para essas equações através de diagramas termodinâmicos SkewT/LogP. O sistema incorpora dados reais de forma a simular os parâmetros necessários para a formação de nuvens. É especialmente adequado para a simulação de nuvens cumulus que se formam devido ao um processo convectivo. Esta abordagem permite não só reduzir os custos computacionais de métodos baseados em física, mas também fornece a possibilidade de controlar a forma e dinâmica de nuvens através do controlo dos níveis atmosféricos existentes no diagrama SkewT/LogP. Nestatese,abordámostambémumoutrodesafio,queestárelacionadocomasimulação de nuvens orográficas. Do nosso conhecimento, esta é a primeira tentativa de simular a formação deste tipo de nuvens. A novidade deste método reside no fato de este tipo de nuvens serem não convectivas, oque se traduz nocálculodeoutrosníveis atmosféricos. Além disso, atendendo a que este tipo de nuvens se forma sobre montanhas, é também apresentadoumalgoritmoparadeterminarainfluênciadamontanhasobreomovimento da nuvem. Em resumo, esta dissertação apresenta um conjunto de algoritmos para a modelação e simulação de nuvens cumulus e orográficas, recorrendo a diagramas termodinâmicos SkewT/LogP pela primeira vez no campo da computação gráfica.Clouds and weather are important topics in computer graphics, in particular in the simulation and animation of natural phenomena. This is so because simulation of natural phenomena−where clouds are included−find applications in movies, games and flight simulators. However, existing techniques in computer graphics only offer the simplified cloud representations, possibly with fake dynamics that mimic the reality. The problem that this work addresses is how to find realistic simulation of cloud formation and evolution, that are suitable for virtual environments, i.e., clouds with physically-based dynamics over time. It happens that techniques for cloud simulation are available within the area of meteorology, but numerical weather prediction systems based on physics laws are computationally expensive and provide more numerical accuracy than the required accuracy in computer graphics. In computer graphics, we often need to direct and adjust physical features, or even to bend the reality, to meet artistic goals, which is a key factor that makes computer graphics distinct from physical sciences. However, pure physically-based simulations evolve their solutions according to pre-set physics rules that are notoriously difficult to control. In order to face these challenges we have developed a new lightweight physically-based cloudsimulationschemethatsimulatesthedynamicpropertiesofcloudformation. This new model avoids solving the physically-based equations typically used to simulate the formation of clouds by explicitly solving these equations using SkewT/LogP thermodynamic diagrams. The system incorporates a weather model that uses real data to simulate parameters related to cloud formation. This is specially suitable to the simulation of cumulus clouds, which result from a convective process. This approach not only reduces the computational costs of previous physically-based methods, but also provides a technique to control the shape and dynamics of clouds by handling the cloud levels in SkewT/LogP diagrams. In this thesis, we have also tackled a new challenge, which is related to the simulation oforographic clouds. From ourknowledge, this isthefirstattempttosimulatethis type of cloud formation. The novelty in this method relates to the fact that these clouds are non-convective, so that different atmospheric levels have to be determined. Moreover, since orographic clouds form over mountains, we have also to determine the mountain influence in the cloud motion. In summary, this thesis presents a set of algorithms for the modelling and simulation of cumulus and orographic clouds, taking advantage of the SkewT/LogP diagrams for the first time in the field of computer graphics

    Physically-based Cloud Rendering on GPU

    Get PDF
    Optická simulace participujících medií je zajímavý a taky důležitý problém, který ale nemá žádné jednoduché řešení. Mezi participujícími médii lze navíc oblaky, díky jejich pro simulaci složitým vlastnostem, chápat jako obzvláště náročný případ. Cílem této práce je navrhnout řešení tohoto problému a to navíc takové, které by tuto simulaci provádělo interaktivně. Hlavními kritérii při navrhnování teto metody byly její fyzikální věrnost a maximální využití některých výhodných vlastností oblaků, které by nám pomohly vyvážit jejich složitou podstatu. Ve výsledku je námi navrhovaná metoda postavená na algoritmu fotonových map, kterou ale zásadním způsobem modifikujeme tak, aby bylo dosáhnuto její interaktivity a časové koherence. Tomuto napomáhá i fakt, že jsme se při návrhu snažili, aby naši techniku bylo možné implementovat na součastných GPU, jejichž masivně paralelní výpočetní výkon jsme chtěli využít. Prototyp naší metody jsme implementovali v aplikaci, která je schopná interaktivně vykreslovat (zatím pouze) jeden oblak. Naše diskuze se tedy především zabývá tím, jak tento prototyp naší metody zlepšit natolik, aby jej bylo možné použít v různých praktických aplikacích v průmyslu.The rendering of participating media is an interesting and important problem without a simple solution. Yet even among the wide variety of participating media the clouds stand out as an especially difficult case, because of their properties that make their simulation even harder. The work presented in this thesis attempts to provide a solution to this problem, and moreover, to make the proposed method to work in interactive rendering speeds. The main design criteria in designing this method were its physical plausibility and maximal utilization of specific cloud properties which would help to balance the complex nature of clouds. As a result the proposed method builds on the well known photon mapping algorithm, but modifies it in several ways to obtain interactive and temporarily coherent results. This is further helped by designing the method in such a way which allows its implementation on contemporary GPUs, taking advantage of their massively parallel sheer computational power. We implement a prototype of the method in an application that renders a single realistic cloud in interactive framerates, and discuss possible extensions of the proposed technique that would allow its use in various practical industrial applications.Department of Software and Computer Science EducationKatedra softwaru a výuky informatikyFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult
    corecore