16 research outputs found

    Harvesting graphics power for MD simulations

    Get PDF
    We discuss an implementation of molecular dynamics (MD) simulations on a graphic processing unit (GPU) in the NVIDIA CUDA language. We tested our code on a modern GPU, the NVIDIA GeForce 8800 GTX. Results for two MD algorithms suitable for short-ranged and long-ranged interactions, and a congruential shift random number generator are presented. The performance of the GPU's is compared to their main processor counterpart. We achieve speedups of up to 80, 40 and 150 fold, respectively. With newest generation of GPU's one can run standard MD simulations at 10^7 flops/$.Comment: 12 pages, 5 figures. Submitted to Mol. Si

    Autómata de Lattice Boltzmann para modelar la difusión óptica en materiales translúcidos

    Get PDF
    La interrogación de objetos traslúcidos mediante luz láser en el rango infrarrojo cercano es una técnica para recabar información tomográfica que está siendo usada cada vez más en diagnóstico médico y en inspecciones industriales. En este trabajo se presenta una estrategia para la simulación de la difusión de luz visible en materiales translúcidos basada en el método de Lattice Bolzmann (LBM). LBM es un autómata celular que simula fenómenos de transporte a nivel macroscópico mediante una representación mesoscópica, muy fácil de implementar y altamente paralelizable. En nuestro caso el transporte de fotones en la materia se modela mediante una matriz de colisión y absorción definida en cada celda del dominio espacial simulado. La grilla de soporte es tridimensional y los resultados son visualizados superponiendo los elementos de una malla triangular. El modelo fue validado con datos experimentales medidos en un fantoma de laboratorio. Se presentan también las posibles aplicaciones del autómata en un motor de visualizaciónSociedad Argentina de Informática e Investigación Operativ

    Interactive deformation and visualization of level set surfaces using graphics hardware

    Get PDF
    technical reportDeformable isosurfaces, implemented with level-set methods, have demonstrated a great potential in visualization for applications such as segmentation, surface process- ing, and surface reconstruction. Their usefulness has been limited, however, by two problems. First, 3D level sets are relatively slow to compute. Second, their formulation usually entails several free parameters that can be difficult to tune correctly for specific applications. The second problem is compounded by the first. This paper presents a solution to these challenges by describing graphics processor (GPU) based algorithms for solving and visualizing level-set solutions at interactive rates. Our efficient GPU- based solution relies on packing the level-set isosurface data into a dynamic, sparse texture format. As the level set moves, this sparse data structure is updated via a novel GPU to CPU message passing scheme. When the level-set solver is integrated with a real-time volume renderer operating on the same p

    A data parallel approach to genetic programming using programmable graphics hardware

    Full text link

    A holistic scalable implementation approach of the lattice Boltzmann method for CPU/GPU heterogeneous clusters

    Get PDF
    This is the author accepted manuscript. The final version is available from MDPI via the DOI in this record.Heterogeneous clusters are a widely utilized class of supercomputers assembled from different types of computing devices, for instance CPUs and GPUs, providing a huge computational potential. Programming them in a scalable way exploiting the maximal performance introduces numerous challenges such as optimizations for different computing devices, dealing with multiple levels of parallelism, the application of different programming models, work distribution, and hiding of communication with computation. We utilize the lattice Boltzmann method for fluid flow as a representative of a scientific computing application and develop a holistic implementation for large-scale CPU/GPU heterogeneous clusters. We review and combine a set of best practices and techniques ranging from optimizations for the particular computing devices to the orchestration of tens of thousands of CPU cores and thousands of GPUs. Eventually, we come up with an implementation using all the available computational resources for the lattice Boltzmann method operators. Our approach shows excellent scalability behavior making it future-proof for heterogeneous clusters of the upcoming architectures on the exaFLOPS scale. Parallel efficiencies of more than 90% are achieved leading to 2,604.72 GLUPS utilizing 24,576 CPU cores and 2,048 GPUs of the CPU/GPU heterogeneous cluster Piz Daint and computing more than 6.8 · 109 lattice cells.This work was supported by the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing” (SFB/TR 89). In addition, this work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID d68. We further thank the Max Planck Computing & Data Facility (MPCDF) and the Global Scientific Information and Computing Center (GSIC) for providing computational resources

    A framework for digital watercolor

    Get PDF
    This research develops an extendible framework for reproducing watercolor in a digital environment, with a focus on interactivity using the GPU. The framework uses the lattice Boltzmann method, a relatively new approach to fluid dynamics, and the Kubelka-Munk reflectance model to capture the optical properties of watercolor. The work is demonstrated through several paintings produced using the system

    Animation de phénomènes gazeux basée sur la simulation d'un modèle de fluide à phase unique sur le GPU

    Get PDF
    Le présent mémoire porte sur l'animation passive de phénomènes naturels. L'animation est dite passive lorsque qu'elle [i.e. lorsqu'elle] est directe et sans contrôle ou sans dynamique inverse. En particulier, le type de phénomènes naturels traités est celui des phénomènes gazeux et plus précisément, ceux modélisés par un fluide à phase unique. Tout d'abord, le domaine de l'animation de fluide synthétisé par la simulation d'un modèle physique sera introduit ainsi que la problématique abordée.Le document comprend trois contributions abordant la problématique à différents niveaux. Dans le premier ouvrage, on retrouve une méthode permettant de résoudre les équations de Navier-Stokes en une seule itération sur le GPU (Graphical Processing Unit). La méthode est si simple qu'elle a pu être implémentée en moins d'une journée de travail dans Fx-Composer (vidéo : http: //www.youtube.com/watch?v=PScfTOKbSpU). En plus d'être extrêmement rapide sur le GPU, cette méthode rend l'animation de fluide beaucoup plus accessible et peut être utilisée à différentes fins : l'initiation à l'animation de fluide à l'aide d'une méthode simple à implémenter ou l'ajout rapide d'effets visuels dans un jeu vidéo ou autre application interactive. La deuxième contribution aborde le problème au niveau de la visualisation du fluide. On y retrouve l'élaboration d'une méthode explicite et inconditionnellement stable pour la résolution numérique de l'équation de convection-diffusion utilisée pour simuler la densité d'un gaz qui à la fois est diffusé et transporté dans le domaine par un champ de vecteurs-vitesse, qui dans notre cas représente le mouvement d'un fluide.Le troisième article aborde le problème au niveau de la complexité calculatoire et réduit l'animation 3D de feu à une utilisation 2D strictement en espace-écran. La complexité d'une animation en espace-écran est constante pour une résolution d'image donnée puisque les calculs se font uniquement sur les pixels de l'écran (ou une sur une sous-résolution de ceux-ci)

    Accelerating Missile Threat Engagement Simulations Using Personal Computer Graphics Cards

    Get PDF
    The 453rd Electronic Warfare Squadron supports on-going military operations by providing battlefield commanders with aircraft ingress and egress routes that minimize the risk of shoulder or ground-fired missile attacks on our aircraft. To determine these routes, the 453rd simulates engagements between ground-to-air missiles and allied aircraft to determine the probability of a successful attack. The simulations are computationally expensive, often requiring two-hours for a single 10-second missile engagement. Hundreds of simulations are needed to perform a complete risk assessment which includes evaluating the effectiveness of countermeasures such as flares, chaff, jammers, and missile warning systems. Thus, the need for faster simulations is acute. This research speeds up these mission critical simulations by using inexpensive commodity PC graphics cards to perform intensive image processing computations used to simulate a heat seeking missile\u27s tracking system. The innovative techniques developed in this research reduce execution time by 33% and incorporate a user-selectable fidelity feature to perform high-fidelity simulations when required. Furthermore, these image processing computations use only 5% of the available computational capacity of the graphics cards, providing a ready source of additional computational power for future simulation enhancements. Analysts can now meet shorter suspenses with more accurate products, ultimately enhancing the safety of Air Force pilots and their weapon systems. With ongoing operations in Iraq and Afghanistan, and a growing threat at home and abroad posed by the proliferation of man-portable missiles, the speed of these simulations play an important role in protecting forces and saving lives

    Parallel fluid dynamics for the film and animation industries

    Get PDF
    Includes bibliographical references (leaves 142-149).The creation of automated fluid effects for film and media using computer simulations is popular, as artist time is reduced and greater realism can be achieved through the use of numerical simulation of physical equations. The fluid effects in today’s films and animations have large scenes with high detail requirements. With these requirements, the time taken by such automated approaches is large. To solve this, cluster environments making use of hundreds or more CPUs have been used. This overcomes the processing power and memory limitations of a single computer and allows very large scenes to be created. One of the newer methods for fluid simulation is the Lattice Boltzmann Method (LBM). This is a cellular automata type of algorithm, which parallelizes easily. An important part of the process of parallelization is load balancing; the distribution of computation amongst the available computing resources in the cluster. To date, the parallelization of the Lattice Boltzmann method only makes use of static load balancing. Instead, it is possible to make use of dynamic load balancing, which adjusts the computation distribution as the simulation progresses. Here, we investigate the use of the LBM in conjunction with a Volume of Fluid (VOF) surface representation in a parallel environment with the aim of producing large scale scenes for the film and animation industries. The VOF method tracks mass exchange between cells of the LBM. In particular, we implement the new dynamic load balancing algorithm to improve the efficiency of the fluid simulation using this method. Fluid scenes from films and animations have two important requirements: the amount of detail and the spatial resolution of the fluid. These aspects of the VOF LBM are explored by considering the time for scene creation using a single and multi-CPU implementation of the method. The scalability of the method is studied by plotting the run time, speedup and efficiency of scene creation against the number of CPUs. From such plots, an estimate is obtained of the feasibility of creating scenes of a giving level of detail. Such estimates enable the recommendation of architectures for creation of specific scenes. Using a parallel implementation of the VOF LBM method we successfully create large scenes with great detail. In general, considering the significant amounts of communication required for the parallel method, it is shown to scale well, favouring scenes with greater detail. The scalability studies show that the new dynamic load balancing algorithm improves the efficiency of the parallel implementation, but only when using lower number of CPUs. In fact, for larger number of CPUs, the dynamic algorithm reduces the efficiency. We hypothesise the latter effect can be removed by making using of centralized load balancing decision instead of the current decentralized approach. The use of a cluster comprising of 200 CPUs is recommended for the production of large scenes of a grid size 6003 in a reasonable time frame
    corecore