45 research outputs found

    Simulation of reaction diffusion processes over biologically relevant size and time scales using multi-GPU workstations

    Get PDF
    AbstractSimulation of in vivo cellular processes with the reaction–diffusion master equation (RDME) is a computationally expensive task. Our previous software enabled simulation of inhomogeneous biochemical systems for small bacteria over long time scales using the MPD-RDME method on a single GPU. Simulations of larger eukaryotic systems exceed the on-board memory capacity of individual GPUs, and long time simulations of modest-sized cells such as yeast are impractical on a single GPU. We present a new multi-GPU parallel implementation of the MPD-RDME method based on a spatial decomposition approach that supports dynamic load balancing for workstations containing GPUs of varying performance and memory capacity. We take advantage of high-performance features of CUDA for peer-to-peer GPU memory transfers and evaluate the performance of our algorithms on state-of-the-art GPU devices. We present parallel efficiency and performance results for simulations using multiple GPUs as system size, particle counts, and number of reactions grow. We also demonstrate multi-GPU performance in simulations of the Min protein system in E. coli. Moreover, our multi-GPU decomposition and load balancing approach can be generalized to other lattice-based problems

    Докинг низкомолекулярных лигандов на молекуле растительного FtsZ-белка: применение технологии cuda для ускорения расчетов

    No full text
    Приведен обзор и анализ возможностей применения технологии CUDA для ускорения расчетов в области структурной биологии и биоинформатики. На примере работы с программой HEX 6.1 выполнен сравнительный анализ прироста производительности и качества результатов жесткого докинга низкомолекулярных соединений различных классов на поверхности FtsZ белка из Arabidopsis thaliana. Идентифицировано несколько потенциальных сайтов связывания бензимидазолов с растительным FtsZ.Наводиться огляд та аналіз можливостей застосування технології CUDA для прискорення обчислень в галузі структурної біології та біоінформатики. На прикладі роботи з програмою Hex 6.1 здійснено порівняльний аналіз приросту продуктивності та якості результатів жорсткого докінгу низькомолекулярних сполук різних класів на поверхні FtsZ-білка із Arabidopsis thaliana. Ідентифіковано декілька потенційних сайтів зв’язування бензімідазолів із рослинним FtsZ.The opportunities to apply the CUDA technology for faster computations in structural biology and bioinformatics are reviewed and analyzed. Using HEX 6.1 software, we performed a comparative analysis of the efficiency and the increase in quality after CUDA application. The work was conducted on the example of rigid docking of low-molecular-weight compounds of different classes on the surface of the Arabidopsis thaliana FtsZ. Several potential binding sites of benzimidazoles to the plant FtsZ were identified

    An Architecture Approach for 3D Render Distribution using Mobile Devices in Real Time

    Get PDF
    Nowadays, video games such as Massively Multiplayer Online Game (MMOG) have become cultural mediators. Mobile games contribute to a large number of downloads and potential benefits in the applications market. Although processing power of mobile devices increases the bandwidth transmission, a poor network connectivity may bottleneck Gaming as a Service (GaaS). In order to enhance performance in digital ecosystem, processing tasks are distributed among thin client devices and robust servers. This research is based on the method ‘divide and rule’, that is, volumetric surfaces are subdivided using a tree-KD of sequence of scenes in a game, so reducing the surface into small sets of points. Reconstruction efficiency is improved, because the search of data is performed in local and small regions. Processes are modeled through a finite set of states that are built using Hidden Markov Models with domains configured by heuristics. Six test that control the states of each heuristic, including the number of intervals are carried out to validate the proposed model. This validation concludes that the proposed model optimizes response frames per second, in a sequence of interactions

    Computational Methods in Science and Engineering : Proceedings of the Workshop SimLabs@KIT, November 29 - 30, 2010, Karlsruhe, Germany

    Get PDF
    In this proceedings volume we provide a compilation of article contributions equally covering applications from different research fields and ranging from capacity up to capability computing. Besides classical computing aspects such as parallelization, the focus of these proceedings is on multi-scale approaches and methods for tackling algorithm and data complexity. Also practical aspects regarding the usage of the HPC infrastructure and available tools and software at the SCC are presented

    I-Light Symposium 2005 Proceedings

    Get PDF
    I-Light was made possible by a special appropriation by the State of Indiana. The research described at the I-Light Symposium has been supported by numerous grants from several sources. Any opinions, findings and conclusions, or recommendations expressed in the 2005 I-Light Symposium Proceedings are those of the researchers and authors and do not necessarily reflect the views of the granting agencies.Indiana University Office of the Vice President for Research and Information Technology, Purdue University Office of the Vice President for Information Technology and CI

    Solving Hyperbolic PDEs using Accelerator Architectures

    Get PDF
    Accelerator architectures are used to accelerate the simulation of nonlinear hyperbolic PDEs. Three different architectures, a multicore CPU using threading, IBM’s Cell Processor, and Nvidia’s Tesla GPUs are investigated. Speed-ups of between 40-75× relative to a single CPU core in single precision are obtained using the Cell processor and the GPU. The three implementations are extended to parallel computing clusters by making use of the Message Passing Interface (MPI). The resulting hybrid-parallel code is investigated for performance and scalability on both a GPU and Cell computing cluster

    Graphics Processing Unit Accelerated Coarse-Grained Protein-Protein Docking

    Get PDF
    Graphics processing unit (GPU) architectures are increasingly used for general purpose computing, providing the means to migrate algorithms from the SISD paradigm, synonymous with CPU architectures, to the SIMD paradigm. Generally programmable commodity multi-core hardware can result in significant speed-ups for migrated codes. Because of their computational complexity, molecular simulations in particular stand to benefit from GPU acceleration. Coarse-grained molecular models provide reduced complexity when compared to the traditional, computationally expensive, all-atom models. However, while coarse-grained models are much less computationally expensive than the all-atom approach, the pairwise energy calculations required at each iteration of the algorithm continue to cause a computational bottleneck for a serial implementation. In this work, we describe a GPU implementation of the Kim-Hummer coarse-grained model for protein docking simulations, using a Replica Exchange Monte-Carlo (REMC) method. Our highly parallel implementation vastly increases the size- and time scales accessible to molecular simulation. We describe in detail the complex process of migrating the algorithm to a GPU as well as the effect of various GPU approaches and optimisations on algorithm speed-up. Our benchmarking and profiling shows that the GPU implementation scales very favourably compared to a CPU implementation. Small reference simulations benefit from a modest speedup of between 4 to 10 times. However, large simulations, containing many thousands of residues, benefit from asynchronous GPU acceleration to a far greater degree and exhibit speed-ups of up to 1400 times. We demonstrate the utility of our system on some model problems. We investigate the effects of macromolecular crowding, using a repulsive crowder model, finding our results to agree with those predicted by scaled particle theory. We also perform initial studies into the simulation of viral capsids assembly, demonstrating the crude assembly of capsid pieces into a small fragment. This is the first implementation of REMC docking on a GPU, and the effectuate speed-ups alter the tractability of large scale simulations: simulations that otherwise require months or years can be performed in days or weeks using a GPU

    Optimisation of the first principle code Octopus for massive parallel architectures: application to light harvesting complexes

    Get PDF
    [EN]: Computer simulation has become a powerful technique for assisting scientists in developing novel insights into the basic phenomena underlying a wide variety of complex physical systems. The work reported in this thesis is concerned with the use of massively parallel computers to simulate the fundamental features at the electronic structure level that control the initial stages of harvesting and transfer of solar energy in green plants which initiate the photosynthetic process. Currently available supercomputer facilities offer the possibility of using hundred of thousands of computing cores. However, obtaining a linear speed-up from HPC systems is far from trivial. Thus, great efforts must be devoted to understand the nature of the scientific code, the methods of parallel execution, data communication requirements in multi-process calculations, the efficient use of available memory, etc. This thesis deals with all of these themes, with a clear objective in mind: the electronic structure simulation of complete macro-molecular complexes, namely the Light Harvesting Complex II, with the aim of understanding its physical behaviour. In order to simulate this complex, we have used (with the assistance of the PRACE consortium) some of the most powerful supercomputers in Europe to run Octopus, a scientific software package for Density Functional Theory and TimeDependent Density Functional Theory calculations. Results obtained with Octopus have been analysed in depth in order to identify the main obstacles to optimal scaling using thousands of cores. Many problems have emerged, mainly the poor performance of the Poisson solver, high memory requirements, the transfer of high quantities of complex data structures among processes, and so on. Finally, all of these problems have been overcome, and the new version reaches a very high performance in massively parallel systems. Tests run efficiently up to 128K processors and thus we have been able to complete the largest TDDFT calculations performed to date. At the conclusion of this work it has been possible to study the Light Harvesting Complex II as originally envisioned.[EU]: Konputagailu bidezko simulazioa da, gaur egun, zientzialariek eskura duten tresnarik ahaltsuenetako bat sistema fisiko konplexuen portaera ulertzen saiatzeko. Oinarrizko fenomeno fisiko horiek simulatzeko superkonputagailuak erabili dira tesi honetan aurkezten den lanean. Konkretuki, punta-puntako konputagailuak erabili dira fotosintesiaren lehen urratsak ulertzeko, landare berdeetan eguzki-energiaren xurgatze-prozesua kontrolatzen duen molekula simulatuz. Superkonputazio-zentroek ehunka milaka prozesatze-nukleo dituzten makinak erabiltzeko aukera eskaintzen dute, baina ez da batere erraza azelerazio-faktore linealak lortzea halako konputagailuetan. Hori dela eta, ahalegin handiak egin behar dira, informatikaren ikuspegitik, sistema osoaren ezagutza ahalik eta sakonena lortzeko: kode zientifikoen izaera, beraren exekuzio paraleloen aukerak, prozesuen arteko datu-transmisioaren beharrak, sistemaren memoriaren erabilera eraginkorrena, eta abar. Tesi honek aurreko arazo guztiei aurre egiten die, helburu argi batekin: konplexu makromolekular osoen simulazioa, konkretuki Light Harvesting Complex II sistemaren egitura elektronikoaren simulazioa, beraren portaera fisikoa ulertu ahal izateko. Sistema hori simulatu ahal izateko bidean, Europako superkonputagailu azkarrenak erabili dira (PRACE partzuergoari esker) Octopus software-paketea exekutatzeko, zeina Density Functional Theory eta Time-Dependent Density Functional Theory izeneko teorien araberako simulazio elektronikoak egiten baititu. Lortutako emaitzak sakonki analizatu dira, milaka konputazio-nukleo eraginkorki erabiltzea oztopatzen zuten arazoak aurkitzeko. Problema ugari azaldu dira bide horretan, nagusiki Poisson ebazlearen errendimendu baxua, memoria eskaera handiak, datu-egitura konplexuen kopuru handiko transferentziak, eta abar. Azkenean, problema horiek guztiak ebatzi dira, eta bertsio berriak errendimendu handia lortu du superkonputagailu paraleloetan. Exekuzio eraginkorrak frogatu ahal izan ditugu 128K prozesadorera arte eta, ondorioz, inoizko TDDFT simulaziorik handienak egin ahal izan ditugu. Hala, lan honen amaieran, hasierako helburua bete ahal izan da: Light Harvesting Complex II sistema molekularraren azterketa egitea.University of the Basque Country, UPV/EHU, University of Coimbra, Red Española de Supercomputación (RES), Jülich Supercomputing Centre (JSC), Rechenzentrum Garching, Cineca, Barcelona Supercomputing Center (BSC), CeSViMa, European Research Council Advanced Grant DYNamo (ERC-2010-AdG-267374), Spanish Grant (FIS2013-46159-C3-1-P), Grupos Consolidados UPV/EHU del Gobierno Vasco (IT578-13), Grupos Consolidados UPV/EHU del Gobierno Vasco (IT395-10), European Community FP7 project CRONOS (Grant number 280879-2), COST Actions CM1204 (XLIC) and MP1306 (EUSpec), ALDAPA research group belongs to the Basque Advanced Informatics Laboratory (BAILab) supported by the University of the Basque Country UPV/EHU (grant UFI11/45).Peer Reviewe
    corecore