250 research outputs found

    Connected component identification and cluster update on GPU

    Full text link
    Cluster identification tasks occur in a multitude of contexts in physics and engineering such as, for instance, cluster algorithms for simulating spin models, percolation simulations, segmentation problems in image processing, or network analysis. While it has been shown that graphics processing units (GPUs) can result in speedups of two to three orders of magnitude as compared to serial codes on CPUs for the case of local and thus naturally parallelized problems such as single-spin flip update simulations of spin models, the situation is considerably more complicated for the non-local problem of cluster or connected component identification. I discuss the suitability of different approaches of parallelization of cluster labeling and cluster update algorithms for calculations on GPU and compare to the performance of serial implementations.Comment: 15 pages, 14 figures, one table, submitted to PR

    Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond

    Full text link
    In this and a set of companion whitepapers, the USQCD Collaboration lays out a program of science and computing for lattice gauge theory. These whitepapers describe how calculation using lattice QCD (and other gauge theories) can aid the interpretation of ongoing and upcoming experiments in particle and nuclear physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers

    Association of polymers and small solute molecules with phospholipid membranes

    Get PDF
    El present treball està dedicat a les aplicacions pràctiques de diversos mètodes teòrics i de simulació incloent la dinàmica molecular, Montecarlo i la mitjana dels càlculs del camp per entendre les propietats físiques del sistema bicapa lipídica, així com la interacció dels objectes a escala nanomètrica en contacte amb lípids desenvolupament i bicapes. En particular, el treball es tracten els següents temes: 1. Optimització i equilibri de propietats bicapes lipídiques utilitzant única cadena de la teoria de camp mig. 2. El desenvolupament del model i estudiar les propietats d'equilibri de bicapes de lípids oxidats amb camp mitjà i els mètodes de dinàmica molecular 3. Estudi de les propietats d'equilibri de doble capa amb nanopartícules utilitzant camp mitjà i els mètodes de Montecarlo. Optimització de polímer translocacional través de la tècnica 5. Mètodes estadístics GPU membrana utilitzats per a les propietats superficials de micro fulles d'alteració bicapes lipídiques.El presente trabajo está dedicado a las aplicaciones prácticas de varios métodos teóricos y de simulación incluyendo la dinámica molecular, Monte Carlo y la media de los cálculos del campo para entender las propiedades físicas del sistema bicapa lipídica, así como la interacción de los objetos a escala nanométrica en contacto con lípidos desarrollo y bicapas. En particular, el trabajo se tratan los siguientes temas: 1. Optimización y equilibrio de propiedades bicapas lipídicas utilizando única cadena de la teoría de campo medio. 2. El desarrollo del modelo y estudiar las propiedades de equilibrio de bicapas de lípidos oxidados con campo medio y los métodos de dinámica molecular 3. Estudio de las propiedades de equilibrio de bicapa con nanopartículas utilizando campo medio y los métodos de Monte Carlo. Optimización de polímero translocacional través de la técnica 5. Métodos estadísticos GPU membrana utilizados para las propiedades superficiales de micro cuchillas de alteración bicapas lipídicas.Present work is devoted to the development and practical applications of several theoretical and simulation methods including Molecular dynamics, Monte Carlo and Mean field calculations to understand the physical properties of the lipid bilayer system as well as the interaction of nano-scale object in contact with lipid bilayers. In partcular, the work covers the following topics: 1. Optimization and equilibrium properties of lipid bilayers using Single Chain Mean Field theory. 2. Developing the model and study the equilibrium properties of bilayers with oxidized lipids with mean field and molecular dynamics methods 3. Study the equilibrium properties of bilayer with nanoparticles using mean field and Monte Carlo methods4. Optimization of translocational polymer through membrane GPU technique 5. Statistical methods used to the surface properties of micro blades disrupting lipid bilayers

    SIMULATeQCD: A simple multi-GPU lattice code for QCD calculations

    Full text link
    The rise of exascale supercomputers has fueled competition among GPU vendors, driving lattice QCD developers to write code that supports multiple APIs. Moreover, new developments in algorithms and physics research require frequent updates to existing software. These challenges have to be balanced against constantly changing personnel. At the same time, there is a wide range of applications for HISQ fermions in QCD studies. This situation encourages the development of software featuring a HISQ action that is flexible, high-performing, open source, easy to use, and easy to adapt. In this technical paper, we explain the design strategy, provide implementation details, list available algorithms and modules, and show key performance indicators for SIMULATeQCD, a simple multi-GPU lattice code for large-scale QCD calculations, mainly developed and used by the HotQCD collaboration. The code is publicly available on GitHub.Comment: 17 pages, 7 figure

    Neural network learns physical rules for copolymer translocation through amphiphilic barriers

    Get PDF
    Recent developments in computer processing power lead to new paradigms of how problems in many-body physics and especially polymer physics can be addressed. Parallel processors can be exploited to generate millions of molecular configurations in complex environments at a second, and concomitant free-energy landscapes can be estimated. Databases that are complete in terms of polymer sequences and architecture form a powerful training basis for cross-checking and verifying machine learning-based models. We employ an exhaustive enumeration of polymer sequence space to benchmark the prediction made by a neural network. In our example, we consider the translocation time of a copolymer through a lipid membrane as a function of its sequence of hydrophilic and hydrophobic units. First, we demonstrate that massively parallel Rosenbluth sampling for all possible sequences of a polymer allows for meaningful dynamic interpretation in terms of the mean first escape times through the membrane. Second, we train a multi-layer neural network on logarithmic translocation times and show by the reduction of the training set to a narrow window of translocation times that the neural network develops an internal representation of the physical rules for sequence-controlled diffusion barriers. Based on the narrow training set, the network result approximates the order of magnitude of translocation times in a window that is several orders of magnitude wider than the training window. We investigate how prediction accuracy depends on the distance of unexplored sequences from the training window. © 2020, The Author(s)

    Architectures and GPU-Based Parallelization for Online Bayesian Computational Statistics and Dynamic Modeling

    Get PDF
    Recent work demonstrates that coupling Bayesian computational statistics methods with dynamic models can facilitate the analysis of complex systems associated with diverse time series, including those involving social and behavioural dynamics. Particle Markov Chain Monte Carlo (PMCMC) methods constitute a particularly powerful class of Bayesian methods combining aspects of batch Markov Chain Monte Carlo (MCMC) and the sequential Monte Carlo method of Particle Filtering (PF). PMCMC can flexibly combine theory-capturing dynamic models with diverse empirical data. Online machine learning is a subcategory of machine learning algorithms characterized by sequential, incremental execution as new data arrives, which can give updated results and predictions with growing sequences of available incoming data. While many machine learning and statistical methods are adapted to online algorithms, PMCMC is one example of the many methods whose compatibility with and adaption to online learning remains unclear. In this thesis, I proposed a data-streaming solution supporting PF and PMCMC methods with dynamic epidemiological models and demonstrated several successful applications. By constructing an automated, easy-to-use streaming system, analytic applications and simulation models gain access to arriving real-time data to shorten the time gap between data and resulting model-supported insight. The well-defined architecture design emerging from the thesis would substantially expand traditional simulation models' potential by allowing such models to be offered as continually updated services. Contingent on sufficiently fast execution time, simulation models within this framework can consume the incoming empirical data in real-time and generate informative predictions on an ongoing basis as new data points arrive. In a second line of work, I investigated the platform's flexibility and capability by extending this system to support the use of a powerful class of PMCMC algorithms with dynamic models while ameliorating such algorithms' traditionally stiff performance limitations. Specifically, this work designed and implemented a GPU-enabled parallel version of a PMCMC method with dynamic simulation models. The resulting codebase readily has enabled researchers to adapt their models to the state-of-art statistical inference methods, and ensure that the computation-heavy PMCMC method can perform significant sampling between the successive arrival of each new data point. Investigating this method's impact with several realistic PMCMC application examples showed that GPU-based acceleration allows for up to 160x speedup compared to a corresponding CPU-based version not exploiting parallelism. The GPU accelerated PMCMC and the streaming processing system can complement each other, jointly providing researchers with a powerful toolset to greatly accelerate learning and securing additional insight from the high-velocity data increasingly prevalent within social and behavioural spheres. The design philosophy applied supported a platform with broad generalizability and potential for ready future extensions. The thesis discusses common barriers and difficulties in designing and implementing such systems and offers solutions to solve or mitigate them

    Enumeration and simulation of lattice polymers as models for compact biological macromolecules

    Get PDF
    Polymers are the main building blocks of many biological systems, and thus polymer models are important tools for our understanding. One such biological system is the large scale organisation of chromatin. A key question here, is how during cell division the chromosomes can separate without entanglement and knotting. One proposal is that this achieved by a specific spatial organisation of the chromosomes, known as the "fractal globule". Using Monte Carlo simulations, we found that fractal globules are unstable and thus cannot represent the biological system without further ingredients. Another proposal is that topological effects cause spatial separation of the chromosomes. These topological effects can be studied using simulations of nonconcatenated ring polymers. Using a compute device called the Graphics Processing Unit, very detailed and long simulations were carried out. From these a picture emerged in which ring polymers behave much slower than was found in previous studies. A second biological system studied here is the folded state of the protein. This is modeled by the Hamiltonian walk. Here, instead of simulations, we exactly enumerated all Hamiltonian walks of the 4x4x4 cube. Interestingly, simulations show that for larger systems many more walks exist than previously estimated.Theoretical Physic

    Efficient Algorithms And Optimizations For Scientific Computing On Many-Core Processors

    Get PDF
    Designing efficient algorithms for many-core and multicore architectures requires using different strategies to allow for the best exploitation of the hardware resources on those architectures. Researchers have ported many scientific applications to modern many-core and multicore parallel architectures, and by doing so they have achieved significant speedups over running on single CPU cores. While many applications have achieved significant speedups, some applications still require more effort to accelerate due to their inherently serial behavior. One class of applications that has this serial behavior is the Monte Carlo simulations. Monte Carlo simulations have been used to simulate many problems in statistical physics and statistical mechanics that were not possible to simulate using Molecular Dynamics. While there are a fair number of well-known and recognized GPU Molecular Dynamics codes, the existing Monte Carlo ensemble simulations have not been ported to the GPU, so they are relatively slow and could not run large systems in a reasonable amount of time. Due to the previously mentioned shortcomings of existing Monte Carlo ensemble codes and due to the interest of researchers to have a fast Monte Carlo simulation framework that can simulate large systems, a new GPU framework called GOMC is implemented to simulate different particle and molecular-based force fields and ensembles. GOMC simulates different Monte Carlo ensembles such as the canonical, grand canonical, and Gibbs ensembles. This work describes many challenges in developing a GPU Monte Carlo code for such ensembles and how I addressed these challenges. This work also describes efficient many-core and multicore large-scale energy calculations for Monte Carlo Gibbs ensemble using cell lists. Designing Monte Carlo molecular simulations is challenging as they have less computation and parallelism when compared to similar molecular dynamics applications. The modified cell list allows for more speedup gains for energy calculations on both many-core and multicore architectures when compared to other implementations without using the conventional cell lists. The work presents results and analysis of the cell list algorithms for each one of the parallel architectures using top of the line GPUs, CPUs, and Intel’s Phi coprocessors. In addition, the work evaluates the performance of the cell list algorithms for different problem sizes and different radial cutoffs. In addition, this work evaluates two cell list approaches, a hybrid MPI+OpenMP approach and a hybrid MPI+CUDA approach. The cell list methods are evaluated on a small cluster of multicore CPUs, Intel Phi coprocessors, and GPUs. The performance results are evaluated using different combinations of MPI processes, threads, and problem sizes. Another application presented in this dissertation involves the understanding of the properties of crystalline materials, and their design and control. Recent developments include the introduction of new models to simulate system behavior and properties that are of large experimental and theoretical interest. One of those models is the Phase-Field Crystal (PFC) model. The PFC model has enabled researchers to simulate 2D and 3D crystal structures and study defects such as dislocations and grain boundaries. In this work, GPUs are used to accelerate various dynamic properties of polycrystals in the 2D PFC model. Some properties require very intensive computation that may involve hundreds of thousands of atoms. The GPU implementation has achieved significant speedups of more than 46 times for some large systems simulations

    Prototyping Parallel Simulations on Manycore Architectures Using Scala: A Case Study

    Get PDF
    International audienceAt the manycore era, every simulation practitioner can take advantage of the com-puting horsepower delivered by the available high performance computing devices. From multicoreCPUs (Central Processing Unit) to thousand-thread GPUs (Graphics Processing Unit), severalarchitectures are now able to offer great speed-ups to simulations. However, it is often tricky toharness them properly, and even more complicated to implement a few declinations of the samemodel to compare the parallelizations. Thus, simulation practitioners would mostly benefit of asimple way to evaluate the potential benefits of choosing one platform or another to parallelizetheir simulations. In this work, we study the ability of the Scala programming language to fulfillthis need. We compare the features of two frameworks in this study: Scala Parallel Collections andScalaCL. Both of them provide facilities to set up a data-parallelism approach on Scala collections.The capabilities of the two frameworks are benchmarked with three simulation models as well asa large set of parallel architectures. According to our results, these two Scala frameworks shouldbe considered by the simulation community to quickly prototype parallel simulations, and choosethe target platform on which investing in an optimized development will be rewarding
    corecore