224 research outputs found

    A histogram-free multicanonical Monte Carlo algorithm for the basis expansion of density of states

    Full text link
    We report a new multicanonical Monte Carlo (MC) algorithm to obtain the density of states (DOS) for physical systems with continuous state variables in statistical mechanics. Our algorithm is able to obtain an analytical form for the DOS expressed in a chosen basis set, instead of a numerical array of finite resolution as in previous variants of this class of MC methods such as the multicanonical (MUCA) sampling and Wang-Landau (WL) sampling. This is enabled by storing the visited states directly in a data set and avoiding the explicit collection of a histogram. This practice also has the advantage of avoiding undesirable artificial errors caused by the discretization and binning of continuous state variables. Our results show that this scheme is capable of obtaining converged results with a much reduced number of Monte Carlo steps, leading to a significant speedup over existing algorithms.Comment: 8 pages, 6 figures. Paper accepted in the Platform for Advanced Scientific Computing Conference (PASC '17), June 26 to 28, 2017, Lugano, Switzerlan

    High-dimensional hierarchical models and massively parallel computing

    Get PDF
    This work expounds a computationally expedient strategy for the fully Bayesian treatment of high-dimensional hierarchical models. Most steps in a Markov chain Monte Carlo routine for such models are either conditionally independent draws or low-dimensional draws based on summary statistics of parameters at higher levels of the hierarchy. We construct both sets of steps using parallelized algorithms designed to take advantage of the immense parallel computing power of general-purpose graphics processing units while avoiding the severe memory transfer bottleneck. We apply our strategy to RNA-sequencing (RNA-seq) data analysis, a multiple-testing, low-sample-size scenario where hierarchical models provide a way to borrow information across genes. Our approach is solidly tractable, and it performs well under several metrics of estimation, posterior inference, and gene detection. Best-case-scenario empirical Bayes counterparts perform equally well, lending support to existing empirical Bayes approaches in RNA-seq. Finally, we attempt to improve the robustness of estimation and inference of our RNA-seq model using alternate hierarchical distributions

    GPU accelerated population annealing algorithm

    Get PDF
    Population annealing is a promising recent approach for Monte Carlo simulations in statistical physics, in particular for the simulation of systems with complex free-energy landscapes. It is a hybrid method, combining importance sampling through Markov chains with elements of sequential Monte Carlo in the form of population control. While it appears to provide algorithmic capabilities for the simulation of such systems that are roughly comparable to those of more established approaches such as parallel tempering, it is intrinsically much more suitable for massively parallel computing. Here, we tap into this structural advantage and present a highly optimized implementation of the population annealing algorithm on GPUs that promises speed-ups of several orders of magnitude as compared to a serial implementation on CPUs. While the sample code is for simulations of the 2D ferromagnetic Ising model, it should be easily adapted for simulations of other spin models, including disordered systems. Our code includes implementations of some advanced algorithmic features that have only recently been suggested, namely the automatic adaptation of temperature steps and a multi-histogram analysis of the data at different temperatures.Comment: 12 pages, 3 figures and 5 tables, code at https://github.com/LevBarash/PAisin

    Accelerating molecular dynamics simulations with population annealing

    Get PDF
    Population annealing is a powerful tool for large-scale Monte Carlo simulations. We adapt this method to molecular dynamics simulations and demonstrate its excellent accelerating effect by simulating the folding of a short peptide commonly used to gauge the performance of algorithms. The method is compared to the well established parallel tempering approach and is found to yield similar performance for the same computational resources. In contrast to other methods, however, population annealing scales to a nearly arbitrary number of parallel processors and it is thus a unique tool that enables molecular dynamics to tap into the massively parallel computing power available in supercomputers that is so much needed for a range of difficult computational problems

    Parallel Tempering Simulation of the three-dimensional Edwards-Anderson Model with Compact Asynchronous Multispin Coding on GPU

    Get PDF
    Monte Carlo simulations of the Ising model play an important role in the field of computational statistical physics, and they have revealed many properties of the model over the past few decades. However, the effect of frustration due to random disorder, in particular the possible spin glass phase, remains a crucial but poorly understood problem. One of the obstacles in the Monte Carlo simulation of random frustrated systems is their long relaxation time making an efficient parallel implementation on state-of-the-art computation platforms highly desirable. The Graphics Processing Unit (GPU) is such a platform that provides an opportunity to significantly enhance the computational performance and thus gain new insight into this problem. In this paper, we present optimization and tuning approaches for the CUDA implementation of the spin glass simulation on GPUs. We discuss the integration of various design alternatives, such as GPU kernel construction with minimal communication, memory tiling, and look-up tables. We present a binary data format, Compact Asynchronous Multispin Coding (CAMSC), which provides an additional 28.4%28.4\% speedup compared with the traditionally used Asynchronous Multispin Coding (AMSC). Our overall design sustains a performance of 33.5 picoseconds per spin flip attempt for simulating the three-dimensional Edwards-Anderson model with parallel tempering, which significantly improves the performance over existing GPU implementations.Comment: 15 pages, 18 figure
    • …
    corecore