224 research outputs found
A histogram-free multicanonical Monte Carlo algorithm for the basis expansion of density of states
We report a new multicanonical Monte Carlo (MC) algorithm to obtain the
density of states (DOS) for physical systems with continuous state variables in
statistical mechanics. Our algorithm is able to obtain an analytical form for
the DOS expressed in a chosen basis set, instead of a numerical array of finite
resolution as in previous variants of this class of MC methods such as the
multicanonical (MUCA) sampling and Wang-Landau (WL) sampling. This is enabled
by storing the visited states directly in a data set and avoiding the explicit
collection of a histogram. This practice also has the advantage of avoiding
undesirable artificial errors caused by the discretization and binning of
continuous state variables. Our results show that this scheme is capable of
obtaining converged results with a much reduced number of Monte Carlo steps,
leading to a significant speedup over existing algorithms.Comment: 8 pages, 6 figures. Paper accepted in the Platform for Advanced
Scientific Computing Conference (PASC '17), June 26 to 28, 2017, Lugano,
Switzerlan
High-dimensional hierarchical models and massively parallel computing
This work expounds a computationally expedient strategy for the fully Bayesian treatment of high-dimensional hierarchical models. Most steps in a Markov chain Monte Carlo routine for such models are either conditionally independent draws or low-dimensional draws based on summary statistics of parameters at higher levels of the hierarchy. We construct both sets of steps using parallelized algorithms designed to take advantage of the immense parallel computing power of general-purpose graphics processing units while avoiding the severe memory transfer bottleneck. We apply our strategy to RNA-sequencing (RNA-seq) data analysis, a multiple-testing, low-sample-size scenario where hierarchical models provide a way to borrow information across genes. Our approach is solidly tractable, and it performs well under several metrics of estimation, posterior inference, and gene detection. Best-case-scenario empirical Bayes counterparts perform equally well, lending support to existing empirical Bayes approaches in RNA-seq. Finally, we attempt to improve the robustness of estimation and inference of our RNA-seq model using alternate hierarchical distributions
GPU accelerated population annealing algorithm
Population annealing is a promising recent approach for Monte Carlo
simulations in statistical physics, in particular for the simulation of systems
with complex free-energy landscapes. It is a hybrid method, combining
importance sampling through Markov chains with elements of sequential Monte
Carlo in the form of population control. While it appears to provide
algorithmic capabilities for the simulation of such systems that are roughly
comparable to those of more established approaches such as parallel tempering,
it is intrinsically much more suitable for massively parallel computing. Here,
we tap into this structural advantage and present a highly optimized
implementation of the population annealing algorithm on GPUs that promises
speed-ups of several orders of magnitude as compared to a serial implementation
on CPUs. While the sample code is for simulations of the 2D ferromagnetic Ising
model, it should be easily adapted for simulations of other spin models,
including disordered systems. Our code includes implementations of some
advanced algorithmic features that have only recently been suggested, namely
the automatic adaptation of temperature steps and a multi-histogram analysis of
the data at different temperatures.Comment: 12 pages, 3 figures and 5 tables, code at
https://github.com/LevBarash/PAisin
Accelerating molecular dynamics simulations with population annealing
Population annealing is a powerful tool for large-scale Monte Carlo
simulations. We adapt this method to molecular dynamics simulations and
demonstrate its excellent accelerating effect by simulating the folding of a
short peptide commonly used to gauge the performance of algorithms. The method
is compared to the well established parallel tempering approach and is found to
yield similar performance for the same computational resources. In contrast to
other methods, however, population annealing scales to a nearly arbitrary
number of parallel processors and it is thus a unique tool that enables
molecular dynamics to tap into the massively parallel computing power available
in supercomputers that is so much needed for a range of difficult computational
problems
Parallel Tempering Simulation of the three-dimensional Edwards-Anderson Model with Compact Asynchronous Multispin Coding on GPU
Monte Carlo simulations of the Ising model play an important role in the
field of computational statistical physics, and they have revealed many
properties of the model over the past few decades. However, the effect of
frustration due to random disorder, in particular the possible spin glass
phase, remains a crucial but poorly understood problem. One of the obstacles in
the Monte Carlo simulation of random frustrated systems is their long
relaxation time making an efficient parallel implementation on state-of-the-art
computation platforms highly desirable. The Graphics Processing Unit (GPU) is
such a platform that provides an opportunity to significantly enhance the
computational performance and thus gain new insight into this problem. In this
paper, we present optimization and tuning approaches for the CUDA
implementation of the spin glass simulation on GPUs. We discuss the integration
of various design alternatives, such as GPU kernel construction with minimal
communication, memory tiling, and look-up tables. We present a binary data
format, Compact Asynchronous Multispin Coding (CAMSC), which provides an
additional speedup compared with the traditionally used Asynchronous
Multispin Coding (AMSC). Our overall design sustains a performance of 33.5
picoseconds per spin flip attempt for simulating the three-dimensional
Edwards-Anderson model with parallel tempering, which significantly improves
the performance over existing GPU implementations.Comment: 15 pages, 18 figure
- …