57,210 research outputs found
Rapid Computation of Thermodynamic Properties Over Multidimensional Nonbonded Parameter Spaces using Adaptive Multistate Reweighting
We show how thermodynamic properties of molecular models can be computed over
a large, multidimensional parameter space by combining multistate reweighting
analysis with a linear basis function approach. This approach reduces the
computational cost to estimate thermodynamic properties from molecular
simulations for over 130,000 tested parameter combinations from over a thousand
CPU years to tens of CPU days. This speed increase is achieved primarily by
computing the potential energy as a linear combination of basis functions,
computed from either modified simulation code or as the difference of energy
between two reference states, which can be done without any simulation code
modification. The thermodynamic properties are then estimated with the
Multistate Bennett Acceptance Ratio (MBAR) as a function of multiple model
parameters without the need to define a priori how the states are connected by
a pathway. Instead, we adaptively sample a set of points in parameter space to
create mutual configuration space overlap. The existence of regions of poor
configuration space overlap are detected by analyzing the eigenvalues of the
sampled states' overlap matrix. The configuration space overlap to sampled
states is monitored alongside the mean and maximum uncertainty to determine
convergence, as neither the uncertainty or the configuration space overlap
alone is a sufficient metric of convergence.
This adaptive sampling scheme is demonstrated by estimating with high
precision the solvation free energies of charged particles of Lennard-Jones
plus Coulomb functional form. We also compute entropy, enthalpy, and radial
distribution functions of unsampled parameter combinations using only the data
from these sampled states and use the free energies estimates to examine the
deviation of simulations from the Born approximation to the solvation free
energy
PaPaS: A Portable, Lightweight, and Generic Framework for Parallel Parameter Studies
The current landscape of scientific research is widely based on modeling and
simulation, typically with complexity in the simulation's flow of execution and
parameterization properties. Execution flows are not necessarily
straightforward since they may need multiple processing tasks and iterations.
Furthermore, parameter and performance studies are common approaches used to
characterize a simulation, often requiring traversal of a large parameter
space. High-performance computers offer practical resources at the expense of
users handling the setup, submission, and management of jobs. This work
presents the design of PaPaS, a portable, lightweight, and generic workflow
framework for conducting parallel parameter and performance studies. Workflows
are defined using parameter files based on keyword-value pairs syntax, thus
removing from the user the overhead of creating complex scripts to manage the
workflow. A parameter set consists of any combination of environment variables,
files, partial file contents, and command line arguments. PaPaS is being
developed in Python 3 with support for distributed parallelization using SSH,
batch systems, and C++ MPI. The PaPaS framework will run as user processes, and
can be used in single/multi-node and multi-tenant computing systems. An example
simulation using the BehaviorSpace tool from NetLogo and a matrix multiply
using OpenMP are presented as parameter and performance studies, respectively.
The results demonstrate that the PaPaS framework offers a simple method for
defining and managing parameter studies, while increasing resource utilization.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
Quantum Mechanics with Trajectories: Quantum Trajectories and Adaptive Grids
Although the foundations of the hydrodynamical formulation of quantum
mechanics were laid over 50 years ago, it has only been within the past few
years that viable computational implementations have been developed. One
approach to solving the hydrodynamic equations uses quantum trajectories as the
computational tool. The trajectory equations of motion are described and
methods for implementation are discussed, including fitting of the fields to
gaussian clusters.Comment: Prepared for CiSE, Computing in Science and Engineering IEEE/AIP
special issue on computational chemistr
Metric for attractor overlap
We present the first general metric for attractor overlap (MAO) facilitating
an unsupervised comparison of flow data sets. The starting point is two or more
attractors, i.e., ensembles of states representing different operating
conditions. The proposed metric generalizes the standard Hilbert-space distance
between two snapshots to snapshot ensembles of two attractors. A reduced-order
analysis for big data and many attractors is enabled by coarse-graining the
snapshots into representative clusters with corresponding centroids and
population probabilities. For a large number of attractors, MAO is augmented by
proximity maps for the snapshots, the centroids, and the attractors, giving
scientifically interpretable visual access to the closeness of the states. The
coherent structures belonging to the overlap and disjoint states between these
attractors are distilled by few representative centroids. We employ MAO for two
quite different actuated flow configurations: (1) a two-dimensional wake of the
fluidic pinball with vortices in a narrow frequency range and (2)
three-dimensional wall turbulence with broadband frequency spectrum manipulated
by spanwise traveling transversal surface waves. MAO compares and classifies
these actuated flows in agreement with physical intuition. For instance, the
first feature coordinate of the attractor proximity map correlates with drag
for the fluidic pinball and for the turbulent boundary layer. MAO has a large
spectrum of potential applications ranging from a quantitative comparison
between numerical simulations and experimental particle-image velocimetry data
to the analysis of simulations representing a myriad of different operating
conditions.Comment: 33 pages, 20 figure
High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm
We implement a master-slave parallel genetic algorithm (PGA) with a bespoke
log-likelihood fitness function to identify emergent clusters within price
evolutions. We use graphics processing units (GPUs) to implement a PGA and
visualise the results using disjoint minimal spanning trees (MSTs). We
demonstrate that our GPU PGA, implemented on a commercially available general
purpose GPU, is able to recover stock clusters in sub-second speed, based on a
subset of stocks in the South African market. This represents a pragmatic
choice for low-cost, scalable parallel computing and is significantly faster
than a prototype serial implementation in an optimised C-based
fourth-generation programming language, although the results are not directly
comparable due to compiler differences. Combined with fast online intraday
correlation matrix estimation from high frequency data for cluster
identification, the proposed implementation offers cost-effective,
near-real-time risk assessment for financial practitioners.Comment: 10 pages, 5 figures, 4 tables, More thorough discussion of
implementatio
Atomistic studies of thin film growth
We present here a summary of some recent techniques used for atomistic
studies of thin film growth and morphological evolution. Specific attention is
given to a new kinetic Monte Carlo technique in which the usage of unique
labeling schemes of the environment of the diffusing entity allows the
development of a closed data base of 49 single atom diffusion processes for
periphery motion. The activation energy barriers and diffusion paths are
calculated using reliable manybody interatomic potentials. The application of
the technique to the diffusion of 2-dimensional Cu clusters on Cu(111) shows
interesting trends in the diffusion rate and in the frequencies of the
microscopic mechanisms which are responsible for the motion of the clusters, as
a function of cluster size and temperature. The results are compared with those
obtained from yet another novel kinetic Monte Carlo technique in which an open
data base of the energetics and diffusion paths of microscopic processes is
continuously updated as needed. Comparisons are made with experimental data
where available
Towards a lightweight generic computational grid framework for biological research
Background: An increasing number of scientific research projects require access to large-scale computational resources. This is particularly true in the biological field, whether to facilitate the analysis of large high-throughput data sets, or to perform large numbers of complex simulations â a characteristic of the emerging field of systems biology. Results: In this paper we present a lightweight generic framework for combining disparate computational resources at multiple sites (ranging from local computers and clusters to established national Grid services). A detailed guide describing how to set up the framework is available from the following URL: http://igrid-ext.cryst.bbk.ac.uk/portal_guide/. Conclusion: This approach is particularly (but not exclusively) appropriate for large-scale biology projects with multiple collaborators working at different national or international sites. The framework is relatively easy to set up, hides the complexity of Grid middleware from the user, and provides access to resources through a single, uniform interface. It has been developed as part of the European ImmunoGrid project
- âŠ