13,391 research outputs found
Parallelization of a Dynamic Monte Carlo Algorithm: a Partially Rejection-Free Conservative Approach
We experiment with a massively parallel implementation of an algorithm for
simulating the dynamics of metastable decay in kinetic Ising models. The
parallel scheme is directly applicable to a wide range of stochastic cellular
automata where the discrete events (updates) are Poisson arrivals. For high
performance, we utilize a continuous-time, asynchronous parallel version of the
n-fold way rejection-free algorithm. Each processing element carries an lxl
block of spins, and we employ the fast SHMEM-library routines on the Cray T3E
distributed-memory parallel architecture. Different processing elements have
different local simulated times. To ensure causality, the algorithm handles the
asynchrony in a conservative fashion. Despite relatively low utilization and an
intricate relationship between the average time increment and the size of the
spin blocks, we find that for sufficiently large l the algorithm outperforms
its corresponding parallel Metropolis (non-rejection-free) counterpart. As an
example application, we present results for metastable decay in a model
ferromagnetic or ferroelectric film, observed with a probe of area smaller than
the total system.Comment: 17 pages, 7 figures, RevTex; submitted to the Journal of
Computational Physic
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
Dependability in Aggregation by Averaging
Aggregation is an important building block of modern distributed
applications, allowing the determination of meaningful properties (e.g. network
size, total storage capacity, average load, majorities, etc.) that are used to
direct the execution of the system. However, the majority of the existing
aggregation algorithms exhibit relevant dependability issues, when prospecting
their use in real application environments. In this paper, we reveal some
dependability issues of aggregation algorithms based on iterative averaging
techniques, giving some directions to solve them. This class of algorithms is
considered robust (when compared to common tree-based approaches), being
independent from the used routing topology and providing an aggregation result
at all nodes. However, their robustness is strongly challenged and their
correctness often compromised, when changing the assumptions of their working
environment to more realistic ones. The correctness of this class of algorithms
relies on the maintenance of a fundamental invariant, commonly designated as
"mass conservation". We will argue that this main invariant is often broken in
practical settings, and that additional mechanisms and modifications are
required to maintain it, incurring in some degradation of the algorithms
performance. In particular, we discuss the behavior of three representative
algorithms Push-Sum Protocol, Push-Pull Gossip protocol and Distributed Random
Grouping under asynchronous and faulty (with message loss and node crashes)
environments. More specifically, we propose and evaluate two new versions of
the Push-Pull Gossip protocol, which solve its message interleaving problem
(evidenced even in a synchronous operation mode).Comment: 14 pages. Presented in Inforum 200
- …