849 research outputs found
Sequential Monte Carlo for Graphical Models
We propose a new framework for how to use sequential Monte Carlo (SMC)
algorithms for inference in probabilistic graphical models (PGM). Via a
sequential decomposition of the PGM we find a sequence of auxiliary
distributions defined on a monotonically increasing sequence of probability
spaces. By targeting these auxiliary distributions using SMC we are able to
approximate the full joint distribution defined by the PGM. One of the key
merits of the SMC sampler is that it provides an unbiased estimate of the
partition function of the model. We also show how it can be used within a
particle Markov chain Monte Carlo framework in order to construct
high-dimensional block-sampling algorithms for general PGMs
Locality and Singularity for Store-Atomic Memory Models
Robustness is a correctness notion for concurrent programs running under
relaxed consistency models. The task is to check that the relaxed behavior
coincides (up to traces) with sequential consistency (SC). Although
computationally simple on paper (robustness has been shown to be
PSPACE-complete for TSO, PGAS, and Power), building a practical robustness
checker remains a challenge. The problem is that the various relaxations lead
to a dramatic number of computations, only few of which violate robustness.
In the present paper, we set out to reduce the search space for robustness
checkers. We focus on store-atomic consistency models and establish two
completeness results. The first result, called locality, states that a
non-robust program always contains a violating computation where only one
thread delays commands. The second result, called singularity, is even stronger
but restricted to programs without lightweight fences. It states that there is
a violating computation where a single store is delayed.
As an application of the results, we derive a linear-size source-to-source
translation of robustness to SC-reachability. It applies to general programs,
regardless of the data domain and potentially with an unbounded number of
threads and with unbounded buffers. We have implemented the translation and
verified, for the first time, PGAS algorithms in a fully automated fashion. For
TSO, our analysis outperforms existing tools
Type oriented parallel programming for Exascale
Whilst there have been great advances in HPC hardware and software in recent
years, the languages and models that we use to program these machines have
remained much more static. This is not from a lack of effort, but instead by
virtue of the fact that the foundation that many programming languages are
built on is not sufficient for the level of expressivity required for parallel
work. The result is an implicit trade-off between programmability and
performance which is made worse due to the fact that, whilst many scientific
users are experts within their own fields, they are not HPC experts.
Type oriented programming looks to address this by encoding the complexity of
a language via the type system. Most of the language functionality is contained
within a loosely coupled type library that can be flexibly used to control many
aspects such as parallelism. Due to the high level nature of this approach
there is much information available during compilation which can be used for
optimisation and, in the absence of type information, the compiler can apply
sensible default options thus supporting both the expert programmer and novice
alike.
We demonstrate that, at no performance or scalability penalty when running on
up to 8196 cores of a Cray XE6 system, codes written in this type oriented
manner provide improved programmability. The programmer is able to write
simple, implicit parallel, HPC code at a high level and then explicitly tune by
adding additional type information if required.Comment: As presented at the Exascale Applications and Software Conference
(EASC), 9th-11th April 201
- …