10,497 research outputs found
Robustness against Power is PSPACE-complete
Power is a RISC architecture developed by IBM, Freescale, and several other
companies and implemented in a series of POWER processors. The architecture
features a relaxed memory model providing very weak guarantees with respect to
the ordering and atomicity of memory accesses.
Due to these weaknesses, some programs that are correct under sequential
consistency (SC) show undesirable effects when run under Power. We call these
programs not robust against the Power memory model. Formally, a program is
robust if every computation under Power has the same data and control
dependencies as some SC computation.
Our contribution is a decision procedure for robustness of concurrent
programs against the Power memory model. It is based on three ideas. First, we
reformulate robustness in terms of the acyclicity of a happens-before relation.
Second, we prove that among the computations with cyclic happens-before
relation there is one in a certain normal form. Finally, we reduce the
existence of such a normal-form computation to a language emptiness problem.
Altogether, this yields a PSPACE algorithm for checking robustness against
Power. We complement it by a matching lower bound to show PSPACE-completeness
A Theory of Partitioned Global Address Spaces
Partitioned global address space (PGAS) is a parallel programming model for
the development of applications on clusters. It provides a global address space
partitioned among the cluster nodes, and is supported in programming languages
like C, C++, and Fortran by means of APIs. In this paper we provide a formal
model for the semantics of single instruction, multiple data programs using
PGAS APIs. Our model reflects the main features of popular real-world APIs such
as SHMEM, ARMCI, GASNet, GPI, and GASPI.
A key feature of PGAS is the support for one-sided communication: a node may
directly read and write the memory located at a remote node, without explicit
synchronization with the processes running on the remote side. One-sided
communication increases performance by decoupling process synchronization from
data transfer, but requires the programmer to reason about appropriate
synchronizations between reads and writes. As a second contribution, we propose
and investigate robustness, a criterion for correct synchronization of PGAS
programs. Robustness corresponds to acyclicity of a suitable happens-before
relation defined on PGAS computations. The requirement is finer than the
classical data race freedom and rules out most false error reports.
Our main result is an algorithm for checking robustness of PGAS programs. The
algorithm makes use of two insights. Using combinatorial arguments we first
show that, if a PGAS program is not robust, then there are computations in a
certain normal form that violate happens-before acyclicity. Intuitively,
normal-form computations delay remote accesses in an ordered way. We then
devise an algorithm that checks for cyclic normal-form computations.
Essentially, the algorithm is an emptiness check for a novel automaton model
that accepts normal-form computations in streaming fashion. Altogether, we
prove the robustness problem is PSpace-complete
Locality and Singularity for Store-Atomic Memory Models
Robustness is a correctness notion for concurrent programs running under
relaxed consistency models. The task is to check that the relaxed behavior
coincides (up to traces) with sequential consistency (SC). Although
computationally simple on paper (robustness has been shown to be
PSPACE-complete for TSO, PGAS, and Power), building a practical robustness
checker remains a challenge. The problem is that the various relaxations lead
to a dramatic number of computations, only few of which violate robustness.
In the present paper, we set out to reduce the search space for robustness
checkers. We focus on store-atomic consistency models and establish two
completeness results. The first result, called locality, states that a
non-robust program always contains a violating computation where only one
thread delays commands. The second result, called singularity, is even stronger
but restricted to programs without lightweight fences. It states that there is
a violating computation where a single store is delayed.
As an application of the results, we derive a linear-size source-to-source
translation of robustness to SC-reachability. It applies to general programs,
regardless of the data domain and potentially with an unbounded number of
threads and with unbounded buffers. We have implemented the translation and
verified, for the first time, PGAS algorithms in a fully automated fashion. For
TSO, our analysis outperforms existing tools
Towards the fast and robust optimal design of Wireless Body Area Networks
Wireless body area networks are wireless sensor networks whose adoption has
recently emerged and spread in important healthcare applications, such as the
remote monitoring of health conditions of patients. A major issue associated
with the deployment of such networks is represented by energy consumption: in
general, the batteries of the sensors cannot be easily replaced and recharged,
so containing the usage of energy by a rational design of the network and of
the routing is crucial. Another issue is represented by traffic uncertainty:
body sensors may produce data at a variable rate that is not exactly known in
advance, for example because the generation of data is event-driven. Neglecting
traffic uncertainty may lead to wrong design and routing decisions, which may
compromise the functionality of the network and have very bad effects on the
health of the patients. In order to address these issues, in this work we
propose the first robust optimization model for jointly optimizing the topology
and the routing in body area networks under traffic uncertainty. Since the
problem may result challenging even for a state-of-the-art optimization solver,
we propose an original optimization algorithm that exploits suitable linear
relaxations to guide a randomized fixing of the variables, supported by an
exact large variable neighborhood search. Experiments on realistic instances
indicate that our algorithm performs better than a state-of-the-art solver,
fast producing solutions associated with improved optimality gaps.Comment: Authors' manuscript version of the paper that was published in
Applied Soft Computin
QuickCSG: Fast Arbitrary Boolean Combinations of N Solids
QuickCSG computes the result for general N-polyhedron boolean expressions
without an intermediate tree of solids. We propose a vertex-centric view of the
problem, which simplifies the identification of final geometric contributions,
and facilitates its spatial decomposition. The problem is then cast in a single
KD-tree exploration, geared toward the result by early pruning of any region of
space not contributing to the final surface. We assume strong regularity
properties on the input meshes and that they are in general position. This
simplifying assumption, in combination with our vertex-centric approach,
improves the speed of the approach. Complemented with a task-stealing
parallelization, the algorithm achieves breakthrough performance, one to two
orders of magnitude speedups with respect to state-of-the-art CPU algorithms,
on boolean operations over two to dozens of polyhedra. The algorithm also
outperforms GPU implementations with approximate discretizations, while
producing an output without redundant facets. Despite the restrictive
assumptions on the input, we show the usefulness of QuickCSG for applications
with large CSG problems and strong temporal constraints, e.g. modeling for 3D
printers, reconstruction from visual hulls and collision detection
- …