69,074 research outputs found
Conflict-free star-access in parallel memory systems
We study conflict-free data distribution schemes in parallel memories in multiprocessor system architectures. Given a host graph G, the problem is to map the nodes of G into memory modules such that any instance of a template type T in G can be accessed without memory conflicts. A conflict occurs if two or more nodes of T are mapped to the same memory module. The mapping algorithm should: (i) be fast in terms of data access (possibly mapping each node in constant time); (ii) minimize the required number of memory modules for accessing any instance in G of the given template type; and (iii) guarantee load balancing on the modules. In this paper, we consider conflict-free access to star templates. i.e., to any node of G along with all of its neighbors. Such a template type arises in many classical algorithms like breadth-first search in a graph, message broadcasting in networks, and nearest neighbor based approximation in numerical computation. We consider the star-template access problem on two specific host graphs-tori and hypercubes-that are also popular interconnection network topologies. The proposed conflict-free mappings on these graphs are fast, use an optimal or provably good number of memory modules, and guarantee load balancing. (C) 2006 Elsevier Inc. All rights reserved
Locality-Adaptive Parallel Hash Joins Using Hardware Transactional Memory
Previous work [1] has claimed that the best performing implementation of in-memory hash joins is based on (radix-)partitioning of the build-side input. Indeed, despite the overhead of partitioning, the benefits from increased cache-locality and synchronization free parallelism in the build-phase outweigh the costs when the input data is randomly ordered. However, many datasets already exhibit significant spatial locality (i.e., non-randomness) due to the way data items enter the database: through periodic ETL or trickle loaded in the form of transactions. In such cases, the first benefit of partitioning — increased locality — is largely irrelevant. In this paper, we demonstrate how hardware transactional memory (HTM) can render the other benefit, freedom from synchronization, irrelevant as well. Specifically, using careful analysis and engineering, we develop an adaptive hash join implementation that outperforms parallel radix-partitioned hash joins as well as sort-merge joins on data with high spatial locality. In addition, we show how, through lightweight (less than 1% overhead) runtime monitoring of the transaction abort rate, our implementation can detect inputs with low spatial locality and dynamically fall back to radix-partitioning of the build-side input. The result is a hash join implementation that is more than 3 times faster than the state-of-the-art on high-locality data and never more than 1% slower
The Entity Registry System: Implementing 5-Star Linked Data Without the Web
Linked Data applications often assume that connectivity to data repositories
and entity resolution services are always available. This may not be a valid
assumption in many cases. Indeed, there are about 4.5 billion people in the
world who have no or limited Web access. Many data-driven applications may have
a critical impact on the life of those people, but are inaccessible to those
populations due to the architecture of today's data registries. In this paper,
we propose and evaluate a new open-source system that can be used as a
general-purpose entity registry suitable for deployment in poorly-connected or
ad-hoc environments.Comment: 16 pages, authors are listed in alphabetical orde
On partial order semantics for SAT/SMT-based symbolic encodings of weak memory concurrency
Concurrent systems are notoriously difficult to analyze, and technological
advances such as weak memory architectures greatly compound this problem. This
has renewed interest in partial order semantics as a theoretical foundation for
formal verification techniques. Among these, symbolic techniques have been
shown to be particularly effective at finding concurrency-related bugs because
they can leverage highly optimized decision procedures such as SAT/SMT solvers.
This paper gives new fundamental results on partial order semantics for
SAT/SMT-based symbolic encodings of weak memory concurrency. In particular, we
give the theoretical basis for a decision procedure that can handle a fragment
of concurrent programs endowed with least fixed point operators. In addition,
we show that a certain partial order semantics of relaxed sequential
consistency is equivalent to the conjunction of three extensively studied weak
memory axioms by Alglave et al. An important consequence of this equivalence is
an asymptotically smaller symbolic encoding for bounded model checking which
has only a quadratic number of partial order constraints compared to the
state-of-the-art cubic-size encoding.Comment: 15 pages, 3 figure
Cost-effective HPC clustering for computer vision applications
We will present a cost-effective and flexible realization of high performance computing (HPC) clustering and its potential in solving computationally intensive problems in computer vision. The featured software foundation to support the parallel programming is the GNU parallel Knoppix package with message passing interface (MPI) based Octave, Python and C interface capabilities. The implementation is especially of interest in applications where the main objective is to reuse the existing hardware infrastructure and to maintain the overall budget cost. We will present the benchmark results and compare and contrast the performances of Octave and MATLAB
- …