1,420,334 research outputs found
The Glasgow Parallel Reduction Machine: Programming Shared-memory Many-core Systems using Parallel Task Composition
We present the Glasgow Parallel Reduction Machine (GPRM), a novel, flexible
framework for parallel task-composition based many-core programming. We allow
the programmer to structure programs into task code, written as C++ classes,
and communication code, written in a restricted subset of C++ with functional
semantics and parallel evaluation. In this paper we discuss the GPRM, the
virtual machine framework that enables the parallel task composition approach.
We focus the discussion on GPIR, the functional language used as the
intermediate representation of the bytecode running on the GPRM. Using examples
in this language we show the flexibility and power of our task composition
framework. We demonstrate the potential using an implementation of a merge sort
algorithm on a 64-core Tilera processor, as well as on a conventional Intel
quad-core processor and an AMD 48-core processor system. We also compare our
framework with OpenMP tasks in a parallel pointer chasing algorithm running on
the Tilera processor. Our results show that the GPRM programs outperform the
corresponding OpenMP codes on all test platforms, and can greatly facilitate
writing of parallel programs, in particular non-data parallel algorithms such
as reductions.Comment: In Proceedings PLACES 2013, arXiv:1312.221
Efficient Processing of Spatial Joins Using R-Trees
Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems
Toward Contention Analysis for Parallel Executing Real-Time Tasks
In measurement-based probabilistic timing analysis, the execution conditions imposed to tasks as measurement scenarios, have a strong impact to the worst-case execution time estimates. The scenarios and their effects on the task execution behavior have to be deeply investigated. The aim has to be to identify and to guarantee the scenarios that lead to the maximum measurements, i.e. the worst-case scenarios, and use them to assure the worst-case execution time estimates.
We propose a contention analysis in order to identify the worst contentions that a task can suffer from concurrent executions. The work focuses on the interferences on shared resources (cache memories and memory buses) from parallel executions in multi-core real-time systems. Our approach consists of searching for possible task contenders for parallel executions, modeling their contentiousness, and classifying the measurement scenarios accordingly. We identify the most contentious ones and their worst-case effects on task execution times. The measurement-based probabilistic timing analysis is then used to verify the analysis proposed, qualify the scenarios with contentiousness, and compare them. A parallel execution simulator for multi-core real-time system is developed and used for validating our framework.
The framework applies heuristics and assumptions that simplify the system behavior. It represents a first step for developing a complete approach which would be able to guarantee the worst-case behavior
- …
