1,420,334 research outputs found

    The Glasgow Parallel Reduction Machine: Programming Shared-memory Many-core Systems using Parallel Task Composition

    Get PDF
    We present the Glasgow Parallel Reduction Machine (GPRM), a novel, flexible framework for parallel task-composition based many-core programming. We allow the programmer to structure programs into task code, written as C++ classes, and communication code, written in a restricted subset of C++ with functional semantics and parallel evaluation. In this paper we discuss the GPRM, the virtual machine framework that enables the parallel task composition approach. We focus the discussion on GPIR, the functional language used as the intermediate representation of the bytecode running on the GPRM. Using examples in this language we show the flexibility and power of our task composition framework. We demonstrate the potential using an implementation of a merge sort algorithm on a 64-core Tilera processor, as well as on a conventional Intel quad-core processor and an AMD 48-core processor system. We also compare our framework with OpenMP tasks in a parallel pointer chasing algorithm running on the Tilera processor. Our results show that the GPRM programs outperform the corresponding OpenMP codes on all test platforms, and can greatly facilitate writing of parallel programs, in particular non-data parallel algorithms such as reductions.Comment: In Proceedings PLACES 2013, arXiv:1312.221

    Efficient Processing of Spatial Joins Using R-Trees

    Get PDF
    Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems

    Toward Contention Analysis for Parallel Executing Real-Time Tasks

    Get PDF
    In measurement-based probabilistic timing analysis, the execution conditions imposed to tasks as measurement scenarios, have a strong impact to the worst-case execution time estimates. The scenarios and their effects on the task execution behavior have to be deeply investigated. The aim has to be to identify and to guarantee the scenarios that lead to the maximum measurements, i.e. the worst-case scenarios, and use them to assure the worst-case execution time estimates. We propose a contention analysis in order to identify the worst contentions that a task can suffer from concurrent executions. The work focuses on the interferences on shared resources (cache memories and memory buses) from parallel executions in multi-core real-time systems. Our approach consists of searching for possible task contenders for parallel executions, modeling their contentiousness, and classifying the measurement scenarios accordingly. We identify the most contentious ones and their worst-case effects on task execution times. The measurement-based probabilistic timing analysis is then used to verify the analysis proposed, qualify the scenarios with contentiousness, and compare them. A parallel execution simulator for multi-core real-time system is developed and used for validating our framework. The framework applies heuristics and assumptions that simplify the system behavior. It represents a first step for developing a complete approach which would be able to guarantee the worst-case behavior
    corecore