Search CORE

39,117 research outputs found

A hierarchic task-based programming model for distributed heterogeneous computing

Author: Badia Sala Rosa Maria
Dominguez Marc
Ejarque Jorge
Publication venue: 'SAGE Publications'
Publication date: 01/01/2019
Field of study

Distributed computing platforms are evolving to heterogeneous ecosystems with Clusters, Grids and Clouds introducing in its computing nodes, processors with different core architectures, accelerators (i.e. GPUs, FPGAs), as well as different memories and storage devices in order to achieve better performance with lower energy consumption. As a consequence of this heterogeneity, programming applications for these distributed heterogeneous platforms becomes a complex task. Additionally to the complexity of developing an application for distributed platforms, developers must also deal now with the complexity of the different computing devices inside the node. In this article, we present a programming model that aims to facilitate the development and execution of applications in current and future distributed heterogeneous parallel architectures. This programming model is based on the hierarchical composition of the COMP Superscalar and Omp Superscalar programming models that allow developers to implement infrastructure-agnostic applications. The underlying runtime enables applications to adapt to the infrastructure without the need of maintaining different versions of the code. Our programming model proposal has been evaluated on real platforms, in terms of heterogeneous resource usage, performance and adaptation.This work has been supported by the European Commission through the Horizon 2020 Research and Innovation program under contract 687584 (TANGO project) by the Spanish Government under contract TIN2015-65316 and grant SEV-2015-0493 (Severo Ochoa Program) and by Generalitat de Catalunya under contracts 2014-SGR-1051 and 2014-SGR-1272.Peer ReviewedPostprint (author's final draft

Scaling Monte Carlo Tree Search on Intel Xeon Phi

Author: Herik Jaap van den
Mirsoleimani S. Ali
Plaat Aske
Vermaseren Jos
Publication venue
Publication date: 15/07/2015
Field of study

Many algorithms have been parallelized successfully on the Intel Xeon Phi coprocessor, especially those with regular, balanced, and predictable data access patterns and instruction flows. Irregular and unbalanced algorithms are harder to parallelize efficiently. They are, for instance, present in artificial intelligence search algorithms such as Monte Carlo Tree Search (MCTS). In this paper we study the scaling behavior of MCTS, on a highly optimized real-world application, on real hardware. The Intel Xeon Phi allows shared memory scaling studies up to 61 cores and 244 hardware threads. We compare work-stealing (Cilk Plus and TBB) and work-sharing (FIFO scheduling) approaches. Interestingly, we find that a straightforward thread pool with a work-sharing FIFO queue shows the best performance. A crucial element for this high performance is the controlling of the grain size, an approach that we call Grain Size Controlled Parallel MCTS. Our subsequent comparing with the Xeon CPUs shows an even more comprehensible distinction in performance between different threading libraries. We achieve, to the best of our knowledge, the fastest implementation of a parallel MCTS on the 61 core Intel Xeon Phi using a real application (47 relative to a sequential run).Comment: 8 pages, 9 figure

arXiv.org e-Print Archive

Eliciting Expertise

Author: Shadbolt N R
Publication venue: Taylor & Francis Ltd
Publication date: 04/04/2005
Field of study

Since the last edition of this book there have been rapid developments in the use and exploitation of formally elicited knowledge. Previously, (Shadbolt and Burton, 1995) the emphasis was on eliciting knowledge for the purpose of building expert or knowledge-based systems. These systems are computer programs intended to solve real-world problems, achieving the same level of accuracy as human experts. Knowledge engineering is the discipline that has evolved to support the whole process of specifying, developing and deploying knowledge-based systems (Schreiber et al., 2000) This chapter will discuss the problem of knowledge elicitation for knowledge intensive systems in general

Southampton (e-Prints Soton)

Towards an Adaptive Skeleton Framework for Performance Portability

Author: Maier Patrick
Morton John Magnus
Trinder Phil
Publication venue: School of Computing Science, University of Glasgow
Publication date: 21/12/2015
Field of study

The proliferation of widely available, but very different, parallel architectures makes the ability to deliver good parallel performance on a range of architectures, or performance portability, highly desirable. Irregularly-parallel problems, where the number and size of tasks is unpredictable, are particularly challenging and require dynamic coordination. The paper outlines a novel approach to delivering portable parallel performance for irregularly parallel programs. The approach combines declarative parallelism with JIT technology, dynamic scheduling, and dynamic transformation. We present the design of an adaptive skeleton library, with a task graph implementation, JIT trace costing, and adaptive transformations. We outline the architecture of the protoype adaptive skeleton execution framework in Pycket, describing tasks, serialisation, and the current scheduler.We report a preliminary evaluation of the prototype framework using 4 micro-benchmarks and a small case study on two NUMA servers (24 and 96 cores) and a small cluster (17 hosts, 272 cores). Key results include Pycket delivering good sequential performance e.g. almost as fast as C for some benchmarks; good absolute speedups on all architectures (up to 120 on 128 cores for sumEuler); and that the adaptive transformations do improve performance

Enlighten

MaSiF: Machine learning guided auto-tuning of parallel skeletons

Author: Cole Murray
Collins Alexander
Fensch Christian
Leather Hugh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2013
Field of study

Recommended from our members

Computer-aided programming for multiprocessing systems

Author: Gajski Daniel D.
Wu Min-You
Publication venue: eScholarship, University of California
Publication date: 30/06/1988
Field of study

As both the number of processors and the complexity of problems to be solved increase, programming multiprocessing systems becomes more difficult and error-prone. This report discusses parallel models of computation and tools for computer-aided programming (CAP). Program development tools are necessary since programmers are not able to develop complex parallel programs efficiently. In particular, a CAP tool, named Hypertool, is described here. It performs scheduling and handles the communication primitive insertion automatically so that many errors are eliminated. It also generates the performance estimates and other program quality measures to help programmers in improving their algorithms and programs. Experiments have shown that up to a 300% performance improvement can be achieved by computer-aided programming

eScholarship - University of California

HitFlow: A Dataflow Programming Model for Hybrid Distributed- and Shared-Memory Systems

Author: Barba Gutiérrez Daniel
Fresno Bausela Javier
González Escribano Arturo
Llanos Ferraris Diego Rafael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Producción CientíficaDataflow programming consists in developing a program by describing its sequential stages and the interactions between them. The runtime systems supporting this kind of programming are responsible for exploiting the parallelism by concurrently executing the different stages as soon as their dependencies are met. In this paper we introduce a new parallel programming model and framework based on the dataflow paradigm. It presents a new combination of features that allows to easily map programs to shared or distributed memory, exploiting data locality and affinity to obtain the same performance than optimized coarse-grain MPI programs. These features include: It is a unique one-tier model that supports hybrid shared- and distributed-memory systems with the same abstractions; it can express activities arbitrarily linked, including non-nested cycles; it uses internally a distributed work-stealing mechanism to allow Multiple-Producer/Multiple-Consumer configurations; and it has a runtime mechanism for the reconfiguration of the dependences and communication channels which also allows the creation of task-to-task data affinities. We present an evaluation using examples of different classes of applications. Experimental results show that programs generated using this framework deliver good performance in hybrid distributed- and shared-memory environments, with a similar development effort as other dataflow programming models oriented to shared-memory.2019-01-01MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys project (TIN2014-58876- P), PCAS project (TIN2017-88614-R), CAPAP-H6 (TIN2016-81840-REDT), and COST Program Action IC1305: Network for Sustainable Ultrascale Com- puting (NESUS). By Junta de Castilla y Le on, project PROPHET (VA082P17). And by the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA- CIEMAT), funded by the European Regional Develop- ment Fund (ERDF). CETA-CIEMAT belongs to CIEMAT and the Govern- ment of Spain

Repositorio Documental de la Universidad de Valladolid