Search CORE

1,547 research outputs found

The Integration of Task and Data Parallel Skeletons

Author: Cole M. (Murray)
Kuchen H. (Herbert)
Publication venue
Publication date: 01/08/2004
Field of study

We describe a skeletal parallel programming library which integrates task and data parallel constructs within an API for C++. Traditional skeletal requirements for higher orderness and polymorphism are achieved through exploitation of operator overloading and templates, while the underlying parallelism is provided by MPI. We present a case study describing two algorithms for the travelling salesman problem

Münstersches Informations und Archivsystem für Multimediale Inhalte

Towards Generic Scalable Parallel Combinatorial Search

Author: Archibald Blair
De Beule Jan
Maier Patrick
Stewart Robert
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Combinatorial search problems in mathematics, e.g. in finite geometry, are notoriously hard; a state-of-the-art backtracking search algorithm can easily take months to solve a single problem. There is clearly demand for parallel combinatorial search algorithms scaling to hundreds of cores and beyond. However, backtracking combinatorial searches are challenging to parallelise due to their sensitivity to search order and due to the their irregularly shaped search trees. Moreover, scaling parallel search to hundreds of cores generally requires highly specialist parallel programming expertise. This paper proposes a generic scalable framework for solving hard combinatorial problems. Key elements are distributed memory task parallelism (to achieve scale), work stealing (to cope with irregularity), and generic algorithmic skeletons for combinatorial search (to reduce the parallelism expertise required). We outline two implementations: a mature Haskell Tree Search Library (HTSL) based around algorithmic skeletons and a prototype C++ Tree Search Library (CTSL) that uses hand coded applications. Experiments on maximum clique problems and on a problem in finite geometry, the search for spreads in H(4,2^2), show that (1) CTSL consistently outperforms HTSL on sequential runs, and (2) both libraries scale to 200 cores, e.g. speeding up spreads search by a factor of 81 (HTSL) and 60 (CTSL), respectively. This demonstrates the potential of our generic framework for scaling parallel combinatorial search to large distributed memory platforms

Enlighten: Research Data (University of Glasgow)

Crossref

Heriot Watt Pure

Stirling Online Research Repository (RIOXX)

Ghent University Academic Bibliography

Sheffield Hallam University Research Archive

Archivsystem Ask23

Enlighten

Stirling Online Research Repository

Replicable parallel branch and bound search

Author: Abu-Khzam
Alba
Aldinucci
Archibald
Bernard Gendron
Blair Archibald
Blumofe
Bomze
Butenko
Chandra
Chu
Ciaran McCreesh
Cole
Dean
Depolli
de Bruin
Eblen
Everitt
Fukagawa
Hall
Harvey
Jones
Konc
Lai
Laporte
Li
Li
Li
Maier
Martello
Martello
Matoušek
McCreesh
McCreesh
McCreesh
McCreesh
Moisan
Morrison
Nikolaev
Okubo
Olivier
Patrick Maier
Phil Trinder
Pisinger
Poldner
Prim
Prosser
Regula
Reinders
Reinelt
Robert Stewart
Salkin
San Segundo
San Segundo
Segundo
Segundo
Segundo
Tomita
Tomita
Tomita
Trienekens
Vogels
Walsh
Wu
Xiang
Yan
Publication venue: 'Elsevier BV'
Publication date: 12/07/2017
Field of study

Combinatorial branch and bound searches are a common technique for solving global optimisation and decision problems. Their performance often depends on good search order heuristics, refined over decades of algorithms research. Parallel search necessarily deviates from the sequential search order, sometimes dramatically and unpredictably, e.g. by distributing work at random. This can disrupt effective search order heuristics and lead to unexpected and highly variable parallel performance. The variability makes it hard to reason about the parallel performance of combinatorial searches. This paper presents a generic parallel branch and bound skeleton, implemented in Haskell, with replicable parallel performance. The skeleton aims to preserve the search order heuristic by distributing work in an ordered fashion, closely following the sequential search order. We demonstrate the generality of the approach by applying the skeleton to 40 instances of three combinatorial problems: Maximum Clique, 0/1 Knapsack and Travelling Salesperson. The overheads of our Haskell skeleton are reasonable: giving slowdown factors of between 1.9 and 6.2 compared with a class-leading, dedicated, and highly optimised C++ Maximum Clique solver. We demonstrate scaling up to 200 cores of a Beowulf cluster, achieving speedups of 100x for several Maximum Clique instances. We demonstrate low variance of parallel performance across all instances of the three combinatorial problems and at all scales up to 200 cores, with median Relative Standard Deviation (RSD) below 2%. Parallel solvers that do not follow the sequential search order exhibit far higher variance, with median RSD exceeding 85% for Knapsack

arXiv.org e-Print Archive

Enlighten: Research Data (University of Glasgow)

Crossref

Heriot Watt Pure

Stirling Online Research Repository (RIOXX)

Sheffield Hallam University Research Archive

Enlighten

Stirling Online Research Repository

A Skeleton Based Programming Paradigm for Mobile Multi-Agents on Distributed Systems and Its Realization within the MAGDA Mobile Agents Platform

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2008
Field of study

Crossref

The Münster Skeleton Library ‚Muesli‘:A Comprehensive Overview

Author: Ciechanowicz P. (Philipp)
Kuchen H. (Herbert)
Poldner M. (Michael)
Publication venue
Publication date: 29/08/2012
Field of study

<br

Münstersches Informations und Archivsystem für Multimediale Inhalte

On Designing Multicore-aware Simulators for Biological Systems

Author: Aldinucci Marco
Coppo Mario
Damiani Ferruccio
Drocco Maurizio
Torquati Massimo
Troina Angelo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/10/2010
Field of study

The stochastic simulation of biological systems is an increasingly popular technique in bioinformatics. It often is an enlightening technique, which may however result in being computational expensive. We discuss the main opportunities to speed it up on multi-core platforms, which pose new challenges for parallelisation techniques. These opportunities are developed in two general families of solutions involving both the single simulation and a bulk of independent simulations (either replicas of derived from parameter sweep). Proposed solutions are tested on the parallelisation of the CWC simulator (Calculus of Wrapped Compartments) that is carried out according to proposed solutions by way of the FastFlow programming framework making possible fast development and efficient execution on multi-cores.Comment: 19 pages + cover pag

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Institutional Research Information System University of Turin

A Breezing Proof of the KMW Bound

Author: Coupette Corinna
Lenzen Christoph
Publication venue
Publication date: 16/09/2020
Field of study

In their seminal paper from 2004, Kuhn, Moscibroda, and Wattenhofer (KMW) proved a hardness result for several fundamental graph problems in the LOCAL model: For any (randomized) algorithm, there are input graphs with

n

nodes and maximum degree

\Delta

on which

\Omega(\min\{\sqrt{\log n/\log \log n},\log \Delta/\log \log \Delta\})

(expected) communication rounds are required to obtain polylogarithmic approximations to a minimum vertex cover, minimum dominating set, or maximum matching. Via reduction, this hardness extends to symmetry breaking tasks like finding maximal independent sets or maximal matchings. Today, more than

15

years later, there is still no proof of this result that is easy on the reader. Setting out to change this, in this work, we provide a fully self-contained and

\mathit{simple}

proof of the KMW lower bound. The key argument is algorithmic, and it relies on an invariant that can be readily verified from the generation rules of the lower bound graphs.Comment: 21 pages, 6 figure

arXiv.org e-Print Archive

Crossref