Search CORE

2,288 research outputs found

Towards an Adaptive Skeleton Framework for Performance Portability

Author: Maier Patrick
Morton John Magnus
Trinder Phil
Publication venue: School of Computing Science, University of Glasgow
Publication date: 21/12/2015
Field of study

The proliferation of widely available, but very different, parallel architectures makes the ability to deliver good parallel performance on a range of architectures, or performance portability, highly desirable. Irregularly-parallel problems, where the number and size of tasks is unpredictable, are particularly challenging and require dynamic coordination. The paper outlines a novel approach to delivering portable parallel performance for irregularly parallel programs. The approach combines declarative parallelism with JIT technology, dynamic scheduling, and dynamic transformation. We present the design of an adaptive skeleton library, with a task graph implementation, JIT trace costing, and adaptive transformations. We outline the architecture of the protoype adaptive skeleton execution framework in Pycket, describing tasks, serialisation, and the current scheduler.We report a preliminary evaluation of the prototype framework using 4 micro-benchmarks and a small case study on two NUMA servers (24 and 96 cores) and a small cluster (17 hosts, 272 cores). Key results include Pycket delivering good sequential performance e.g. almost as fast as C for some benchmarks; good absolute speedups on all architectures (up to 120 on 128 cores for sumEuler); and that the adaptive transformations do improve performance

Enlighten

On the Stability of Community Detection Algorithms on Longitudinal Citation Data

Author: Bommarito II Michael James
Katz Daniel Martin
Zelner Jon
Publication venue
Publication date: 17/08/2009
Field of study

There are fundamental differences between citation networks and other classes of graphs. In particular, given that citation networks are directed and acyclic, methods developed primarily for use with undirected social network data may face obstacles. This is particularly true for the dynamic development of community structure in citation networks. Namely, it is neither clear when it is appropriate to employ existing community detection approaches nor is it clear how to choose among existing approaches. Using simulated data, we attempt to clarify the conditions under which one should use existing methods and which of these algorithms is appropriate in a given context. We hope this paper will serve as both a useful guidepost and an encouragement to those interested in the development of more targeted approaches for use with longitudinal citation data.Comment: 17 pages, 7 figures, presenting at Applications of Social Network Analysis 2009, ETH Zurich Edit, August 17, 2009: updated abstract, figures, text clarification

arXiv.org e-Print Archive

Crossref

OpenSIUC

Critical Path Scheduling Parallel Programs on an Unbounded Number of Processors

Author: Butelle Franck
Hakem Mourad
Publication venue: World Scientific Publishing House Ltd
Publication date: 01/01/2006
Field of study

International audienceIn this paper we present an efficient algorithm for compile-time scheduling and clustering of parallel programs onto parallel processing systems with distributed memory, which is called The Dynamic Critical Path Scheduling DCPS. The DCPS is superior to several other algorithms from the literature in terms of computational complexity, processors consumption and solution quality. DCPS has a time complexity of O (e + v\log v), as opposed to DSC algorithm O((e + v)\log v) which is the best known algorithm. Experimental results demonstrate the superiority of DCPS over the DSC algorithm

HAL-Paris 13

OPTIMIZING LARGE COMBINATIONAL NETWORKS FOR K-LUT BASED FPGA MAPPING

Author: Alexandru E. ŞUŞU
Cornel POPESCU
George CULEA
Ioana FĂGĂRĂŞAN
Ion I. BUCUR
Publication venue
Publication date
Field of study

Optimizing by partitioning is a central problem in VLSI design automation, addressing circuit’s manufacturability. Circuit partitioning has multiple applications in VLSI design. One of the most common is that of dividing combinational circuits (usually large ones) that will not fit on a single package among a number of packages. Partitioning is of practical importance for k-LUT based FPGA circuit implementation. In this work is presented multilevel a multi-resource partitioning algorithm for partitioning large combinational circuits in order to efficiently use existing and commercially available FPGAs packagestwo-way partitioning, multi-way partitioning, recursive partitioning, flat partitioning, critical path, cutting cones, bottom-up clusters, top-down min-cut

Research Papers in Economics

JIT costing adaptive skeletons for performance portability

Author: Maier Patrick
Morton John Magnus
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

The proliferation of widely available, but very different, parallel architectures makes the ability to deliver good parallel performance on a range of architectures, or performance portability, highly desirable. Irregular parallel problems, where the number and size of tasks is unpredictable, are particularly challenging and require dynamic coordination. The paper outlines a novel approach to delivering portable parallel performance for irregular parallel programs. The approach combines JIT compiler technology with dynamic scheduling and dynamic transformation of declarative parallelism. We specify families of algorithmic skeletons plus equations for rewriting skeleton expressions. We present the design of a framework that unfolds skeletons into task graphs, dynamically schedules tasks, and dynamically rewrites skeletons, guided by a lightweight JIT trace-based cost model, to adapt the number and granularity of tasks for the architecture. We outline the system architecture and prototype implementation in Racket/Pycket. As the current prototype does not yet automatically perform dynamic rewriting we present results based on manual offline rewriting, demonstrating that (i) the system scales to hundreds of cores given enough parallelism of suitable granularity, and (ii) the JIT trace cost model predicts granularity accurately enough to guide rewriting towards a good adaptive transformation

Enlighten: Research Data (University of Glasgow)

Crossref

Sheffield Hallam University Research Archive

Enlighten

Recommended from our members

A survey of behavioral-level partitioning systems

Author: Vahid Frank
Publication venue: eScholarship, University of California
Publication date: 30/10/1991
Field of study

Many approaches have been developed to partition a system's behavioral description before a structural implementation is synthesized. We highlight the foundations and motivations for behavioral partitioning. We survey behavioral partitioning approaches, discussing abstraction levels, goals, major steps, and key assumptions in each

eScholarship - University of California

Taxonomy and clustering in collaborative systems: the case of the on-line encyclopedia Wikipedia

Author: Caldarelli G.
Capocci A.
Rao F.
Publication venue: 'IOP Publishing'
Publication date: 16/10/2007
Field of study

In this paper we investigate the nature and structure of the relation between imposed classifications and real clustering in a particular case of a scale-free network given by the on-line encyclopedia Wikipedia. We find a statistical similarity in the distributions of community sizes both by using the top-down approach of the categories division present in the archive and in the bottom-up procedure of community detection given by an algorithm based on the spectral properties of the graph. Regardless the statistically similar behaviour the two methods provide a rather different division of the articles, thereby signaling that the nature and presence of power laws is a general feature for these systems and cannot be used as a benchmark to evaluate the suitability of a clustering method.Comment: 5 pages, 3 figures, epl2 styl

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

IMT Institutional Repository

Morphological annotation of Korean with Directly Maintainable Resources

Author: Berlocher Ivan
Huh Hyun-Gue
Laporte Eric
Nam Jee-Sun
Publication venue
Publication date: 01/01/2006
Field of study

This article describes an exclusively resource-based method of morphological annotation of written Korean text. Korean is an agglutinative language. Our annotator is designed to process text before the operation of a syntactic parser. In its present state, it annotates one-stem words only. The output is a graph of morphemes annotated with accurate linguistic information. The granularity of the tagset is 3 to 5 times higher than usual tagsets. A comparison with a reference annotated corpus showed that it achieves 89% recall without any corpus training. The language resources used by the system are lexicons of stems, transducers of suffixes and transducers of generation of allomorphs. All can be easily updated, which allows users to control the evolution of the performances of the system. It has been claimed that morphological annotation of Korean text could only be performed by a morphological analysis module accessing a lexicon of morphemes. We show that it can also be performed directly with a lexicon of words and without applying morphological rules at annotation time, which speeds up annotation to 1,210 word/s. The lexicon of words is obtained from the maintainable language resources through a fully automated compilation process

arXiv.org e-Print Archive

CiteSeerX

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM