Search CORE

6,104 research outputs found

Memory performance of and-parallel prolog on shared-memory architectures

Author: Hermenegildo Manuel V.
Tick Evan
Publication venue: Facultad de Informática (UPM)
Publication date: 01/08/1988
Field of study

The goal of the RAP-WAM AND-parallel Prolog abstract architecture is to provide inference speeds significantly beyond those of sequential systems, while supporting Prolog semantics and preserving sequential performance and storage efficiency. This paper presents simulation results supporting these claims with special emphasis on memory performance on a two-level sharedmemory multiprocessor organization. Several solutions to the cache coherency problem are analyzed. It is shown that RAP-WAM offers good locality and storage efficiency and that it can effectively take advantage of broadcast caches. It is argued that speeds in excess of 2 ML IPS on real applications exhibiting medium parallelism can be attained with current technology

Archivo Digital UPM

Porting Decision Tree Algorithms to Multicore using FastFlow

Author: A.C. Sodan
I. Park
J.E. Gehrke
J.R. Quinlan
K. Asanovic
M. Aldinucci
M. Cole
M. Coppola
M. Joshi
M. Vanneschi
M. Zaki
M.K. Sreenivas
R. Jin
R.D. Blumofe
S. Ruggieri
S. Ruggieri
T. Lim
W. Thies
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The whole computer hardware industry embraced multicores. For these machines, the extreme optimisation of sequential algorithms is no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level parallelism. Decision tree algorithms exhibit natural concurrency that makes them suitable to be parallelised. This paper presents an approach for easy-yet-efficient porting of an implementation of the C4.5 algorithm on multicores. The parallel porting requires minimal changes to the original sequential code, and it is able to exploit up to 7X speedup on an Intel dual-quad core machine.Comment: 18 pages + cove

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio della Ricerca - Università di Pisa

UnipiEprints

A Test Suite for High-Performance Parallel Java

Author: Brunett Sharon
Gollnick Torsten
Hauser Jochem
Ludewig Thorsten
Muylaert Jean
Williams Roy D.
Winkelmann Ralf
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1999
Field of study

The Java programming language has a number of features that make it attractive for writing high-quality, portable parallel programs. A pure object formulation, strong typing and the exception model make programs easier to create, debug, and maintain. The elegant threading provides a simple route to parallelism on shared-memory machines. Anticipating great improvements in numerical performance, this paper presents a suite of simple programs that indicate how a pure Java Navier-Stokes solver might perform. The suite includes a parallel Euler solver. We present results from a 32-processor Hewlett-Packard machine and a 4-processor Sun server. While speedup is excellent on both machines, indicating a high-quality thread scheduler, the single-processor performance needs much improvement

Caltech Authors

The FORCE: A highly portable parallel programming language

Author: Alaghband Gita
Benten Muhammad S.
Jakob Ruediger
Jordan Harry F.
Publication venue
Publication date
Field of study

Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them

NASA Technical Reports Server

Compiling vector pascal to the XeonPhi

Author: Bik
Budd
Chamberlain
Cockshott
Cockshott
Ewing
Grelck
Iverson
Keßler
Krishnaiyer
Lin
Pater
Perrott
Perrott
Scholz
Siebert
Snyder
Tousimojarad
Publication venue: 'Wiley'
Publication date: 26/03/2015
Field of study

Intel's XeonPhi is a highly parallel x86 architecture chip made by Intel. It has a number of novel features which make it a particularly challenging target for the compiler writer. This paper describes the techniques used to port the Glasgow Vector Pascal Compiler to this architecture and assess its performance by comparisons of the XeonPhi with 3 other machines running the same algorithms

Enlighten: Research Data (University of Glasgow)

Crossref

Enlighten

Methods for design and evaluation of integrated hardware/software systems for concurrent computation

Author: Pratt Terrence W.
Publication venue
Publication date
Field of study

Two testbed programming environments to support the evaluation of a large range of parallel architectures have been implemented under the program Parallel Implementation of Scientific Computing Environments (PISCES). The PISCES 1 environment was applied to two areas of aerospace interest: a sparse matrix iterative equation solver and a dynamic scene analysis system. Currently, the NICE/SPAR testbed system for structural analysis is being modified for parallel operation under PISCES 2; the PISCES 1 applications are also being adapted for PISCES 2. A new formal model of concurrent computation has been developed, based on the mathematical system known as H graph semantics together with a timed Petri net model of the parallel aspects of a system

NASA Technical Reports Server