Search CORE

117 research outputs found

Programming Parallel Computers

Author: Chandy K. Mani
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1988
Field of study

This paper is from a keynote address to the IEEE International Conference on Computer Languages, October 9, 1988. Keynote addresses are expected to be provocative (and perhaps even entertaining), but not necessarily scholarly. The reader should be warned that this talk was prepared with these expectations in mind.Parallel computers offer the potential of great speed at low cost. The promise of parallelism is limited by the ability to program parallel machines effectively. This paper explores the opportunities and the problems of parallel computing. Technological and economic trends are studied with a view towards determining where the field of parallel computing is going. An approach to parallel programming, called UNITY, is described. UNITY was developed by Jay Misra and myself, and is described in [Chandy]. Extensions to UNITY are discussed; these extensions were motivated by discussions with Chuck Seit

Recommended from our members

Improving parallel program performance using critical path analysis

Author: Bic Lubomir
Gajski Daniel D.
Kwan Andrew W.
Publication venue: eScholarship, University of California
Publication date: 01/01/1989
Field of study

A programming tool that performs analysis of critical paths for parallel programs has been developed. This tool determines the critical path for the program as scheduled onto a parallel computer with P processing elements, the critical path for the program expressed as a data flow graph (when maximal parallelism can be expressed), and the minimum number of processing elements (P_opt) needed to obtain maximum program speedup. Experiments were performed using several versions of a Gaussian elimination program to examine how speedup varied with changes in granularity and critical path length. These experiments showed that when the available numer of processing elements P < P_opt, increasing granularity improved program speedup more than reducing (the data flow graph's) critical path length, whereas when P ≥ P_opt, increasing granularity degraded program speedup while reducing critical path length improved program speedup

eScholarship - University of California

PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations

Author: Fukushige Toshiyuki
Hamada Tsuyoshi
Kawai Atsushi
Makino Junichiro
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/06/1999
Field of study

We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and "traditional" GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter rely on the hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions but also other forms of interactions such as van der Waals force, hydrodynamical interactions in SPH calculation and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100,000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interaction similar to that of GRAPE-3. One pipeline fitted into a single FPGA chip, which operated at 16 MHz clock. Thus, for gravitational interaction, PROGRAPE-1 provided the speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.Comment: 20 pages with 9 figures; submitted to PAS

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

GPU-Accelerated Algorithms for Compressed Signals Recovery with Application to Astronomical Imagery Deblurring

Author: Fiandrotti Attilio
Fosson Sophie M.
Magli Enrico
Ravazzi Chiara
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Institutional Research Information System University of Turin

PORTO Publications Open Repository TOrino

Practical Parallel Programming, by Gregory V. Wilson

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1997
Field of study

Crossref

An Introduction to Message Passing Paradigms

Author: Walker D
Publication venue: CERN
Publication date: 01/01/1995
Field of study

CERN Document Server

A Data Type for Efficient Representation of Other Data Types

Author: James Mark
Publication venue
Publication date
Field of study

A self-organizing, monomorphic data type denoted a sequence has been conceived to address certain concerns that arise in programming parallel computers. A sequence in the present sense can be regarded abstractly as a vector, set, bag, queue, or other construct. Heretofore, in programming a parallel computer, it has been necessary for the programmer to state explicitly, at the outset, what parts of the program and the underlying data structures must be represented in parallel form. Not only is this requirement not optimal from the perspective of implementation; it entails an additional requirement that the programmer have intimate understanding of the underlying parallel structure. The present sequence data type overcomes both the implementation and parallel structure obstacles. In so doing, the sequence data type provides unified means by which the programmer can represent a data structure for natural and automatic decomposition to a parallel computing architecture. Sequences exhibit the behavioral and structural characteristics of vectors, but the underlying representations are automatically synthesized from combinations of programmers advice and execution use metrics. Sequences can vary bidirectionally between sparseness and density, making them excellent choices for many kinds of algorithms. The novelty and benefit of this behavior lies in the fact that it can relieve programmers of the details of implementations. The creation of a sequence enables decoupling of a conceptual representation from an implementation. The underlying representation of a sequence is a hybrid of representations composed of vectors, linked lists, connected blocks, and hash tables. The internal structure of a sequence can automatically change from time to time on the basis of how it is being used. Those portions of a sequence where elements have not been added or removed can be as efficient as vectors. As elements are inserted and removed in a given portion, then different methods are utilized to provide both an access and memory strategy that is optimized for that portion and the use to which it is put

NASA Technical Reports Server

A language comparison for scientific computing on MIMD architectures

Author: Jones Mark T.
Patrick Merrell L.
Voigt Robert G.
Publication venue
Publication date
Field of study

Choleski's method for solving banded symmetric, positive definite systems is implemented on a multiprocessor computer using three FORTRAN based parallel programming languages, the Force, PISCES and Concurrent FORTRAN. The capabilities of the language for expressing parallelism and their user friendliness are discussed, including readability of the code, debugging assistance offered, and expressiveness of the languages. The performance of the different implementations is compared. It is argued that PISCES, using the Force for medium-grained parallelism, is the appropriate choice for programming Choleski's method on the multiprocessor computer, Flex/32

NASA Technical Reports Server

State-of-the-Art in Parallel Computing with R

Author: Eddelbuettel Dirk
Mansmann Ulrich
Morgan Martin
Schmidberger Markus
Tierney Luke
Yu Hao
Publication venue
Publication date: 01/01/2009
Field of study

R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing. This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance. Two packages (snow, Rmpi) stand out as particularly useful for general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems four different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix

Crossref

Directory of Open Access Journals

Open Access LMU

Journal of Statistical Software

Parallel Computer Needs at Dartmouth College

Author: Bishop Matt
Drysdale Scot
Johnson Don
Kotz David
Makedon Fillia
Metaxas Takis
Publication venue: Dartmouth Digital Commons
Publication date: 29/06/1994
Field of study

To determine the need for a parallel computer on campus, a committee of the Graduate Program in Computer Science surveyed selected Dartmouth College faculty and students in December, 1991, and January, 1992. We hope that the information in this report can be used by many groups on campus, including the Computer Science graduate program and DAGS summer institute, Kiewit\u27s NH Supercomputer Initiative, and by numerous researchers hoping to collaborate with people in other disciplines. We found significant interest in parallel supercomputing on campus. An on-campus parallel supercomputing facility would not only support numerous courses and research projects, but would provide a locus for intellectual activity in parallel computing, encouraging interdisciplinary collaboration. We believe that this report is a first step in that direction

Dartmouth Digital Commons (Dartmouth College)