Search CORE

12 research outputs found

Evaluation of High Performance Fortran through Application Kernels

Author: Fox Geoffrey C.
Hawick Ken
Yau Hon W.
Publication venue: SURFACE at Syracuse University
Publication date: 01/01/1997
Field of study

Since the definition of the High Performance Fortran (HPF) standard, we have been maintaining a suite of application kernel codes with the aim of using them to evaluate the available compilers. This paper presents the results and conclusions from this study, for sixteen codes, on compilers from IBM, DEC, and the Portland Group Inc. (PGI), and on three machines: a DEC Alphafarm, an IBM SP-2, and a Cray T3D. From this, we hope to show the prospective HPF user that scalable performance is possible with modest effort, yet also where the current weaknesses lay

Syracuse University Research Facility and Collaborative Environment

Scalability and performance of MPI, HPF and OpenMP on an Origin 2000

Author: Guan Zhe
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2002
Field of study

Digital Repository @ Iowa State University (ISU)

High Performance Fortran Implementations: A Survey

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1997
Field of study

Crossref

Partial parallelization of VMEC system

Author: Zhou Mei
Publication venue: University of Montana, Maureen and Mike Mansfield Library
Publication date: 01/01/1999
Field of study

University of Montana

HPF to OpenMP on the Origin2000: a case study

Author: Brieger Leesa
Publication venue: 'Wiley'
Publication date: 30/11/2000
Field of study

The geophysics group at CRS4 has long developed echo reconstruction codes in HPF on distributed-memory machines. Now, however, with the arrival of shared-memory machines and their native OpenMP compilers, the transfer to OpenMP would seem to present the logical next step in our code development strategy. Recent experience with porting one of our important HPF codes to OpenMP does not bear this out—at least not on the Origin2000. The OpenMP code suffers from the immaturity of the standard, and the operating system's handling of UNIX threads seems to severely penalize OpenMP performance. On the other hand, the HPF code on the Origin2000 is fast, scalable and not disproportionately sensitive to load on the machine.1147–1154Pubblicat

P-arch

High Performance Fortran Comes of Age: Guest Editors' Introduction

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1997
Field of study

Crossref

DDT: a research tool for automatic data distribution in HPF

Author: Ayguadé Parra Eduard
García Almiñana Jordi
Gironès Medina Mercè
Grande Ayan Ma. Luz
Labarta Mancho Jesús José
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1997
Field of study

This article describes the main features and implementation of our automatic data distribution research tool. The tool (DDT) accepts programs written in Fortran 77 and generates High Performance Fortran (HPF) directives to map arrays onto the memories of the processors and parallelize loops, and executable statements to remap these arrays. DDT works by identifying a set of computational phases (procedures and loops). The algorithm builds a search space of candidate solutions for these phases which is explored looking for the combination that minimizes the overall cost; this cost includes data movement cost and computation cost. The movement cost reflects the cost of accessing remote data during the execution of a phase and the remapping costs that have to be paid in order to execute the phase with the selected mapping. The computation cost includes the cost of executing a phase in parallel according to the selected mapping and the owner computes rule. The tool supports interprocedural analysis and uses control flow information to identify how phases are sequenced during the execution of the application.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Directory of Open Access Journals

Compiler Techniques for Optimizing Communication and Data Distribution for Distributed-Memory Computers

Author: Palermo Daniel Joseph
Publication venue: Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Publication date: 01/05/1996
Field of study

Advanced Research Projects Agency (ARPA)National Aeronautics and Space AdministrationOpe

Illinois Digital Environment for Access to Learning and Scholarship Repository

HPCCP/CAS Workshop Proceedings 1998

Author: Mata Ellen
Schulbach Catherine
Schulbach Catherine
Publication venue
Publication date
Field of study

This publication is a collection of extended abstracts of presentations given at the HPCCP/CAS (High Performance Computing and Communications Program/Computational Aerosciences Project) Workshop held on August 24-26, 1998, at NASA Ames Research Center, Moffett Field, California. The objective of the Workshop was to bring together the aerospace high performance computing community, consisting of airframe and propulsion companies, independent software vendors, university researchers, and government scientists and engineers. The Workshop was sponsored by the HPCCP Office at NASA Ames Research Center. The Workshop consisted of over 40 presentations, including an overview of NASA's High Performance Computing and Communications Program and the Computational Aerosciences Project; ten sessions of papers representative of the high performance computing research conducted within the Program by the aerospace industry, academia, NASA, and other government laboratories; two panel sessions; and a special presentation by Mr. James Bailey

NASA Technical Reports Server

Code-Optimierung im Polyedermodell - Effizienzsteigerung von parallelen Schleifensätzen

Author: Faber Peter
Publication venue
Publication date: 21/10/2008
Field of study

A safe basis for automatic loop parallelization is the polyhedron model which represents the iteration domain of a loop nest as a polyhedron in

\mathbb{Z}^n

. However, turning the parallel loop program in the model to efficient code meets with several obstacles, due to which performance may deteriorate seriously -- especially on distributed memory architectures. We introduce a fine-grained model of the computation performed and show how this model can be applied to create efficient code