Search CORE

61 research outputs found

Computer Science: A Guide to Selected Resources on the Internet.

Author: Knee Michael
Publication venue: Scholars Archive
Publication date: 01/06/2001
Field of study

University at Albany, State University of New York (SUNY): Scholars Archive

Automating FEA programming

Author: Sharma Naveen
Publication venue
Publication date
Field of study

In this paper we briefly describe a combined symbolic and numeric approach for solving mathematical models on parallel computers. An experimental software system, PIER, is being developed in Common Lisp to synthesize computationally intensive and domain formulation dependent phases of finite element analysis (FEA) solution methods. Quantities for domain formulation like shape functions, element stiffness matrices, etc., are automatically derived using symbolic mathematical computations. The problem specific information and derived formulae are then used to generate (parallel) numerical code for FEA solution steps. A constructive approach to specify a numerical program design is taken. The code generator compiles application oriented input specifications into (parallel) FORTRAN77 routines with the help of built-in knowledge of the particular problem, numerical solution methods and the target computer

NASA Technical Reports Server

Automatic Loop Parallelization via Compiler Guided Refactoring

Author: Karlsson Sven
Ladelsky Razya
Larsen Per
Lidman Jacob
McKee Sally A.
Zaks Ayal
Publication venue: Technical University of Denmark
Publication date: 01/01/2011
Field of study

Online Research Database In Technology

Efficient Machine-Independent Programming of High-Performance Multiprocessors

Author: Tseng Chau-Wen
Publication venue
Publication date: 15/10/1998
Field of study

Parallel computing is regarded by most computer scientists as the most likely approach for significantly improving computing power for scientists and engineers. Advances in programming languages and parallelizing compilers are making parallel computers easier to use by providing a high-level portable programming model that protects software investment. However, experience has shown that simply finding parallelism is not always sufficient for obtaining good performance from today's multiprocessors. The goal of this project is to develop advanced compiler analysis of data and computation decompositions, thread placement, communication, synchronization, and memory system effects needed in order to take advantage of performance-critical elements in modern parallel architectures

Digital Repository at the University of Maryland

Refactoring Sequential Java Code for Concurrency via Concurrent Libraries

Author: Dig Danny
Ernst Michael D.
Marrero John
Publication venue
Publication date: 30/09/2008
Field of study

Parallelizing existing sequential programs to run efficiently on multicores is hard. The Java 5 packagejava.util.concurrent (j.u.c.) supports writing concurrent programs: much of the complexity of writing threads-safe and scalable programs is hidden in the library. To use this package, programmers still need to reengineer existing code. This is tedious because it requires changing many lines of code, is error-prone because programmers can use the wrong APIs, and is omission-prone because programmers can miss opportunities to use the enhanced APIs. This paper presents our tool, CONCURRENCER, which enables programmers to refactor sequential code into parallel code that uses j.u.c. concurrent utilities. CONCURRENCER does not require any program annotations, although the transformations are very involved: they span multiple program statements and use custom program analysis. A find-and-replace tool can not perform such transformations. Empirical evaluation shows that CONCURRENCER refactors code effectively: CONCURRENCER correctly identifies and applies transformations that some open-source developers overlooked, and the converted code exhibits good speedup

CiteSeerX

DSpace@MIT

Crossref

ADIFOR–Generating Derivative Codes from Fortran Programs

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1992
Field of study

Crossref

Compilation techniques for irregular problems on parallel machines

Author: Das Subhendu
Publication venue: W&M ScholarWorks
Publication date: 01/01/1994
Field of study

Massively parallel computers have ushered in the era of teraflop computing. Even though large and powerful machines are being built, they are used by only a fraction of the computing community. The fundamental reason for this situation is that parallel machines are difficult to program. Development of compilers that automatically parallelize programs will greatly increase the use of these machines.;A large class of scientific problems can be categorized as irregular computations. In this class of computation, the data access patterns are known only at runtime, creating significant difficulties for a parallelizing compiler to generate efficient parallel codes. Some compilers with very limited abilities to parallelize simple irregular computations exist, but the methods used by these compilers fail for any non-trivial applications code.;This research presents development of compiler transformation techniques that can be used to effectively parallelize an important class of irregular programs. A central aim of these transformation techniques is to generate codes that aggressively prefetch data. Program slicing methods are used as a part of the code generation process. In this approach, a program written in a data-parallel language, such as HPF, is transformed so that it can be executed on a distributed memory machine. An efficient compiler runtime support system has been developed that performs data movement and software caching

College of William & Mary: W&M Publish

Distributed memory compiler design for sparse problems

Author: Berryman Harry
Hiranandani Seema
Saltz Joel
Wu Janet
Publication venue
Publication date
Field of study

A compiler and runtime support mechanism is described and demonstrated. The methods presented are capable of solving a wide range of sparse and unstructured problems in scientific computing. The compiler takes as input a FORTRAN 77 program enhanced with specifications for distributing data, and the compiler outputs a message passing program that runs on a distributed memory computer. The runtime support for this compiler is a library of primitives designed to efficiently support irregular patterns of distributed array accesses and irregular distributed array partitions. A variety of Intel iPSC/860 performance results obtained through the use of this compiler are presented

NASA Technical Reports Server

An integrated compile-time/run-time software distributed shared memory system

Author: Cox A.L.
Dwarkadas S.
Zwaenepoel W.
Publication venue
Publication date: 20/10/2005
Field of study

On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance of hand-coded message passing by translating data-parallel programs into message passing programs, but efficient execution is limited to those programs for which precise analysis can be carried out. Shared memory is easier to program than message passing and its domain is not constrained by the limitations of parallelizing compilers, but it lags in performance. Our goal is to close that performance gap while retaining the benefits of shared memory. In other words, our goal is (1) to make shared memory as efficient as message passing, whether hand-coded or compiler-generated, (2) to retain its ease of programming, and (3) to retain the broader class of applications it supports.To this end we have designed and implemented an integrated compile-time and run-time software DSM system. The programming model remains identical to the original pure run-time DSM system. No user intervention is required to obtain the benefits of our system. The compiler computes data access patterns for the individual processors. It then performs a source-to-source transformation, inserting in the program calls to inform the run-time system of the computed data access patterns. The run-time system uses this information to aggregate communication, to aggregate data and synchronization into a single message, to eliminate consistency overhead, and to replace global synchronization with point-to-point synchronization wherever possible.We extended the Parascope programming environment to perform the required analysis, and we augmented the TreadMarks run-time DSM library to take advantage of the analysis. We used six Fortran programs to assess the performance benefits: Jacobi, 3D-FFT, Integer Sort, Shallow, Gauss, and Modified Gramm-Schmidt, each with two different data set sizes. The experiments were run on an 8-node IBM SP/2 using user-space communication. Compiler optimization in conjunction with the augmented run-time system achieves substantial execution time improvements in comparison to the base TreadMarks, ranging from 4% to 59% on 8 processors. Relative to message passing implementations of the same applications, the compile-time run-time system is 0-29% slower than message passing, while the base run-time system is 5-212% slower. For the five programs that XHPF could parallelize (all except IS), the execution times achieved by the compiler optimized shared memory programs are within 9% of XHPF

Infoscience - École polytechnique fédérale de Lausanne