146 research outputs found
Recommended from our members
Fine grain software pipelining of non-vectorizable nested loops
This paper presents a new technique to parallelize nested loops at the statement level. It transforms sequential nested loops, either vectorizable or not, into parallel ones. Previously, the wavefront method was used to parallelize non-vectorizable nested loops. However, in order to reduce the complexity of parallelization, the wavefront method regards an iteration as an unbreakable scheduling unit and draws parallelism through iteration overlapping. Our technique takes a statement rather than an iteration as the scheduling unit and exploits parallelism by overlapping the statements in all dimensions. In this paper, we show how this finer grain parallelization can be achieved with reasonable computational complexity, and the effectiveness of the resulting method in exploiting parallelism
Parafrase restructuring of FORTRAN code for parallel processing
Parafrase transforms a FORTRAN code, subroutine by subroutine, into a parallel code for a vector and/or shared-memory multiprocessor system. Parafrase is not a compiler; it transforms a code and provides information for a vector or concurrent process. Parafrase uses a data dependency to reveal parallelism among instructions. The data dependency test distinguishes between recurrences and statements that can be directly vectorized or parallelized. A number of transformations are required to build a data dependency graph
A Rules Based Approach to Analyze Data Dependent Transformation Strategies of a Supercompiler for Parallel Computers.
A supercompiler is a program that attempts to automatically restructure serial code into an equivalent parallel form. This restructuring is achieved through the application of various transformation strategies designed to remove data dependences. A data dependence is a relation between two programming statements that prevent those two statements from being executed in parallel. This research develops a rules based system to analyze the various data dependent transformation strategies of a supercompiler for parallel computers. With the information obtained from user input and the automated analysis of a program segment, this rules based analysis will be able to determine which of the available transformation strategies is the optimal one to be applied for a particular program segment
Per Aspera ad Astra: On the Way to Parallel Processing
Computational Science and Engineering is being established as a third category of scientific methodology; this innovative discipline supports and supplements the traditional categories: theory and experiment, in order to solve the problems arising from complex systems challenging science and technology. While the successes of the past two decades in scientific computing have been achieved essentially by the technical breakthrough of the vector-supercomputers, today the discussion about the future of supercomputing is focussed on massively parallel computers. The discrepancy, however, between peak performance and sustained performance achievable with algorithmic kernels, software packages, and real applications is still disappointingly high. An important issue are programming models. While Message Passing on parallel computers with distributed memory is the only efficient programming paradigm available today, from a user's point of view it is hard to imagine that this programming model, rather than Shared Virtual Memory, will be capable to serve as the central basis in order to bring computing on massively parallel systems from a sheer computer science trend to the technological breakthrough needed to deal with the large applications of the future; this is especially true for commercial applications where explicit programming the data communication via Message Passing may turn out to be a huge software-technological barrier which nobody might be willing to surmount.KFA Jülich is one of the largest big-science research centres in Europe; its scientific and engineering activities are ranging from fundamental research to applied science and technology. KFA's Central Institute for Applied Mathematics (ZAM) is running the large-scale computing facilities and network systems at KFA and is providing communication services, general-purpose and supercomputer capacity also to the HLRZ ("Höchstleistungsrechenzentrum") established in 1987 in order to further enhance and promote computational science in Germany. Thus, at KFA - and in particular enforced by ZAM - supercomputing has received high priority since more than ten years. What particle accelerators mean to experimental physics, supercomputers mean to Computational Science and Engineering: Supercomputers are the accelerators of theory
Recommended from our members
Applying an abstract data structure description approach to parallelizing scientific pointer programs
Even though impressive progress has been made in the area of parallelizing scientific programs with arrays, the application of similar techniques to programs with pointer data structures has remained difficult. Unlike arrays which have a small number of well-defined properties that can be utilized by a parallelizing compiler, pointer data structures are used to implement a wide variety of structures that exhibit a much more diverse set of properties. The complexity and diversity of such properties means that, in general, scientific programs with pointer data structures cannot be effectively analyzed by an optimizing and parallelizing compiler.In order to provide a system in which the compiler can fully utilize the properties of different types of pointer data structures, we have developed a mechanism for the Abstract Description of Data Structures (ADDS). With our approach, the programmer can explicitly describe important properties such as dimensionality of the pointer data structure, independence of dimensions, and direction of traversal. These abstract descriptions of pointer data structures are then used by the compiler to guide analysis, optimization, and parallelization.In this paper we summarize the ADDS approach through the use of numerous examples of data structures used in scientific computations, we illustrate how such declarations are natural and non-tedious to specify, and we show how the ADDS declarations can be used to improve compile-time analysis. In order to demonstrate the viability of our approach, we show how such techniques can be used to parallelize an important class of scientific codes which naturally use recursive pointer data structures. In particular, we use our approach to develop the parallelization of an N-body simulation that is based on a relatively complicated pointer data structure, and we report the speedup results for a Sequent multiprocessor
Pipeline synthesis and optimization for reconfigurable custom computing machines
This paper presents a pipeline synthesis and optimization technique
for high-level language programming of reconfigurable Custom
Computing Machines. The circuit synthesis generates hardware
accelerators from a sequential program which exploit the
reconfigurable hardware\u27s parallelism. Program loops are transformed
to structural hardware specifications. The optimization algorithm
uses integer linear programming to balance and pipeline the
circuit\u27s registers. This global optimization determines the minimal
amount of flip-flops necessary for an optimal pipeline throughput.
It also considers the irregular flip-flop distribution on FPGAs.
Standard interface circuitry and a runtime system provide the
connection between the accelerator unit and its host computer. An
integrated compiler invokes the synthesis and produces a program
which downloads, calls and controls its hardware accelerators
automatically
On the Super-computational Background of the Research Centre Jülich
KFA Jülich is one of the largest big-science research centres in Europe; its scientific and engineering activities are ranging from fundamental research to applied science and technology. KFA's Central Institute for Applied Mathematics (ZAM) is running the large-scale computing facilities and network systems at KFA and is providing communication services, general-purpose and supercomputer capacity also for the HLRZ ("Höchstleistungsrechenzentrum") established in 1987 in order to further enhance and promote computational science in Germany. Thus, at KFA - and in particular enforced by ZAM - supercomputing has received high priority since more than ten years. What particle accelerators mean to experimental physics, supercomputers mean to Computational Science and Engineering: Supercomputers are the accelerators of theory
Intermediate Code Generation for Portable Scalable, Compilers. Architecture Independent Data Parallelism: The Preliminaries
This paper introduces the goals of the Portable, Scalable, Architecture Independent (PSI) Compiler Project for Data Parallel Languages at the University of Missouri-Rolla. A goal of this project is to produce a subcompiler for data parallel scientific programming languages such as HPF(High Performance Fortran) where the input grammar is translated to a three-address code intermediate language. Ultimately we plan to integrate our work into automated synthesis systems for scientific programming because we feel that it should not be necessary to learn complicated programming techniques to use multiprocessor computers or networks of computers effectively. This paper shows how to compile a data parallel language to an arbitrary multiprocessor topology or network of CPUs given the number of processors, length of vector registers, and total number of components in an array assuming a message passing, distributed memory paradigm of send and receive. We emphasize that this paradigm is not only amenable to machines such as the CM5 and NCube but to LAN and WAN connected architectures. We do automatic program partitioning and mapping to processing elements of a multiprocessor architecture or distributed network of machines. No programmer intervention is required, hence, no errors will be introduced through data decomposition
- …