31 research outputs found

    An APL-to-C compiler for the IBM RISC System/6000

    No full text

    On performance and space usage improvements for parallelized compiled APL code

    No full text

    An Array Operation Synthesis Scheme to Optimize Fortran 90 Programs

    No full text
    An increasing number of programming languages, such as Fortran 90 and APL, are providing a rich set of intrinsic array functions and array expressions. These constructs which constitute an important part of data parallel languages provide excellent opportunities for compiler optimizations. In this paper, we present a new approach to combine consecutive data access patterns of array constructs into a composite access function to the source arrays. Our scheme is based on the composition of access functions, which is similar to a composition of mathematic functions. Our new scheme can handle not only data movements of arrays of different numbers of dimensions and segmented array operations but also masked array expressions and multiple sources array operations. As a result, our proposed scheme is the first synthesis scheme which can synthesize Fortran 90 RESHAPE, EOSHIFT, MERGE, and WHERE constructs together. Experimental results show speedups from 1.21 to 2.95 for code fragments from rea..

    Exploitation of APL data parallelism on a shared-memory MIMD machine

    No full text

    A Function-Composition Approach to Synthesize Fortran 90 Array Operations

    No full text
    An increasing number of programming languages, such as Fortran 90 and APL, are providing a rich set of intrinsic array functions and array expressions. These constructs which constitute an important part of data parallel languages provide excellent opportunities for compiler optimizations. In this paper, we present a new approach to combine consecutive array operations or array expressions into a composite access function of the source arrays. Our scheme is based on the composition of access functions, which is analogous to a composition of mathematic functions. Our new scheme can handle not only data movements of arrays with different numbers of dimensions and with multiple-clause array operations but also masked array expressions and multiple-source array operations. As a result, our proposed scheme is the first synthesis scheme which can collectively synthesize Fortran 90 RESHAPE, EOSHIFT, MERGE, array reduction operations, and WHERE constructs. In addition, we also discuss the case..

    Partial Redundancy Elimination on Predicated Code

    No full text
    Partial redundancy elimination (PRE) is one of the most widespread optimizations in compilers. However, current PRE-techniques are inadequate to handle predicated code, i.e., programs where instructions are guarded by a 1-bit register that dynamically controls whether the e#ect of instruction should be committed or nullified. In fact, to exclude corrupting the semantics they must be overly conservative making them close to useless. Since predicated code will be more and more common with the advent of the IA-64 architecture, we present here a family of PRE-algorithms tailored for predicated code. Conceptually, the basic element of this family can be considered the counterpart of busy code motion of [20]. It can easily be tuned by two orthogonal means. First, by adjusting the power of a preprocess feeding it by information on predication. Second, by relaxing or strengthening the constraints on synthesizing predicates controlling the movability of computations. Together with extensions towards lazy code motion, this results in a family of PRE-algorithms spanning a range from tamed to quite aggressive algorithms, which is illustrated by various meaningful examples

    Probabilistic memory disambiguation and its application to data speculation

    No full text

    Probabilistic Memory Disambiguation and its Application to Data Speculation

    No full text
    In many code streams, a load instruction is on a critical path and is followed by a chain of operations depending on the load. The execution of a load from memory often yields a long latency on a modern microprocessor. To compensate for this latency and reduce the length of the critical path, one way is to trigger the load as early as possible in a non-blocking way. However, to preserve the semantics of a program on a conventional architecture, a load from a memory location must be executed after all of the corresponding aliasing stores potentially writing to this very location. This constraint often hampers an instruction scheduler to move the load sufficiently early to reduce the critical path length. This aliasing problem can be greatly alleviated with an architectural support of data speculation, which allows to speculate on a load and other instructions using the result of the load, that is to initiate a load before being sure all of the previous aliasing stores have completed. Cl..
    corecore