33 research outputs found

    An Efficient Framework for Performing Execution-Constraint-Sensitive Transformations That Increase Instruction-Level Parallelism

    No full text
    263 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1997.The increasing amount of instruction-level parallelism required to fully utilize high issue-rate processors forces the compiler to perform increasingly advanced transformations, many of which require adding extra operations in order to remove those dependences constraining performance. Although aggressive application of these transformations is necessary in order to realize the full performance potential, overly-aggressive application can negate their benefit or even degrade performance. This thesis investigates a general framework for applying these transformations at schedule time, which is typically the only time the processor's execution constraints are visible to the compiler. Feedback from the instruction scheduler is then used to aggressively and intelligently apply these transformations. This results in consistently better performance than traditional application methods because the application of transformations can now be more fully adapted to the processor's execution constraints. Techniques for optimizing the processor's machine description for efficient use by the scheduler, and for incrementally updating the dependence graph after performing each transformation, allow the utilization of scheduler feedback with relatively small compile-time overhead.U of I OnlyRestricted to the U of I community idenfinitely during batch ingest of legacy ETD

    An Efficient Framework For Performing Execution-Constraint-Sensitive Transformations That Increase Instruction-Level Parallelism

    No full text
    The increasing amount of instruction-level parallelism required to fully utilize high issue-rate processors forces the compiler to perform increasingly advanced transformations, many of which require adding extra operations in order to remove those dependences constraining performance. Although aggressive application of these transformations is necessary in order to realize the full performance potential, overly-aggressive application can negate their benefit or even degrade performance. This thesis investigates a general framework for applying these transformations at schedule time, which is typically the only time the processor's execution constraints are visible to the compiler. Feedback from the instruction scheduler is then used to aggressively and intelligently apply these transformations. This results in consistently better performance than traditional application methods because the application of transformations can now be more fully adapted to the processor's execution constraints. Techniques for optimizing the processor's machine description for efficient use by the scheduler, and for incrementally updating the dependence graph after performing each transformation, allow the utilization of scheduler feedback with relatively small compile-time overhead

    Data relocation and prefetching for programs with large data sets

    No full text
    Numerical applications frequently contain nested loop structures that process large arrays of data. The execution of these loop structures often produces memory preference patterns that poorly utilize data caches. Limited associativity and cache capacity result in cache con ict misses. Also, non-unit stride access patterns can cause low utilization of cache lines. Data copying has been proposed and investigated in order to reduce the cache con ict misses [1][2], but this technique has a high execution overhead since it does the copy operations entirely in software. We propose a combined hardware and software technique called data relocation and prefetching which eliminates much of the overhead of data copying through the use of special hardware. Furthermore, by relocating the data while performing software prefetching, the overhead of copying the data can be reduced further. Experimental results for data relocation and prefetching are encouraging and show a large improvement incache performance. Index terms- Cache con icts, data copying, data relocation, program optimization, software prefetching.

    Optimization of Machine Descriptions for Efficient Use

    No full text
    A machine description facility allows compiler writers to specify machine execution constraints to the optimization and scheduling phases of an instruction-level parallelism (ILP) optimizing compiler. The machine description (MDES) facility should support quick development and easy maintenance of machine execution constraint descriptions by compiler writers. However, the facility should also allow compact representation and efficient usage of the MDES during compilation. This paper advocates a model that allows compiler writers to develop the MDES in a high-level language, which is then translated into a low-level representation for efficient use by the compiler. The discrepancy between the requirements of the high-level language and the low-level representation is reconciled with a collection of transformations that derive efficient low-level representations from the easy-to-understand high-level descriptions. In order to support these transformations, a novel approach to representing..
    corecore