Search CORE

33 research outputs found

CLOMP: Accurately Characterizing OpenMP Application Overheads

Author: Bronis R. de Supinski
Greg Bronevetsky
John Gyllenhaal
W.D. Collins
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

An Efficient Framework for Performing Execution-Constraint-Sensitive Transformations That Increase Instruction-Level Parallelism

Author: Gyllenhaal John Christopher
Publication venue
Publication date
Field of study

263 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1997.The increasing amount of instruction-level parallelism required to fully utilize high issue-rate processors forces the compiler to perform increasingly advanced transformations, many of which require adding extra operations in order to remove those dependences constraining performance. Although aggressive application of these transformations is necessary in order to realize the full performance potential, overly-aggressive application can negate their benefit or even degrade performance. This thesis investigates a general framework for applying these transformations at schedule time, which is typically the only time the processor's execution constraints are visible to the compiler. Feedback from the instruction scheduler is then used to aggressively and intelligently apply these transformations. This results in consistently better performance than traditional application methods because the application of transformations can now be more fully adapted to the processor's execution constraints. Techniques for optimizing the processor's machine description for efficient use by the scheduler, and for incrementally updating the dependence graph after performing each transformation, allow the utilization of scheduler feedback with relatively small compile-time overhead.U of I OnlyRestricted to the U of I community idenfinitely during batch ingest of legacy ETD

Illinois Digital Environment for Access to Learning and Scholarship Repository

An Efficient Framework For Performing Execution-Constraint-Sensitive Transformations That Increase Instruction-Level Parallelism

Author: John Christopher Gyllenhaal
Publication venue
Publication date
Field of study

The increasing amount of instruction-level parallelism required to fully utilize high issue-rate processors forces the compiler to perform increasingly advanced transformations, many of which require adding extra operations in order to remove those dependences constraining performance. Although aggressive application of these transformations is necessary in order to realize the full performance potential, overly-aggressive application can negate their benefit or even degrade performance. This thesis investigates a general framework for applying these transformations at schedule time, which is typically the only time the processor's execution constraints are visible to the compiler. Feedback from the instruction scheduler is then used to aggressively and intelligently apply these transformations. This results in consistently better performance than traditional application methods because the application of transformations can now be more fully adapted to the processor's execution constraints. Techniques for optimizing the processor's machine description for efficient use by the scheduler, and for incrementally updating the dependence graph after performing each transformation, allow the utilization of scheduler feedback with relatively small compile-time overhead

CiteSeerX

Data relocation and prefetching for programs with large data sets

Author: Grant Haab
John Gyllenhaal
Wen-mei Hwu
Yoji Yamada
Publication venue
Publication date: 01/01/1994
Field of study

Numerical applications frequently contain nested loop structures that process large arrays of data. The execution of these loop structures often produces memory preference patterns that poorly utilize data caches. Limited associativity and cache capacity result in cache con ict misses. Also, non-unit stride access patterns can cause low utilization of cache lines. Data copying has been proposed and investigated in order to reduce the cache con ict misses [1][2], but this technique has a high execution overhead since it does the copy operations entirely in software. We propose a combined hardware and software technique called data relocation and prefetching which eliminates much of the overhead of data copying through the use of special hardware. Furthermore, by relocating the data while performing software prefetching, the overhead of copying the data can be reduced further. Experimental results for data relocation and prefetching are encouraging and show a large improvement incache performance. Index terms- Cache con icts, data copying, data relocation, program optimization, software prefetching.

CiteSeerX

Crossref

Code scheduling for VLIW/superscalar processors with limited register files

Author: Ellis
Freudenberger S.
John C. Gyllenhaal
Tokuzo Kiyohara
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Optimization of Machine Descriptions for Efficient Use

Author: B. Ramakrishna Rau
John Gyllenhaal
Wen-mei W. Hwu
Publication venue
Publication date
Field of study

A machine description facility allows compiler writers to specify machine execution constraints to the optimization and scheduling phases of an instruction-level parallelism (ILP) optimizing compiler. The machine description (MDES) facility should support quick development and easy maintenance of machine execution constraint descriptions by compiler writers. However, the facility should also allow compact representation and efficient usage of the MDES during compilation. This paper advocates a model that allows compiler writers to develop the MDES in a high-level language, which is then translated into a low-level representation for efficient use by the compiler. The discrepancy between the requirements of the high-level language and the low-level representation is reconciled with a collection of transformations that derive efficient low-level representations from the easy-to-understand high-level descriptions. In order to support these transformations, a novel approach to representing..

CiteSeerX