Search CORE

13 research outputs found

Recommended from our members

A spill code minimization algorithm for loops

Author: Kolson David J.
Nicolau Alexandru
Publication venue: eScholarship, University of California
Publication date: 29/06/1992
Field of study

Loops are the main source of parallelism in applications. The issue of finding an optimal register allocation to loops has been an open issue for some time. In this case optimal refers to the minimization of spills from registers to memory. In this paper we address this issue and present an optimal, but exponential algorithm which allocates registers to loop bodies such that the spill code is minimal. We also show heuristic modifications to the algorithm which perform in practice as well as the exponential approach. Finally, we examine this algorithm's feasibility in production compilers

eScholarship - University of California

Complete and Practical Universal Instruction Selection

Author: Blindell G. H.
Boender J.
Buchwald S.
Eckstein E.
Floch A.
Gebotys C. H.
Johnson N.
Land A. H.
Lattner C.
Lee C.
Lozano R. C.
Nethercote N.
Single Thread Performance CPU
Tanaka H.
Wilson T.
Živojnović V.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Path splitting--a technique for improving data flow analysis

Author: Poletto Massimiliano Antonio
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1995
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 83-87).by Massimiliano Antonio Poletto.M.Eng

DSpace@MIT

Eliminating Branches using a Superoptimizer and the GNU C Compiler

Author: Richard Kenner
Torbjo Rn Granlund
Torbjorn Granlund
Publication venue
Publication date: 01/01/1992
Field of study

this paper uses the RS/6000 for all its examples, the techniques described here are applicable to most machine

CiteSeerX

Eliminating branches using a superoptimizer and the GNU C compiler

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1992
Field of study

Crossref

Eliminating branches using a superoptimizer and the GNU C compiler

Author: Richard Kenner
Torbjörn Granlund
Warren Henry
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Application-Specific Memory Subsystems

Author: Wingbermuehle Joseph George
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2015
Field of study

The disparity in performance between processors and main memories has led computer architects to incorporate large cache hierarchies in modern computers. These cache hierarchies are designed to be general-purpose in that they strive to provide the best possible performance across a wide range of applications. However, such a memory subsystem does not necessarily provide the best possible performance for a particular application. Although general-purpose memory subsystems are desirable when the work-load is unknown and the memory subsystem must remain fixed, when this is not the case a custom memory subsystem may be beneficial. For example, in an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA) designed to run a particular application, a custom memory subsystem optimized for that application would be desirable. In addition, when there are tunable parameters in the memory subsystem, it may make sense to change these parameters depending on the application being run. Such a situation arises today with FPGAs and, to a lesser extent, GPUs, and it is plausible that general-purpose computers will begin to support greater flexibility in the memory subsystem in the future. In this dissertation, we first show that it is possible to create application-specific memory subsystems that provide much better performance than a general-purpose memory subsystem. In addition, we show a way to discover such memory subsystems automatically using a superoptimization technique on memory address traces gathered from applications. This allows one to generate a custom memory subsystem with little effort. We next show that our memory subsystem superoptimization technique can be used to optimize for objectives other than performance. As an example, we show that it is possible to reduce the number of writes to the main memory, which can be useful for main memories with limited write durability, such as flash or Phase-Change Memory (PCM). Finally, we show how to superoptimize memory subsystems for streaming applications, which are a class of parallel applications. In particular, we show that, through the use of ScalaPipe, we can author and deploy streaming applications targeting FPGAs with superoptimized memory subsystems. ScalaPipe is a domain-specific language (DSL) embedded in the Scala programming language for generating streaming applications that can be implemented on CPUs and FPGAs. Using the ScalaPipe implementation, we are able to demonstrate actual performance improvements using the superoptimized memory subsystem with applications implemented in hardware

Washington University St. Louis: Open Scholarship