Search CORE

9 research outputs found

Code scheduling for VLIW/superscalar processors with limited register files

Author: Ellis
Freudenberger S.
John C. Gyllenhaal
Tokuzo Kiyohara
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Register connection: A new approach to adding registers into instruction set architectures

Author: Richard Hank
Roger Bringmann
Sadun Anik
Scott Mahlke
Tokuzo Kiyohara
Wen-mei Hwu
William Chen
Publication venue: ACM Press
Publication date: 01/01/1993
Field of study

Code optimization and scheduling for superscalar and superpipelined processors often increase the register requirement of programs. For existing instruction sets with a small to moderate number of registers, this increased register requirement can be a factor that limits the e ectiveness of the compiler. In this paper, we introduce a new architectural method for adding a set of extended registers into an architecture. Using a novel concept of connection, this method allows the data stored in the extended registers to be accessed by instructions that apparently reference core registers. Furthermore, we address the technical issues involved in applying the new method toanarchitecture: instruction set extension, procedure call convention, context switching considerations, upward compatibility, e cient implementation, compiler support, and performance. Experimental results based onaprototype compiler and execution driven simulation show that the proposed method can signi cantly improve the performance of superscalar processors with a small or moderate number of registers.

CiteSeerX

Abstract Tolerating Data Access Latency with Register Preloading

Author: Pohua P. Chang
Scott A. Mahlke
Tokuzo Kiyohara
Wen-mei W. Hwu
William Y. Chen
Publication venue
Publication date
Field of study

By exploiting ne grain parallelism, superscalar processors can potentially increase the performance of future supercomputers. However, supercomputers typically have a long access delay to their rst level memory which can severely restrict the performance of superscalar processors. Compilers attempt to move load instructions far enough ahead to hide this latency. However, conventional movement of load instructions is limited by data dependence analysis. This paper introduces a simple hardware scheme, referred to as preload register update, to allow the compiler to move load instructions even in the presence of inconclusive data dependence analysis results. Preload register update keeps the load destination registers coherent when load instructions are moved past store instructions that reference the same location. With this addition, superscalar processors can more e ectively tolerate longer data access latencies

CiteSeerX

Tolerating Data Access Latency with Register Preloading

Author: Pohua P. Chang
Scott A. Mahlke
Tokuzo Kiyohara
Wen-mei W. Hwu
William Chen
William Y. Chen
Publication venue
Publication date: 01/01/1992
Field of study

By exploiting fine grain parallelism, superscalar processors can potentially increase the performance of future supercomputers. However, supercomputers typically have a long access delay to their first level memory which can severely restrict the performance of superscalar processors. Compilers attempt to move load instructions far enough ahead to hide this latency. However, conventional movement of load instructions is limited by data dependence analysis. This paper introduces a simple hardware scheme, referred to as preload register update, to allow the compiler to move load instructions even in the presence of inconclusive data dependence analysis results. Preload register update keeps the load destination registers coherent when load instructions are moved past store instructions that reference the same location. With this addition, superscalar processors can more effectively tolerate longer data access latencies. Keywords: data dependence analysis, load latency, register file, re..

CiteSeerX

Crossref

Register connection

Author: Richard Hank
Roger Bringmann
Sadun Anik
Scott Mahlke
Tokuzo Kiyohara
Wen-Mei Hwu
William Chen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

The superblock: An effective technique for VLIW and superscalar compilation

Author: A. Aho
A. Aiken
B.R. Rau
D. Bernstein
D.J. Kuck
D.J. Kuck
Daniel M. Lavery
F.C. Chow
G. Kane
G.J. Chaitin
Grant E. Haab
H.S. Warren Jr.
Intel.
J. Ellis
J.A. Fisher
John G. Holm
M.A. Schuette
M.D. Smith
N.P. Jouppi
Nancy J. Warter
P.P. Chang
P.P. Chang
P.P. Chang
Pohua P. Chang
R. Gupta
R.P. Colwell
R.W. Horst
Richard E. Hank
Roger A. Bringmann
Roland G. Ouellette
Scott A. Mahlke
T. Nakatani
Tokuzo Kiyohara
W.W. Hwu
W.W. Hwu
W.W. Hwu
W.Y. Chen
Wen -Mei W. Hwu
William Y. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref