16 research outputs found
Inter-Block Code Motion without Copies
OF DISSERTATION INTER-BLOCK CODE MOTION WITHOUT COPIES Code motion is an important optimization for any compiler, and the necessity to include instruction scheduling in compilers for instruction-level-parallel (ILP) architectures makes code motion even more important in compilers for such architectures. Currently popular global scheduling techniques such as trace scheduling allow interblock code motion during scheduling, but require compensation copies which may make them less useful for many popular ILP architectures. This work measures the amount of inter-block code motion possible when compensation copies are not allowed and develops a global scheduling technique, dominator-path scheduling, which relies on such inter-block motion without copies. Tests show that dominator-path scheduling outperforms trace scheduling for a test suite of C programs compiled for the IBM RISC System/6000, a popular superscalar computer. Philip H. Sweany Department of Computer Science Colorado State Unive..
Abstract Dominator-Path Scheduling — A Global Scheduling Method
Dominator-path scheduling performs global instruction scheduling of paths in the dominator tree. Unlike other global scheduling methods, dominator-path scheduling does not require copies of operations to preserve program semantics. In a limited test suite for a typical superscalar architecture, dominator-path scheduling produces schedules requiring 8.3 % fewer cycles than local scheduling alone.
Instruction Scheduling Using Simulated Annealing
Most nodes of modern massively-parallel computing systems contain processors that use instruction-level parallelism to increase the speed of the individual processor. In order to achieve the greatest speedup possible, the compiler must perform instruction scheduling so that instructions are presented to the processor in the order that is most efficient. Instruction scheduling is a compiler problem that, due to its NP-complete nature, requires heuristic solutions for any significant programs. One promising stochastic search technique that shows promise in the realm of instruction scheduling is the use of simulated annealing (SA.) Simulated annealing can be thought of as modified hill-climbing that includes “occasional perturbations ” in the search space under investigation to avoid the problem of getting stuck in a local minimum or maximum. The process gets its name from the fact that it closely follows the physical process of annealing which is gradual cooling of a liquid until it solidifies. We implemented an SA-driven instruction scheduler in our compiler for instruction-level parallel (ILP) architectures. This allows us to compare our SA instruction scheduler with a more traditional list scheduling approach. Experimental comparison of 114 data dependence graphs efficiency improvement of more than 6 % when using an SAdriven scheduler over results obtainable with the standard list scheduling technique. 1
Dominator-Path Scheduling - A Global Scheduling Method
Dominator-path scheduling performs global instruction scheduling of paths in the dominator tree. Unlike other global scheduling methods, dominator-path scheduling does not require copies of operations to preserve program semantics. In a limited test suite for a typical superscalar architecture, dominator-path scheduling produces schedules requiring 8.3% fewer cycles than local scheduling alone. 1 Introduction Architectures exhibiting instruction-level parallelism (ILP), such as superscalar and superpipelined machines, are currently popular. To best exploit instruction-level parallelism in these machines, an instruction scheduling phase is required during compilation. Instruction scheduling is typically classified as local if it considers code only within a basic block and global if it schedules multiple basic blocks at once. Local scheduling methods are well known (see [Bea91] for one summary.) Local instruction scheduling's largest impediment is its inability to consider context from..
Abstract
Frequency-Based List Scheduling (FBLS) extends standard List Scheduling by considering execution frequencies within a schedule. This is useful for global instruction scheduling methods that schedule groups of basic blocks, called meta-blocks, as though they were a single block. Traditional local schedulers operate on the premise that each instruction is executed the same number of times as every other instruction in the “block”, an unwarrented assumption for meta-blocks. This assumption can lead metablocks schedulers to produce inefficient code. FBLS provides an answer to this problem by considering the differing execution frequencies within meta-blocks when scheduling operations. To evaluate our contention that FBLS is a useful extension to standard list scheduling, we implemented FBLS and compared it to standard list scheduling within the context of dominator-path scheduling [1], a meta-block global scheduling algorithm. Experimental results show overall run-time improvement of 10.9 % for livermore loops.
Proceedings of the 29th Annual Hawaii International Conference on System Sciences- 1996 Extending List Scheduling to Consider Execution Frequency*
Frequency-Based List Scheduling (FBLS) extends stan-dard List Scheduling by considering execution frequencies within a schedule. This is useful for global instruction scheduling methods that schedule groups of basic blocks, called meta-blocks, as though they were a single block Traditional local schedulers operate on the premise that each instruction is executed the same number of times as every other instruction in the “block”, an unwarrented as-sumption for meta-blocks. This assumption can lead meta-blocks schedulers to produce ineficient code. FBLS pro-vides an answer to this problem by considering the diflering execution frequencies within meta-blocks when scheduling operations. To evaluate our contention that FBLS is a useful ex-tension to stana’ard list scheduling, we implemented FBZ.S and compared it to standard list scheduling within the con-text of dominator-path scheduling [l], a meta-block global scheduling algorithm. Experimental results show overall run-time improvement of 10.9 % for livermore loops.
Extending List Scheduling to Consider Execution Frequency
Frequency-Based List Scheduling (FBLS) extends standard List Scheduling by considering execution frequencies within a schedule. This is useful for global instruction scheduling methods that schedule groups of basic blocks, called meta-blocks, as though they were a single block. Traditionallocal schedulers operate on the premise that each instruction is executed the same number of times as every other instruction in the "block", an unwarrented assumption for meta-blocks. This assumption can lead metablocks schedulers to produce inefficient code. FBLS provides ananswer to this problem by considering the differing execution frequencies within meta-blocks when scheduling operations. To evaluate our contention that FBLS is a useful extension to standard list scheduling, we implemented FBLS and compared it to standard list scheduling within the context of dominator-path scheduling [1], a meta-block global scheduling algorithm. Experimental results show overall run-time improvement of 10.9% f..