5 research outputs found

    Register Allocation After Classical SSA Elimination is NP-Complete

    Full text link
    Abstract. Chaitin proved that register allocation is equivalent to graph coloring and hence NP-complete. Recently, Bouchez, Brisk, and Hack have proved independently that the interference graph of a program in static single assignment (SSA) form is chordal and therefore colorable in linear time. Can we use the result of Bouchez et al. to do register allocation in polynomial time by first transforming the program to SSA form, then performing register allocation, and finally doing the classical SSA elimination that replaces φ-functions with copy instructions? In this paper we show that the answer is no, unless P = NP: register allocation after classical SSA elimination is NP-complete. Chaitin’s proof technique does not work for programs after classical SSA elimination; instead we use a reduction from the graph coloring problem for circular arc graphs.

    SSA Elimination after Register Allocation

    No full text
    Abstract. Compilers such as gcc use static-single-assignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSA-based register allocation performs SSA elimination after register allocation. SSA elimination before register allocation is straightforward and standard, while previously described approaches to SSA elimination after register allocation have shortcomings; in particular, they have problems with implementing copies between memory locations. We present spill-free SSA elimination, a simple and efficient algorithm for SSA elimination after register allocation that avoids increasing the number of spilled variables. We also present three optimizations of the core algorithm. Our experiments show that spillfree SSA elimination takes less than five percent of the total compilation time of a JIT compiler. Our optimizations reduce the number of memory accesses by more than 9 % and improve the program execution time by more than 1.8%.

    Nearly optimal register allocation with PBQP

    No full text
    Abstract. For irregular architectures global register allocation remains a challenging problem, and has received a lot of attention in recent years. The classical graph-colouring analogy used by Chaitin and Briggs is not adequate for irregular architectures featuring non-orthogonal instruction sets and irregular register sets. Previous work [1, 2] on register allocation based on partitioned boolean quadratic programming (PBQP) has demonstrated that this approach is effective for highly irregular architectures and small benchmarks. However, experiments have shown that the heuristic used for non-reducible nodes performs poorly for larger benchmarks and more regular architectures. In this paper we present a new heuristic for PBQP, which significantly outperforms the old heuristic, and produces register allocations equal to those of the state-of-the-art graph-colouring approach. We also present a new solver for PBQP which is based on branch-and-bound and is able to solve register allocations optimally. The branch-and-bound solver allows PBQP to be used as a progressive register allocator, where programmers may explicitly trade extra compile time for a better register allocation. Experiments were conducted using the register allocation problems in the SPEC2000 benchmark suite as input, with IA-32 as the target architecture. Using an optimal solver for PBQP we were able to solve 97.4 % of the register allocation problems in SPEC2000 optimally.

    Split Register Allocation: Linear Complexity Without the Performance Penalty

    Get PDF
    Traditional bytecode language tool chains distribute the roles among offline and online compilers. Verification and code compaction are typically assigned to ofinria-00551513
    corecore