890 research outputs found

    On the Complexity of Spill Everywhere under SSA Form

    Get PDF
    Compilation for embedded processors can be either aggressive (time consuming cross-compilation) or just in time (embedded and usually dynamic). The heuristics used in dynamic compilation are highly constrained by limited resources, time and memory in particular. Recent results on the SSA form open promising directions for the design of new register allocation heuristics for embedded systems and especially for embedded compilation. In particular, heuristics based on tree scan with two separated phases -- one for spilling, then one for coloring/coalescing -- seem good candidates for designing memory-friendly, fast, and competitive register allocators. Still, also because of the side effect on power consumption, the minimization of loads and stores overhead (spilling problem) is an important issue. This paper provides an exhaustive study of the complexity of the ``spill everywhere'' problem in the context of the SSA form. Unfortunately, conversely to our initial hopes, many of the questions we raised lead to NP-completeness results. We identify some polynomial cases but that are impractical in JIT context. Nevertheless, they can give hints to simplify formulations for the design of aggressive allocators.Comment: 10 page

    A Polynomial Spilling Heuristic: Layered Allocation

    Get PDF
    Register allocation is one of the most important, and one of the oldest compiler optimizations. Its purpose is to map temporary variables to either machine registers or main memory locations and explicit load/store instructions. The latter option is referred to as spilling. This paper addresses the minimization of the spill code overhead, one of the di fficult problems in register allocation. We devised a heuristic approach called layered. It is rooted in the recent advances in SSA-based register allocation. As opposed to the conventional incremental spilling approaches, our method incrementally allocates clusters of variables. We describe a new polynomial method, the layered-optimal allocator, and demonstrate its quasi-optimiality on standard benchmarks and on two architectures.L'allocation de registres est l'une des premiÚres et des plus importantes optimisations effectuées par les compilateurs. Elle a pour but d'associer aux variables temporaires du programme des registres de la machine ou des locations mémoires et d'insérer, dans le code, des instructions de load/store explicites, appelées vidage. Dans ce papier, nous nous intéressons à la minimisation des latences mémoires dues au code de vidage, un des problÚmes difficiles en allocation de registres. Nous proposons une approche heuristique d'allocation par couches. Ce travail se base sur les récentes avancées en allocation de registres sous SSA. Contrairement à l'approche conventionnelle de vidage incrémental, notre méthode alloue les variables de maniÚre incrémentale par groupe. Nous comparons notre approche, appelée allocation-optimale par couche, aux méthodes de l'état de l'art à une approche optimale et nous montrons l'allocation-optimale par couche est quasi-optimale sur des benchmarks standard et sur deux architectures différentes

    Survey on Combinatorial Register Allocation and Instruction Scheduling

    Full text link
    Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the last three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time. This paper provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization

    Optimistic chordal coloring: a coalescing heuristic forSSAform programs

    Get PDF
    The interference graph for a procedure in Static Single Assignment (SSA) Form is chordal. Since the k-colorability problem can be solved in polynomial-time for chordal graphs, this result has generated interest in SSA-based heuristics for spilling and coalescing. Since copies can be folded during SSA construction, instances of the coalescing problem under SSA have fewer affinities than traditional methods. This paper presents Optimistic Chordal Coloring (OCC), a coalescing heuristic for chordal graphs. OCC was evaluated on interference graphs from embedded/multimedia benchmarks: in all cases, OCC found the optimal solution, and ran, on average, 2.30× faster than Iterated Register Coalescin

    COMPARISON OF INSTRUCTION SCHEDULING AND REGISTER ALLOCATION FOR MIPS AND HPL-PD ARCHITECTURE FOR EXPLOITATION OF INSTRUCTION LEVEL PARALLELISM

    Get PDF
    The integrated approaches for instruction scheduling and register allocation have been promising area of research for code generation and compiler optimization. In this paper we have proposed an integrated algorithm for instruction scheduling and register allocation and implemented it for compiler optimization in machine description in trimaran infrastructure for exploitation of Instruction level parallelism. Our implementation in trimaran infrastructure shows that our scheduler reduces the number of active live ranges dealt with linear scan allocator. As a result only few spills were needed and the quality of the code generated was improved. For our experiments we used 20 benchmarks available with trimaran infrastructure for HPL-PD architecture. We compare some of these results with results obtained by Haijing Tang et al (2013) performed by LLVM compiler on MIPS architecture. For our experimental work we added machine description (MDES) targeted to HL-PD architecture. The implemented algorithm is based on subgraph isomorphism. The input program is represented in the form of directed acyclic graph (DAG). The vertices of the DAG represent the instructions, input and output operands of the program, while the edges represent dependencies among the instructions
    • 

    corecore