617 research outputs found

    Register Allocation Via Coloring of Chordal Graphs

    Full text link
    Abstract. We present a simple algorithm for register allocation which is competitive with the iterated register coalescing algorithm of George and Appel. We base our algorithm on the observation that 95 % of the methods in the Java 1.5 library have chordal interference graphs when compiled with the JoeQ compiler. A greedy algorithm can optimally color a chordal graph in time linear in the number of edges, and we can eas-ily add powerful heuristics for spilling and coalescing. Our experiments show that the new algorithm produces better results than iterated regis-ter coalescing for settings with few registers and comparable results for settings with many registers.

    Optimistic chordal coloring: a coalescing heuristic forSSAform programs

    Get PDF
    The interference graph for a procedure in Static Single Assignment (SSA) Form is chordal. Since the k-colorability problem can be solved in polynomial-time for chordal graphs, this result has generated interest in SSA-based heuristics for spilling and coalescing. Since copies can be folded during SSA construction, instances of the coalescing problem under SSA have fewer affinities than traditional methods. This paper presents Optimistic Chordal Coloring (OCC), a coalescing heuristic for chordal graphs. OCC was evaluated on interference graphs from embedded/multimedia benchmarks: in all cases, OCC found the optimal solution, and ran, on average, 2.30× faster than Iterated Register Coalescin

    Parallel Copy Elimination on Data Dependence Graphs

    Get PDF
    Register allocation regained much interest in recent years due to the development of decoupled strategies that split the problem into separate phases: spilling, register assignment, and copy elimination. Traditional approaches to copy elimination during register allocation are based on interference graphs and register coalescing. Variables are represented as nodes in a graph, which are coalesced, if they can be assigned the same register. However, decoupled approaches strive to avoid interference graphs and thus often resort to local recoloring. A common assumption of existing coalescing and recoloring approaches is that the original ordering of the instructions in the program is not changed. This work presents an extension of a local recoloring technique called Parallel Copy Motion. We perform code motion on data dependence graphs in order to eliminate useless copies and reorder instructions, while at the same time a valid register assignment is preserved. Our results show that even after traditional register allocation with coalescing our technique is able to eliminate an additional 3% (up to 9%) of the remaining copies and reduce the weighted costs of register copies by up to 25% for the SPECINT 2000 benchmarks. In comparison to Parallel Copy Motion, our technique removes 11% (up to 20%) more copies and up to 39% more of the copy costs

    Parameterized Construction of Program Representations for Sparse Dataflow Analyses

    Get PDF
    Data-flow analyses usually associate information with control flow regions. Informally, if these regions are too small, like a point between two consecutive statements, we call the analysis dense. On the other hand, if these regions include many such points, then we call it sparse. This paper presents a systematic method to build program representations that support sparse analyses. To pave the way to this framework we clarify the bibliography about well-known intermediate program representations. We show that our approach, up to parameter choice, subsumes many of these representations, such as the SSA, SSI and e-SSA forms. In particular, our algorithms are faster, simpler and more frugal than the previous techniques used to construct SSI - Static Single Information - form programs. We produce intermediate representations isomorphic to Choi et al.'s Sparse Evaluation Graphs (SEG) for the family of data-flow problems that can be partitioned per variables. However, contrary to SEGs, we can handle - sparsely - problems that are not in this family

    Parallel Copy Motion

    Get PDF
    International audienceRecent results on the static single assignment (SSA) form open promising directions for the design of new register allocation heuristics for just-in-time (JIT) compilation. In particular, heuris- tics based on tree scans with two decoupled phases, one for spilling, one for splitting/coloring/coalescing, seem good candidates for de- signing memory-friendly, fast, and competitive register allocators. Another class of register allocators, well-suited for JIT compilation, are those based on linear scans. Most of them perform coalesc- ing poorly but also do live-range splitting (mostly on control-flow edges) to avoid spilling. This leads to a large amount of register-to- register copies inside basic blocks but also, implicitly, on critical edges, i.e., edges that flow from a block with several successors to a block with several predecessors. This paper presents a new back-end optimization that we call parallel copy motion. The technique is to move copy instructions in a register-allocated code from a program point, possibly an edge, to another. In contrast with a classical scheduler that must preserve data dependences, our copy motion also permutes register assign- ments so that a copy can "traverse" all instructions of a basic block, except those with conflicting register constraints. Thus, parallel copies can be placed either where the scheduling has some empty slots (for multiple-issues architectures), or where fewer copies are necessary because some variables are dead at this point. Moreover, to the cost of some code compensations (namely, the reverse of the copy), a copy can also be moved out from a critical edge. This pro- vides a simple solution to avoid critical-edge splitting, especially useful when the compiler cannot split it, as it is the case for abnor- mal edges. This compensation technique also enables the schedul- ing/motion of the copy in the successor or predecessor basic block. Experiments with the SPECint benchmarks suite and our own benchmark suite show that we can now apply broadly an SSA-based register allocator: all procedures, even with abnormal edges, can be treated. Simple strategies for moving copies from edges and locally inside basic block show significant average improvements (4% for SPECint and 3% for our suite), with no degradation. It let us believe that the approach is promising, and not only for improving coalescing in fast register allocators

    Studies on register allocation for instruction-level parallel processors

    Get PDF
    制度:新 ; 文部省報告番号:甲2570号 ; 学位の種類:博士(工学) ; 授与年月日:2008/3/15 ; 早大学位記番号:新472

    A New Verified Compiler Backend for CakeML

    Get PDF
    We have developed and mechanically verified a new compiler backend for CakeML. Our new compiler features a sequence of intermediate languages that allows it to incrementally compile away high-level features and enables verification at the right levels of semantic detail. In this way, it resembles mainstream (unverified) compilers for strict functional languages. The compiler supports efficient curried multi-argument functions, configurable data representations, exceptions that unwind the call stack, register allocation, and more. The compiler targets several architectures: x86-64, ARMv6, ARMv8, MIPS-64, and RISC-V. In this paper, we present the overall structure of the compiler, including its 12 intermediate languages, and explain how everything fits together. We focus particularly on the interaction between the verification of the register allocator and the garbage collector, and memory representations. The entire development has been carried out within the HOL4 theorem prover.Engineering and Physical Sciences Research Counci

    Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency

    Get PDF
    Compared to the previous versions, the only change is correcting an awful typo that made Algorithm 1 wrong. Line 18 is not "if b = loc(pred(b))" but simply "if b = loc(b)".Static single assignment (SSA) form is an intermediate program representation in which many code optimizations can be performed with fast and easy-to-implement algorithms. However, some of these optimizations create situations where the SSA variables arising from the same original variable now have overlapping live ranges. This complicates the translation out of SSA code into standard code. There are three issues to consider: correctness, code quality (elimination of copies), and algorithm efficiency (speed and memory footprint). Briggs et al. proposed patches to correct the initial approach of Cytron et al. A cleaner and more general approach was proposed by Sreedhar et al., along with techniques to reduce the number of generated copies. We propose a new approach based on coalescing and a precise view of interferences, in which correctness and optimizations are separated. Our approach is provably correct and simpler to implement, with no patches or particular cases as in previous solutions, while reducing the number of generated copies. Also, experiments with SPEC CINT2000 show that it is 2x faster and 10x less memory-consuming than the Method~III of Sreedhar et al., which makes it suitable for just-in-time compilation
    corecore