916 research outputs found

    Survey on Combinatorial Register Allocation and Instruction Scheduling

    Full text link
    Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the last three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time. This paper provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization

    Register allocation by graph coloring under full live-range splitting

    Get PDF
    International audienceRegister allocation is often a two-phase approach: spilling of registers to memory, followed by coalescing of registers. Extreme liverange splitting (i.e. live-range splitting after each statement) enables optimal solutions based on ILP, for both spilling and coalescing. However, while the solutions are easily found for spilling, for coalescing they are more elusive. This difficulty stems from the huge size of interference graphs resulting from live-range splitting. This paper focuses on coalescing in the context of extreme liverange splitting. It presents some theoretical properties that give rise to an algorithm for reducing interference graphs. This reduction consists mainly in finding and removing useless splitting points. It is followed by a graph decomposition based on clique separators. The reduction and decomposition are general enough, so that any coalescing algorithm can be applied afterwards. Our strategy for reducing and decomposing interference graphs preserves the optimality of coalescing. When used together with an optimal coalescing algorithm (e.g. ILP), optimal solutions are much more easily found. The strategy has been tested on a standard benchmark, the optimal coalescing challenge. For this benchmark, the cutting-plane algorithm for optimal coalescing (the only optimal algorithm for coalescing) runs 300 times faster when combined with our strategy. Moreover, we provide all the optimal solutions of the optimal coalescing challenge, including the three instances that were previously unsolved

    Optimistic chordal coloring: a coalescing heuristic forSSAform programs

    Get PDF
    The interference graph for a procedure in Static Single Assignment (SSA) Form is chordal. Since the k-colorability problem can be solved in polynomial-time for chordal graphs, this result has generated interest in SSA-based heuristics for spilling and coalescing. Since copies can be folded during SSA construction, instances of the coalescing problem under SSA have fewer affinities than traditional methods. This paper presents Optimistic Chordal Coloring (OCC), a coalescing heuristic for chordal graphs. OCC was evaluated on interference graphs from embedded/multimedia benchmarks: in all cases, OCC found the optimal solution, and ran, on average, 2.30× faster than Iterated Register Coalescin

    Experimental and theoretical study of combustion jet ignition

    Get PDF
    A combustion jet ignition system was developed to generate turbulent jets of combustion products containing free radicals and to discharge them as ignition sources into a combustible medium. In order to understand the ignition and the inflammation processes caused by combustion jets, the studies of the fluid mechanical properties of turbulent jets with and without combustion were conducted theoretically and experimentally. Experiments using a specially designed igniter, with a prechamber to build up and control the stagnation pressure upstream of the orifice, were conducted to investigate the formation processes of turbulent jets of combustion products. The penetration speed of combustion jets has been found to be constant initially and then decreases monotonically as turbulent jets of combustion products travel closer to the wall. This initial penetration speed to combustion jets is proportional to the initial stagnation pressure upstream of the orifice for the same stoichiometric mixture. Computer simulations by Chorin's Random Vortex Method implemented with the flame propagation algorithm for the theoretical model of turbulent jets with and without combustion were performed to study the turbulent jet flow field. In the formation processes of the turbulent jets, the large-scale eddy structure of turbulence, the so-called coherent structure, dominates the entrainment and mixing processes. The large-scale eddy structure of turbulent jets in this study is constructed by a series of vortex pairs, which are organized in the form of a staggered array of vortex clouds generating local recirculation flow patterns

    Parallel Copy Elimination on Data Dependence Graphs

    Get PDF
    Register allocation regained much interest in recent years due to the development of decoupled strategies that split the problem into separate phases: spilling, register assignment, and copy elimination. Traditional approaches to copy elimination during register allocation are based on interference graphs and register coalescing. Variables are represented as nodes in a graph, which are coalesced, if they can be assigned the same register. However, decoupled approaches strive to avoid interference graphs and thus often resort to local recoloring. A common assumption of existing coalescing and recoloring approaches is that the original ordering of the instructions in the program is not changed. This work presents an extension of a local recoloring technique called Parallel Copy Motion. We perform code motion on data dependence graphs in order to eliminate useless copies and reorder instructions, while at the same time a valid register assignment is preserved. Our results show that even after traditional register allocation with coalescing our technique is able to eliminate an additional 3% (up to 9%) of the remaining copies and reduce the weighted costs of register copies by up to 25% for the SPECINT 2000 benchmarks. In comparison to Parallel Copy Motion, our technique removes 11% (up to 20%) more copies and up to 39% more of the copy costs

    Enhancing the capabilities of LIGO time-frequency plane searches through clustering

    Full text link
    One class of gravitational wave signals LIGO is searching for consists of short duration bursts of unknown waveforms. Potential sources include core collapse supernovae, gamma ray burst progenitors, and mergers of binary black holes or neutron stars. We present a density-based clustering algorithm to improve the performance of time-frequency searches for such gravitational-wave bursts when they are extended in time and/or frequency, and not sufficiently well known to permit matched filtering. We have implemented this algorithm as an extension to the QPipeline, a gravitational-wave data analysis pipeline for the detection of bursts, which currently determines the statistical significance of events based solely on the peak significance observed in minimum uncertainty regions of the time-frequency plane. Density based clustering improves the performance of such a search by considering the aggregate significance of arbitrarily shaped regions in the time-frequency plane and rejecting the isolated minimum uncertainty features expected from the background detector noise. In this paper, we present test results for simulated signals and demonstrate that density based clustering improves the performance of the QPipeline for signals extended in time and/or frequency.Comment: 17 pages, 6 figures. Submitted to CQG on Dec 12, 2008; accepted on June 18, 200
    corecore