3,743 research outputs found

    Survey on Combinatorial Register Allocation and Instruction Scheduling

    Full text link
    Register allocation (mapping variables to processor registers or memory) and instruction scheduling (reordering instructions to increase instruction-level parallelism) are essential tasks for generating efficient assembly code in a compiler. In the last three decades, combinatorial optimization has emerged as an alternative to traditional, heuristic algorithms for these two tasks. Combinatorial optimization approaches can deliver optimal solutions according to a model, can precisely capture trade-offs between conflicting decisions, and are more flexible at the expense of increased compilation time. This paper provides an exhaustive literature review and a classification of combinatorial optimization approaches to register allocation and instruction scheduling, with a focus on the techniques that are most applied in this context: integer programming, constraint programming, partitioned Boolean quadratic programming, and enumeration. Researchers in compilers and combinatorial optimization can benefit from identifying developments, trends, and challenges in the area; compiler practitioners may discern opportunities and grasp the potential benefit of applying combinatorial optimization

    Optimizing construction of scheduled data flow graph for on-line testability

    Get PDF
    The objective of this work is to develop a new methodology for behavioural synthesis using a flow of synthesis, better suited to the scheduling of independent calculations and non-concurrent online testing. The traditional behavioural synthesis process can be defined as the compilation of an algorithmic specification into an architecture composed of a data path and a controller. This stream of synthesis generally involves scheduling, resource allocation, generation of the data path and controller synthesis. Experiments showed that optimization started at the high level synthesis improves the performance of the result, yet the current tools do not offer synthesis optimizations that from the RTL level. This justifies the development of an optimization methodology which takes effect from the behavioural specification and accompanying the synthesis process in its various stages. In this paper we propose the use of algebraic properties (commutativity, associativity and distributivity) to transform readable mathematical formulas of algorithmic specifications into mathematical formulas evaluated efficiently. This will effectively reduce the execution time of scheduling calculations and increase the possibilities of testability

    Coarse-grained reconfigurable array architectures

    Get PDF
    Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that benefit from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efficiently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on flexibility, performance, and power-efficiency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code

    Integer linear programming vs. graph-based methods in code generation

    Get PDF
    A common characterictic of many applications is that they are aimed at the high-volume consumer market, which is extremely cost-sensitive. However many of them impose stringent performance demands on the underlying system. Therefore the code generation must take into account the restrictions and features given by the target architecture while satisfying these performance demands. High-level language compilers often are unable to generate code meeting these requirements. One reason is the phase coupling problem between instruction scheduling and register allocation. Many compilers perform these tasks separately with each phase ignorant of the require- ments of the other. Commonly, each task is accomplished by using heuristic methods. As the goals of the two phases often conflict, whichever phase is performed first imposes constraints on the other, sometimes producing inefficient code. Integer linear programming (ILP) provides an integrated approach to the combined instruction scheduling and register allocation problem. This way, optimal solutions can be found - albeit at the cost of high compilation times. In our experiments, we considered as target processor the 32-bit DSP ADSP-2106x. We have examined two different ILP formulations and compared them with conventional approaches including list scheduling and the critical path method. Moreover, we have investigated approximations based on the ILP formulations; this way, compilation time can be reduced considerably while still producing near-optimal results. From the results of our implementation, we have concluded that integrating ILP formulations in conventional global algorithms is a promising method for generating high-quality code
    corecore