15 research outputs found

    The complexity of multidimensional periodic scheduling

    Get PDF
    AbstractWe discuss the computational complexity of the multidimensional periodic scheduling problem. This problem originates from the assignment of periodic tasks to processing units over time and it is related to the design of high-performance video signal processors. We present a model of multidimensional periodic operations and introduce the multidimensional periodic scheduling problem. Next, we show that this problem and two related sub-problems are NP-hard. Further-more, we identify several special cases induced by practical situations, of which some are proven to be polynomially solvable

    Integer linear programming vs. graph-based methods in code generation

    Get PDF
    A common characterictic of many applications is that they are aimed at the high-volume consumer market, which is extremely cost-sensitive. However many of them impose stringent performance demands on the underlying system. Therefore the code generation must take into account the restrictions and features given by the target architecture while satisfying these performance demands. High-level language compilers often are unable to generate code meeting these requirements. One reason is the phase coupling problem between instruction scheduling and register allocation. Many compilers perform these tasks separately with each phase ignorant of the require- ments of the other. Commonly, each task is accomplished by using heuristic methods. As the goals of the two phases often conflict, whichever phase is performed first imposes constraints on the other, sometimes producing inefficient code. Integer linear programming (ILP) provides an integrated approach to the combined instruction scheduling and register allocation problem. This way, optimal solutions can be found - albeit at the cost of high compilation times. In our experiments, we considered as target processor the 32-bit DSP ADSP-2106x. We have examined two different ILP formulations and compared them with conventional approaches including list scheduling and the critical path method. Moreover, we have investigated approximations based on the ILP formulations; this way, compilation time can be reduced considerably while still producing near-optimal results. From the results of our implementation, we have concluded that integrating ILP formulations in conventional global algorithms is a promising method for generating high-quality code

    RAPID EXPLORATION OF COST-PERFORMANCE TRADEOFFS USING DOMINANCE EFFECT DURING DESIGN OF HARDWARE ACCELERATORS

    Get PDF
    Modern Very Large Scale Integration (VLSI) designs require a tradeoff between cost efficiency and performance (circuit speed). Furthermore, the Design Space Exploration (DSE) of the cost-performance tradeoffs for the multi objective VLSI designs should also be fast and efficient in nature. This paper presents a novel accelerated DSE approach for the exploration of cost-performance tradeoffs of modular multi (trio parametric. viz. cost, execution time and power consumption) objective VLSI hardware accelerators using hierarchical criterion analysis. The selection of the final design point is made after the tradeoffs are explored using the proposed approach.  Results of the proposed approach when applied to various benchmarks yielded significant acceleration in the exploration process compared to current existing approaches with multi parametric objective

    Improved force-directed scheduling in high-throughput digital signal processing

    Get PDF
    This paper discusses improved force-directed scheduling and its application in the design of high-throughput DSP systems, such as real-time video VLSL circuits. We present a mathematical justification of the technique of force-directed scheduling, introduced by Paulin and Knight (1989), and we show how the algorithm can be used to find cost-effective time assignments and resource allocations, allowing trade-offs between processing units and memories. Furthermore, we present modifications that improve the effectiveness and the efficiency of the algorithm. The significance of the improvements is illustrated by an empirical performance analysis based on a number of problem instance

    EXECUTION TIME – AREA TRADEOFF IN GAUSING RESIDUAL LOAD DECODER: INTEGRATED EXPLORATION OF CHAINING BASED SCHEDULE AND ALLOCATION IN HLS FOR HARDWARE ACCELERATORS

    Get PDF
    Design space exploration is an indispensable segment of High Level Synthesis (HLS) design of hardware accelerators. This paper presents a novel technique for Area-Execution time tradeoff using residual load decoding heuristics in genetic algorithms (GA) for integrated design space exploration (DSE) of scheduling and allocation. This approach is also able to resolve issues encountered during DSE of data paths for hardware accelerators, such as accuracy of the solution found, as well as the total exploration time during the process. The integrated solution found by the proposed approach satisfies the user specified constraints of hardware area and total execution time (not just latency), while at the same time offers a twofold unified solution of chaining based schedule and allocation. The cost function proposed in the genetic algorithm approach takes into account the functional units, multiplexers and demultiplexers needed during implementation. The proposed exploration system (ExpSys) was tested on a large number of benchmarks drawn from the literature for assessment of its efficiency. Results indicate an average improvement in Quality of Results (QoR) greater than 26 % when compared to a recent well known GA based exploration method

    Parallel Algorithms for Force Directed Scheduling of Flattened and Hierarchical Signal Flow Graphs

    Get PDF
    Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNational Science Foundation (NSF) / MIP-9320854Semiconductor Research Corporation / SRC 95-DP-109Advanced Research Projects Agency (ARPA) / DAA-H04-94-G-027

    Optimization with Potts neural networks in high level synthesis

    Get PDF

    Integer linear programming vs. graph-based methods in code generation

    Get PDF
    A common characterictic of many applications is that they are aimed at the high-volume consumer market, which is extremely cost-sensitive. However many of them impose stringent performance demands on the underlying system. Therefore the code generation must take into account the restrictions and features given by the target architecture while satisfying these performance demands. High-level language compilers often are unable to generate code meeting these requirements. One reason is the phase coupling problem between instruction scheduling and register allocation. Many compilers perform these tasks separately with each phase ignorant of the require- ments of the other. Commonly, each task is accomplished by using heuristic methods. As the goals of the two phases often conflict, whichever phase is performed first imposes constraints on the other, sometimes producing inefficient code. Integer linear programming (ILP) provides an integrated approach to the combined instruction scheduling and register allocation problem. This way, optimal solutions can be found - albeit at the cost of high compilation times. In our experiments, we considered as target processor the 32-bit DSP ADSP-2106x. We have examined two different ILP formulations and compared them with conventional approaches including list scheduling and the critical path method. Moreover, we have investigated approximations based on the ILP formulations; this way, compilation time can be reduced considerably while still producing near-optimal results. From the results of our implementation, we have concluded that integrating ILP formulations in conventional global algorithms is a promising method for generating high-quality code

    Power and memory optimization techniques in embedded systems design

    Get PDF
    Embedded systems incur tight constraints on power consumption and memory (which impacts size) in addition to other constraints such as weight and cost. This dissertation addresses two key factors in embedded system design, namely minimization of power consumption and memory requirement. The first part of this dissertation considers the problem of optimizing power consumption (peak power as well as average power) in high-level synthesis (HLS). The second part deals with memory usage optimization mainly targeting a restricted class of computations expressed as loops accessing large data arrays that arises in scientific computing such as the coupled cluster and configuration interaction methods in quantum chemistry. First, a mixed-integer linear programming (MILP) formulation is presented for the scheduling problem in HLS using multiple supply-voltages in order to optimize peak power as well as average power and energy consumptions. For large designs, the MILP formulation may not be suitable; therefore, a two-phase iterative linear programming formulation and a power-resource-saving heuristic are presented to solve this problem. In addition, a new heuristic that uses an adaptation of the well-known force-directed scheduling heuristic is presented for the same problem. Next, this work considers the problem of module selection simultaneously with scheduling for minimizing peak and average power consumption. Then, the problem of power consumption (peak and average) in synchronous sequential designs is addressed. A solution integrating basic retiming and multiple-voltage scheduling (MVS) is proposed and evaluated. A two-stage algorithm namely power-oriented retiming followed by a MVS technique for peak and/or average power optimization is presented. Memory optimization is addressed next. Dynamic memory usage optimization during the evaluation of a special class of interdependent large data arrays is considered. Finally, this dissertation develops a novel integer-linear programming (ILP) formulation for static memory optimization using the well-known fusion technique by encoding of legality rules for loop fusion of a special class of loops using logical constraints over binary decision variables and a highly effective approximation of memory usage
    corecore