19 research outputs found

    Exact Scheduling Strategies based on Bipartite Graph Matching

    No full text
    Scheduling is one of the central tasks in high--level synthesis. In recent publications a bipartite graph matching formulation has been introduced to prune the search space of schedulers. In this paper, we improve that formulation and introduce two novel aspects related to the way the search space is traversed, namely problem formulation and bottleneck identification. The approach results in a very run time efficient branch--and--bound scheduler searching for a correct ordering of operations from which a schedule can be derived in linear time. The results show that the use of these bipartite graph matching strategies leads to the most run time efficient exact scheduler to date. 1. Introduction In high--level synthesis (HLS), the data path of a clocked VLSI circuit consisting of arithmetic modules (functional units), registers and interconnection units is synthesized from an algorithmic description represented by a data flow graph (DFG) [McFa90]. One of the central tasks in HLS is sch..

    Memory Arbitration and Cache Management in Stream-Based Systems

    No full text
    With the ongoing advancements in VLSI technology, the performance of an embedded system is determined to a large extend by the communication of data and instructions. This results in new methods for on- and off-chip communication and caching schemes. In this paper, we use an arbitration scheme that exploits the characteristics of continuous 'media' streams while minimizing the latency for random (e.g. CPU) memory accesses to background memory. We also introduce a novel caching scheme for a streambased multiprocessor architecture, to limit as much as possible the amount of on-chip buffering required to guarantee the throughput of the continuous streams. With these two schemes we can build an architecture for media processing with optimal flexibility at run-time while performance guarantees can be determined at compile-time

    Execution Interval Analysis under Resource Constraints

    No full text
    Execution intervals are commonly used in high--level synthesis systems to identify the relation between operations and the cycle steps in which they possibly can be scheduled. These intervals are normally based on the ASAP (as soon as possible) and ALAP (as late as possible) values of operations under the assumption of unlimited resources. In this paper a novel and much more accurate execution interval analysis is presented for designs on which resource constraints are imposed. The analysis prunes the search space of schedulers without limiting the solution space and therefore enhances the quality of schedulers. Themethod is based on a bipartite graph matching formulation and runs in polynomial time. Well--known benchmarks show the positive effects of the approach on scheduling results and run times. 1. Introduction In high--level synthesis a data path consisting of modules, registers, and interconnection units is synthesized from a description represented by a data flow graph (DFG) ..

    Efficient Code Generation for In-House DSP-Cores

    No full text
    A balance between efficiency and flexibility is obtained by developing a relative large number of in-house DSP-cores each for a relatively small application area. These cores are programmed using existing ASIC synthesis tools which are modified for this purpose. The key problem is to model conflicts arising from the instruction set. A class of instruction sets is defined for which conflicts can be modelled statically before scheduling. The approach is illustrated with a real life example. 1. Introduction Dependent on the desired flexibility and on the importance of area (cost) and power dissipation different options exist for the implementation of signal processing algorithms. At one end of the design space general purpose processors offer flexibility. Many applications can be programmed on the same processor but often at a high cost (area and dissipation). At the other end of the design space ASICs offer cost effective solutions because they are tailored towards a specific applicatio..

    Fast System-Level Area-Delay Curve Prediction

    No full text
    In this paper a unified approach of lower bound functional area and cycle budget estimations is presented to predict area--delay characteristics of designs at system level. The estimations are mainly based on relaxing precedence constraints in a behavioral design description and are the most accurate estimations reported to date. 1. Introduction In high--level synthesis, a data path consisting of modules (i.e. functional units), registers and interconnection units is synthesized from a behavioral design description [McFa90]. Such a description is represented by a data flow graph (DFG), which is a translation of an algorithmic specification in a hardware description language [Eijnd92]. The ability to predict the area--delay characteristics of designs without actually implementing them is important in producing quality designs in a reasonable time, and is therefore an important part of an (interactive) system design environment [Fleu93]. If a design will be part of a larger system, then..

    Constraint analysis for DSP code generation

    Get PDF
    Code generation methods for digital signal processing (DSP) applications are hampered by the combination of tight timing constraints imposed by the performance requirements of DSP algorithms and resource constraints imposed by a hardware architecture. In this paper, we present a method for register binding and instruction scheduling based on the exploitation and analysis of the combination of resource and timing constraints. The analysis identifies implicit sequencing relations between operations in addition to the preceding constraints. Without the explicit modeling of these sequencing constraints, a scheduler is often not capable of finding a solution that satisfies the timing and resource constraints. The presented approach results in an efficient method to obtain high-quality instruction schedules with low register requirement

    Module Selection and Scheduling using Unrestricted Libraries

    No full text
    Most high--level synthesis schedulers are only capable of mapping an operation to one specific module type. To ensure a full design space exploration, a synthesis system should however select freely from a library containing modules with a large variety in delay, area and so on. A novel module selection and scheduling approach is presented, which allows the full use of such unrestricted libraries. Extensive benchmark results show very fast running times and optimal solutions. Hence this approach clearly illustrates the advantages of synthesis tools which can fully cope with unrestricted libraries, as they lead to designs with less module area. 1. Introduction Many high--level synthesis systems are being developed [McFa90], which synthesize a data path from a description represented by a data flow graph (DFG). In a time constrained design, operations are mapped on hardware modules selected by the system. To ensure a full design space exploration, a system should select freely from a l..

    Heterogeneous multiprocessor for the management of real-time video and graphics streams

    Get PDF
    This paper presents an application domain driven approach to the design of embedded systems on silicon, and it shows how this approach is used to design a chip for a multi-window TV application. We discuss all major design steps in a logical order starting with an application domain analysis. This leads to the choice of Kahn data flow graphs as the programming paradigm for high-throughput signal applications. Based on this analysis we designed a multiprocessor architecture which uses a run-time reconfiguration. Finally, attention is directed towards the physical implementation and the deep-submicron problems we had to solve. The result is a chip that can manage up to 25 internal real-time video streams. The chip combines the flexibility of a programmable solution with the cost effectiveness of a consumer produc
    corecore