93 research outputs found

    A scheduling algorithm for multiport memory minimization in datapath synthesis

    Full text link
    Abstract- In this paper, we present a new scheduling algorithms that generates area-efficient register transfer level datapaths with multiport memories. The proposed scheduling algorithm assigns an operation to a specific control step such that maximal sharing of functional units can be achieved with minimal number of memory ports, while satisfying given constraints. We propose a measure of multiport memory cost, MAV (Multiple Access Variable) which is defined as a variable accessed at several control steps, and overall memory cost is reduced by equally distributing the MAVs throughout all the control steps. When compared with previous approaches for several benchmarks available from the literature, the proposed algorithm generates the datapaths with less memory modules and interconnection structures by reflecting the memory cost in the scheduling process

    Parallelization of Stochastic Evolution for Cell Placement

    Get PDF
    VLSI physical design and the problems related to it such as placement, channel routing, etc, carry inherent complexities that are best dealt with iterative heuristics. However the major drawback of these iterative heuristics has been the large runtime involved in reaching acceptable solutions especially when optimizing for multiple objectives. Among the acceleration techniques proposed, parallelization is one promising method. Distributed memory multiprocessor systems and shared memory multiprocessor systems have gained considerable attention in recent years of research. This idea of parallel computing has attracted both the researchers and manufacturers who are targeting to reduce the time to market. Our objective is to exploit the benefits of parallel computing for a time consuming placement problem in VLSI. Finding the best solution for the placement of n modules is a hard problem. Thus the enumerative search techniques, specially those which employ the brute force, are unaccepted for the circuits in which n (number of modules) is large. Constructive and Iterative heuristics play the key role in this scenario and hence are frequently used. We will use Stochastic Evolution for finding the optimal solution to the above mentioned placement problem where the major task in our objective will be the parallelization of Stochastic Evolution using different parallelization techniques and the comparison between these different parallelized versions based on the results achieved. The parallelization will be carried out using MPI (Message Passing Interface) on a distributed memory multiprocessor system and conclusion will be based on the results achieved that are expected to show speedup nearly equal to linear speedup when run over increasing number of processors

    Parallelization of Stochastic Evolution for Cell Placement

    Get PDF
    VLSI physical design and the problems related to it such as placement, channel routing, etc, carry inherent complexities that are best dealt with iterative heuristics. However the major drawback of these iterative heuristics has been the large runtime involved in reaching acceptable solutions especially when optimizing for multiple objectives. Among the acceleration techniques proposed, parallelization is one promising method. Distributed memory multiprocessor systems and shared memory multiprocessor systems have gained considerable attention in recent years of research. This idea of parallel computing has attracted both the researchers and manufacturers who are targeting to reduce the time to market. Our objective is to exploit the benefits of parallel computing for a time consuming placement problem in VLSI. Finding the best solution for the placement of n modules is a hard problem. Thus the enumerative search techniques, specially those which employ the brute force, are unaccepted for the circuits in which n (number of modules) is large. Constructive and Iterative heuristics play the key role in this scenario and hence are frequently used. We will use Stochastic Evolution for finding the optimal solution to the above mentioned placement problem where the major task in our objective will be the parallelization of Stochastic Evolution using different parallelization techniques and the comparison between these different parallelized versions based on the results achieved. The parallelization will be carried out using MPI (Message Passing Interface) on a distributed memory multiprocessor system and conclusion will be based on the results achieved that are expected to show speedup nearly equal to linear speedup when run over increasing number of processors

    Parallelization of Stochastic Evolution for Cell Placement

    Get PDF
    VLSI physical design and the problems related to it such as placement, channel routing, etc, carry inherent complexities that are best dealt with iterative heuristics. However the major drawback of these iterative heuristics has been the large runtime involved in reaching acceptable solutions especially when optimizing for multiple objectives. Among the acceleration techniques proposed, parallelization is one promising method. Distributed memory multiprocessor systems and shared memory multiprocessor systems have gained considerable attention in recent years of research. This idea of parallel computing has attracted both the researchers and manufacturers who are targeting to reduce the time to market. Our objective is to exploit the benefits of parallel computing for a time consuming placement problem in VLSI. Finding the best solution for the placement of n modules is a hard problem. Thus the enumerative search techniques, specially those which employ the brute force, are unaccepted for the circuits in which n (number of modules) is large. Constructive and Iterative heuristics play the key role in this scenario and hence are frequently used. We will use Stochastic Evolution for finding the optimal solution to the above mentioned placement problem where the major task in our objective will be the parallelization of Stochastic Evolution using different parallelization techniques and the comparison between these different parallelized versions based on the results achieved. The parallelization will be carried out using MPI (Message Passing Interface) on a distributed memory multiprocessor system and conclusion will be based on the results achieved that are expected to show speedup nearly equal to linear speedup when run over increasing number of processors

    Multiple objective optimisation of data and control paths in a behavioural silicon compiler

    No full text
    The objective of this research was to implement an 'intelligent' silicon compiler that provides the ability to automatically explore the design space and optimise a design, given as a behavioural description, with respect to multiple objectives. The objective has been met by the implementation of the MOODS Silicon Compiler. The user submits goals or objectives to the system which automatically finds near optimal solutions. As objectives may be conflicting, trade-offs between synthesis tasks are essential and consequently their simultaneous execution must occur. Tasks are decomposed into behaviour preserving transformations which, due to their completeness, can be applied in any sequence to a multi-level representation of the design. An accurate evaluation of the design is ensured by feeding up technology dependent information to a cost function. The cost function guides the simulated annealing algorithm in applying transformations to iteratively optimise the design. The simulated annealing algorithm provides an abstractness from the transformations and designer's objectives. This abstractness avoids the construction of tailored heuristics which pre-program trade-offs into a system. Pre-programmed trade-offs are used in most systems by assuming a particular shape to the trade-off curve and are inappropriate as trade-offs are technology dependent. The lack of pre-programmed trade-offs in the MOODS system allows it to adapt to changes in technology or library cells. The choice of cells and their subsequent sharing are based on the user's criteria expressed in the cost function, rather than being pre-programmed into the system. The results show that implementations created by MOODS are better than or equal to those achieved by other systems. Comparisons with other systems highlighted the importance of specifying all of a design's data as the lack of data misrepresents the design leading to misleading comparisons. The MOODS synthesis system includes an efficient method for automated design space exploration where a varied set of near optimal implementations can be produced from a single behavioural specification. Design space exploration is an important aspect of designing by high-level synthesis and in the development of synthesis systems. It allows the designer to obtain a perspicuous characterization of a design's design space allowing him to investigate alternative designs

    Address generator synthesis

    Get PDF

    Problems related to the integration of fault tolerant aircraft electronic systems

    Get PDF
    Problems related to the design of the hardware for an integrated aircraft electronic system are considered. Taxonomies of concurrent systems are reviewed and a new taxonomy is proposed. An informal methodology intended to identify feasible regions of the taxonomic design space is described. Specific tools are recommended for use in the methodology. Based on the methodology, a preliminary strawman integrated fault tolerant aircraft electronic system is proposed. Next, problems related to the programming and control of inegrated aircraft electronic systems are discussed. Issues of system resource management, including the scheduling and allocation of real time periodic tasks in a multiprocessor environment, are treated in detail. The role of software design in integrated fault tolerant aircraft electronic systems is discussed. Conclusions and recommendations for further work are included

    A Framework for Cosynthesis of Memory and Communication Architectures for MPSoC

    Full text link

    MULTI-SCALE SCHEDULING TECHNIQUES FOR SIGNAL PROCESSING SYSTEMS

    Get PDF
    A variety of hardware platforms for signal processing has emerged, from distributed systems such as Wireless Sensor Networks (WSNs) to parallel systems such as Multicore Programmable Digital Signal Processors (PDSPs), Multicore General Purpose Processors (GPPs), and Graphics Processing Units (GPUs) to heterogeneous combinations of parallel and distributed devices. When a signal processing application is implemented on one of those platforms, the performance critically depends on the scheduling techniques, which in general allocate computation and communication resources for competing processing tasks in the application to optimize performance metrics such as power consumption, throughput, latency, and accuracy. Signal processing systems implemented on such platforms typically involve multiple levels of processing and communication hierarchy, such as network-level, chip-level, and processor-level in a structural context, and application-level, subsystem-level, component-level, and operation- or instruction-level in a behavioral context. In this thesis, we target scheduling issues that carefully address and integrate scheduling considerations at different levels of these structural and behavioral hierarchies. The core contributions of the thesis include the following. Considering both the network-level and chip-level, we have proposed an adaptive scheduling algorithm for wireless sensor networks (WSNs) designed for event detection. Our algorithm exploits discrepancies among the detection accuracy of individual sensors, which are derived from a collaborative training process, to allow each sensor to operate in a more energy efficient manner while the network satisfies given constraints on overall detection accuracy. Considering the chip-level and processor-level, we incorporated both temperature and process variations to develop new scheduling methods for throughput maximization on multicore processors. In particular, we studied how to process a large number of threads with high speed and without violating a given maximum temperature constraint. We targeted our methods to multicore processors in which the cores may operate at different frequencies and different levels of leakage. We develop speed selection and thread assignment schedulers based on the notion of a core's steady state temperature. Considering the application-level, component-level and operation-level, we developed a new dataflow based design flow within the targeted dataflow interchange format (TDIF) design tool. Our new multiprocessor system-on-chip (MPSoC)-oriented design flow, called TDIF-PPG, is geared towards analysis and mapping of embedded DSP applications on MPSoCs. An important feature of TDIF-PPG is its capability to integrate graph level parallelism and actor level parallelism into the application mapping process. Here, graph level parallelism is exposed by the dataflow graph application representation in TDIF, and actor level parallelism is modeled by a novel model for multiprocessor dataflow graph implementation that we call the Parallel Processing Group (PPG) model. Building on the contribution above, we formulated a new type of parallel task scheduling problem called Parallel Actor Scheduling (PAS) for chip-level MPSoC mapping of DSP systems that are represented as synchronous dataflow (SDF) graphs. In contrast to traditional SDF-based scheduling techniques, which focus on exploiting graph level (inter-actor) parallelism, the PAS problem targets the integrated exploitation of both intra- and inter-actor parallelism for platforms in which individual actors can be parallelized across multiple processing units. We address a special case of the PAS problem in which all of the actors in the DSP application or subsystem being optimized can be parallelized. For this special case, we develop and experimentally evaluate a two-phase scheduling framework with three work flows --- particle swarm optimization with a mixed integer programming formulation, particle swarm optimization with a simulated annealing engine, and particle swarm optimization with a fast heuristic based on list scheduling. Then, we extend our scheduling framework to support general PAS problem which considers the actors cannot be parallelized
    corecore