60 research outputs found

    Semantics-preserving cosynthesis of cyber-physical systems

    Get PDF

    Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems

    Get PDF
    To exploit the power of modern heterogeneous multiprocessor embedded platforms on partitioned applications, the designer usually needs to efficiently map and schedule all the tasks and the communications of the application, respecting the constraints imposed by the target architecture. Since the problem is heavily constrained, common methods used to explore such design space usually fail, obtaining low-quality solutions. In this paper, we propose an ant colony optimization (ACO) heuristic that, given a model of the target architecture and the application, efficiently executes both scheduling and mapping to optimize the application performance. We compare our approach with several other heuristics, including simulated annealing, tabu search, and genetic algorithms, on the performance to reach the optimum value and on the potential to explore the design space. We show that our approach obtains better results than other heuristics by at least 16% on average, despite an overhead in execution time. Finally, we validate the approach by scheduling and mapping a JPEG encoder on a realistic target architecture

    A Dynamically Constrained Genetic Algorithm For Hardware-software Partitioning

    Get PDF
    In this article, we describe the application of an enhanced genetic algorithm to the problem of hardware-software codesign. Starting from a source code written in a high-level language our algorithm determines, using a dynamically-weighted fitness function, the most interesting code parts of the program to be implemented in hardware, given a limited amount of resources, in order to achieve the greatest overall execution speedup. The novelty of our approach resides in the tremendous reduction of the search space obtained by specific optimizations passes that are conducted on each generation. Moreover, by considering different granularities during the evolution process, very fast and effective convergence (in the order of a few seconds) can thus be attained. The partitioning obtained can then be used to build the different functional units of a processor well suited for a large customization, thanks to its architecture that uses only one instruction, Move

    Minimizing System Modification in an Incremental Design Approach

    Get PDF
    In this paper we present an approach to mapping and scheduling of distributed embedded systems for hard real-time applications, aiming at minimizing the system modification cost. We consider an incremental design process that starts from an already existing system running a set of applications. We are interested to implement new functionality so that the already running applications are disturbed as little as possible and there is a good chance that, later, new functionality can easily be added to the resulted system. The mapping and scheduling problem are considered in the context of a realistic communication model based on a TDMA protocol

    Weighted Round Robin Configuration for Worst-Case Delay Optimization in Network-on-Chip

    Get PDF
    We propose an approach for computing the end-to-end delay bound of individual variable bit-rate flows in a FIFO multiplexer with aggregate scheduling under Weighted Round Robin (WRR) policy. To this end, we use network calculus to derive per-flow end-to-end equivalent service curves employed for computing Least Upper Delay Bounds (LUDBs) of individual flows. Since real time applications are going to meet guaranteed services with lower delay bounds, we optimize weights in WRR policy to minimize LUDBs while satisfying performance constraints. We formulate two constrained delay optimization problems, namely, Minimize-Delay and Multiobjective optimization. Multi-objective optimization has both total delay bounds and their variance as minimization objectives. The proposed optimizations are solved using a genetic algorithm. A Video Object Plane Decoder (VOPD) case study exhibits 15.4% reduction of total worst-case delays and 40.3% reduction on the variance of delays when compared with round robin policy. The optimization algorithm has low run-time complexity, enabling quick exploration of large design spaces. We conclude that an appropriate weight allocation can be a valuable instrument for delay optimization in on-chip network designs

    A CAD Tool for Synthesizing Optimized Variants of Altera\u27s Nios II Soft-Core Processor

    Get PDF
    Soft-core processors offer embedded system designers the benefits of customization, flexibility and reusability. Altera\u27s NIOS II soft-core processor is a popular, commercially available soft-core processor that can be implemented on a variety of Altera FPGAs. In this thesis, the Nios II soft-core processor from Altera Corporation was studied and a VHDL implementation, called UW_Nios II, was developed. UW_Nios II was developed to enable us to perform design space exploration (DSE) for the Nios II processor. It was evaluated and compared with Altera Nios II and shown to be competitive. SCBuild is an existing CAD tool that was developed to enable DSE of soft-core processors. We modified SCBuild to automatically explore the design space of the UW_Nios II using a genetic algorithm. This tool can accurately estimate the area and critical path delay of different variants of the UW_Nios II on a field programmable gate array. Through experiments conducted using SCBuild, it was shown that employing a genetic algorithm to explore the design space of parameterized Nios II core, with a large design space, helps designers find optimized variants of UW_Nios II

    A hybrid genetic algorithm for constrained hardware- software partitioning

    Get PDF
    In this article, we propose a novel partitioning method for hardware-software codesign based on a genetic algorithm that has been enhanced for this specific task. Given a high- level program and an area constraint, our software considers different granularities levels to discover the most interesting blocks to be implemented in ad hoc functional units that can then be used as new instructions in a Move processor. Various optimizations are conducted to obtain a clean, very fast (in the order of a few seconds) and efficient partitioning on programs ranging from a few to several hundreds of lines of code
    • …
    corecore