11 research outputs found

    Memory Bounds for the Distributed Execution of a Hierarchical Synchronous Data-Flow Graph

    Get PDF
    International audienceThis paper presents an application analysis technique to define the boundary of shared memory requirements of Multiprocessor System-on-Chip (MPSoC) in early stages of development. This technique is part of a rapid prototyping process and is based on the analysis of a hierarchical Synchronous Data-Flow (SDF) graph description of the system application. The analysis does not require any knowledge of the system architecture, the mapping or the scheduling of the system application tasks. The initial step of the method consists of applying a set of transformations to the SDF graph so as to reveal its memory characteristics. These transformations produce a weighted graph that represents the different memory objects of the application as well as the memory allocation constraints due to their relationships. The memory boundaries are then derived from this weighted graph using analogous graph theory problems, in particular the Maximum-Weight Clique (MWC) problem. Stateof-the-art algorithms to solve these problems are presented and a heuristic approach is proposed to provide a near-optimal solution of the MWC problem. A performance evaluation of the heuristic approach is presented, and is based on hierarchical SDF graphs of realistic applications. This evaluation shows the efficiency of proposed heuristic approach in finding near optimal solutions

    Memory Bounds for the Distributed Execution of a Hierarchical Synchronous Data-Flow Graph

    Get PDF
    International audienceThis paper presents an application analysis technique to define the boundary of shared memory requirements of Multiprocessor System-on-Chip (MPSoC) in early stages of development. This technique is part of a rapid prototyping process and is based on the analysis of a hierarchical Synchronous Data-Flow (SDF) graph description of the system application. The analysis does not require any knowledge of the system architecture, the mapping or the scheduling of the system application tasks. The initial step of the method consists of applying a set of transformations to the SDF graph so as to reveal its memory characteristics. These transformations produce a weighted graph that represents the different memory objects of the application as well as the memory allocation constraints due to their relationships. The memory boundaries are then derived from this weighted graph using analogous graph theory problems, in particular the Maximum-Weight Clique (MWC) problem. Stateof-the-art algorithms to solve these problems are presented and a heuristic approach is proposed to provide a near-optimal solution of the MWC problem. A performance evaluation of the heuristic approach is presented, and is based on hierarchical SDF graphs of realistic applications. This evaluation shows the efficiency of proposed heuristic approach in finding near optimal solutions

    A study of memory-aware scheduling in message driven parallel programs

    Full text link
    Abstract—This paper presents a simple, but powerful memory-aware scheduling mechanism that adaptively schedules tasks in a message driven distributed-memory parallel program. The scheduler adapts its behavior whenever memory usage exceeds a threshold by scheduling tasks known to reduce memory usage. The usefulness of the scheduler and its low overhead are demonstrated in the context of an LU matrix factorization program. In the LU program, only a single additional line of code is required to make use of the new general-purpose memory-aware scheduling mechanism. Without memory-aware scheduling, the LU program can only run with small problem sizes, but with the new memory-aware scheduling, the program scales to larger problem sizes. I

    Power aware data and memory management for dynamic applications

    Get PDF
    In recent years, the semiconductor industry has turned its focus towards heterogeneous multiprocessor platforms. They are an economically viable solution for coping with the growing setup and manufacturing cost of silicon systems. Furthermore, their inherent flexibility perfectly supports the emerging market of interactive, mobile data and content services. The platform’s performance and energy depend largely on how well the data-dominated services are mapped on the memory subsystem. A crucial aspect thereby is how efficient data is transferred between the different memory layers. Several compilation techniques have been developed to optimally use the available bandwidth. Unfortunately, they do not take the interaction between multiple threads into account and do not deal with the dynamic behaviour of these novel applications. The main limitations of current techniques are outlined and an approach for dealing with them is introduced

    A review on optimization techniques for the deployment and scheduling of distributed real-time systems

    Get PDF
    RESUMEN: En las ultimas tres décadas, se ha realizado un gran número de propuestas sobre la optimización del despliegue y planificación de sistemas de tiempo real distribuidos bajo diferentes enfoques algorítmicos que aportan soluciones aceptables a este problema catalogado como NP-difícil. En la actualidad, la mayor parte de los sistemas utilizados en el sector industrial son sistemas de criticidad mixta en los que se puede usar la planificación cíclica, las prioridades fijas y el particionado, que proporciona aislamiento temporal y espacial a las aplicaciones. Así, en este artículo se realiza una revisión de los trabajos publicados sobre este tema y se presenta un análisis de las diferentes soluciones aportadas para sistemas de tiempo real distribuidos basados en las políticas de planificación que se están usando en la práctica. Como resultado de la comparación, se presenta una tabla a modo de guía en la que se relacionan los trabajos revisados y se caracterizan sus soluciones.ABSTRACT: In the last three decades, a large number of proposals has been carried out for the optimization of the deployment and scheduling of distributed real-time systems under different algorithmic approaches that provide acceptable solutions for this NP-hard problem. Nowadays, most of the systems used in industry are mixed-criticallity systems which use cyclic scheduling, fixed-priority scheduling and partitioning, which provides both temporal and spatial isolation in the execution of applications. Thus, in this work a review of the works published on this topic is performed, as well as an analysis of the different proposed solutions for distributed real-time systems based on the scheduling policies that are used in practice. As a result of the comparison, a table intended as a guide is elaborated in which all the reviewed works are reported and their solutions are characterized.Este trabajo ha sido financiado en parte por el Gobierno de España y los fondos FEDER (AEI /FEDER, UE) en el proyecto TIN2017-86520-C3-3-R (PRECON-I4)

    Power-efficient data management for dynamic applications

    Get PDF
    In recent years, the semiconductor industry has turned its focus towards heterogeneous multi-processor platforms. They are an economically viable solution for coping with the growing setup and manufacturing cost of silicon systems. Furthermore, their inherent flexibility also perfectly supports the emerging market of interactive, mobile data and content services. The platform's performance and energy depend largely on how well the data-dominated services are mapped on the memory subsystem. A crucial aspect thereby is how efficient data is transferred between the different memory layers. Several compilation techniques have been developed to optimally use the available bandwidth. Unfortunately, they do not take the interaction between multiple threads running on the different processors into account, only locally optimize the bandwidth nor deal with the dynamic behavior of these applications. The contributions of this chapter are to outline the main limitations of current techniques and to introduce an approach for dealing with the dynamic multi-threaded of our application domain

    A constructive algorithm for memory-aware task assignment and scheduling

    No full text
    This paper presents a constructive algorithm for memory-aware task assignment and scheduling, which is a part of the prototype system MATAS. The algorithm is well suited for image and video processing applications which have hard memory constraints as well as constraints on cost, execution time, and resource usage. Our algorithm takes into account code and data memory constraints together with the other constraints. It can create pipelined implementations. The algorithm finds a task assignment, a schedule, and data and code memory placement in memory. Infeasible solutions caused by memory fragmentation are avoided. The experiments show that our memory-aware algorithm reduces memory utilization comparing to greedy scheduling algorithm which has time minimization objective. Moreover, memory-aware algorithm is able to find task assignment and schedule when time minimization algorithm fails. MATAS can create pipelined implementations, therefore the design throughput is increased

    A constructive algorithm for memory-aware task assignment and scheduling

    No full text

    A constructive algorithm for memory-aware task assignment and scheduling

    No full text
    This paper presents a constructive algorithm for memory-aware task assignment and scheduling, which is a part of the prototype system MATAS. The algorithm is well suited for image and video processing applications which have hard memory constraints as well as constraints on cost, execution time, and resource usage. Our algorithm takes into account code and data memory constraints together with the other constraints. It can create pipelined implementations. The algorithm finds a task assignment, a schedule, and data and code memory placement in memory. Infeasible solutions caused by memory fragmentation are avoided. The experiments show that our memory-aware algorithm reduces memory utilization compared to greedy scheduling algorithm which has time minimization objective. Moreover, memory-aware algorithm is able to find task assignment and schedule when time minimization algorithm fails. MATAS can create pipelined implementations, therefore the design throughput is increased

    Global Constraint Catalog, 2nd Edition

    Get PDF
    This report presents a catalogue of global constraints where each constraint is explicitly described in terms of graph properties and/or automata and/or first order logical formulae with arithmetic. When available, it also presents some typical usage as well as some pointers to existing filtering algorithms
    corecore