362 research outputs found

    FPGA Implementation of Data Flow Graphs for Digital Signal Processing Applications

    Get PDF
    A rapid growth in digital signal processing applications has increased the requirement for high-speed digital systems. Multiprocessor systems are the best choice for these applications. A prior sequence of operations should be applied to the operations that described the nature of these applications before hardware implementation is produced. These operations should be scheduled and hardware allocated. This paper proposes a new scheduling technique for digital signal processing (DSP) applications has been represented by data flow graphs (DFGs). In addition, hardware allocation is implemented in the form of embedded system. A proposed scheduling technique also achieves the optimal scheduling of a DFG at design time. The optimality criteria considered in this algorithm are the maximum throughput within the available hardware resources. The maximum throughput is achieved by arranging the DFG nodes according to their inter-related data dependencies. Then, two nodes can be clustered into one compound task to reduce the overall execution time by minimizing the number of tasks to be executed that minimizing the number of cycles to execute them. Then each task is presented in form of instruction to be executed in the hardware system. A hardware system is composed of one or multiple homogenous pipelined processing elements and it is designed to meet the maximum-rate schedule.  Two implementations are proposed of the system architecture according to the number of the processing elements, namely:  the serial system and the parallel system. The serial system comprises one processing element where all tasks are processed sequentially, whilst the parallel system has four processing elements to execute tasks concurrently. These systems consist mainly of seven units: central shared memory, state table, multiway function unit buffer, execution array, processing element/s, instruction buffer and the address generation unit. The hardware components were built on an FPGA chip using Verilog HDL. In synthesis results, the parallel system has better system performance by 25.5% than the serial system. While the serial system requires smaller area size, which described by the number of slice registers and the number of the slice lookup tables (LUTs) than the parallel one. The relationship between the number of instructions that are executed in both systems, and the system area and the system performance that presented by system frequency, are studied. By increasing memories size in both systems, the system performance isn’t affected as in a serial system, and it is slightly decreased as the parallel system by 1.5% to 4.5%. In terms of the systems area, both serial system area and parallel system area are increased and in some cases are doubled. The proposed scheduling technique is shown to outperform the retaining technique, which we have chosen to compare with.  The serial system has better performance by 19.3% higher system frequency than a retiming technique. And the parallel system also outperforms the retaining technique by 51.2% higher system frequency in synthesis results

    Applying Real-Time Scheduling Theory to the Synchronous Data Flow Model of Computation

    Get PDF
    Schedulability analysis techniques that are well understood within the real-time scheduling community are applied to the analysis of recurrent real-time workloads that are modeled using the synchronous data-flow graph (SDFG) model. An enhancement to the standard SDFG model is proposed, that permits the specification of a real-time latency constraint between a specified input and a specified output of an SDFG. A technique is derived for transforming such an enhanced SDFG to a collection of traditional 3-parameter sporadic tasks, thereby allowing for the analysis of systems of SDFG tasks using the methods and algorithms that have previously been developed within the real-time scheduling community for the analysis of systems of such sporadic tasks. The applicability of this approach is illustrated by applying prior results from real-time scheduling theory to construct an exact preemptive uniprocessor schedulability test for collections of recurrent processes that are each represented using the enhanced SDFG model

    Proceedings of the first international workshop on Investigating dataflow in embedded computing architectures (IDEA 2015), January 21, 2015, Amsterdam, The Netherlands

    Get PDF
    IDEA '15 held at HiPEAC 2015, Amsterdam, The Netherlands on January 21st, 2015 is the rst workshop on Investigating Data ow in Embedded computing Architectures. This technical report comprises of the proceedings of IDEA '15. Over the years, data ow has been gaining popularity among Embedded Systems researchers around Europe and the world. However, research on data ow is limited to small pockets in dierent communities without a common forum for discussion. The goal of the workshop was to provide a platform to researchers and practitioners to present work on modelling and analysis of present and future high performance embedded computing architectures using data ow. Despite being the rst edition of the workshop, it was very pleasant to see a total of 14 submissions, out of which 6 papers were selected following a thorough reviewing process. All the papers were reviewed by at least 5 reviewers. This workshop could not have become a reality without the help of a Technical Program Committee (TPC). The TPC members not only did the hard work to give helpful reviews in time, but also participated in extensive discussion following the reviewing process, leading to an excellent workshop program and very valuable feedback to authors. Likewise, the Organisation Committee also deserves acknowledgment to make this workshop a successful event. We take this opportunity to thank everyone who contributed in making this workshop a success

    Proceedings of the first international workshop on Investigating dataflow in embedded computing architectures (IDEA 2015), January 21, 2015, Amsterdam, The Netherlands

    Get PDF
    IDEA '15 held at HiPEAC 2015, Amsterdam, The Netherlands on January 21st, 2015 is the rst workshop on Investigating Data ow in Embedded computing Architectures. This technical report comprises of the proceedings of IDEA '15. Over the years, data ow has been gaining popularity among Embedded Systems researchers around Europe and the world. However, research on data ow is limited to small pockets in dierent communities without a common forum for discussion. The goal of the workshop was to provide a platform to researchers and practitioners to present work on modelling and analysis of present and future high performance embedded computing architectures using data ow. Despite being the rst edition of the workshop, it was very pleasant to see a total of 14 submissions, out of which 6 papers were selected following a thorough reviewing process. All the papers were reviewed by at least 5 reviewers. This workshop could not have become a reality without the help of a Technical Program Committee (TPC). The TPC members not only did the hard work to give helpful reviews in time, but also participated in extensive discussion following the reviewing process, leading to an excellent workshop program and very valuable feedback to authors. Likewise, the Organisation Committee also deserves acknowledgment to make this workshop a successful event. We take this opportunity to thank everyone who contributed in making this workshop a success

    REAL-TIME SCHEDULING ON ASYMMETRIC MULTIPROCESSOR PLATFORMS

    Get PDF
    Real-time scheduling analysis is crucial for time-critical systems, in which provable timing guarantees are more important than observed raw performance. Techniques for real-time scheduling analysis initially targeted uniprocessor platforms but have since evolved to encompass multiprocessor platforms. However, work directed at multiprocessors has largely focused on symmetric platforms, in which every processor is identical. Today, it is common for a multiprocessor to include heterogeneous processing elements, as this offers advantages with respect to size, weight, and power (SWaP) limitations. As a result, realizing modern real-time systems on asymmetric multiprocessor platforms is an inevitable trend. Unfortunately, principles and mechanisms regarding real-time scheduling on such platforms are relatively lacking. The goal of this dissertation is to enrich such principles and mechanisms, by bridging existing analysis for symmetric multiprocessor platforms to asymmetric ones and by developing new techniques that are unique for asymmetric multiprocessor platforms. The specific contributions are threefold. First, for a platform consisting of processors that differ with respect to processing speeds only, this dissertation shows that the preemptive global earliest-deadline-first (G-EDF) scheduler is optimal for scheduling soft real-time (SRT) task systems. Furthermore, it shows that semi-partitioned scheduling, which is a hybrid of conventional global and partitioned scheduling approaches, can be applied to optimally schedule both hard real-time (HRT) and SRT task systems. Second, on platforms that consist of processors with different functionalities, tasks that belong to different functionalities may process the same source data consecutively and therefore have producer/consumer relationships among them, which are represented by directed acyclic graphs (DAGs). End-to-end response-time bounds for such DAGs are derived in this dissertation under a G-EDF-based scheduling approach, and it is shown that such bounds can be improved by a linear-programming-based deadline-setting technique. Third, processor virtualization can lead a symmetric physical platform to be asymmetric. In fact, for a designated virtual-platform capacity, there exist an infinite number of allocation schemes for virtual processors and a choice must be made. In this dissertation, a particular asymmetric virtual-processor allocation scheme, called minimum-parallelism (MP) form, is shown to dominate all other schemes including symmetric ones.Doctor of Philosoph

    Hypercube algorithms on mesh connected multicomputers

    Get PDF
    A new methodology named CALMANT (CC-cube Algorithms on Meshes and Tori) for mapping a type of algorithm that we call CC-cube algorithm onto multicomputers with hypercube, mesh, or torus interconnection topology is proposed. This methodology is suitable when the initial problem can be expressed as a set of processes that communicate through a hypercube topology (a CC-cube algorithm). There are many important algorithms that fit into the CC-cube type. CALMANT is based on three different techniques: (a) the standard embedding to assign the processes of the algorithm to the nodes of the mesh multicomputer; (b) the communication pipelining technique to increase the level of communication parallelism inherent in the CC-cube algorithms; and (c) optimal message-scheduling algorithms proposed in this work in order to avoid conflicts and minimizing in this way the communication time. Although CALMANT is proposed for multicomputers with different interconnection network topologies, the paper only focuses on the particular case of meshes.Peer ReviewedPostprint (published version

    Ordonnancement hybride des applications flots de données sur des systèmes embarqués multi-coeurs

    Get PDF
    Les systèmes embarqués sont de plus en plus présents dans l'industrie comme dans la vie quotidienne. Une grande partie de ces systèmes comprend des applications effectuant du traitement intensif des données: elles utilisent de nombreux filtres numériques, où les opérations sur les données sont répétitives et ont un contrôle limité. Les graphes "flots de données", grâce à leur déterminisme fonctionnel inhérent, sont très répandus pour modéliser les systèmes embarqués connus sous le nom de "data-driven". L'ordonnancement statique et périodique des graphes flot de données a été largement étudié, surtout pour deux modèles particuliers: SDF et CSDF. Dans cette thèse, on s'intéresse plus particulièrement à l'ordonnancement périodique des graphes CSDF. Le problème consiste à identifier des séquences périodiques infinies d'actionnement des acteurs qui aboutissent à des exécutions complètes à buffers bornés. L'objectif est de pouvoir aborder ce problème sous des angles différents : maximisation de débit, minimisation de la latence et minimisation de la capacité des buffers. La plupart des travaux existants proposent des solutions pour l'optimisation du débit et négligent le problème d'optimisation de la latence et propose même dans certains cas des ordonnancements qui ont un impact négatif sur elle afin de conserver les propriétés de périodicité. On propose dans cette thèse un ordonnancement hybride, nommé Self-Timed Périodique (STP), qui peut conserver les propriétés d'un ordonnancement périodique et à la fois améliorer considérablement sa performance en terme de latence.One of the most important aspects of parallel computing is its close relation to the underlying hardware and programming models. In this PhD thesis, we take dataflow as the basic model of computation, as it fits the streaming application domain. Cyclo-Static Dataflow (CSDF) is particularly interesting because this variant is one of the most expressive dataflow models while still being analyzable at design time. Describing the system at higher levels of abstraction is not sufficient, e.g. dataflow have no direct means to optimize communication channels generally based on shared buffers. Therefore, we need to link the dataflow MoCs used for performance analysis of the programs, the real time task models used for timing analysis and the low-level model used to derive communication times. This thesis proposes a design flow that meets these challenges, while enabling features such as temporal isolation and taking into account other challenges such as predictability and ease of validation. To this end, we propose a new scheduling policy noted Self-Timed Periodic (STP), which is an execution model combining Self-Timed Scheduling (STS) with periodic scheduling. In STP scheduling, actors are no longer strictly periodic but self-timed assigned to periodic levels: the period of each actor under periodic scheduling is replaced by its worst-case execution time. Then, STP retains some of the performance and flexibility of self-timed schedule, in which execution times of actors need only be estimates, and at the same time makes use of the fact that with a periodic schedule we can derive a tight estimation of the required performance metrics

    Dataflow Analysis for Multiprocessor Systems with Non-Starvation-Free Schedulers

    Get PDF
    Dataflow analysis techniques are suitable for the temporal analysis of real-time stream processing applications. However, the applicability of these models is currently limited to systems with starvation-free schedulers, such as Time-Division Multiplexing (TDM) schedulers. Removal of this limitation would broaden the application domain of dataflow analysis techniques significantly. In this paper we present a temporal analysis technique for Homogeneous Synchronous Dataflow (HSDF) graphs, that is also applicable for systems with non-starvation-free schedulers. Unlike existing dataflow analysis techniques, the proposed analysis technique makes use of an enabling-jitter characterization and iterative fixed-point computation. The presented approach is applicable for arbitrary (cyclic) graph topologies. Buffer capacity constraints are taken into account during the analysis and sufficient buffer capacities can be determined afterwards. The approach presented in this paper is the first approach that considers non-starvation-free schedulers in combination with arbitrary HSDF graphs. The proposed dataflow analysis technique is implemented in a tool. This tool is used to evaluate the analysis technique using examples that illustrate some important differences with other temporal analysis methods. The case-study discusses how the method presented in this paper can be used to solve a problem with the inaccuracy of the temporal analysis results of a real-time stream processing system. This stream processing system consists of an FM receiver together with a DAB receiver application which both share a Digital Signal Processor (DSP)
    • …
    corecore