24 research outputs found

    System-level design of energy-efficient sensor-based human activity recognition systems: a model-based approach

    Get PDF
    This thesis contributes an evaluation of state-of-the-art dataflow models of computation regarding their suitability for a model-based design and analysis of human activity recognition systems, in terms of expressiveness and analyzability, as well as model accuracy. Different aspects of state-of-the-art human activity recognition systems have been modeled and analyzed. Based on existing methods, novel analysis approaches have been developed to acquire extra-functional properties like processor utilization, data communication rates, and finally energy consumption of the system

    PRUNE: Dynamic and Decidable Dataflow for Signal Processing on Heterogeneous Platforms

    Get PDF
    The majority of contemporary mobile devices and personal computers are based on heterogeneous computing platforms that consist of a number of CPU cores and one or more Graphics Processing Units (GPUs). Despite the high volume of these devices, there are few existing programming frameworks that target full and simultaneous utilization of all CPU and GPU devices of the platform. This article presents a dataflow-flavored Model of Computation (MoC) that has been developed for deploying signal processing applications to heterogeneous platforms. The presented MoC is dynamic and allows describing applications with data dependent run-time behavior. On top of the MoC, formal design rules are presented that enable application descriptions to be simultaneously dynamic and decidable. Decidability guarantees compile-time application analyzability for deadlock freedom and bounded memory. The presented MoC and the design rules are realized in a novel Open Source programming environment "PRUNE" and demonstrated with representative application examples from the domains of image processing, computer vision and wireless communications. Experimental results show that the proposed approach outperforms the state-of-the-art in analyzability, flexibility and performance.Comment: This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publicatio

    Timing analysis of synchronous data flow graphs

    Get PDF
    Consumer electronic systems are getting more and more complex. Consequently, their design is getting more complicated. Typical systems built today are made of different subsystems that work in parallel in order to meet the functional re- quirements of the demanded applications. The types of applications running on such systems usually have inherent timing constraints which should be realized by the system. The analysis of timing guarantees for parallel systems is not a straightforward task. One important category of applications in consumer electronic devices are multimedia applications such as an MP3 player and an MPEG decoder/encoder. Predictable design is the prominent way of simultaneously managing the design complexity of these systems and providing timing guarantees. Timing guarantees cannot be obtained without using analyzable models of computation. Data flow models proved to be a suitable means for modeling and analysis of multimedia applications. Synchronous Data Flow Graphs (SDFGs) is a data flow model of computation that is traditionally used in the domain of Digital Signal Processing (DSP) platforms. Owing to the structural similarity between DSP and multimedia applications, SDFGs are suitable for modeling multimedia applications as well. Besides, various performance metrics can be analyzed using SDFGs. In fact, the combination of expressivity and analysis potential makes SDFGs very interesting in the domain of multimedia applications. This thesis contributes to SDFG analysis. We propose necessary and sufficient conditions to analyze the integrity of SDFGs and we provide techniques to capture prominent performance metrics, namely, throughput and latency. These perfor- mance metrics together with the mentioned sanity checks (conditions) build an appropriate basis for the analysis of the timing behavior of modeled applications. An SDFG is a graph with actors as vertices and channels as edges. Actors represent basic parts of an application which need to be executed. Channels represent data dependencies between actors. Streaming applications essentially continue their execution indefinitely. Therefore, one of the key properties of an SDFG which models such an application is liveness, i.e., whether all actors can run infinitely often. For example, one is usually not interested in a system which completely or partially deadlocks. Another elementary requirement known as boundedness, is whether an implementation of an SDFG is feasible using a lim- ited amount of memory. Necessary and sufficient conditions for liveness and the different types of boundedness are given, as well as algorithms for checking those conditions. Throughput analysis of SDFGs is an important step for verifying throughput requirements of concurrent real-time applications, for instance within design-space exploration activities. In fact, the main reason that SDFGs are used for mod- eling multimedia applications is analysis of the worst-case throughput, as it is essential for providing timing guarantees. Analysis of SDFGs can be hard, since the worst-case complexity of analysis algorithms is often high. This is also true for throughput analysis. In particular, many algorithms involve a conversion to another kind of data flow graph, namely, a homogenous data flow graph, whose size can be exponentially larger than the size of the original graph and in practice often is much larger. The thesis presents a method for throughput analysis of SD- FGs which is based on explicit state-space exploration, avoiding the mentioned conversion. The method, despite its worst-case complexity, works well in practice, while existing methods often fail. Since the state-space exploration method is akin to the simulation of the graph, the result can be easily obtained as a byproduct in existing simulation tools. In various contexts, such as design-space exploration or run-time reconfigu- ration, many throughput computations are required for varying actor execution times. The computations need to be fast because typically very limited resources or time can be dedicated to the analysis. In this thesis, we present methods to compute throughput of an SDFG where execution times of actors can be param- eters. As a result, the throughput of these graphs is obtained in the form of a function of these parameters. Calculation of throughput for different actor exe- cution times is then merely an evaluation of this function for specific parameter values, which is much faster than the standard throughput analysis. Although throughput is a very useful performance indicator for concurrent real-time applications, another important metric is latency. Especially for appli- cations such as video conferencing, telephony and games, latency beyond a certain limit cannot be tolerated. The final contribution of this thesis is an algorithm to determine the minimal achievable latency, providing an execution scheme for executing an SDFG with this latency. In addition, a heuristic is proposed for optimizing latency under a throughput constraint. This heuristic gives optimal latency and throughput results in most cases

    Worst-case temporal analysis of real-time dynamic streaming applications

    Get PDF

    A Compositional Approach to Embedded System Design

    Get PDF
    An important observable trend in embedded system design is the growing system complexity. Besides the sheer increase of functionality, the growing complexity has another dimension which is the resulting heterogeneity with respect to the different functions and components of an embedded system. This means that functions from different application domains are tightly coupled in a single embedded system. It is established industry practice that specialized specification languages and design environments are used in each application domain. The resulting heterogeneity of the specification is increased even further by reused components (legacy code, IP). Since there is little hope that a single suitable language will replace this heterogeneous set of languages, multi-language design is becoming increasingly important for complex embedded systems. The key problems in the context of multi-language design are the safe integration of the differently specified subsystems and the optimized implementation of the whole system. Both require the reliable validation of the system function as well as of the non-functional system properties. Current cosimulation-based approaches are well suited for functional validation and debugging. However, these approaches are less powerful for the validation of non-functional system properties. In this dissertation, a novel compositional approach to embedded system design is presented which augments existing cosimulation-based design flows with formal analysis capabilities regarding non-functional system properties such as timing or power consumption. Starting from a truly multi-language specification, the system is transformed into an abstract internal design representation which serves as basis for system-wide analysis and optimization.Ein wesentlicher Trend im Entwurf eingebetteter Systeme ist die steigende Komplexität der zu entwerfenden Systeme. Neben der zunehmenden Funktionalität hat die steigende Komplexität eine weitere Dimension: die resultierende Heterogenität bezüglich der verschiedenen Funktionen und Komponenten eines eingebetteten Systems. Dies bedeutet, daß Funktionen aus verschiedenen Anwendungsbereichen in einem einzelnen System eng miteinander kooperieren. Es ist in der industriellen Praxis etabliert, daß in jedem Anwendungsbereich spezialisierte Spezifikationssprachen zum Einsatz kommen. Da wenig Hoffnung besteht, daß eine einzige geeignete Sprache diesen heterogenen Mix von Sprachen ersetzen wird, gewinnt der mehrsprachige Entwurf für komplexe eingebettete Systeme an Bedeutung. Die Hauptprobleme im Bereich des mehrsprachigen Entwurfs sind die sichere Integration der verschieden spezifizierten Teilsysteme und die optimierte Implementierung des gesamten Systems. Beide Probleme verlangen eine zuverlässige Validierung der Systemfunktion sowie der nichtfunktionalen Systemeigenschaften. Heutige cosimulationsbasierte Ansätze aus Forschung und Industrie sind gut geeignet für die funktionale Validierung und Fehlersuche, haben aber Schwächen bei der Validierung nichtfunktionaler Systemeigenschaften. In der vorliegenden Arbeit wird ein neuartiger kompositionaler Ansatz für den Entwurf eingebetteter Systeme vorgestellt, der existierende cosimulationsbasierte Entwurfsflüsse um Fähigkeiten zur Analyse nichtfunktionaler Systemeigenschaften ergänzt. Ausgehend von einer mehrsprachigen Spezifikation, wird das System in eine abstrakte homogene interne Darstellung transformiert, die als Grundlage für die systemweite Analyse und Optimierung dient

    A fine-grained parallel dataflow-inspired architecture for streaming applications

    Get PDF
    Data driven streaming applications are quite common in modern multimedia and wireless applications, like for example video and audio processing. The main components of these applications are Digital Signal Processing (DSP) algorithms. These algorithms are not extremely complex in terms of their structure and the operations that make up the algorithms are fairly simple (usually binary mathematical operations like addition and multiplication). What makes it challenging to implement and execute these algorithms efficiently is their large degree of fine-grained parallelism and the required throughput. DSP algorithms can usually be described as dataflow graphs with nodes corresponding to operations and edges between the nodes expressing data dependencies. A node fires, i.e. executes, as soon as all required input data has arrived at its input edge(s). \ud \ud To execute DSP algorithms efficiently while maintaining flexibility, coarse-grained reconfigurable arrays (CGRAs) can be used. CGRAs are composed of a set of small, reconfigurable cores, interconnected in e.g. a two dimensional array. Each core by itself is not very powerful, yet the complete array of cores forms an efficient architecture with a high throughput due to its ability to efficiently execute operations in parallel. \ud \ud In this thesis, we present a CGRA targeted at data driven streaming DSP applications that contain a large degree of fine grained parallelism, such as matrix manipulations or filter algorithms. Along with the architecture, also a programming language is presented that can directly describe DSP applications as dataflow graphs which are then automatically mapped and executed on the architecture. In contrast to previously published work on CGRAs, the guiding principle and inspiration for the presented CGRA and its corresponding programming paradigm is the dataflow principle. \ud \ud The result of this work is a completely integrated framework targeted at streaming DSP algorithms, consisting of a CGRA, a programming language and a compiler. The complete system is based on dataflow principles. We conclude that by using an architecture that is based on dataflow principles and a corresponding programming paradigm that can directly express dataflow graphs, DSP algorithms can be implemented in a very intuitive and straightforward manner

    Software tools for the rapid development of signal processing and communications systems on configurable platforms

    Get PDF
    Programmers and engineers in the domains of high performance computing (HPC) and electronic system design have a shared goal: to define a structure for coordination and communication between nodes in a highly parallel network of processing tasks. Practitioners in both of these fields have recently encountered additional constraints that motivate the use of multiple types of processing device in a hybrid or heterogeneous platform, but constructing a working "program" to be executed on such an architecture is very time-consuming with current domain-specific design methodologies. In the field of HPC, research has proposed solutions involving the use of alternative computational devices such as FPGAs (field-programmable gate arrays), since these devices can exhibit much greater performance per unit of power consumption. The appeal of integrating these devices into traditional microprocessor-based systems is mitigated, however, by the greater difficulty in constructing a system for the resulting hybrid platform. In the field of electronic system design, a similar problem of integration exists. Many of the highly parallel FPGA-based systems that Xilinx and its customers produce for applications such as telecommunications and video processing require the additional use of one or more microprocessors, but coordinating the interactions between existing FPGA cores and software running on the microprocessors is difficult. The aim of my project is to improve the design flow for hybrid systems by proposing, firstly, an abstract representation of these systems and their components which captures in metadata their different models of computation and communication; secondly, novel design checking, exploration and optimisation techniques based around this metadata; and finally, a novel design methodology in which component and system metadata is used to generate software simulation models. The effectiveness of this approach will be evaluated through the implementation of two physical-layer telecommunications system models that meet the requirements of the 3GPP "LTE" standard, which is commercially relevant to Xilinx and many other organisations
    corecore