34 research outputs found
μ€μκ° μλ² λλ μμ€ν μ μν λμ νμ λͺ μΈ λ° μ€κ³ κ³΅κ° νμ κΈ°λ²
νμλ
Όλ¬Έ (λ°μ¬)-- μμΈλνκ΅ λνμ : μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2016. 8. νμν.νλμ μΉ©μ μ§μ λλ νλ‘μΈμμ κ°μκ° λ§μμ§κ³ , λ§μ κΈ°λ₯λ€μ΄ ν΅ν©λ¨μ λ°λΌ, μ°μ°μμ λ³ν, μλΉμ€μ νμ§, μμμΉ λͺ»ν μμ€ν
μμμ κ³ μ₯ λ±κ³Ό κ°μ λ€μν μμλ€μ μν΄ μμ€ν
μ μνκ° λμ μΌλ‘ λ³ννκ² λλ€. λ°λ©΄μ, λ³Έ λ
Όλ¬Έμμ μ£Όλ κ΄μ¬μ¬λ₯Ό κ°μ§λ μ€λ§νΈ ν° μ₯μΉμμ μ£Όλ‘ μ¬μ©λλ λΉλμ€, κ·Έλν½ μμ©λ€μ κ²½μ°, κ³μ° 볡μ‘λκ° μ§μμ μΌλ‘ μ¦κ°νκ³ μλ€. λ°λΌμ, μ΄λ κ² λμ μΌλ‘ λ³νλ νμλ₯Ό κ°μ§λ©΄μλ λ³λ ¬μ±μ λ΄μ ν κ³μ° μ§μ½μ μΈ μ°μ°μ ν¬ν¨νλ 볡μ‘ν μμ€ν
μ ꡬννκΈ° μν΄μλ 체κ³μ μΈ μ€κ³ λ°©λ²λ‘ μ΄ κ³ λλ‘ μꡬλλ€.
λͺ¨λΈ κΈ°λ° λ°©λ²λ‘ μ λ³λ ¬ μλ² λλ μννΈμ¨μ΄ κ°λ°μ μν λνμ μΈ λ°©λ² μ€ νλμ΄λ€. νΉν, μμ€ν
λͺ
μΈ, μ μ μ±λ₯ λΆμ, μ€κ³ κ³΅κ° νμ, κ·Έλ¦¬κ³ μλ μ½λ μμ±κΉμ§μ λͺ¨λ μ€κ³ λ¨κ³λ₯Ό μ§μνλ λ³λ ¬ μλ² λλ μννΈμ¨μ΄ μ€κ³ νκ²½μΌλ‘μ, HOPES νλ μμν¬κ° μ μλμλ€. λ€λ₯Έ μ€κ³ νκ²½λ€κ³Όλ λ€λ₯΄κ², μ΄κΈ°μ’
λ©ν°νλ‘μΈμ μν€ν
μ²μμμ μΌλ°μ μΈ μν λͺ¨λΈλ‘μ, κ³΅ν΅ μ€κ° μ½λ (CIC) λΌκ³ λΆλ₯΄λ νλ‘κ·Έλλ° νλ«νΌμ΄λΌλ μλ‘μ΄ κ°λ
μ μκ°νμλ€. CIC νμ€ν¬ λͺ¨λΈμ νλ‘μΈμ€ λ€νΈμν¬ λͺ¨λΈμ κΈ°λ°νκ³ μμ§λ§, SDF λͺ¨λΈλ‘ ꡬ체νλ μ μκΈ° λλ¬Έμ, λ³λ ¬ μ²λ¦¬λΏλ§ μλλΌ μ μ λΆμμ΄ μ©μ΄νλ€λ μ₯μ μ κ°μ§λ€. νμ§λ§, SDF λͺ¨λΈμ μμ©μ λμ μΈ νμλ₯Ό λͺ
μΈν μ μλ€λ ννμμ μ μ½μ κ°μ§λ€.
μ΄λ¬ν μ μ½μ 극볡νκ³ , μμ€ν
μ λμ νμλ₯Ό μμ© μΈλΆμ λ΄λΆλ‘ ꡬλΆνμ¬ λͺ
μΈνκΈ° μν΄, λ³Έ λ
Όλ¬Έμμλ λ°μ΄ν° νλ‘μ°μ μ νμνκΈ° (FSM) λͺ¨λΈμ κΈ°λ°νμ¬ νμ₯λ CIC νμ€ν¬ λͺ¨λΈμ μ μνλ€. μμ μμ€μμλ, κ° μμ©μ λ°μ΄ν° νλ‘μ° νμ€ν¬λ‘ λͺ
μΈ λλ©°, λμ νμλ μμ©λ€μ μνμ κ°λ
νλ μ μ΄ νμ€ν¬λ‘ λͺ¨λΈ λλ€. λ°μ΄ν° νλ‘μ° νμ€ν¬ λ΄λΆλ, μ νμνκΈ° κΈ°λ°μ SADF λͺ¨λΈκ³Ό μ μ¬ν ννλ‘ λμ νμκ° λͺ
μΈ λλ€SDF νμ€ν¬λ 볡μκ°μ νμλ₯Ό κ°μ§ μ μμΌλ©°, λͺ¨λ μ νκΈ° (MTM)μ΄λΌκ³ λΆλ¦¬λ μ ν μνκΈ°μ ν
μ΄λΈ ννμ λͺ
μΈλ₯Ό ν΅ν΄ SDF κ·Έλνμ λͺ¨λ μ ν κ·μΉμ λͺ
μΈ νλ€. μ΄λ₯Ό MTM-SDF κ·ΈλνλΌκ³ λΆλ₯΄λ©°, 볡μ λͺ¨λ λ°μ΄ν° νλ‘μ° λͺ¨λΈ μ€ νλλΌ κ΅¬λΆλλ€. μμ©μ μ νν νμ (λλ λͺ¨λ)λ₯Ό κ°μ§λ©°, κ° νμ (λͺ¨λ)λ SDF κ·Έλνλ‘ ννλλ κ²μ κ°μ νλ€. μ΄λ₯Ό ν΅ν΄ λ€μν νλ‘μΈμ κ°μμ λν΄ λ¨μμκ°λΉ μ²λ¦¬λμ μ΅λννλ μ»΄νμΌ-μκ° μ€μΌμ€λ§μ μννκ³ , μ€μΌμ€ κ²°κ³Όλ₯Ό μ μ₯ν μ μλλ‘ νλ€.
λν, 볡μ λͺ¨λ λ°μ΄ν° νλ‘μ° κ·Έλνλ₯Ό μν λ©ν°νλ‘μΈμ μ€μΌμ€λ§ κΈ°λ²μ μ μνλ€. 볡μ λͺ¨λ λ°μ΄ν° νλ‘μ° κ·Έλνλ₯Ό μν λͺλͺ μ€μΌμ€λ§ κΈ°λ²λ€μ΄ μ‘΄μ¬νμ§λ§, λͺ¨λ μ¬μ΄μ νμ€ν¬ μ΄μ£Όλ₯Ό νμ©ν κΈ°λ²λ€μ μ‘΄μ¬νμ§ μλλ€. νμ§λ§ νμ€ν¬ μ΄μ£Όλ₯Ό νμ©νκ² λλ©΄ μμ μꡬλμ μ€μΌ μ μλ€λ λ°κ²¬μ ν΅ν΄, λ³Έ λ
Όλ¬Έμμλ λͺ¨λ μ¬μ΄μ νμ€ν¬ μ΄μ£Όλ₯Ό νμ©νλ 볡μ λͺ¨λ λ°μ΄ν° νλ‘μ° κ·Έλνλ₯Ό μν λ©ν°νλ‘μΈμ μ€μΌμ€λ§ κΈ°λ²μ μ μνλ€. μ μ μκ³ λ¦¬μ¦μ κΈ°λ°νμ¬, μ μνλ κΈ°λ²μ μμ μꡬλμ μ΅μννκΈ° μν΄ κ° λͺ¨λμ ν΄λΉνλ λͺ¨λ SDF κ·Έλνλ₯Ό λμμ μ€μΌμ€ νλ€. μ£Όμ΄μ§ λ¨μ μκ°λΉ μ²λ¦¬λ μ μ½μ λ§μ‘±μν€κΈ° μν΄, μ μνλ κΈ°λ²μ κ° λͺ¨λ λ³λ‘ μ€μ μ²λ¦¬λ μꡬλμ κ³μ°νλ©°, μ²λ¦¬λμ λΆκ·μΉμ±μ μννκΈ° μν μΆλ ₯ λ²νΌμ ν¬κΈ°λ₯Ό κ³μ°νλ€.
λͺ
μΈλ νμ€ν¬ κ·Έλνμ μ€μΌμ€ κ²°κ³Όλ‘λΆν°, HOPES νλ μμν¬λ λμ μν€ν
μ²λ₯Ό μν μλ μ½λ μμ±μ μ§μνλ€. μ΄λ₯Ό μν΄ μλ μ½λ μμ±κΈ°λ CIC νμ€ν¬ λͺ¨λΈμ νμ₯λ νΉμ§λ€μ μ§μνλλ‘ νμ₯λμλ€. μμ© μμ€μμλ MTM-SDF κ·Έλνλ₯Ό μ£Όμ΄μ§ μ μ μ€μΌμ€λ§ κ²°κ³Όλ₯Ό λ°λ₯΄λ λ©ν°νλ‘μΈμ μ½λλ₯Ό μμ±νλλ‘ νμ₯λμλ€. λν, λ€ κ°μ§ μλ‘ λ€λ₯Έ μ€μΌμ€λ§ μ μ±
(fully-static, self-timed, static-assignment, fully-dynamic)μ λν λ©ν°νλ‘μΈμ μ½λ μμ±μ μ§μνλ€. μμ€ν
μμ€μμλ μ§μνλ μμ€ν
μμ² APIμ λν μ€μ ꡬν μ½λλ₯Ό μμ±νλ©°, μ μ μ€μΌμ€ κ²°κ³Όμ νμ€ν¬λ€μ μ μ΄ κ°λ₯ν μμ±λ€μ λν μλ£ κ΅¬μ‘° μ½λλ₯Ό μμ±νλ€.
볡μ λͺ¨λ λ©ν°λ―Έλμ΄ ν°λ―Έλ μμ λ₯Ό ν΅ν κΈ°μ΄μ μΈ μ€νλ€μ ν΅ν΄, μ μνλ λ°©λ²λ‘ μ νλΉμ±μ 보μΈλ€.As the number of processors in a chip increases, and more functions are integrated, the system status will change dynamically due to various factors such as the workload variation, QoS requirement, and unexpected component failure. On the other hand, computation-complexity of user applications is also steadily increasingvideo and graphics applications are two major driving forces in smart mobile devices, which define the main application domain of interest in this dissertation. So, a systematic design methodology is highly required to implement such complex systems which contain dynamically changed behavior as well as computation-intensive workload that can be parallelized.
A model-based approach is one of representative approaches for parallel embedded software development. Especially, HOPES framework is proposed which is a design environment for parallel embedded software supporting the overall design steps: system specification, performance estimation, design space exploration, and automatic code generation. Distinguished from other design environments, it introduces a novel concept of programming platform, called CIC (Common Intermediate Code) that can be understood as a generic execution model of heterogeneous multiprocessor architecture. The CIC task model is based on a process network model, but it can be refined to the SDF (Synchronous Data Flow) model, since it has a very desirable features for static analyzability as well as parallel processing. However, the SDF model has a typical weakness of expression capability, especially for the system-level specification and dynamically changed behavior of an application.
To overcome this weakness, in this dissertation, we propose an extended CIC task model based on dataflow and FSM models to specify the dynamic behavior of the system distinguishing inter- and intra-application dynamism. At the top-level, each application is specified by a dataflow task and the dynamic behavior is modeled as a control task that supervises the execution of applications. Inside a dataflow task, it specifies the dynamic behavior using a similar way as FSM-based SADFan SDF task may have multiple behaviors and a tabular specification of an FSM, called MTM (Mode Transition Machine), describes the mode transition rules for the SDF graph. We call it to MTM-SDF model which is classified as multi-mode dataflow models in the dissertation. It assumes that an application has a finite number of behaviors (or modes) and each behavior (mode) is represented by an SDF graph. It enables us to perform compile-time scheduling of each graph to maximize the throughput varying the number of allocated processors, and store the scheduling information.
Also, a multiprocessor scheduling technique is proposed for a multi-mode dataflow graph. While there exist several scheduling techniques for multi-mode dataflow models, no one allows task migration between modes. By observing that the resource requirement can be additionally reduced if task migration is allowed, we propose a multiprocessor scheduling technique of a multi-mode dataflow graph considering task migration between modes. Based on a genetic algorithm, the proposed technique schedules all SDF graphs in all modes simultaneously to minimize the resource requirement. To satisfy the throughput constraint, the proposed technique calculates the actual throughput requirement of each mode and the output buffer size for tolerating throughput jitter.
For the specified task graph and scheduling results, the CIC translator generates parallelized code for the target architecture. Therefore the CIC translator is extended to support extended features of the CIC task model. In application-level, it is extended to support multiprocessor code generation for an MTM-SDF graph considering the given static scheduling results. Also, multiprocessor code generation of four different scheduling policies are supported for an MTM-SDF graph: fully-static, self-timed, static-assignment, and fully-dynamic. In system-level, the CIC translator is extended to support code generation for implementation of system request APIs and data structures for the static scheduling results and configurable task parameters.
Through preliminary experiments with a multi-mode multimedia terminal example, the viability of the proposed methodology is verified.Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Contribution 7
1.3 Dissertation organization 9
Chapter 2 Background 10
2.1 Related work 10
2.1.1 Compiler-based approach 10
2.1.2 Language-based approach 11
2.1.3 Model-based approach 15
2.2 HOPES framework 19
2.3 Common Intermediate Code (CIC) Model 21
Chapter 3 Dynamic Behavior Specification 26
3.1 Problem definition 26
3.1.1 System-level dynamic behavior 26
3.1.2 Application-level dynamic behavior 27
3.2 Related work 28
3.3 Motivational example 31
3.4 Control task specification for system-level dynamism 33
3.4.1 Internal specification 33
3.4.2 Action scripts 38
3.5 MTM-SDF specification for application-level dynamism 44
3.5.1 MTM specification 44
3.5.2 Task graph specification 45
3.5.3 Execution semantic of an MTM-SDF graph 46
Chapter 4 Multiprocessor Scheduling of an Multi-mode Dataflow Graph 50
4.1 Related work 51
4.2 Motivational example 56
4.2.1 Throughput requirement calculation considering mode transition delay 56
4.2.2 Task migration between mode transition 58
4.3 Problem definition 61
4.4 Throughput requirement analysis 65
4.4.1 Mode transition delay 66
4.4.2 Arrival curves of the output buffer 70
4.4.3 Buffer size determination 71
4.4.4 Throughput requirement analysis 73
4.5 Proposed MMDF scheduling framework 75
4.5.1 Optimization problem 75
4.5.2 GA configuration 76
4.5.3 Fitness function 78
4.5.4 Local optimization technique 79
4.6 Experimental results 81
4.6.1 MMDF scheduling technique 83
4.6.2 Scalability of the Proposed Framework 88
Chapter 5 Multiprocessor Code Generation for the Extended CIC Model 89
5.1 CIC translator 89
5.2 Code generation for application-level dynamism 91
5.2.1 Function call-style code generation (fully-static, self-timed) 94
5.2.2 Thread-style code generation (static-assignment, fully-dynamic) 98
5.3 Code generation for system-level dynamism 101
5.4 Experimental results 105
Chapter 6 Conclusion and Future Work 107
Bibliography 109
μ΄λ‘ 125Docto
Modeling and Mapping of Optimized Schedules for Embedded Signal Processing Systems
The demand for Digital Signal Processing (DSP) in embedded systems has been increasing rapidly due to the proliferation of multimedia- and communication-intensive devices such as pervasive tablets and smart phones. Efficient implementation of embedded DSP systems requires integration of diverse hardware and software components, as well as dynamic workload distribution across heterogeneous computational resources. The former implies increased complexity of application modeling and analysis, but also brings enhanced potential for achieving improved energy consumption, cost or performance. The latter results from the increased use of dynamic behavior in embedded DSP applications. Furthermore, parallel programming is highly relevant in many embedded DSP areas due to the development and use of Multiprocessor System-On-Chip (MPSoC) technology. The need for efficient cooperation among different devices supporting diverse parallel embedded computations motivates high-level modeling that expresses dynamic signal processing behaviors and supports efficient task scheduling and hardware mapping.
Starting with dynamic modeling, this thesis develops a systematic design methodology that supports functional simulation and hardware mapping of dynamic reconfiguration based on Parameterized Synchronous Dataflow (PSDF) graphs. By building on the DIF (Dataflow Interchange Format), which is a design language and associated software package for developing and experimenting with dataflow-based design techniques for signal processing systems, we have developed a novel tool for functional simulation of PSDF specifications. This simulation tool allows designers to model applications in PSDF and simulate their functionality, including use of the dynamic parameter reconfiguration capabilities offered by PSDF. With the help of this simulation tool, our design methodology
helps to map PSDF specifications into efficient implementations on field programmable gate arrays (FPGAs). Furthermore, valid schedules can be derived from the PSDF models at runtime to adapt hardware configurations based on changing data characteristics or
operational requirements. Under certain conditions, efficient quasi-static schedules can be applied to reduce overhead and enhance predictability in the scheduling process.
Motivated by the fact that scheduling is critical to performance and to efficient use of dynamic reconfiguration, we have focused on a methodology for schedule design, which complements the emphasis on automated schedule construction in the existing literature on dataflow-based design and implementation. In particular, we have proposed a dataflow-based schedule design framework called the dataflow schedule graph (DSG), which provides a graphical framework for schedule construction based on dataflow semantics, and can also be used as an intermediate representation target for automated schedule generation. Our approach to applying the DSG in this thesis emphasizes schedule
construction as a design process rather than an outcome of the synthesis process. Our approach employs dataflow graphs for representing both application models and schedules that are derived from them. By providing a dataflow-integrated framework for unambiguously representing, analyzing, manipulating, and interchanging schedules, the DSG facilitates effective codesign of dataflow-based application models and schedules for execution of these models.
As multicore processors are deployed in an increasing variety of embedded image processing systems, effective utilization of resources such as multiprocessor systemon-chip (MPSoC) devices, and effective handling of implementation concerns such as
memory management and I/O become critical to developing efficient embedded implementations. However, the diversity and complexity of applications and architectures in embedded image processing systems make the mapping of applications onto MPSoCs difficult. We help to address this challenge through a structured design methodology that is built upon the DSG modeling framework. We refer to this methodology as the DEIPS methodology (DSG-based design and implementation of Embedded Image Processing
Systems). The DEIPS methodology provides a unified framework for joint consideration of DSG structures and the application graphs from which they are derived, which allows designers to integrate considerations of parallelization and resource constraints together
with the application modeling process. We demonstrate the DEIPS methodology through cases studies on practical embedded image processing systems
HARDWARE AND SOFTWARE ARCHITECTURES FOR ENERGY- AND RESOURCE-EFFICIENT SIGNAL PROCESSING SYSTEMS
For a large class of digital signal processing (DSP) systems, design and implementation of hardware and software is challenging due to stringent constraints on energy and resource requirements. In this thesis, we develop methods to address this challenge by proposing new constraint-aware system design methods for DSP systems, and energy- and resource-optimized designs of key DSP subsystems that are relevant across various application areas. In addition to general methods for optimizing energy consumption and resource utilization, we present streamlined designs that are specialized to efficiently address platform-dependent constraints.
We focus on two specific aspects in development of energy- and resource-optimized design techniques:
(1) Application-specific systems and architectures for energy- and resource- efficient design.
First, we address challenges in efficient implementation of wireless sensor network building energy monitoring systems (WSNBEMSs). We develop new energy management schemes in order to maximize system lifetime for WSNBEMSs, and demonstrate that system lifetime can be improved significantly without affecting monitoring accuracy.
We also present resource efficient, field programmable gate array (FPGA) architecture for implementation of orthogonal frequency division multiplexing (OFDM) systems. We have demonstrated that our design provides at least 8.8% enhancement in terms of resource efficiency compared to Xilinx FFT v7.1 within the same OFDM configuration.
(2) Dataflow-based methods for structured design and implementation of energy- and resource- efficient DSP systems.
First, we introduce a dataflow-based design approach based on integrating interrupt-based signal acquisition in context of parameterized synchronous dataflow (PSDF) modeling. We demonstrate that by applying our approach, energy- and resource-efficient embedded software can be derived systematically from high level models of dynamic, data-driven applications systems (DDDASs) functional structure.
Also, we present an in-depth development of lightweight dataflow-Verilog (LWDF-V), which is an integration of the LWDF programming model with the Verilog hardware description language (HDL), and we demonstrate the utility of LWDF-V for design and implementation of digital systems for signal processing. We emphasize efficient of LWDF with HDLs, and emphasize the application of LWDF-V to design DSP systems with dynamic parameters on FPGA platforms
Integrated input modeling and memory management for image processing applications
Image processing applications often demand powerful calculations and real-time performance with low power and energy consumption. Programmable hardware provides inherent parallelism and flexibility making it a good implementation choice for this application domain. In this work we introduce a new modeling technique combining Cyclo-Static Dataflow (CSDF) base model semantics and Homogeneous Parameterized Dataflow (HPDF) meta-modeling framework, which exposes more levels of parallelism than previous models and can be used to reduce buffer sizes. We model two different applications and show how we can achieve efficient scheduling and memory organization, which is crucial for this application domain, since large amounts of data are processed, and storing intermediate results usually requires the use of off-chip resources, causing slower data access and higher power consumption. We also designed a reusable wishbone compliant memory controller module that can be used to access the Xilinx Multimedia Boardβs memory chips using single accesses or burst mode
Design methodology for embedded computer vision systems
Computer vision has emerged as one of the most popular domains of embedded appli¬cations. Though various new powerful embedded platforms to support such applica¬tions have emerged in recent years, there is a distinct lack of efficient domain-specific synthesis techniques for optimized implementation of such systems. In this thesis, four different aspects that contribute to efficient design and synthesis of such systems are explored:
(1) Graph Transformations: Dataflow modeling is widely used in digital signal processing (DSP) systems. However, support for dynamic behavior in such systems exists mainly at the modeling level and there is a lack of optimized synthesis tech¬niques for these models. New transformation techniques for efficient system-on-chip (SoC) design methods are proposed and implemented for cyclo-static dataflow and its parameterized version (parameterized cyclo-static dataflow) -- two powerful models that allow dynamic reconfigurability and phased behavior in DSP systems.
(2) Design Space Exploration: The broad range of target platforms along with the complexity of applications provides a vast design space, calling for efficient tools to explore this space and produce effective design choices. A novel architectural level design methodology based on a formalism called multirate synchronization graphs is presented along with methods for performance evaluation.
(3) Multiprocessor Communication Interface: Efficient code synthesis for emerg¬ing new parallel architectures is an important and sparsely-explored problem. A widely-encountered problem in this regard is efficient communication between pro¬cessors running different sub-systems. A widely used tool in the domain of general-purpose multiprocessor clusters is MPI (Message Passing Interface). However, this does not scale well for embedded DSP systems. A new, powerful and highly optimized communication interface for multiprocessor signal processing systems is presented in this work that is based on the integration of relevant properties of MPI with dataflow semantics.
(4) Parameterized Design Framework for Particle Filters: Particle filter systems constitute an important class of applications used in a wide number of fields. An effi¬cient design and implementation framework for such systems has been implemented based on the observation that a large number of such applications exhibit similar prop¬erties. The key properties of such applications are identified and parameterized appro¬priately to realize different systems that represent useful trade-off points in the space of possible implementations
Overview of the MPEG Reconfigurable Video Coding Framework
International audienceVideo coding technology in the last 20 years has evolved producing a variety of different and complex algorithms and coding standards. So far the specification of such standards, and of the algorithms that build them, has been done case by case providing monolithic textual and reference software specifications in different forms and programming languages. However, very little attention has been given to provide a specification formalism that explicitly presents common components between standards, and the incremental modifications of such monolithic standards. The MPEG Reconfigurable Video Coding (RVC) framework is a new ISO standard currently under its final stage of standardization, aiming at providing video codec specifications at the level of library components instead of monolithic algorithms. The new concept is to be able to specify a decoder of an existing standard or a completely new configuration that may better satisfy application-specific constraints by selecting standard components from a library of standard coding algorithms. The possibility of dynamic configuration and reconfiguration of codecs also requires new methodologies and new tools for describing the new bitstream syntaxes and the parsers of such new codecs. The RVC framework is based on the usage of a new actor/ dataflow oriented language called CAL for the specification of the standard library and instantiation of the RVC decoder model. This language has been specifically designed for modeling complex signal processing systems. CAL dataflow models expose the intrinsic concurrency of the algorithms by employing the notions of actor programming and dataflow. The paper gives an overview of the concepts and technologies building the standard RVC framework and the non standard tools supporting the RVC model from the instantiation and simulation of the CAL model to software and/or hardware code synthesis
Design Tools for Dynamic, Data-Driven, Stream Mining Systems
The proliferation of sensing devices and cost- and energy-efficient embedded processors has contributed to an increasing interest in adaptive stream mining (ASM) systems. In this class of signal processing systems, knowledge is extracted from data streams in real-time as the data arrives, rather than in a store-now, process later fashion. The evolution of machine learning methods in many application areas has contributed to demands for efficient and accurate information extraction from streams of data arriving at distributed, mobile, and heterogeneous processing nodes. To enhance accuracy, and meet the stringent constraints in which they must be deployed, it is important for ASM systems to be effective in adapting knowledge extraction approaches and processing configurations based on data characteristics and operational conditions. In this thesis, we address these challenges in design and implementation of ASM systems. We develop systematic methods and supporting
design tools for ASM systems that integrate (1) foundations of dataflow modeling for high level signal processing system design, and (2) the paradigm on Dynamic Data-Driven Application Systems (DDDAS). More specifically, the contributions of this thesis can be broadly categorized in to three major directions:
1. We develop a new design framework that systematically applies dataflow methodologies for high level signal processing system design, and adaptive stream mining based on dynamic topologies of classifiers. In particular, we introduce a new design environment, called the lightweight dataflow for dynamic data driven application systems environment (LiD4E). LiD4E provides formal semantics, rooted in dataflow principles, for design and implementation of a broad class of stream mining topologies. Using this novel application of dataflow methods, LiD4E facilitates the efficient and reliable mapping and adaptation of classifier topologies into implementations on embedded platforms.
2. We introduce new design methods for data-driven digital signal processing (DSP) systems that are targeted to resource- and energy-constrained embedded environments, such as unmanned areal vehicles (UAVs), mobile communication platforms, and wireless sensor networks. We develop a design and implementation framework for multi-mode, data driven embedded signal processing systems, where application modes with complementary trade-offs are selected, configured, executed, and switched dynamically, in a data-driven manner. We demonstrate the utility of our proposed new design methods on an energy-constrained, multi-mode face detection application.
3. We introduce new methods for multiobjective, system-level optimization that have been incorporated into the LiD4E design tool described previously. More specifically, we develop new methods for integrated modeling and optimization of real-time stream mining constraints, multidimensional stream mining performance (e.g., precision and recall), and energy efficiency. Using a design methodology centered on data-driven control of and coordination between alternative dataflow subsystems for stream mining (classification modes), we develop systematic methods for exploring complex, multidimensional design spaces associated with dynamic stream mining systems, and deriving sets of Pareto-optimal system configurations that can be switched among based on data characteristics and operating constraints
PRUNE: Dynamic and Decidable Dataflow for Signal Processing on Heterogeneous Platforms
The majority of contemporary mobile devices and personal computers are based
on heterogeneous computing platforms that consist of a number of CPU cores and
one or more Graphics Processing Units (GPUs). Despite the high volume of these
devices, there are few existing programming frameworks that target full and
simultaneous utilization of all CPU and GPU devices of the platform.
This article presents a dataflow-flavored Model of Computation (MoC) that has
been developed for deploying signal processing applications to heterogeneous
platforms. The presented MoC is dynamic and allows describing applications with
data dependent run-time behavior. On top of the MoC, formal design rules are
presented that enable application descriptions to be simultaneously dynamic and
decidable. Decidability guarantees compile-time application analyzability for
deadlock freedom and bounded memory.
The presented MoC and the design rules are realized in a novel Open Source
programming environment "PRUNE" and demonstrated with representative
application examples from the domains of image processing, computer vision and
wireless communications. Experimental results show that the proposed approach
outperforms the state-of-the-art in analyzability, flexibility and performance.Comment: This is the author's version of an article that has been published in
this journal. Changes were made to this version by the publisher prior to
publicatio
mAPN: Modeling, Analysis, and Exploration of Algorithmic and Parallelism Adaptivity
Using parallel embedded systems these days is increasing. They are getting
more complex due to integrating multiple functionalities in one application or
running numerous ones concurrently. This concerns a wide range of applications,
including streaming applications, commonly used in embedded systems. These
applications must implement adaptable and reliable algorithms to deliver the
required performance under varying circumstances (e.g., running applications on
the platform, input data, platform variety, etc.). Given the complexity of
streaming applications, target systems, and adaptivity requirements, designing
such systems with traditional programming models is daunting. This is why
model-based strategies with an appropriate Model of Computation (MoC) have long
been studied for embedded system design. This work provides algorithmic
adaptivity on top of parallelism for dynamic dataflow to express larger sets of
variants. We present a multi-Alternative Process Network (mAPN), a high-level
abstract representation in which several variants of the same application
coexist in the same graph expressing different implementations. We introduce
mAPN properties and its formalism to describe various local implementation
alternatives. Furthermore, mAPNs are enriched with metadata to Provide the
alternatives with quantitative annotations in terms of a specific metric. To
help the user analyze the rich space of variants, we propose a methodology to
extract feasible variants under user and hardware constraints. At the core of
the methodology is an algorithm for computing global metrics of an execution of
different alternatives from a compact mAPN specification. We validate our
approach by exploring several possible variants created for the Automatic
Subtitling Application (ASA) on two hardware platforms.Comment: 26 PAGES JOURNAL PAPE