Search CORE

34 research outputs found

Dataflow-Based Mapping of Computer Vision Algorithms onto FPGAs

Author
Publication venue: Springer
Publication date
Field of study

실시간 임베디드 시스템을 위한 동적 행위 명세 및 설계 공간 탐색 기법

Author: 정한웅
Publication venue: 서울대학교 대학원
Publication date: 01/08/2016
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 8. 하순회.하나의 칩에 집적되는 프로세서의 개수가 많아지고, 많은 기능들이 통합됨에 따라, 연산양의 변화, 서비스의 품질, 예상치 못한 시스템 요소의 고장 등과 같은 다양한 요소들에 의해 시스템의 상태가 동적으로 변화하게 된다. 반면에, 본 논문에서 주된 관심사를 가지는 스마트 폰 장치에서 주로 사용되는 비디오, 그래픽 응용들의 경우, 계산 복잡도가 지속적으로 증가하고 있다. 따라서, 이렇게 동적으로 변하는 행위를 가지면서도 병렬성을 내제한 계산 집약적인 연산을 포함하는 복잡한 시스템을 구현하기 위해서는 체계적인 설계 방법론이 고도로 요구된다. 모델 기반 방법론은 병렬 임베디드 소프트웨어 개발을 위한 대표적인 방법 중 하나이다. 특히, 시스템 명세, 정적 성능 분석, 설계 공간 탐색, 그리고 자동 코드 생성까지의 모든 설계 단계를 지원하는 병렬 임베디드 소프트웨어 설계 환경으로서, HOPES 프레임워크가 제시되었다. 다른 설계 환경들과는 다르게, 이기종 멀티프로세서 아키텍처에서의 일반적인 수행 모델로서, 공통 중간 코드 (CIC) 라고 부르는 프로그래밍 플랫폼이라는 새로운 개념을 소개하였다. CIC 태스크 모델은 프로세스 네트워크 모델에 기반하고 있지만, SDF 모델로 구체화될 수 있기 때문에, 병렬 처리뿐만 아니라 정적 분석이 용이하다는 장점을 가진다. 하지만, SDF 모델은 응용의 동적인 행위를 명세할 수 없다는 표현상의 제약을 가진다. 이러한 제약을 극복하고, 시스템의 동적 행위를 응용 외부와 내부로 구분하여 명세하기 위해, 본 논문에서는 데이터 플로우와 유한상태기 (FSM) 모델에 기반하여 확장된 CIC 태스크 모델을 제안한다. 상위 수준에서는, 각 응용은 데이터 플로우 태스크로 명세 되며, 동적 행위는 응용들의 수행을 감독하는 제어 태스크로 모델 된다. 데이터 플로우 태스크 내부는, 유한상태기 기반의 SADF 모델과 유사한 형태로 동적 행위가 명세 된다SDF 태스크는 복수개의 행위를 가질 수 있으며, 모드 전환기 (MTM)이라고 불리는 유한 상태기의 테이블 형태의 명세를 통해 SDF 그래프의 모드 전환 규칙을 명세 한다. 이를 MTM-SDF 그래프라고 부르며, 복수 모드 데이터 플로우 모델 중 하나라 구분된다. 응용은 유한한 행위 (또는 모드)를 가지며, 각 행위 (모드)는 SDF 그래프로 표현되는 것을 가정한다. 이를 통해 다양한 프로세서 개수에 대해 단위시간당 처리량을 최대화하는 컴파일-시간 스케줄링을 수행하고, 스케줄 결과를 저장할 수 있도록 한다. 또한, 복수 모드 데이터 플로우 그래프를 위한 멀티프로세서 스케줄링 기법을 제시한다. 복수 모드 데이터 플로우 그래프를 위한 몇몇 스케줄링 기법들이 존재하지만, 모드 사이에 태스크 이주를 허용한 기법들은 존재하지 않는다. 하지만 태스크 이주를 허용하게 되면 자원 요구량을 줄일 수 있다는 발견을 통해, 본 논문에서는 모드 사이의 태스크 이주를 허용하는 복수 모드 데이터 플로우 그래프를 위한 멀티프로세서 스케줄링 기법을 제안한다. 유전 알고리즘에 기반하여, 제안하는 기법은 자원 요구량을 최소화하기 위해 각 모드에 해당하는 모든 SDF 그래프를 동시에 스케줄 한다. 주어진 단위 시간당 처리량 제약을 만족시키기 위해, 제안하는 기법은 각 모드 별로 실제 처리량 요구량을 계산하며, 처리량의 불규칙성을 완화하기 위한 출력 버퍼의 크기를 계산한다. 명세된 태스크 그래프와 스케줄 결과로부터, HOPES 프레임워크는 대상 아키텍처를 위한 자동 코드 생성을 지원한다. 이를 위해 자동 코드 생성기는 CIC 태스크 모델의 확장된 특징들을 지원하도록 확장되었다. 응용 수준에서는 MTM-SDF 그래프를 주어진 정적 스케줄링 결과를 따르는 멀티프로세서 코드를 생성하도록 확장되었다. 또한, 네 가지 서로 다른 스케줄링 정책 (fully-static, self-timed, static-assignment, fully-dynamic)에 대한 멀티프로세서 코드 생성을 지원한다. 시스템 수준에서는 지원하는 시스템 요청 API에 대한 실제 구현 코드를 생성하며, 정적 스케줄 결과와 태스크들의 제어 가능한 속성들에 대한 자료 구조 코드를 생성한다. 복수 모드 멀티미디어 터미널 예제를 통한 기초적인 실험들을 통해, 제안하는 방법론의 타당성을 보인다.As the number of processors in a chip increases, and more functions are integrated, the system status will change dynamically due to various factors such as the workload variation, QoS requirement, and unexpected component failure. On the other hand, computation-complexity of user applications is also steadily increasingvideo and graphics applications are two major driving forces in smart mobile devices, which define the main application domain of interest in this dissertation. So, a systematic design methodology is highly required to implement such complex systems which contain dynamically changed behavior as well as computation-intensive workload that can be parallelized. A model-based approach is one of representative approaches for parallel embedded software development. Especially, HOPES framework is proposed which is a design environment for parallel embedded software supporting the overall design steps: system specification, performance estimation, design space exploration, and automatic code generation. Distinguished from other design environments, it introduces a novel concept of programming platform, called CIC (Common Intermediate Code) that can be understood as a generic execution model of heterogeneous multiprocessor architecture. The CIC task model is based on a process network model, but it can be refined to the SDF (Synchronous Data Flow) model, since it has a very desirable features for static analyzability as well as parallel processing. However, the SDF model has a typical weakness of expression capability, especially for the system-level specification and dynamically changed behavior of an application. To overcome this weakness, in this dissertation, we propose an extended CIC task model based on dataflow and FSM models to specify the dynamic behavior of the system distinguishing inter- and intra-application dynamism. At the top-level, each application is specified by a dataflow task and the dynamic behavior is modeled as a control task that supervises the execution of applications. Inside a dataflow task, it specifies the dynamic behavior using a similar way as FSM-based SADFan SDF task may have multiple behaviors and a tabular specification of an FSM, called MTM (Mode Transition Machine), describes the mode transition rules for the SDF graph. We call it to MTM-SDF model which is classified as multi-mode dataflow models in the dissertation. It assumes that an application has a finite number of behaviors (or modes) and each behavior (mode) is represented by an SDF graph. It enables us to perform compile-time scheduling of each graph to maximize the throughput varying the number of allocated processors, and store the scheduling information. Also, a multiprocessor scheduling technique is proposed for a multi-mode dataflow graph. While there exist several scheduling techniques for multi-mode dataflow models, no one allows task migration between modes. By observing that the resource requirement can be additionally reduced if task migration is allowed, we propose a multiprocessor scheduling technique of a multi-mode dataflow graph considering task migration between modes. Based on a genetic algorithm, the proposed technique schedules all SDF graphs in all modes simultaneously to minimize the resource requirement. To satisfy the throughput constraint, the proposed technique calculates the actual throughput requirement of each mode and the output buffer size for tolerating throughput jitter. For the specified task graph and scheduling results, the CIC translator generates parallelized code for the target architecture. Therefore the CIC translator is extended to support extended features of the CIC task model. In application-level, it is extended to support multiprocessor code generation for an MTM-SDF graph considering the given static scheduling results. Also, multiprocessor code generation of four different scheduling policies are supported for an MTM-SDF graph: fully-static, self-timed, static-assignment, and fully-dynamic. In system-level, the CIC translator is extended to support code generation for implementation of system request APIs and data structures for the static scheduling results and configurable task parameters. Through preliminary experiments with a multi-mode multimedia terminal example, the viability of the proposed methodology is verified.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 7 1.3 Dissertation organization 9 Chapter 2 Background 10 2.1 Related work 10 2.1.1 Compiler-based approach 10 2.1.2 Language-based approach 11 2.1.3 Model-based approach 15 2.2 HOPES framework 19 2.3 Common Intermediate Code (CIC) Model 21 Chapter 3 Dynamic Behavior Specification 26 3.1 Problem definition 26 3.1.1 System-level dynamic behavior 26 3.1.2 Application-level dynamic behavior 27 3.2 Related work 28 3.3 Motivational example 31 3.4 Control task specification for system-level dynamism 33 3.4.1 Internal specification 33 3.4.2 Action scripts 38 3.5 MTM-SDF specification for application-level dynamism 44 3.5.1 MTM specification 44 3.5.2 Task graph specification 45 3.5.3 Execution semantic of an MTM-SDF graph 46 Chapter 4 Multiprocessor Scheduling of an Multi-mode Dataflow Graph 50 4.1 Related work 51 4.2 Motivational example 56 4.2.1 Throughput requirement calculation considering mode transition delay 56 4.2.2 Task migration between mode transition 58 4.3 Problem definition 61 4.4 Throughput requirement analysis 65 4.4.1 Mode transition delay 66 4.4.2 Arrival curves of the output buffer 70 4.4.3 Buffer size determination 71 4.4.4 Throughput requirement analysis 73 4.5 Proposed MMDF scheduling framework 75 4.5.1 Optimization problem 75 4.5.2 GA configuration 76 4.5.3 Fitness function 78 4.5.4 Local optimization technique 79 4.6 Experimental results 81 4.6.1 MMDF scheduling technique 83 4.6.2 Scalability of the Proposed Framework 88 Chapter 5 Multiprocessor Code Generation for the Extended CIC Model 89 5.1 CIC translator 89 5.2 Code generation for application-level dynamism 91 5.2.1 Function call-style code generation (fully-static, self-timed) 94 5.2.2 Thread-style code generation (static-assignment, fully-dynamic) 98 5.3 Code generation for system-level dynamism 101 5.4 Experimental results 105 Chapter 6 Conclusion and Future Work 107 Bibliography 109 초록 125Docto

SNU Open Repository and Archive

Modeling and Mapping of Optimized Schedules for Embedded Signal Processing Systems

Author: Wu Hsiang-Huang
Publication venue
Publication date: 01/01/2013
Field of study

The demand for Digital Signal Processing (DSP) in embedded systems has been increasing rapidly due to the proliferation of multimedia- and communication-intensive devices such as pervasive tablets and smart phones. Efficient implementation of embedded DSP systems requires integration of diverse hardware and software components, as well as dynamic workload distribution across heterogeneous computational resources. The former implies increased complexity of application modeling and analysis, but also brings enhanced potential for achieving improved energy consumption, cost or performance. The latter results from the increased use of dynamic behavior in embedded DSP applications. Furthermore, parallel programming is highly relevant in many embedded DSP areas due to the development and use of Multiprocessor System-On-Chip (MPSoC) technology. The need for efficient cooperation among different devices supporting diverse parallel embedded computations motivates high-level modeling that expresses dynamic signal processing behaviors and supports efficient task scheduling and hardware mapping. Starting with dynamic modeling, this thesis develops a systematic design methodology that supports functional simulation and hardware mapping of dynamic reconfiguration based on Parameterized Synchronous Dataflow (PSDF) graphs. By building on the DIF (Dataflow Interchange Format), which is a design language and associated software package for developing and experimenting with dataflow-based design techniques for signal processing systems, we have developed a novel tool for functional simulation of PSDF specifications. This simulation tool allows designers to model applications in PSDF and simulate their functionality, including use of the dynamic parameter reconfiguration capabilities offered by PSDF. With the help of this simulation tool, our design methodology helps to map PSDF specifications into efficient implementations on field programmable gate arrays (FPGAs). Furthermore, valid schedules can be derived from the PSDF models at runtime to adapt hardware configurations based on changing data characteristics or operational requirements. Under certain conditions, efficient quasi-static schedules can be applied to reduce overhead and enhance predictability in the scheduling process. Motivated by the fact that scheduling is critical to performance and to efficient use of dynamic reconfiguration, we have focused on a methodology for schedule design, which complements the emphasis on automated schedule construction in the existing literature on dataflow-based design and implementation. In particular, we have proposed a dataflow-based schedule design framework called the dataflow schedule graph (DSG), which provides a graphical framework for schedule construction based on dataflow semantics, and can also be used as an intermediate representation target for automated schedule generation. Our approach to applying the DSG in this thesis emphasizes schedule construction as a design process rather than an outcome of the synthesis process. Our approach employs dataflow graphs for representing both application models and schedules that are derived from them. By providing a dataflow-integrated framework for unambiguously representing, analyzing, manipulating, and interchanging schedules, the DSG facilitates effective codesign of dataflow-based application models and schedules for execution of these models. As multicore processors are deployed in an increasing variety of embedded image processing systems, effective utilization of resources such as multiprocessor systemon-chip (MPSoC) devices, and effective handling of implementation concerns such as memory management and I/O become critical to developing efficient embedded implementations. However, the diversity and complexity of applications and architectures in embedded image processing systems make the mapping of applications onto MPSoCs difficult. We help to address this challenge through a structured design methodology that is built upon the DSG modeling framework. We refer to this methodology as the DEIPS methodology (DSG-based design and implementation of Embedded Image Processing Systems). The DEIPS methodology provides a unified framework for joint consideration of DSG structures and the application graphs from which they are derived, which allows designers to integrate considerations of parallelization and resource constraints together with the application modeling process. We demonstrate the DEIPS methodology through cases studies on practical embedded image processing systems

CiteSeerX

Digital Repository at the University of Maryland

HARDWARE AND SOFTWARE ARCHITECTURES FOR ENERGY- AND RESOURCE-EFFICIENT SIGNAL PROCESSING SYSTEMS

Author: Cho Inkeun
Publication venue
Publication date: 01/01/2014
Field of study

For a large class of digital signal processing (DSP) systems, design and implementation of hardware and software is challenging due to stringent constraints on energy and resource requirements. In this thesis, we develop methods to address this challenge by proposing new constraint-aware system design methods for DSP systems, and energy- and resource-optimized designs of key DSP subsystems that are relevant across various application areas. In addition to general methods for optimizing energy consumption and resource utilization, we present streamlined designs that are specialized to efficiently address platform-dependent constraints. We focus on two specific aspects in development of energy- and resource-optimized design techniques: (1) Application-specific systems and architectures for energy- and resource- efficient design. First, we address challenges in efficient implementation of wireless sensor network building energy monitoring systems (WSNBEMSs). We develop new energy management schemes in order to maximize system lifetime for WSNBEMSs, and demonstrate that system lifetime can be improved significantly without affecting monitoring accuracy. We also present resource efficient, field programmable gate array (FPGA) architecture for implementation of orthogonal frequency division multiplexing (OFDM) systems. We have demonstrated that our design provides at least 8.8% enhancement in terms of resource efficiency compared to Xilinx FFT v7.1 within the same OFDM configuration. (2) Dataflow-based methods for structured design and implementation of energy- and resource- efficient DSP systems. First, we introduce a dataflow-based design approach based on integrating interrupt-based signal acquisition in context of parameterized synchronous dataflow (PSDF) modeling. We demonstrate that by applying our approach, energy- and resource-efficient embedded software can be derived systematically from high level models of dynamic, data-driven applications systems (DDDASs) functional structure. Also, we present an in-depth development of lightweight dataflow-Verilog (LWDF-V), which is an integration of the LWDF programming model with the Verilog hardware description language (HDL), and we demonstrate the utility of LWDF-V for design and implementation of digital systems for signal processing. We emphasize efficient of LWDF with HDLs, and emphasize the application of LWDF-V to design DSP systems with dynamic parameters on FPGA platforms

Digital Repository at the University of Maryland

Integrated input modeling and memory management for image processing applications

Author: Haim Fiorella
Publication venue
Publication date
Field of study

Image processing applications often demand powerful calculations and real-time performance with low power and energy consumption. Programmable hardware provides inherent parallelism and flexibility making it a good implementation choice for this application domain. In this work we introduce a new modeling technique combining Cyclo-Static Dataflow (CSDF) base model semantics and Homogeneous Parameterized Dataflow (HPDF) meta-modeling framework, which exposes more levels of parallelism than previous models and can be used to reduce buffer sizes. We model two different applications and show how we can achieve efficient scheduling and memory organization, which is crucial for this application domain, since large amounts of data are processed, and storing intermediate results usually requires the use of off-chip resources, causing slower data access and higher power consumption. We also designed a reusable wishbone compliant memory controller module that can be used to access the Xilinx Multimedia Board’s memory chips using single accesses or burst mode

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Design methodology for embedded computer vision systems

Author: Saha Sankalita
Publication venue
Publication date: 27/11/2007
Field of study

Computer vision has emerged as one of the most popular domains of embedded appli¬cations. Though various new powerful embedded platforms to support such applica¬tions have emerged in recent years, there is a distinct lack of efficient domain-specific synthesis techniques for optimized implementation of such systems. In this thesis, four different aspects that contribute to efficient design and synthesis of such systems are explored: (1) Graph Transformations: Dataflow modeling is widely used in digital signal processing (DSP) systems. However, support for dynamic behavior in such systems exists mainly at the modeling level and there is a lack of optimized synthesis tech¬niques for these models. New transformation techniques for efficient system-on-chip (SoC) design methods are proposed and implemented for cyclo-static dataflow and its parameterized version (parameterized cyclo-static dataflow) -- two powerful models that allow dynamic reconfigurability and phased behavior in DSP systems. (2) Design Space Exploration: The broad range of target platforms along with the complexity of applications provides a vast design space, calling for efficient tools to explore this space and produce effective design choices. A novel architectural level design methodology based on a formalism called multirate synchronization graphs is presented along with methods for performance evaluation. (3) Multiprocessor Communication Interface: Efficient code synthesis for emerg¬ing new parallel architectures is an important and sparsely-explored problem. A widely-encountered problem in this regard is efficient communication between pro¬cessors running different sub-systems. A widely used tool in the domain of general-purpose multiprocessor clusters is MPI (Message Passing Interface). However, this does not scale well for embedded DSP systems. A new, powerful and highly optimized communication interface for multiprocessor signal processing systems is presented in this work that is based on the integration of relevant properties of MPI with dataflow semantics. (4) Parameterized Design Framework for Particle Filters: Particle filter systems constitute an important class of applications used in a wide number of fields. An effi¬cient design and implementation framework for such systems has been implemented based on the observation that a large number of such applications exhibit similar prop¬erties. The key properties of such applications are identified and parameterized appro¬priately to realize different systems that represent useful trade-off points in the space of possible implementations

Digital Repository at the University of Maryland

Overview of the MPEG Reconfigurable Video Coding Framework

Author: Bhattacharyya Shuvra S.
Eker Johan
Janneck Jörn W.
Lucarz Christophe
Mattavelli Marco
Raulet Mickaël
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

International audienceVideo coding technology in the last 20 years has evolved producing a variety of different and complex algorithms and coding standards. So far the specification of such standards, and of the algorithms that build them, has been done case by case providing monolithic textual and reference software specifications in different forms and programming languages. However, very little attention has been given to provide a specification formalism that explicitly presents common components between standards, and the incremental modifications of such monolithic standards. The MPEG Reconfigurable Video Coding (RVC) framework is a new ISO standard currently under its final stage of standardization, aiming at providing video codec specifications at the level of library components instead of monolithic algorithms. The new concept is to be able to specify a decoder of an existing standard or a completely new configuration that may better satisfy application-specific constraints by selecting standard components from a library of standard coding algorithms. The possibility of dynamic configuration and reconfiguration of codecs also requires new methodologies and new tools for describing the new bitstream syntaxes and the parsers of such new codecs. The RVC framework is based on the usage of a new actor/ dataflow oriented language called CAL for the specification of the standard library and instantiation of the RVC decoder model. This language has been specifically designed for modeling complex signal processing systems. CAL dataflow models expose the intrinsic concurrency of the algorithms by employing the notions of actor programming and dataflow. The paper gives an overview of the concepts and technologies building the standard RVC framework and the non standard tools supporting the RVC model from the instantiation and simulation of the CAL model to software and/or hardware code synthesis

HAL-CentraleSupelec

Lund University Publications

HAL-Rennes 1

Design Tools for Dynamic, Data-Driven, Stream Mining Systems

Author: Sudusinghe Kishan Palintha
Publication venue
Publication date: 01/01/2015
Field of study

The proliferation of sensing devices and cost- and energy-efficient embedded processors has contributed to an increasing interest in adaptive stream mining (ASM) systems. In this class of signal processing systems, knowledge is extracted from data streams in real-time as the data arrives, rather than in a store-now, process later fashion. The evolution of machine learning methods in many application areas has contributed to demands for efficient and accurate information extraction from streams of data arriving at distributed, mobile, and heterogeneous processing nodes. To enhance accuracy, and meet the stringent constraints in which they must be deployed, it is important for ASM systems to be effective in adapting knowledge extraction approaches and processing configurations based on data characteristics and operational conditions. In this thesis, we address these challenges in design and implementation of ASM systems. We develop systematic methods and supporting design tools for ASM systems that integrate (1) foundations of dataflow modeling for high level signal processing system design, and (2) the paradigm on Dynamic Data-Driven Application Systems (DDDAS). More specifically, the contributions of this thesis can be broadly categorized in to three major directions: 1. We develop a new design framework that systematically applies dataflow methodologies for high level signal processing system design, and adaptive stream mining based on dynamic topologies of classifiers. In particular, we introduce a new design environment, called the lightweight dataflow for dynamic data driven application systems environment (LiD4E). LiD4E provides formal semantics, rooted in dataflow principles, for design and implementation of a broad class of stream mining topologies. Using this novel application of dataflow methods, LiD4E facilitates the efficient and reliable mapping and adaptation of classifier topologies into implementations on embedded platforms. 2. We introduce new design methods for data-driven digital signal processing (DSP) systems that are targeted to resource- and energy-constrained embedded environments, such as unmanned areal vehicles (UAVs), mobile communication platforms, and wireless sensor networks. We develop a design and implementation framework for multi-mode, data driven embedded signal processing systems, where application modes with complementary trade-offs are selected, configured, executed, and switched dynamically, in a data-driven manner. We demonstrate the utility of our proposed new design methods on an energy-constrained, multi-mode face detection application. 3. We introduce new methods for multiobjective, system-level optimization that have been incorporated into the LiD4E design tool described previously. More specifically, we develop new methods for integrated modeling and optimization of real-time stream mining constraints, multidimensional stream mining performance (e.g., precision and recall), and energy efficiency. Using a design methodology centered on data-driven control of and coordination between alternative dataflow subsystems for stream mining (classification modes), we develop systematic methods for exploring complex, multidimensional design spaces associated with dynamic stream mining systems, and deriving sets of Pareto-optimal system configurations that can be switched among based on data characteristics and operating constraints

Digital Repository at the University of Maryland

PRUNE: Dynamic and Decidable Dataflow for Signal Processing on Heterogeneous Platforms

Author: Bhattacharyya Shuvra S.
Boutellier Jani
Huttunen Heikki
Wu Jiahao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The majority of contemporary mobile devices and personal computers are based on heterogeneous computing platforms that consist of a number of CPU cores and one or more Graphics Processing Units (GPUs). Despite the high volume of these devices, there are few existing programming frameworks that target full and simultaneous utilization of all CPU and GPU devices of the platform. This article presents a dataflow-flavored Model of Computation (MoC) that has been developed for deploying signal processing applications to heterogeneous platforms. The presented MoC is dynamic and allows describing applications with data dependent run-time behavior. On top of the MoC, formal design rules are presented that enable application descriptions to be simultaneously dynamic and decidable. Decidability guarantees compile-time application analyzability for deadlock freedom and bounded memory. The presented MoC and the design rules are realized in a novel Open Source programming environment "PRUNE" and demonstrated with representative application examples from the domains of image processing, computer vision and wireless communications. Experimental results show that the proposed approach outperforms the state-of-the-art in analyzability, flexibility and performance.Comment: This is the author's version of an article that has been published in this journal. Changes were made to this version by the publisher prior to publicatio

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

mAPN: Modeling, Analysis, and Exploration of Algorithmic and Parallelism Adaptivity

Author: Bouraoui Hasna
Castrillon Jeronimo
Jerad Chadlia
Romdhani Omar
Publication venue
Publication date: 15/07/2022
Field of study

Using parallel embedded systems these days is increasing. They are getting more complex due to integrating multiple functionalities in one application or running numerous ones concurrently. This concerns a wide range of applications, including streaming applications, commonly used in embedded systems. These applications must implement adaptable and reliable algorithms to deliver the required performance under varying circumstances (e.g., running applications on the platform, input data, platform variety, etc.). Given the complexity of streaming applications, target systems, and adaptivity requirements, designing such systems with traditional programming models is daunting. This is why model-based strategies with an appropriate Model of Computation (MoC) have long been studied for embedded system design. This work provides algorithmic adaptivity on top of parallelism for dynamic dataflow to express larger sets of variants. We present a multi-Alternative Process Network (mAPN), a high-level abstract representation in which several variants of the same application coexist in the same graph expressing different implementations. We introduce mAPN properties and its formalism to describe various local implementation alternatives. Furthermore, mAPNs are enriched with metadata to Provide the alternatives with quantitative annotations in terms of a specific metric. To help the user analyze the rich space of variants, we propose a methodology to extract feasible variants under user and hardware constraints. At the core of the methodology is an algorithm for computing global metrics of an execution of different alternatives from a compact mAPN specification. We validate our approach by exploring several possible variants created for the Automatic Subtitling Application (ASA) on two hardware platforms.Comment: 26 PAGES JOURNAL PAPE

arXiv.org e-Print Archive