194,290 research outputs found

    Scalable Design Space Exploration via Answer Set Programming

    Get PDF
    The design of embedded systems is becoming continuously more complex such that the application of efficient high level design methods are crucial for competitive results regarding design time and performance. Recently, advances in Boolean constraint solvers for Answer Set Programming (ASP) allow for easy integration of background theories and more control over the solving process. The goal of this research is to leverage those advances for system level design space exploration while using specialized techniques from electronic design automation that drive new application-originated ideas for multi-objective combinatorial optimization

    DESIGN SPACE EXPLORATION FOR SIGNAL PROCESSING SYSTEMS USING LIGHTWEIGHT DATAFLOW GRAPHS

    Get PDF
    Digital signal processing (DSP) is widely used in many types of devices, including mobile phones, tablets, personal computers, and numerous forms of embedded systems. Implementation of modern DSP applications is very challenging in part due to the complex design spaces that are involved. These design spaces involve many kinds of configurable parameters associated with the signal processing algorithms that are used, as well as different ways of mapping the algorithms onto the targeted platforms. In this thesis, we develop new algorithms, software tools and design methodologies to systematically explore the complex design spaces that are involved in design and implementation of signal processing systems. To improve the efficiency of design space exploration, we develop and apply compact system level models, which are carefully formulated to concisely capture key properties of signal processing algorithms, target platforms, and algorithm-platform interactions. Throughout the thesis, we develop design methodologies and tools for integrating new compact system level models and design space exploration methods with lightweight dataflow (LWDF) techniques for design and implementation of signal processing systems. LWDF is a previously-introduced approach for integrating new forms of design space exploration and system-level optimization into design processes for DSP systems. LWDF provides a compact set of retargetable application programming interfaces (APIs) that facilitates the integration of dataflow-based models and methods. Dataflow provides an important formal foundation for advanced DSP system design, and the flexible support for dataflow in LWDF facilitates experimentation with and application of novel design methods that are founded in dataflow concepts. Our developed methodologies apply LWDF programming to facilitate their application to different types of platforms and their efficient integration with platform-based tools for hardware/software implementation. Additionally, we introduce novel extensions to LWDF to improve its utility for digital hardware design and adaptive signal processing implementation. To address the aforementioned challenges of design space exploration and system optimization, we present a systematic multiobjective optimization framework for dataflow-based architectures. This framework builds on the methodology of multiobjective evolutionary algorithms and derives key system parameters subject to time-varying and multidimensional constraints on system performance. We demonstrate the framework by applying LWDF techniques to develop a dataflow-based architecture that can be dynamically reconfigured to realize strategic configurations in the underlying parameter space based on changing operational requirements. Secondly, we apply Markov decision processes (MDPs) for design space exploration in adaptive embedded signal processing systems. We propose a framework, known as the Hierarchical MDP framework for Compact System-level Modeling (HMCSM), which embraces MDPs to enable autonomous adaptation of embedded signal processing under multidimensional constraints and optimization objectives. The framework integrates automated, MDP-based generation of optimal reconfiguration policies, dataflow-based application modeling, and implementation of embedded control software that carries out the generated reconfiguration policies. Third, we present a new methodology for design and implementation of signal processing systems that are targeted to system-on-chip (SoC) platforms. The methodology is centered on the use of LWDF concepts and methods for applying principles of dataflow design at different layers of abstraction. The development processes integrated in our approach are software implementation, hardware implementation, hardware-software co-design, and optimized application mapping. The proposed methodology facilitates development and integration of signal processing hardware and software modules that involve heterogeneous programming languages and platforms. Through three case studies involving complex applications, we demonstrate the effectiveness of the proposed contributions for compact system level design and design space exploration: a digital predistortion (DPD) system, a reconfigurable channelizer for wireless communication, and a deep neural network (DNN) for vehicle classification

    Design methodology for embedded computer vision systems

    Get PDF
    Computer vision has emerged as one of the most popular domains of embedded appli¬cations. Though various new powerful embedded platforms to support such applica¬tions have emerged in recent years, there is a distinct lack of efficient domain-specific synthesis techniques for optimized implementation of such systems. In this thesis, four different aspects that contribute to efficient design and synthesis of such systems are explored: (1) Graph Transformations: Dataflow modeling is widely used in digital signal processing (DSP) systems. However, support for dynamic behavior in such systems exists mainly at the modeling level and there is a lack of optimized synthesis tech¬niques for these models. New transformation techniques for efficient system-on-chip (SoC) design methods are proposed and implemented for cyclo-static dataflow and its parameterized version (parameterized cyclo-static dataflow) -- two powerful models that allow dynamic reconfigurability and phased behavior in DSP systems. (2) Design Space Exploration: The broad range of target platforms along with the complexity of applications provides a vast design space, calling for efficient tools to explore this space and produce effective design choices. A novel architectural level design methodology based on a formalism called multirate synchronization graphs is presented along with methods for performance evaluation. (3) Multiprocessor Communication Interface: Efficient code synthesis for emerg¬ing new parallel architectures is an important and sparsely-explored problem. A widely-encountered problem in this regard is efficient communication between pro¬cessors running different sub-systems. A widely used tool in the domain of general-purpose multiprocessor clusters is MPI (Message Passing Interface). However, this does not scale well for embedded DSP systems. A new, powerful and highly optimized communication interface for multiprocessor signal processing systems is presented in this work that is based on the integration of relevant properties of MPI with dataflow semantics. (4) Parameterized Design Framework for Particle Filters: Particle filter systems constitute an important class of applications used in a wide number of fields. An effi¬cient design and implementation framework for such systems has been implemented based on the observation that a large number of such applications exhibit similar prop¬erties. The key properties of such applications are identified and parameterized appro¬priately to realize different systems that represent useful trade-off points in the space of possible implementations

    Optimization of Discrete-parameter Multiprocessor Systems using a Novel Ergodic Interpolation Technique

    Full text link
    Modern multi-core systems have a large number of design parameters, most of which are discrete-valued, and this number is likely to keep increasing as chip complexity rises. Further, the accurate evaluation of a potential design choice is computationally expensive because it requires detailed cycle-accurate system simulation. If the discrete parameter space can be embedded into a larger continuous parameter space, then continuous space techniques can, in principle, be applied to the system optimization problem. Such continuous space techniques often scale well with the number of parameters. We propose a novel technique for embedding the discrete parameter space into an extended continuous space so that continuous space techniques can be applied to the embedded problem using cycle accurate simulation for evaluating the objective function. This embedding is implemented using simulation-based ergodic interpolation, which, unlike spatial interpolation, produces the interpolated value within a single simulation run irrespective of the number of parameters. We have implemented this interpolation scheme in a cycle-based system simulator. In a characterization study, we observe that the interpolated performance curves are continuous, piece-wise smooth, and have low statistical error. We use the ergodic interpolation-based approach to solve a large multi-core design optimization problem with 31 design parameters. Our results indicate that continuous space optimization using ergodic interpolation-based embedding can be a viable approach for large multi-core design optimization problems.Comment: A short version of this paper will be published in the proceedings of IEEE MASCOTS 2015 conferenc

    Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems

    Full text link
    Deep Learning is increasingly being adopted by industry for computer vision applications running on embedded devices. While Convolutional Neural Networks' accuracy has achieved a mature and remarkable state, inference latency and throughput are a major concern especially when targeting low-cost and low-power embedded platforms. CNNs' inference latency may become a bottleneck for Deep Learning adoption by industry, as it is a crucial specification for many real-time processes. Furthermore, deployment of CNNs across heterogeneous platforms presents major compatibility issues due to vendor-specific technology and acceleration libraries. In this work, we present QS-DNN, a fully automatic search based on Reinforcement Learning which, combined with an inference engine optimizer, efficiently explores through the design space and empirically finds the optimal combinations of libraries and primitives to speed up the inference of CNNs on heterogeneous embedded devices. We show that, an optimized combination can achieve 45x speedup in inference latency on CPU compared to a dependency-free baseline and 2x on average on GPGPU compared to the best vendor library. Further, we demonstrate that, the quality of results and time "to-solution" is much better than with Random Search and achieves up to 15x better results for a short-time search
    corecore