5,842 research outputs found

    Dynamic Power Management for Neuromorphic Many-Core Systems

    Full text link
    This work presents a dynamic power management architecture for neuromorphic many core systems such as SpiNNaker. A fast dynamic voltage and frequency scaling (DVFS) technique is presented which allows the processing elements (PE) to change their supply voltage and clock frequency individually and autonomously within less than 100 ns. This is employed by the neuromorphic simulation software flow, which defines the performance level (PL) of the PE based on the actual workload within each simulation cycle. A test chip in 28 nm SLP CMOS technology has been implemented. It includes 4 PEs which can be scaled from 0.7 V to 1.0 V with frequencies from 125 MHz to 500 MHz at three distinct PLs. By measurement of three neuromorphic benchmarks it is shown that the total PE power consumption can be reduced by 75%, with 80% baseline power reduction and a 50% reduction of energy per neuron and synapse computation, all while maintaining temporary peak system performance to achieve biological real-time operation of the system. A numerical model of this power management model is derived which allows DVFS architecture exploration for neuromorphics. The proposed technique is to be used for the second generation SpiNNaker neuromorphic many core system

    The Design of a System Architecture for Mobile Multimedia Computers

    Get PDF
    This chapter discusses the system architecture of a portable computer, called Mobile Digital Companion, which provides support for handling multimedia applications energy efficiently. Because battery life is limited and battery weight is an important factor for the size and the weight of the Mobile Digital Companion, energy management plays a crucial role in the architecture. As the Companion must remain usable in a variety of environments, it has to be flexible and adaptable to various operating conditions. The Mobile Digital Companion has an unconventional architecture that saves energy by using system decomposition at different levels of the architecture and exploits locality of reference with dedicated, optimised modules. The approach is based on dedicated functionality and the extensive use of energy reduction techniques at all levels of system design. The system has an architecture with a general-purpose processor accompanied by a set of heterogeneous autonomous programmable modules, each providing an energy efficient implementation of dedicated tasks. A reconfigurable internal communication network switch exploits locality of reference and eliminates wasteful data copies

    Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

    Get PDF
    In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

    Energy Efficient Branch and Bound based On-Chip Irregular Network Design

    Get PDF
    Here we present a technique which construct the topology for heterogeneous SoC, (Application Specific NoC) such that total Dynamic communication energy is optimized. The topology is certain to satisfy the constraints of node degree as well the link length. We first layout the topology by finding the shortest path between traffic characteristics with the branch and bound optimization technique. Deadlock is dealt with escape routing using Spanning tree. Investigation outcome show that the proposed design methodology is fast and achieves significant dynamic energy gain

    Nature-Inspired Interconnects for Self-Assembled Large-Scale Network-on-Chip Designs

    Get PDF
    Future nano-scale electronics built up from an Avogadro number of components needs efficient, highly scalable, and robust means of communication in order to be competitive with traditional silicon approaches. In recent years, the Networks-on-Chip (NoC) paradigm emerged as a promising solution to interconnect challenges in silicon-based electronics. Current NoC architectures are either highly regular or fully customized, both of which represent implausible assumptions for emerging bottom-up self-assembled molecular electronics that are generally assumed to have a high degree of irregularity and imperfection. Here, we pragmatically and experimentally investigate important design trade-offs and properties of an irregular, abstract, yet physically plausible 3D small-world interconnect fabric that is inspired by modern network-on-chip paradigms. We vary the framework's key parameters, such as the connectivity, the number of switch nodes, the distribution of long- versus short-range connections, and measure the network's relevant communication characteristics. We further explore the robustness against link failures and the ability and efficiency to solve a simple toy problem, the synchronization task. The results confirm that (1) computation in irregular assemblies is a promising and disruptive computing paradigm for self-assembled nano-scale electronics and (2) that 3D small-world interconnect fabrics with a power-law decaying distribution of shortcut lengths are physically plausible and have major advantages over local 2D and 3D regular topologies

    Energy Efficient Network Generation for Application Specific NoC

    Get PDF
    Networks-on-Chip is emerging as a communication platform for future complex SoC designs, composed of a large number of homogenous or heterogeneous processing resources. Most SoC platforms are customized to the domainspecific requirements of their applications, which communicate in a specific, mostly irregular way. The specific but often diverse communication requirements among cores of the SoC call for the design of application-specific network of SoC for improved performance in terms of communication energy, latency, and throughput. In this work, we propose a methodology for the design of customized irregular network architecture of SoC. The proposed method exploits priori knowledge of the application2019;s communication characteristic to generate an energy optimized network and corresponding routing tables
    • …
    corecore