18 research outputs found

    Iteratively Training Look-Up Tables for Network Quantization

    Full text link
    Operating deep neural networks (DNNs) on devices with limited resources requires the reduction of their memory as well as computational footprint. Popular reduction methods are network quantization or pruning, which either reduce the word length of the network parameters or remove weights from the network if they are not needed. In this article we discuss a general framework for network reduction which we call `Look-Up Table Quantization` (LUT-Q). For each layer, we learn a value dictionary and an assignment matrix to represent the network weights. We propose a special solver which combines gradient descent and a one-step k-means update to learn both the value dictionaries and assignment matrices iteratively. This method is very flexible: by constraining the value dictionary, many different reduction problems such as non-uniform network quantization, training of multiplierless networks, network pruning or simultaneous quantization and pruning can be implemented without changing the solver. This flexibility of the LUT-Q method allows us to use the same method to train networks for different hardware capabilities.Comment: Copyright 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

    Energy-efficient Hardware Accelerator Design for Convolutional Neural Network

    Get PDF
    Department of Electrical EngineeringConvolutional neural network (CNN) is a class of deep neural networks, which shows a superior performance in handling images. Due to its performance, CNN is widely used to classify an object or detect the position of object from an image. CNN can be implemented on either edge devices or cloud servers. Since the cloud servers have high computational capabilities, CNN on cloud can perform a large number of tasks at once with a high throughput. However, CNN on the cloud requires a long round-trip time. To infer an image picture, data from a sensor should be uploaded to the cloud server, and processed information from CNN is transferred to the user. If an application requires a rapid response in a certain situation, the long round-trip time of cloud is a critical issue. On the other hand, an edge device has a very short latency, even though it has limited computing resources. In addition, since the edge device does not require the transmission of images over network, its performance would not be affected by the bandwidth of network. Because of these features, it is ef???cient to use the cloud for CNN computing in most cases, but the edge device is preferred in some applications. For example, CNN algorithm for autonomous car requires rapid responses. CNN on cloud requires transmission and reception of images through network, and it cannot respond quickly to users. This problem becomes more serious when a high-resolution input image is required. On the other hand, the edge device does not require the data transmission, and it can response very quickly. Edge devices would be also suitable for CNN applications involving privacy or security. However, the edge device has limited energy resource, the energy ef???ciency of the CNN accelerator is a very important issue. Embedded CNN accelerator consists of off-chip memory, host CPU and a hardware accelerator. The hardware accelerator consists of the main controller, global buffer and arrays of processing elements (PE). It also has a separate compression module and activation module. In this dissertation, we propose energy-ef???cient design in three different parts. First, we propose a time-multiplexing PE to increase the energy ef???ciency of multipliers. From the fact the feature maps have small values which are de???ned as non-outliers, we increase the energy ef???ciency for computing non-outliers. For further improving the energy efficiency of PE, approximate computing is also introduced. Method to optimize the trade-off between accuracy and energy is also proposed. Second, we investigate the energy-ef???cient accuracy recovery circuit. For the implementation of CNN on edge, CNN loops are usually tiled. During tiling of CNN loops, accuracy can be degraded. We analyze the accuracy reduction due to tiling and recover accuracy by extending et al. of partial sums with very small energy overhead. Third, we reduce energy consumption for DRAM accessing. CNN requires massive data transmission between on-chip and off-chip memory. The energy consumption of data transmission accounts for a large portion of total energy consumption. We propose a spatial correlation-aware compression algorithm to reduce the transmission of feature maps. In each of these three levels, this dissertation proposes novel optimization and design ???ows which increase the energy ef???ciency of CNN accelerator on edge.clos

    Applications of MATLAB in Science and Engineering

    Get PDF
    The book consists of 24 chapters illustrating a wide range of areas where MATLAB tools are applied. These areas include mathematics, physics, chemistry and chemical engineering, mechanical engineering, biological (molecular biology) and medical sciences, communication and control systems, digital signal, image and video processing, system modeling and simulation. Many interesting problems have been included throughout the book, and its contents will be beneficial for students and professionals in wide areas of interest

    Novel arithmetic implementations using cellular neural network arrays.

    Get PDF
    The primary goal of this research is to explore the use of arrays of analog self-synchronized cells---the cellular neural network (CNN) paradigm---in the implementation of novel digital arithmetic architectures. In exploring this paradigm we also discover that the implementation of these CNN arrays produces very low system noise; that is, noise generated by the rapid switching of current through power supply die connections---so called di/dt noise. With the migration to sub 100 nanometer process technology, signal integrity is becoming a critical issue when integrating analog and digital components onto the same chip, and so the CNN architectural paradigm offers a potential solution to this problem. A typical example is the replacement of conventional digital circuitry adjacent to sensitive bio-sensors in a SoC Bio-Platform. The focus of this research is therefore to discover novel approaches to building low-noise digital arithmetic circuits using analog cellular neural networks, essentially implementing asynchronous digital logic but with the same circuit components as used in analog circuit design. We address our exploration by first improving upon previous research into CNN binary arithmetic arrays. The second phase of our research introduces a logical extension of the binary arithmetic method to implement binary signed-digit (BSD) arithmetic. To this end, a new class of CNNs that has three stable states is introduced, and is used to implement arithmetic circuits that use binary inputs and outputs but internally uses the BSD number representation. Finally, we develop CNN arrays for a 2-dimensional number representation (the Double-base Number System - DBNS). A novel adder architecture is described in detail, that performs the addition as well as reducing the representation for further processing; the design incorporates an innovative self-programmable array. Extensive simulations have shown that our new architectures can reduce system noise by almost 70dB and crosstalk by more than 23dB over standard digital implementations.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .I27. Source: Dissertation Abstracts International, Volume: 66-11, Section: B, page: 6159. Thesis (Ph.D.)--University of Windsor (Canada), 2005

    Digital Filters and Signal Processing

    Get PDF
    Digital filters, together with signal processing, are being employed in the new technologies and information systems, and are implemented in different areas and applications. Digital filters and signal processing are used with no costs and they can be adapted to different cases with great flexibility and reliability. This book presents advanced developments in digital filters and signal process methods covering different cases studies. They present the main essence of the subject, with the principal approaches to the most recent mathematical models that are being employed worldwide

    High-speed fir filter design and optimization using artificial intelligence techniques

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore