Search CORE

25,188 research outputs found

An architecture for multi-layer feed-forward neural networks.

Author: Nosratinia Aria.
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/1991
Field of study

Feed-forward neural networks can perform classifications and generalizations that are difficult to achieve with any other known method, and their performance matches or surpasses that of the conventional methods. To utilize the potential of these networks to the fullest, however, an efficient hardware implementation is needed. In this thesis, an architecture for efficient implementation of food-forward multi-layer neural networks is introduced. The interconnection congestion problem is addressed by a multiplexing scheme, which reduces the number of physical interconnections without any loss of generality. The building blocks are mostly in current mode analog CMOS, and the connection strengths of the network are stored in a digital memory. Also included in this thesis is a performance analysis of the architecture and a study of the effects of quantization and truncation of connection strengths on network performance.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis1991 .N687. Source: Masters Abstracts International, Volume: 31-01, page: 0398. Co-Supervisors: M. Ahmadi; M. Shridhar. Thesis (M.A.Sc.)--University of Windsor (Canada), 1991

RNA: REUSABLE NEURON ARCHITECTURE FOR ON-CHIP ELECTROCARDIOGRAM CLASSIFICATION AND MACHINE LEARNING

Author: Sun Yuwen
Publication venue
Publication date: 25/06/2010
Field of study

Artificial neural networks (ANN) offer tremendous promise in classifying electrocardiogram (ECG) for detection and diagnosis of cardiovascular diseases. In this thesis, we propose a reusable neuron architecture (RNA) to enable an efficient and cost-effective ANN-based ECG processing by multiplexing the same physical neurons for both feed-forward and back-propagation stages. RNA further conserves the area and resources of the chip and reduces power dissipation by coalescing different layers of the neural network into a single layer. Moreover, the microarchitecture of each RNA neuron has been optimized to maximize the degree of hardware reusability by fusing multiple two-input multipliers and a multi-input adder into one two-input multiplier and one two-input adder. With RNA, we demonstrated a hardware implementation of a three-layer 51-30-12 artificial neural network using only thirty physical RNA neurons.A quantitative design space exploration in area, power dissipation, and speed between the proposed RNA and three other implementations representative of different reusable hardware strategies is presented and discussed. An RNA ASIC was implemented using 45nm CMOS technology and verified on a Xilinx Virtex-5 FPGA board. Compared with an equivalent software implementation in C executed on a mainstream embedded microprocessor, the RNA ASIC improves both the training speed and the energy efficiency by three orders of magnitude, respectively. The real-time and functional correctness of RNA was verified using real ECG signals from the MIT-BIH arrhythmia database

E-PUR: An Energy-Efficient Processing Unit for Recurrent Neural Networks

Author: Arnau Jose-Maria
Dot Gem
Gonzalez Antonio
Silfa Franyell
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/11/2017
Field of study

Recurrent Neural Networks (RNNs) are a key technology for emerging applications such as automatic speech recognition, machine translation or image description. Long Short Term Memory (LSTM) networks are the most successful RNN implementation, as they can learn long term dependencies to achieve high accuracy. Unfortunately, the recurrent nature of LSTM networks significantly constrains the amount of parallelism and, hence, multicore CPUs and many-core GPUs exhibit poor efficiency for RNN inference. In this paper, we present E-PUR, an energy-efficient processing unit tailored to the requirements of LSTM computation. The main goal of E-PUR is to support large recurrent neural networks for low-power mobile devices. E-PUR provides an efficient hardware implementation of LSTM networks that is flexible to support diverse applications. One of its main novelties is a technique that we call Maximizing Weight Locality (MWL), which improves the temporal locality of the memory accesses for fetching the synaptic weights, reducing the memory requirements by a large extent. Our experimental results show that E-PUR achieves real-time performance for different LSTM networks, while reducing energy consumption by orders of magnitude with respect to general-purpose processors and GPUs, and it requires a very small chip area. Compared to a modern mobile SoC, an NVIDIA Tegra X1, E-PUR provides an average energy reduction of 92x

arXiv.org e-Print Archive