Search CORE

111,043 research outputs found

Energy-Efficient Neural Network Architectures

Author: Wu Hsi-Shou
Publication venue
Publication date: 01/01/2018
Field of study

Emerging systems for artificial intelligence (AI) are expected to rely on deep neural networks (DNNs) to achieve high accuracy for a broad variety of applications, including computer vision, robotics, and speech recognition. Due to the rapid growth of network size and depth, however, DNNs typically result in high computational costs and introduce considerable power and performance overheads. Dedicated chip architectures that implement DNNs with high energy efficiency are essential for adding intelligence to interactive edge devices, enabling them to complete increasingly sophisticated tasks by extending battery lie. They are also vital for improving performance in cloud servers that support demanding AI computations. This dissertation focuses on architectures and circuit technologies for designing energy-efficient neural network accelerators. First, a deep-learning processor is presented for achieving ultra-low power operation. Using a heterogeneous architecture that includes a low-power always-on front-end and a selectively-enabled high-performance back-end, the processor dynamically adjusts computational resources at runtime to support conditional execution in neural networks and meet performance targets with increased energy efficiency. Featuring a reconfigurable datapath and a memory architecture optimized for energy efficiency, the processor supports multilevel dynamic activation of neural network segments, performing object detection tasks with 5.3x lower energy consumption in comparison with a static execution baseline. Fabricated in 40nm CMOS, the processor test-chip dissipates 0.23mW at 5.3 fps. It demonstrates energy scalability up to 28.6 TOPS/W and can be configured to run a variety of workloads, including severely power-constrained ones such as always-on monitoring in mobile applications. To further improve the energy efficiency of the proposed heterogeneous architecture, a new charge-recovery logic family, called zero-short-circuit current (ZSCC) logic, is proposed to decrease the power consumption of the always-on front-end. By relying on dedicated circuit topologies and a four-phase clocking scheme, ZSCC operates with significantly reduced short-circuit currents, realizing order-of-magnitude power savings at relatively low clock frequencies (in the order of a few MHz). The efficiency and applicability of ZSCC is demonstrated through an ANSI S1.11 1/3 octave filter bank chip for binaural hearing aids with two microphones per ear. Fabricated in a 65nm CMOS process, this charge-recovery chip consumes 13.8µW with a 1.75MHz clock frequency, achieving 9.7x power reduction per input in comparison with a 40nm monophonic single-input chip that represents the published state of the art. The ability of ZSCC to further increase the energy efficiency of the heterogeneous neural network architecture is demonstrated through the design and evaluation of a ZSCC-based front-end. Simulation results show 17x power reduction compared with a conventional static CMOS implementation of the same architecture.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147614/1/hsiwu_1.pd

Deep Blue Documents at the University of Michigan

Augmented-LSTM and 1D-CNN-LSTM based DPD models for linearization of wideband power amplifiers

Author: Ambagahawela Rathnayake Mudiyanselage A. (Anusha)
Publication venue: University of Oulu
Publication date: 05/05/2023
Field of study

Abstract. Artificial Neural Networks (ANNs) have gained popularity in modeling the nonlinear behavior of wideband power amplifiers. Recently, modern researchers have used two types of neural network architectures, Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN), to model power amplifier behavior and compensate for power amplifier distortion. Each architecture has its own advantages and limitations. In light of these, this study proposes two digital pre-distortion (DPD) models based on LSTM and CNN. The first proposed model is an augmented LSTM model, which effectively reduces distortion in wideband power amplifiers. The measurement results demonstrate that the proposed augmented LSTM model provides better linearization performance than existing state-of-the-art DPDs designed using ANNs. The second proposed model is a 1D-CNN-LSTM model that simplifies the augmented LSTM model by integrating a CNN layer before the LSTM layer. This integration reduces the number of input features to the LSTM layer, resulting in a low-complexity linearization for wideband PAs. The measurement results show that the 1D-CNN-LSTM model provides comparable results to the augmented LSTM model. In summary, this study proposes two novel DPD models based on LSTM and CNN, which effectively reduce distortion and provide low-complexity linearization for wideband PAs. The measurement results demonstrate that both models offer comparable performance to existing state-of-the-art DPDs designed using ANNs

University of Oulu Repository - Jultika

Multilevel HfO2-based RRAM devices for low-power neuromorphic networks

Author: G. Ossorio O.
Ielmini D.
K. Mahadevaiah M.
Milo V.
Olivo P.
Perez E.
Wenger C.
Zambelli C.
Publication venue: 'AIP Publishing'
Publication date: 01/01/2019
Field of study

Training and recognition with neural networks generally require high throughput, high energy efficiency, and scalable circuits to enable artificial intelligence tasks to be operated at the edge, i.e., in battery-powered portable devices and other limited-energy environments. In this scenario, scalable resistive memories have been proposed as artificial synapses thanks to their scalability, reconfigurability, and high-energy efficiency, and thanks to the ability to perform analog computation by physical laws in hardware. In this work, we study the material, device, and architecture aspects of resistive switching memory (RRAM) devices for implementing a 2-layer neural network for pattern recognition. First, various RRAM processes are screened in view of the device window, analog storage, and reliability. Then, synaptic weights are stored with 5-level precision in a 4 kbit array of RRAM devices to classify the Modified National Institute of Standards and Technology (MNIST) dataset. Finally, classification performance of a 2-layer neural network is tested before and after an annealing experiment by using experimental values of conductance stored into the array, and a simulation-based analysis of inference accuracy for arrays of increasing size is presented. Our work supports material-based development of RRAM synapses for novel neural networks with high accuracy and low-power consumption. (C) 2019 Author(s)

Archivio istituzionale della ricerca - Politecnico di Milano

Repositorio Documental de la Universidad de Valladolid

Archivio istituzionale della ricerca - Università di Ferrara

Synthetic Low-Field MRI Super-Resolution Via Nested U-Net Architecture

Author: Bhutto Danyal
Kalluvila Aryan
Koonjoo Neha
Rockenbach Marcio
Rosen Matthew S.
Publication venue
Publication date: 27/11/2022
Field of study

Low-field (LF) MRI scanners have the power to revolutionize medical imaging by providing a portable and cheaper alternative to high-field MRI scanners. However, such scanners are usually significantly noisier and lower quality than their high-field counterparts. The aim of this paper is to improve the SNR and overall image quality of low-field MRI scans to improve diagnostic capability. To address this issue, we propose a Nested U-Net neural network architecture super-resolution algorithm that outperforms previously suggested deep learning methods with an average PSNR of 78.83 and SSIM of 0.9551. We tested our network on artificial noisy downsampled synthetic data from a major T1 weighted MRI image dataset called the T1-mix dataset. One board-certified radiologist scored 25 images on the Likert scale (1-5) assessing overall image quality, anatomical structure, and diagnostic confidence across our architecture and other published works (SR DenseNet, Generator Block, SRCNN, etc.). We also introduce a new type of loss function called natural log mean squared error (NLMSE). In conclusion, we present a more accurate deep learning method for single image super-resolution applied to synthetic low-field MRI via a Nested U-Net architecture

arXiv.org e-Print Archive

Multi-time-horizon Solar Forecasting Using Recurrent Neural Network

Author: Mishra Sakshi
Palanisamy Praveen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/07/2018
Field of study

The non-stationarity characteristic of the solar power renders traditional point forecasting methods to be less useful due to large prediction errors. This results in increased uncertainties in the grid operation, thereby negatively affecting the reliability and increased cost of operation. This research paper proposes a unified architecture for multi-time-horizon predictions for short and long-term solar forecasting using Recurrent Neural Networks (RNN). The paper describes an end-to-end pipeline to implement the architecture along with the methods to test and validate the performance of the prediction model. The results demonstrate that the proposed method based on the unified architecture is effective for multi-horizon solar forecasting and achieves a lower root-mean-squared prediction error compared to the previous best-performing methods which use one model for each time-horizon. The proposed method enables multi-horizon forecasts with real-time inputs, which have a high potential for practical applications in the evolving smart grid.Comment: Accepted at: IEEE Energy Conversion Congress and Exposition (ECCE 2018), 7 pages, 5 figures, code available: sakshi-mishra.github.i

arXiv.org e-Print Archive

Crossref