Search CORE

267 research outputs found

Recommended from our members

Versatile stochastic dot product circuits based on nonvolatile memories for high performance neurocomputing and neurooptimization.

Author: Mahmoodi MR
Prezioso M
Strukov DB
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

The key operation in stochastic neural networks, which have become the state-of-the-art approach for solving problems in machine learning, information theory, and statistics, is a stochastic dot-product. While there have been many demonstrations of dot-product circuits and, separately, of stochastic neurons, the efficient hardware implementation combining both functionalities is still missing. Here we report compact, fast, energy-efficient, and scalable stochastic dot-product circuits based on either passively integrated metal-oxide memristors or embedded floating-gate memories. The circuit's high performance is due to mixed-signal implementation, while the efficient stochastic operation is achieved by utilizing circuit's noise, intrinsic and/or extrinsic to the memory cell array. The dynamic scaling of weights, enabled by analog memory devices, allows for efficient realization of different annealing approaches to improve functionality. The proposed approach is experimentally verified for two representative applications, namely by implementing neural network for solving a four-node graph-partitioning problem, and a Boltzmann machine with 10-input and 8-hidden neurons

eScholarship - University of California

An Analog VLSI Deep Machine Learning Implementation

Author: Lu Junjie
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2014
Field of study

Machine learning systems provide automated data processing and see a wide range of applications. Direct processing of raw high-dimensional data such as images and video by machine learning systems is impractical both due to prohibitive power consumption and the “curse of dimensionality,” which makes learning tasks exponentially more difficult as dimension increases. Deep machine learning (DML) mimics the hierarchical presentation of information in the human brain to achieve robust automated feature extraction, reducing the dimension of such data. However, the computational complexity of DML systems limits large-scale implementations in standard digital computers. Custom analog signal processing (ASP) can yield much higher energy efficiency than digital signal processing (DSP), presenting means of overcoming these limitations. The purpose of this work is to develop an analog implementation of DML system. First, an analog memory is proposed as an essential component of the learning systems. It uses the charge trapped on the floating gate to store analog value in a non-volatile way. The memory is compatible with standard digital CMOS process and allows random-accessible bi-directional updates without the need for on-chip charge pump or high voltage switch. Second, architecture and circuits are developed to realize an online k-means clustering algorithm in analog signal processing. It achieves automatic recognition of underlying data pattern and online extraction of data statistical parameters. This unsupervised learning system constitutes the computation node in the deep machine learning hierarchy. Third, a 3-layer, 7-node analog deep machine learning engine is designed featuring online unsupervised trainability and non-volatile floating-gate analog storage. It utilizes massively parallel reconfigurable current-mode analog architecture to realize efficient computation. And algorithm-level feedback is leveraged to provide robustness to circuit imperfections in analog signal processing. At a processing speed of 8300 input vectors per second, it achieves 1×1012 operation per second per Watt of peak energy efficiency. In addition, an ultra-low-power tunable bump circuit is presented to provide similarity measures in analog signal processing. It incorporates a novel wide-input-range tunable pseudo-differential transconductor. The circuit demonstrates tunability of bump center, width and height with a power consumption significantly lower than previous works

University of Tennessee, Knoxville: Trace

Investigation of practical issues in translating algorithms based on back-propagation into analogue, VLSI circuits

Author: Woodburn Robin
Publication venue: The University of Edinburgh
Publication date: 01/01/1996
Field of study

Edinburgh Research Archive

Redesigning Commercial Floating-Gate Memory for Analog Computing Applications

Author: Bayat F. Merrikh
Do N.
Guo X.
Likharev K. K.
Ommani H. A.
Strukov D. B.
Publication venue
Publication date: 15/10/2014
Field of study

We have modified a commercial NOR flash memory array to enable high-precision tuning of individual floating-gate cells for analog computing applications. The modified array area per cell in a 180 nm process is about 1.5 um^2. While this area is approximately twice the original cell size, it is still at least an order of magnitude smaller than in the state-of-the-art analog circuit implementations. The new memory cell arrays have been successfully tested, in particular confirming that each cell may be automatically tuned, with ~1% precision, to any desired subthreshold readout current value within an almost three-orders-of-magnitude dynamic range, even using an unoptimized tuning algorithm. Preliminary results for a four-quadrant vector-by-matrix multiplier, implemented with the modified memory array gate-coupled with additional peripheral floating-gate transistors, show highly linear transfer characteristics over a broad range of input currents.Comment: 4 pages, 6 figure

arXiv.org e-Print Archive

Crossref

An analog neural network with on-chip learning

Author: Sigvartsen Roy Ludvig
Publication venue
Publication date: 01/01/1994
Field of study

NORA - Norwegian Open Research Archives

Design of Building Blocks for Trit Algorithm

Author: Parthasarathy Balaji
Publication venue: 'Oklahoma State University Library'
Publication date: 01/05/1993
Field of study

This thesis attempts to design the building blocks for TRIT algorithm. PSPICE was used for simulation. The building blocks were laidout in Magic.Electrical Engineerin

SHAREOK repository

Pulse stream VLSI circuits and techniques for the implementation of neural networks

Author: Hamilton Alister
Publication venue: The University of Edinburgh
Publication date: 01/01/1993
Field of study

Edinburgh Research Archive