159 research outputs found
Time complexity of in-memory solution of linear systems
In-memory computing with crosspoint resistive memory arrays has been shown to
accelerate data-centric computations such as the training and inference of deep
neural networks, thanks to the high parallelism endowed by physical rules in
the electrical circuits. By connecting crosspoint arrays with negative feedback
amplifiers, it is possible to solve linear algebraic problems such as linear
systems and matrix eigenvectors in just one step. Based on the theory of
feedback circuits, we study the dynamics of the solution of linear systems
within a memory array, showing that the time complexity of the solution is free
of any direct dependence on the problem size N, rather it is governed by the
minimal eigenvalue of an associated matrix of the coefficient matrix. We show
that, when the linear system is modeled by a covariance matrix, the time
complexity is O(logN) or O(1). In the case of sparse positive-definite linear
systems, the time complexity is solely determined by the minimal eigenvalue of
the coefficient matrix. These results demonstrate the high speed of the circuit
for solving linear systems in a wide range of applications, thus supporting
in-memory computing as a strong candidate for future big data and machine
learning accelerators.Comment: Accepted by IEEE Trans. Electron Devices. The authors thank Scott
Aaronson for helpful discussion about time complexit
In-memory eigenvector computation in time O(1)
In-memory computing with crosspoint resistive memory arrays has gained
enormous attention to accelerate the matrix-vector multiplication in the
computation of data-centric applications. By combining a crosspoint array and
feedback amplifiers, it is possible to compute matrix eigenvectors in one step
without algorithmic iterations. In this work, time complexity of the
eigenvector computation is investigated, based on the feedback analysis of the
crosspoint circuit. The results show that the computing time of the circuit is
determined by the mismatch degree of the eigenvalues implemented in the
circuit, which controls the rising speed of output voltages. For a dataset of
random matrices, the time for computing the dominant eigenvector in the circuit
is constant for various matrix sizes, namely the time complexity is O(1). The
O(1) time complexity is also supported by simulations of PageRank of real-world
datasets. This work paves the way for fast, energy-efficient accelerators for
eigenvector computation in a wide range of practical applications.Comment: Accepted by Adv. Intell. Sys
BlockAMC: Scalable In-Memory Analog Matrix Computing for Solving Linear Systems
Recently, in-memory analog matrix computing (AMC) with nonvolatile resistive
memory has been developed for solving matrix problems in one step, e.g., matrix
inversion of solving linear systems. However, the analog nature sets up a
barrier to the scalability of AMC, due to the limits on the manufacturability
and yield of resistive memory arrays, non-idealities of device and circuit, and
cost of hardware implementations. Aiming to deliver a scalable AMC approach for
solving linear systems, this work presents BlockAMC, which partitions a large
original matrix into smaller ones on different memory arrays. A macro is
designed to perform matrix inversion and matrix-vector multiplication with the
block matrices, obtaining the partial solutions to recover the original
solution. The size of block matrices can be exponentially reduced by performing
multiple stages of divide-and-conquer, resulting in a two-stage solver design
that enhances the scalability of this approach. BlockAMC is also advantageous
in alleviating the accuracy issue of AMC, especially in the presence of device
and circuit non-idealities, such as conductance variations and interconnect
resistances. Compared to a single AMC circuit solving the same problem,
BlockAMC improves the area and energy efficiency by 48.83% and 40%,
respectively.Comment: This paper has been accepted to the conference DATE 202
In-memory computing with emerging memory devices: Status and outlook
Supporting data for "In-memory computing with emerging memory devices: status and outlook", submitted to APL Machine Learning
One Step in-Memory Solution of Inverse Algebraic Problems
AbstractMachine learning requires to process large amount of irregular data and extract meaningful information. Von-Neumann architecture is being challenged by such computation, in fact a physical separation between memory and processing unit limits the maximum speed in analyzing lots of data and the majority of time and energy are spent to make information travel from memory to the processor and back. In-memory computing executes operations directly within the memory without any information travelling. In particular, thanks to emerging memory technologies such as memristors, it is possible to program arbitrary real numbers directly in a single memory device in an analog fashion and at the array level, execute algebraic operation in-memory and in one step. In this chapter the latest results in accelerating inverse operation, such as the solution of linear systems, in-memory and in a single computational cycle will be presented
Applications of memristors in conventional analogue electronics
This dissertation presents the steps employed to activate and utilise analogue memristive devices in conventional analogue circuits and beyond.
TiO2 memristors are mainly utilised in this study, and their large variability in operation in between similar devices is identified.
A specialised memristor characterisation instrument is designed and built to mitigate this issue and to allow access to large numbers of devices at a time.
Its performance is quantified against linear resistors, crossbars of linear resistors, stand-alone memristive elements and crossbars of memristors.
This platform allows for a wide range of different pulsing algorithms to be applied on individual devices, or on crossbars of memristive elements, and is used throughout this dissertation.
Different ways of achieving analogue resistive switching from any device state are presented.
Results of these are used to devise a state-of-art biasing parameter finder which automatically extracts pulsing parameters that induce repeatable analogue resistive switching.
IV measurements taken during analogue resistive switching are then utilised to model the internal atomic structure of two devices, via fittings by the Simmons tunnelling barrier model.
These reveal that voltage pulses modulate a nano-tunnelling gap along a conical shape.
Further retention measurements are performed which reveal that under certain conditions, TiO2 memristors become volatile at short time scales.
This volatile behaviour is then implemented into a novel SPICE volatile memristor model.
These characterisation methods of solid-state devices allowed for inclusion of TiO2 memristors in practical electronic circuits.
Firstly, in the context of large analogue resistive crossbars, a crosspoint reading method is analysed and improved via a 3-step technique.
Its scaling performance is then quantified via SPICE simulations.
Next, the observed volatile dynamics of memristors are exploited in two separate sequence detectors, with applications in neuromorphic engineering.
Finally, the memristor as a programmable resistive weight is exploited to synthesise a memristive programmable gain amplifier and a practical memristive automatic gain control circuit.Open Acces
A survey of fault-tolerance algorithms for reconfigurable nano-crossbar arrays
ACM Comput. Surv. Volume 50, issue 6 (November 2017)Nano-crossbar arrays have emerged as a promising and viable technology to improve computing performance of electronic circuits beyond the limits of current CMOS. Arrays offer both structural efficiency with reconfiguration and prospective capability of integration with different technologies. However, certain problems need to be addressed, and the most important one is the prevailing occurrence of faults. Considering fault rate projections as high as 20% that is much higher than those of CMOS, it is fair to expect sophisticated fault-tolerance methods. The focus of this survey article is the assessment and evaluation of these methods and related algorithms applied in logic mapping and configuration processes. As a start, we concisely explain reconfigurable nano-crossbar arrays with their fault characteristics and models. Following that, we demonstrate configuration techniques of the arrays in the presence of permanent faults and elaborate on two main fault-tolerance methodologies, namely defect-unaware and defect-aware approaches, with a short review on advantages and disadvantages. For both methodologies, we present detailed experimental results of related algorithms regarding their strengths and weaknesses with a comprehensive yield, success rate and runtime analysis. Next, we overview fault-tolerance approaches for transient faults. As a conclusion, we overview the proposed algorithms with future directions and upcoming challenges.This work is supported by the EU-H2020-RISE project NANOxCOMP no 691178 and the TUBITAK-Career
project no 113E760
High-Density Solid-State Memory Devices and Technologies
This Special Issue aims to examine high-density solid-state memory devices and technologies from various standpoints in an attempt to foster their continuous success in the future. Considering that broadening of the range of applications will likely offer different types of solid-state memories their chance in the spotlight, the Special Issue is not focused on a specific storage solution but rather embraces all the most relevant solid-state memory devices and technologies currently on stage. Even the subjects dealt with in this Special Issue are widespread, ranging from process and design issues/innovations to the experimental and theoretical analysis of the operation and from the performance and reliability of memory devices and arrays to the exploitation of solid-state memories to pursue new computing paradigms
Solution of partial differential equations on vector and parallel computers
The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed
- …