299 research outputs found
Block-Wise Mixed-Precision Quantization: Enabling High Efficiency for Practical ReRAM-based DNN Accelerators
Resistive random access memory (ReRAM)-based processing-in-memory (PIM)
architectures have demonstrated great potential to accelerate Deep Neural
Network (DNN) training/inference. However, the computational accuracy of analog
PIM is compromised due to the non-idealities, such as the conductance variation
of ReRAM cells. The impact of these non-idealities worsens as the number of
concurrently activated wordlines and bitlines increases. To guarantee
computational accuracy, only a limited number of wordlines and bitlines of the
crossbar array can be turned on concurrently, significantly reducing the
achievable parallelism of the architecture.
While the constraints on parallelism limit the efficiency of the
accelerators, they also provide a new opportunity for fine-grained
mixed-precision quantization. To enable efficient DNN inference on practical
ReRAM-based accelerators, we propose an algorithm-architecture co-design
framework called \underline{B}lock-\underline{W}ise mixed-precision
\underline{Q}uantization (BWQ). At the algorithm level, BWQ-A introduces a
mixed-precision quantization scheme at the block level, which achieves a high
weight and activation compression ratio with negligible accuracy degradation.
We also present the hardware architecture design BWQ-H, which leverages the
low-bit-width models achieved by BWQ-A to perform high-efficiency DNN inference
on ReRAM devices. BWQ-H also adopts a novel precision-aware weight mapping
method to increase the ReRAM crossbar's throughput. Our evaluation demonstrates
the effectiveness of BWQ, which achieves a 6.08x speedup and a 17.47x energy
saving on average compared to existing ReRAM-based architectures.Comment: 12 pages, 13 figure
ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNN Inference
The primary operation in DNNs is the dot product of quantized input
activations and weights. Prior works have proposed the design of memory-centric
architectures based on the Processing-In-Memory (PIM) paradigm. Resistive RAM
(ReRAM) technology is especially appealing for PIM-based DNN accelerators due
to its high density to store weights, low leakage energy, low read latency, and
high performance capabilities to perform the DNN dot-products massively in
parallel within the ReRAM crossbars. However, the main bottleneck of these
architectures is the energy-hungry analog-to-digital conversions (ADCs)
required to perform analog computations in-ReRAM, which penalizes the
efficiency and performance benefits of PIM. To improve energy-efficiency of
in-ReRAM analog dot-product computations we present ReDy, a hardware
accelerator that implements a ReRAM-centric Dynamic quantization scheme to take
advantage of the bit serial streaming and processing of activations. The energy
consumption of ReRAM-based DNN accelerators is directly proportional to the
numerical precision of the input activations of each DNN layer. In particular,
ReDy exploits that activations of CONV layers from Convolutional Neural
Networks (CNNs), a subset of DNNs, are commonly grouped according to the size
of their filters and the size of the ReRAM crossbars. Then, ReDy quantizes
on-the-fly each group of activations with a different numerical precision based
on a novel heuristic that takes into account the statistical distribution of
each group. Overall, ReDy greatly reduces the activity of the ReRAM crossbars
and the number of A/D conversions compared to an static 8-bit uniform
quantization. We evaluate ReDy on a popular set of modern CNNs. On average,
ReDy provides 13\% energy savings over an ISAAC-like accelerator with
negligible accuracy loss and area overhead.Comment: 13 pages, 16 figures, 4 Table
Accelerate & Actualize: Can 2D Materials Bridge the Gap Between Neuromorphic Hardware and the Human Brain?
Two-dimensional (2D) materials present an exciting opportunity for devices
and systems beyond the von Neumann computing architecture paradigm due to their
diversity of electronic structure, physical properties, and atomically-thin,
van der Waals structures that enable ease of integration with conventional
electronic materials and silicon-based hardware. All major classes of
non-volatile memory (NVM) devices have been demonstrated using 2D materials,
including their operation as synaptic devices for applications in neuromorphic
computing hardware. Their atomically-thin structure, superior physical
properties, i.e., mechanical strength, electrical and thermal conductivity, as
well as gate-tunable electronic properties provide performance advantages and
novel functionality in NVM devices and systems. However, device performance and
variability as compared to incumbent materials and technology remain major
concerns for real applications. Ultimately, the progress of 2D materials as a
novel class of electronic materials and specifically their application in the
area of neuromorphic electronics will depend on their scalable synthesis in
thin-film form with desired crystal quality, defect density, and phase purity.Comment: Neuromorphic Computing, 2D Materials, Heterostructures, Emerging
Memory Devices, Resistive, Phase-Change, Ferroelectric, Ferromagnetic,
Crossbar Array, Machine Learning, Deep Learning, Spiking Neural Network
Fabrication and Pseudo-Analog Characteristics of Ta2O5 -Based ReRAM Cell
Memristori on yksi elektroniikan peruskomponenteista vastuksen, kondensaattorin ja kelan lisäksi. Se on passiivinen komponentti, jonka teorian kehitti Leon Chua vuonna 1971. Kesti kuitenkin yli kolmekymmentä vuotta ennen kuin teoria pystyttiin yhdistämään kokeellisiin tuloksiin. Vuonna 2008 Hewlett Packard julkaisi artikkelin, jossa he väittivät valmistaneensa ensimmäisen toimivan memristorin.
Memristori eli muistivastus on resistiivinen komponentti, jonka vastusarvoa pystytään muuttamaan. Nimens mukaisesti memristori kykenee myös säilyttämään vastusarvonsa ilman jatkuvaa virtaa ja jännitettä. Tyypillisesti memristorilla on vähintään kaksi vastusarvoa, joista kumpikin pystytään valitsemaan syöttämällä komponentille jännitettä tai virtaa. Tämän vuoksi memristoreita kutsutaankin usein resistiivisiksi kytkimiksi.
Resistiivisiä kytkimiä tutkitaan nykyään paljon erityisesti niiden mahdollistaman muistiteknologian takia. Resistiivisistä kytkimistä rakennettua muistia kutsutaan ReRAM-muistiksi (lyhenne sanoista resistive random access memory). ReRAM-muisti on Flash-muistin tapaan haihtumaton muisti, jota voidaan sähköisesti ohjelmoida tai tyhjentää. Flash-muistia käytetään tällä hetkellä esimerkiksi muistitikuissa. ReRAM-muisti mahdollistaa kuitenkin nopeamman ja vähävirtaiseman toiminnan Flashiin verrattuna, joten se on tulevaisuudessa varteenotettava kilpailija markkinoilla.
ReRAM-muisti mahdollistaa myös useammin bitin tallentamisen yhteen muistisoluun binäärisen (”0” tai ”1”) toiminnan sijaan. Tyypillisesti ReRAM-muistisolulla on kaksi rajoittavaa vastusarvoa, mutta näiden kahden tilan välille pystytään mahdollisesti ohjelmoimaan useampia tiloja. Muistisoluja voidaan kutsua analogisiksi, jos tilojen määrää ei ole rajoitettu. Analogisilla muistisoluilla olisi mahdollista rakentaa tehokkaasti esimerkiksi neuroverkkoja. Neuroverkoilla pyritään mallintamaan aivojen toimintaa ja suorittamaan tehtäviä, jotka ovat tyypillisesti vaikeita perinteisille tietokoneohjelmille. Neuroverkkoja käytetään esimerkiksi puheentunnistuksessa tai tekoälytoteutuksissa.
Tässä diplomityössä tarkastellaan Ta2O5 -perustuvan ReRAM-muistisolun analogista toimintaa pitäen mielessä soveltuvuus neuroverkkoihin. ReRAM-muistisolun valmistus ja mittaustulokset käydään läpi. Muistisolun toiminta on harvoin täysin analogista, koska kahden rajoittavan vastusarvon välillä on usein rajattu määrä tiloja. Tämän vuoksi toimintaa kutsutaan pseudoanalogiseksi. Mittaustulokset osoittavat, että yksittäinen ReRAM-muistisolu kykenee binääriseen toimintaan hyvin. Joiltain osin yksittäinen solu kykenee tallentamaan useampia tiloja, mutta vastusarvoissa on peräkkäisten ohjelmointisyklien välillä suurta vaihtelevuutta, joka hankaloittaa tulkintaa. Valmistettu ReRAM-muistisolu ei sellaisenaan kykene toimimaan pseudoanalogisena muistina, vaan se vaati rinnalleen virtaa rajoittavan komponentin. Myös valmistusprosessin kehittäminen vähentäisi yksittäisen solun toiminnassa esiintyvää varianssia, jolloin sen toiminta muistuttaisi enemmän pseudoanalogista muistia.The memristor is one of the fundamental circuit elements in addition to a resistor, capacitor and an inductor. It is a passive component whose theory was postulated by Leon Chua in 1971. It took over 30 years before any known physical examples were discovered. In 2008 Hewlett Packard published an article where they manufactured a device which they claimed to be the first memristor found.
The memristor, which is a concatenation of memory resistor, is a resistive component that has an ability to change its resistance. It can also remember its resistance value without continuous current or voltage. Typically, a memristor has at least two resistance states that can be altered. This is the reason why memristors are also called resistive switches.
Resistive switches can be used in memory technologies. A memory array that has been built using resistive switches is called ReRAM (resistive random access memory). ReRAM, like Flash memory, is a non-volatile memory that can be programmed or erased electrically. Flash memories are currently used e.g. in memory sticks. However, compared to Flash, ReRAM has faster operating speed and lower power consumption, for instance. It could potentially replace current memory standards in future.
A ReRAM memory cell can also store multiple bits instead of binary operation (”0” or ”1”). Typically there exists multiple intermediate resistance states between ReRAM’s limiting resistances that could be utilized. Such memory could be called analog, if the amount of intermediate states is not limited to discrete levels. Analog memories make it possible to build artificial neural networks (ANN) efficiently, for instance. ANNs try to model the behaviour of brain and to perform tasks that are difficult for traditional computer programs such as speech recognition or artificial intelligence.
This thesis studies the analog behaviour of Ta 2 O 5 -based ReRAM cell. Manufacturing process and measurement results are presented. The operation of ReRAM cell is rarely fully analog as there exists limited amount of intermediate resistance states. This is the reason why operation is called pseudo-analog. Measurement results show that a single ReRAM cell is suitable for binary operation. In some cases, a single cell can store multiple resistance values but there exists significant variance in resistance states between subsequent programming cycles. The proposed ReRAM cell cannot operate as pseudo-analog ReRAM cell in itself as it needs an external current limiting component. Improving the manufacturing process should reduce the variability such that the operation would be more like a pseudo-analog memory.Siirretty Doriast
MemTorch: An Open-source Simulation Framework for Memristive Deep Learning Systems
Memristive devices have shown great promise to facilitate the acceleration
and improve the power efficiency of Deep Learning (DL) systems. Crossbar
architectures constructed using memristive devices can be used to efficiently
implement various in-memory computing operations, such as Multiply-Accumulate
(MAC) and unrolled-convolutions, which are used extensively in Deep Neural
Networks (DNNs) and Convolutional Neural Networks (CNNs). Currently, there is a
lack of a modernized, open source and general high-level simulation platform
that can fully integrate any behavioral or experimental memristive device model
and its putative non-idealities into crossbar architectures within DL systems.
This paper presents such a framework, entitled MemTorch, which adopts a
modernized software engineering methodology and integrates directly with the
well-known PyTorch Machine Learning (ML) library. We fully detail the public
release of MemTorch and its release management, and use it to perform novel
simulations of memristive DL systems, which are trained and benchmarked using
the CIFAR-10 dataset. Moreover, we present a case study, in which MemTorch is
used to simulate a near-sensor in-memory computing system for seizure detection
using Pt/Hf/Ti Resistive Random Access Memory (ReRAM) devices. Our open source
MemTorch framework can be used and expanded upon by circuit and system
designers to conveniently perform customized large-scale memristive DL
simulations taking into account various unavoidable device non-idealities, as a
preliminary step before circuit-level realization.Comment: Submitted to IEEE Transactions on Neural Networks and Learning
Systems. Update: Fixed accent \'e characte
- …