Search CORE

299 research outputs found

Block-Wise Mixed-Precision Quantization: Enabling High Efficiency for Practical ReRAM-based DNN Accelerators

Author: Chakrabarty Krishnendu
Cheng Feng
Doppa Janardhan Rao
Hanson Edward
Li Hai
Li Shiyu
Pande Partha Pratim
Wang Nansu
Wu Xueying
Yang Huanrui
Yang Xiaoxuan
Zheng Qilin
Publication venue
Publication date: 27/10/2023
Field of study

Resistive random access memory (ReRAM)-based processing-in-memory (PIM) architectures have demonstrated great potential to accelerate Deep Neural Network (DNN) training/inference. However, the computational accuracy of analog PIM is compromised due to the non-idealities, such as the conductance variation of ReRAM cells. The impact of these non-idealities worsens as the number of concurrently activated wordlines and bitlines increases. To guarantee computational accuracy, only a limited number of wordlines and bitlines of the crossbar array can be turned on concurrently, significantly reducing the achievable parallelism of the architecture. While the constraints on parallelism limit the efficiency of the accelerators, they also provide a new opportunity for fine-grained mixed-precision quantization. To enable efficient DNN inference on practical ReRAM-based accelerators, we propose an algorithm-architecture co-design framework called \underline{B}lock-\underline{W}ise mixed-precision \underline{Q}uantization (BWQ). At the algorithm level, BWQ-A introduces a mixed-precision quantization scheme at the block level, which achieves a high weight and activation compression ratio with negligible accuracy degradation. We also present the hardware architecture design BWQ-H, which leverages the low-bit-width models achieved by BWQ-A to perform high-efficiency DNN inference on ReRAM devices. BWQ-H also adopts a novel precision-aware weight mapping method to increase the ReRAM crossbar's throughput. Our evaluation demonstrates the effectiveness of BWQ, which achieves a 6.08x speedup and a 17.47x energy saving on average compared to existing ReRAM-based architectures.Comment: 12 pages, 13 figure

arXiv.org e-Print Archive

ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNN Inference

Author: González Antonio
Riera Marc
Sabri Mohammad
Publication venue
Publication date: 28/06/2023
Field of study

The primary operation in DNNs is the dot product of quantized input activations and weights. Prior works have proposed the design of memory-centric architectures based on the Processing-In-Memory (PIM) paradigm. Resistive RAM (ReRAM) technology is especially appealing for PIM-based DNN accelerators due to its high density to store weights, low leakage energy, low read latency, and high performance capabilities to perform the DNN dot-products massively in parallel within the ReRAM crossbars. However, the main bottleneck of these architectures is the energy-hungry analog-to-digital conversions (ADCs) required to perform analog computations in-ReRAM, which penalizes the efficiency and performance benefits of PIM. To improve energy-efficiency of in-ReRAM analog dot-product computations we present ReDy, a hardware accelerator that implements a ReRAM-centric Dynamic quantization scheme to take advantage of the bit serial streaming and processing of activations. The energy consumption of ReRAM-based DNN accelerators is directly proportional to the numerical precision of the input activations of each DNN layer. In particular, ReDy exploits that activations of CONV layers from Convolutional Neural Networks (CNNs), a subset of DNNs, are commonly grouped according to the size of their filters and the size of the ReRAM crossbars. Then, ReDy quantizes on-the-fly each group of activations with a different numerical precision based on a novel heuristic that takes into account the statistical distribution of each group. Overall, ReDy greatly reduces the activity of the ReRAM crossbars and the number of A/D conversions compared to an static 8-bit uniform quantization. We evaluate ReDy on a popular set of modern CNNs. On average, ReDy provides 13\% energy savings over an ISAAC-like accelerator with negligible accuracy loss and area overhead.Comment: 13 pages, 16 figures, 4 Table

arXiv.org e-Print Archive

Accelerate & Actualize: Can 2D Materials Bridge the Gap Between Neuromorphic Hardware and the Human Brain?

Author: Jariwala Deep
Katti Keshava
Liu Xiwen
Publication venue
Publication date: 24/01/2023
Field of study

Two-dimensional (2D) materials present an exciting opportunity for devices and systems beyond the von Neumann computing architecture paradigm due to their diversity of electronic structure, physical properties, and atomically-thin, van der Waals structures that enable ease of integration with conventional electronic materials and silicon-based hardware. All major classes of non-volatile memory (NVM) devices have been demonstrated using 2D materials, including their operation as synaptic devices for applications in neuromorphic computing hardware. Their atomically-thin structure, superior physical properties, i.e., mechanical strength, electrical and thermal conductivity, as well as gate-tunable electronic properties provide performance advantages and novel functionality in NVM devices and systems. However, device performance and variability as compared to incumbent materials and technology remain major concerns for real applications. Ultimately, the progress of 2D materials as a novel class of electronic materials and specifically their application in the area of neuromorphic electronics will depend on their scalable synthesis in thin-film form with desired crystal quality, defect density, and phase purity.Comment: Neuromorphic Computing, 2D Materials, Heterostructures, Emerging Memory Devices, Resistive, Phase-Change, Ferroelectric, Ferromagnetic, Crossbar Array, Machine Learning, Deep Learning, Spiking Neural Network

arXiv.org e-Print Archive

Fabrication and Pseudo-Analog Characteristics of Ta2O5 -Based ReRAM Cell

Author: Grönroos Mika
Publication venue: fi=Turun yliopisto|en=University of Turku|
Publication date: 15/08/2016
Field of study

Memristori on yksi elektroniikan peruskomponenteista vastuksen, kondensaattorin ja kelan lisäksi. Se on passiivinen komponentti, jonka teorian kehitti Leon Chua vuonna 1971. Kesti kuitenkin yli kolmekymmentä vuotta ennen kuin teoria pystyttiin yhdistämään kokeellisiin tuloksiin. Vuonna 2008 Hewlett Packard julkaisi artikkelin, jossa he väittivät valmistaneensa ensimmäisen toimivan memristorin. Memristori eli muistivastus on resistiivinen komponentti, jonka vastusarvoa pystytään muuttamaan. Nimens mukaisesti memristori kykenee myös säilyttämään vastusarvonsa ilman jatkuvaa virtaa ja jännitettä. Tyypillisesti memristorilla on vähintään kaksi vastusarvoa, joista kumpikin pystytään valitsemaan syöttämällä komponentille jännitettä tai virtaa. Tämän vuoksi memristoreita kutsutaankin usein resistiivisiksi kytkimiksi. Resistiivisiä kytkimiä tutkitaan nykyään paljon erityisesti niiden mahdollistaman muistiteknologian takia. Resistiivisistä kytkimistä rakennettua muistia kutsutaan ReRAM-muistiksi (lyhenne sanoista resistive random access memory). ReRAM-muisti on Flash-muistin tapaan haihtumaton muisti, jota voidaan sähköisesti ohjelmoida tai tyhjentää. Flash-muistia käytetään tällä hetkellä esimerkiksi muistitikuissa. ReRAM-muisti mahdollistaa kuitenkin nopeamman ja vähävirtaiseman toiminnan Flashiin verrattuna, joten se on tulevaisuudessa varteenotettava kilpailija markkinoilla. ReRAM-muisti mahdollistaa myös useammin bitin tallentamisen yhteen muistisoluun binäärisen (”0” tai ”1”) toiminnan sijaan. Tyypillisesti ReRAM-muistisolulla on kaksi rajoittavaa vastusarvoa, mutta näiden kahden tilan välille pystytään mahdollisesti ohjelmoimaan useampia tiloja. Muistisoluja voidaan kutsua analogisiksi, jos tilojen määrää ei ole rajoitettu. Analogisilla muistisoluilla olisi mahdollista rakentaa tehokkaasti esimerkiksi neuroverkkoja. Neuroverkoilla pyritään mallintamaan aivojen toimintaa ja suorittamaan tehtäviä, jotka ovat tyypillisesti vaikeita perinteisille tietokoneohjelmille. Neuroverkkoja käytetään esimerkiksi puheentunnistuksessa tai tekoälytoteutuksissa. Tässä diplomityössä tarkastellaan Ta2O5 -perustuvan ReRAM-muistisolun analogista toimintaa pitäen mielessä soveltuvuus neuroverkkoihin. ReRAM-muistisolun valmistus ja mittaustulokset käydään läpi. Muistisolun toiminta on harvoin täysin analogista, koska kahden rajoittavan vastusarvon välillä on usein rajattu määrä tiloja. Tämän vuoksi toimintaa kutsutaan pseudoanalogiseksi. Mittaustulokset osoittavat, että yksittäinen ReRAM-muistisolu kykenee binääriseen toimintaan hyvin. Joiltain osin yksittäinen solu kykenee tallentamaan useampia tiloja, mutta vastusarvoissa on peräkkäisten ohjelmointisyklien välillä suurta vaihtelevuutta, joka hankaloittaa tulkintaa. Valmistettu ReRAM-muistisolu ei sellaisenaan kykene toimimaan pseudoanalogisena muistina, vaan se vaati rinnalleen virtaa rajoittavan komponentin. Myös valmistusprosessin kehittäminen vähentäisi yksittäisen solun toiminnassa esiintyvää varianssia, jolloin sen toiminta muistuttaisi enemmän pseudoanalogista muistia.The memristor is one of the fundamental circuit elements in addition to a resistor, capacitor and an inductor. It is a passive component whose theory was postulated by Leon Chua in 1971. It took over 30 years before any known physical examples were discovered. In 2008 Hewlett Packard published an article where they manufactured a device which they claimed to be the ﬁrst memristor found. The memristor, which is a concatenation of memory resistor, is a resistive component that has an ability to change its resistance. It can also remember its resistance value without continuous current or voltage. Typically, a memristor has at least two resistance states that can be altered. This is the reason why memristors are also called resistive switches. Resistive switches can be used in memory technologies. A memory array that has been built using resistive switches is called ReRAM (resistive random access memory). ReRAM, like Flash memory, is a non-volatile memory that can be programmed or erased electrically. Flash memories are currently used e.g. in memory sticks. However, compared to Flash, ReRAM has faster operating speed and lower power consumption, for instance. It could potentially replace current memory standards in future. A ReRAM memory cell can also store multiple bits instead of binary operation (”0” or ”1”). Typically there exists multiple intermediate resistance states between ReRAM’s limiting resistances that could be utilized. Such memory could be called analog, if the amount of intermediate states is not limited to discrete levels. Analog memories make it possible to build artiﬁcial neural networks (ANN) eﬃciently, for instance. ANNs try to model the behaviour of brain and to perform tasks that are diﬃcult for traditional computer programs such as speech recognition or artiﬁcial intelligence. This thesis studies the analog behaviour of Ta 2 O 5 -based ReRAM cell. Manufacturing process and measurement results are presented. The operation of ReRAM cell is rarely fully analog as there exists limited amount of intermediate resistance states. This is the reason why operation is called pseudo-analog. Measurement results show that a single ReRAM cell is suitable for binary operation. In some cases, a single cell can store multiple resistance values but there exists signiﬁcant variance in resistance states between subsequent programming cycles. The proposed ReRAM cell cannot operate as pseudo-analog ReRAM cell in itself as it needs an external current limiting component. Improving the manufacturing process should reduce the variability such that the operation would be more like a pseudo-analog memory.Siirretty Doriast

UTUPub

MemTorch: An Open-source Simulation Framework for Memristive Deep Learning Systems

Author: Azghadi Mostafa Rahimi
Lammie Corey
Linares-Barranco Bernabé
Xiang Wei
Publication venue
Publication date: 23/04/2020
Field of study

Memristive devices have shown great promise to facilitate the acceleration and improve the power efficiency of Deep Learning (DL) systems. Crossbar architectures constructed using memristive devices can be used to efficiently implement various in-memory computing operations, such as Multiply-Accumulate (MAC) and unrolled-convolutions, which are used extensively in Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs). Currently, there is a lack of a modernized, open source and general high-level simulation platform that can fully integrate any behavioral or experimental memristive device model and its putative non-idealities into crossbar architectures within DL systems. This paper presents such a framework, entitled MemTorch, which adopts a modernized software engineering methodology and integrates directly with the well-known PyTorch Machine Learning (ML) library. We fully detail the public release of MemTorch and its release management, and use it to perform novel simulations of memristive DL systems, which are trained and benchmarked using the CIFAR-10 dataset. Moreover, we present a case study, in which MemTorch is used to simulate a near-sensor in-memory computing system for seizure detection using Pt/Hf/Ti Resistive Random Access Memory (ReRAM) devices. Our open source MemTorch framework can be used and expanded upon by circuit and system designers to conveniently perform customized large-scale memristive DL simulations taking into account various unavoidable device non-idealities, as a preliminary step before circuit-level realization.Comment: Submitted to IEEE Transactions on Neural Networks and Learning Systems. Update: Fixed accent \'e characte

arXiv.org e-Print Archive

ResearchOnline at James Cook University

Digital.CSIC