Search CORE

5 research outputs found

A Phase Change Memory and DRAM Based Framework For Energy-Efficient and High-Speed In-Memory Stochastic Computing

Author: Mysore Supreeth
Publication venue: UKnowledge
Publication date: 01/01/2023
Field of study

Convolutional Neural Networks (CNNs) have proven to be highly effective in various fields related to Artificial Intelligence (AI) and Machine Learning (ML). However, the significant computational and memory requirements of CNNs make their processing highly compute and memory-intensive. In particular, the multiply-accumulate (MAC) operation, which is a fundamental building block of CNNs, requires enormous arithmetic operations. As the input dataset size increases, the traditional processor-centric von-Neumann computing architecture becomes ill-suited for CNN-based applications. This results in exponentially higher latency and energy costs, making the processing of CNNs highly challenging. To overcome these challenges, researchers have explored the Processing-In Memory (PIM) technique, which involves placing the processing unit inside or near the memory unit. This approach reduces data migration length and utilizes the internal memory bandwidth at the memory chip level. However, developing a reliable PIM-based system with minimal hardware modifications and design complexity remains a significant challenge. The proposed solution in the report suggests utilizing different memory technologies, such as Dynamic RAM (DRAM) and phase change memory (PCM), with Stochastic arithmetic and minimal add-on logic. Stochastic computing is a technique that uses random numbers to perform arithmetic operations instead of traditional binary representation. This technique reduces hardware requirements for CNN\u27s arithmetic operations, making it possible to implement them with minimal add-on logic. The report details the workflow for performing arithmetical operations used by CNNs, including MAC, activation, and floating-point functions. The proposed solution includes designs for scalable Stochastic Number Generator (SNG), DRAM CNN accelerator, non-volatile memory (NVM) class PCRAM-based CNN accelerator, and DRAM-based stochastic to binary conversion (StoB) for in-situ deep learning. These designs utilize stochastic computing to reduce the hardware requirements for CNN\u27s arithmetic operations and enable energy and time-efficient processing of CNNs. The report also identifies future research directions for the proposed designs, including in-situ PCRAM-based SNG, ODIN (A Bit-Parallel Stochastic Arithmetic Based Accelerator for In-Situ Neural Network Processing in Phase Change RAM), ATRIA (Bit-Parallel Stochastic Arithmetic Based Accelerator for In-DRAM CNN Processing), and AGNI (In-Situ, Iso-Latency Stochastic-to-Binary Number Conversion for In-DRAM Deep Learning), and presents initial findings for these ideas. In summary, the proposed solution in the report offers a comprehensive approach to address the challenges of processing CNNs, and the proposed designs have the potential to improve the energy and time efficiency of CNNs significantly. Using Stochastic Computing and different memory technologies enables the development of reliable PIM-based systems with minimal hardware modifications and design complexity, providing a promising path for the future of CNN-based applications

University of Kentucky

AGNI: In-Situ, Iso-Latency Stochastic-to-Binary Number Conversion for In-DRAM Deep Learning

Author: Salehi Sayed Ahmad
Shivanandamurthy Supreeth Mysore
Thakkar Ishan
Vatsavai Sairam Sri
Publication venue
Publication date: 11/02/2023
Field of study

Recent years have seen a rapid increase in research activity in the field of DRAM-based Processing-In-Memory (PIM) accelerators, where the analog computing capability of DRAM is employed by minimally changing the inherent structure of DRAM peripherals to accelerate various data-centric applications. Several DRAM-based PIM accelerators for Convolutional Neural Networks (CNNs) have also been reported. Among these, the accelerators leveraging in-DRAM stochastic arithmetic have shown manifold improvements in processing latency and throughput, due to the ability of stochastic arithmetic to convert multiplications into simple bit-wise logical AND operations. However,the use of in-DRAM stochastic arithmetic for CNN acceleration requires frequent stochastic to binary number conversions. For that, prior works employ full adder-based or serial counter based in-DRAM circuits. These circuits consume large area and incur long latency. Their in-DRAM implementations also require heavy modifications in DRAM peripherals, which significantly diminishes the benefits of using stochastic arithmetic in these accelerators. To address these shortcomings, this paper presents a new substrate for in-DRAM stochastic-to-binary number conversion called AGNI. AGNI makes minor modifications in DRAM peripherals using pass transistors, capacitors, encoders, and charge pumps, and re-purposes the sense amplifiers as voltage comparators, to enable in-situ binary conversion of input statistic operands of different sizes with iso latency.Comment: (Preprint) To Appear at ISQED 202

arXiv.org e-Print Archive

BILP-Q: Quantum Coalition Structure Generation

Author: Klusch Matthias
Macaluso Antonio
Venkatesh Supreeth Mysore
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/04/2022
Field of study

Quantum AI is an emerging field that uses quantum computing to solve typical complex problems in AI. In this work, we propose BILP-Q, the first-ever general quantum approach for solving the Coalition Structure Generation problem (CSGP), which is notably NP-hard. In particular, we reformulate the CSGP in terms of a Quadratic Binary Combinatorial Optimization (QUBO) problem to leverage existing quantum algorithms (e.g., QAOA) to obtain the best coalition structure. Thus, we perform a comparative analysis in terms of time complexity between the proposed quantum approach and the most popular classical baselines. Furthermore, we consider standard benchmark distributions for coalition values to test the BILP-Q on small-scale experiments using the IBM Qiskit environment. Finally, since QUBO problems can be solved operating with quantum annealing, we run BILP-Q on medium-size problems using a real quantum annealer (D-Wave).Comment: 8 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

Self‐gated resonant‐clocked flip‐flop optimised for power efficiency and signal integrity

Author: Jennifer Judy Dominic Jawahar
Kanchana Bhaaskaran Vettuvanam Somasundaram
Kong B.S.
Rabaey J.M.
Supreeth Mysore Shivananda Murthy
Voss B.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date
Field of study

Crossref