Search CORE

2 research outputs found

Probabilistic Compute-in-Memory Design For Efficient Markov Chain Monte Carlo Sampling

Author: Fan Anjunyi
Fu Yihan
Huang Ru
Shi Daijing
Yan Bonan
Yang Yuchao
Yue Wenshuo
Publication venue
Publication date: 16/07/2023
Field of study

Markov chain Monte Carlo (MCMC) is a widely used sampling method in modern artificial intelligence and probabilistic computing systems. It involves repetitive random number generations and thus often dominates the latency of probabilistic model computing. Hence, we propose a compute-in-memory (CIM) based MCMC design as a hardware acceleration solution. This work investigates SRAM bitcell stochasticity and proposes a novel ``pseudo-read'' operation, based on which we offer a block-wise random number generation circuit scheme for fast random number generation. Moreover, this work proposes a novel multi-stage exclusive-OR gate (MSXOR) design method to generate strictly uniformly distributed random numbers. The probability error deviating from a uniform distribution is suppressed under

10^{-5}

. Also, this work presents a novel in-memory copy circuit scheme to realize data copy inside a CIM sub-array, significantly reducing the use of R/W circuits for power saving. Evaluated in a commercial 28-nm process development kit, this CIM-based MCMC design generates 4-bit

\sim

32-bit samples with an energy efficiency of

0.53

~pJ/sample and high throughput of up to

166.7

M~samples/s. Compared to conventional processors, the overall energy efficiency improves

5.41\times10^{11}

2.33\times10^{12}

times

arXiv.org e-Print Archive

Hadamard product-based in-memory computing design for floating point neural network training

Author: Anjunyi Fan
Bonan Yan
Haiyue Han
Huiyu Liu
Ru Huang
Yaojun Zhang
Yaoyu Tao
Yihan Fu
Yuchao Yang
Zhonghua Jin
Publication venue: 'IOP Publishing'
Publication date: 01/01/2023
Field of study

Deep neural networks (DNNs) are one of the key fields of machine learning. It requires considerable computational resources for cognitive tasks. As a novel technology to perform computing inside/near memory units, in-memory computing (IMC) significantly improves computing efficiency by reducing the need for repetitive data transfer between the processing and memory units. However, prior IMC designs mainly focus on the acceleration for DNN inference. DNN training with the IMC hardware has rarely been proposed. The challenges lie in the requirement of DNN training for high precision (e.g. floating point (FP)) and various operations of tensors (e.g. inner and outer products). These challenges call for the IMC design with new features. This paper proposes a novel Hadamard product-based IMC design for FP DNN training. Our design consists of multiple compartments, which are the basic units for the matrix element-wise processing. We also develop BFloat16 post-processing circuits and fused adder trees, laying the foundation for IMC FP processing. Based on the proposed circuit scheme, we reformulate the back-propagation training algorithm for the convenience and efficiency of the IMC execution. The proposed design is implemented with commercial 28 nm technology process design kits and benchmarked with widely used neural networks. We model the influence of the circuit structural design parameters and provide an analysis framework for design space exploration. Our simulation validates that MobileNet training with the proposed IMC scheme saves

91.2\%

in energy and

13.9\%

in time versus the same task with NVIDIA GTX 3060 GPU. The proposed IMC design has a data density of 769.2 Kb mm ^−2 with the FP processing circuits included, showing a 3.5 × improvement than the prior FP IMC designs

Directory of Open Access Journals