20 research outputs found

    ๋‚ธ๋“œ ํ”Œ๋ž˜์‹œ ์…€ ์ŠคํŠธ๋ง ๊ธฐ๋ฐ˜์˜ ์‹œ๋ƒ…ํ‹ฑ ์–ด๋ ˆ์ด ์•„ํ‚คํ…์ฒ˜

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2021.8. ์ด์ข…ํ˜ธ.Neuromorphic computing using synaptic devices has been proposed to efficiently process vector-matrix multiplication (VMM) which is a significant task in DNN. Until now, resistive RAM (RRAM) was mainly used as synaptic devices for neuromorphic computing. However, a number of limitations still exist for RRAMs to implement a large-scale synaptic device array due to device nonideality such as variation, endurance and monolithic integration of RRAMs and CMOS peripheral circuits. Due to these problems, SRAM cells, which are mature silicon memory, have been proposed as synaptic devices. However, SRAM occupies large area (~150 F2 per bitcell) and on-chip SRAM capacity (~a few MB) is insufficient to accommodate a large number of parameters. In this dissertation, synaptic architectures based on NAND flash cell strings are proposed for off-chip learning and on-chip learning. A novel synaptic architecture based on NAND cell strings is proposed as a high-density synapse capable of XNOR operation for binary neural networks (BNNs) in off-chip learning. By changing the threshold voltage of NAND flash cells and input voltages in complementary fashion, the XNOR operation is successfully demonstrated. The large on/off current ratio (~7ร—105) of NAND flash cells can implement high-density and highly reliable BNNs without error correction codes. We propose a novel synaptic architecture based on a NAND flash memory for highly robust and high-density quantized neural networks (QNN) with 4-bit weight. Quantization training can minimize the degradation of the inference accuracy compared to post-training quantization. The proposed operation scheme can implement QNN with higher inference accuracy compared to BNN. On-chip learning can significantly reduce time and energy consumption during training, compensate the weight variation of synaptic devices, and can adapt to changing environment in real time. On-chip learning using the high-density advantage of NAND flash memory structure is of great significance. However, the conventional on-chip learning method used for RRAM array cannot be utilized when using NAND flash cells as synaptic devices because of the cell string structure of NAND flash memory. In this work, a novel synaptic array architecture enabling forward propagation (FP) and backward propagation (BP) in the NAND flash memory is proposed for on-chip learning. In the proposed synaptic architecture, positive synaptic weight and negative synaptic weight are separated in different array to enable weights to be transposed correctly. In addition, source-lines (SL) are separated, which is different from conventional NAND flash memory, to enable both the FP and BP in the NAND flash memory. By applying input and error input to bit-lines (BL) and string-select lines (SSL) in NAND cell array, respectively, accurate vector-matrix multiplication is successfully performed in both FP and BP eliminating the effect of pass cells. The proposed on-chip learning system is much more robust to weight variation compared to the off-chip learning system. Finally, superiority of the proposed on-chip learning architecture is verified by circuit simulation of a neural network.DNN์—์„œ ์ค‘์š”ํ•œ ์ž‘์—…์ธ ๋ฒกํ„ฐ-๋งคํŠธ๋ฆญ์Šค ๊ณฑ์…ˆ (VMM)์„ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์‹œ๋ƒ…์Šค ์†Œ์ž๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋‰ด๋กœ๋ชจํ”ฝ ์ปดํ“จํŒ…์ด ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. ์ง€๊ธˆ๊นŒ์ง€ RRAM (Resistive RAM)์ด ์ฃผ๋กœ ๋‰ด๋กœ๋ชจํ”ฝ ์ปดํ“จํŒ…์˜ ์‹œ๋ƒ…์Šค ์†Œ์ž๋กœ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ RRAM์€ ์†Œ์ž์˜ ์‚ฐํฌ๊ฐ€ ํฌ๊ณ  ์‹ ๋ขฐ์„ฑ์ด ์ข‹์ง€ ์•Š์œผ๋ฉฐ CMOS ์ฃผ๋ณ€ ํšŒ๋กœ์™€ ํ†ตํ•ฉ์ด ์–ด๋ ค์šด ๋ฌธ์ œ๋กœ ์ธํ•ด ๋Œ€๊ทœ๋ชจ ์‹œ๋ƒ…์Šค ์†Œ์ž ์–ด๋ ˆ์ด๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐ๋Š” ์—ฌ์ „ํžˆ ๋งŽ์€ ์ œํ•œ์ด ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋กœ ์ธํ•ด ์„ฑ์ˆ™ํ•œ ์‹ค๋ฆฌ์ฝ˜ ๋ฉ”๋ชจ๋ฆฌ์ธ SRAM ์…€์ด ์‹œ๋ƒ…์Šค ์†Œ์ž๋กœ ์ œ์•ˆ๋˜๊ณ  ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ SRAM์€ ์…€ ๋‹น ๋ฉด์  (~150 F2 per bitcell)์ด ํฌ๊ณ  ๋˜ํ•œ ์˜จ์นฉ SRAM ์šฉ๋Ÿ‰ (~a few MB) ์€ ๋งŽ์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ˆ˜์šฉํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•˜์ง€ ์•Š๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์˜คํ”„ ์นฉ ํ•™์Šต๊ณผ ์˜จ ์นฉ ํ•™์Šต์„ ์œ„ํ•ด NAND ํ”Œ๋ž˜์‹œ ์…€ ์ŠคํŠธ๋ง์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ์‹œ๋ƒ…์Šค ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ œ์•ˆํ•œ๋‹ค. NAND ์…€ ์ŠคํŠธ๋ง ๊ธฐ๋ฐ˜์˜ ์ƒˆ๋กœ์šด ์‹œ๋ƒ…์Šค ์•„ํ‚คํ…์ฒ˜๋Š” ์˜คํ”„ ์นฉ ํ•™์Šต์—์„œ ์ด์ง„ ์‹ ๊ฒฝ๋ง (BNN)์„ ์œ„ํ•œ XNOR ์—ฐ์‚ฐ์ด ๊ฐ€๋Šฅํ•œ ๊ณ ๋ฐ€๋„ ์‹œ๋ƒ…์Šค๋กœ ์‚ฌ์šฉ๋œ๋‹ค. ์ƒํ˜ธ ๋ณด์™„์ ์ธ ๋ฐฉ์‹์œผ๋กœ NAND ํ”Œ๋ž˜์‹œ ์…€์˜ ์ž„๊ณ„ ์ „์••๊ณผ ์ž…๋ ฅ ์ „์••์„ ๋ณ€๊ฒฝํ•จ์œผ๋กœ์จ XNOR ์—ฐ์‚ฐ์„ ์„ฑ๊ณต์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•œ๋‹ค. NAND ํ”Œ๋ž˜์‹œ ์…€์˜ ํฐ ์˜จ/์˜คํ”„ ์ „๋ฅ˜ ๋น„์œจ(~ 7x105)์€ ECC ์—†์ด ๊ณ ๋ฐ€๋„ ๋ฐ ๊ณ ์‹ ๋ขฐ์„ฑ์˜ BNN์„ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” 4๋น„ํŠธ ๊ฐ€์ค‘์น˜๋ฅผ ๊ฐ–๋Š” ๋งค์šฐ ๊ฒฌ๊ณ ํ•˜๋ฉฐ ๊ณ ์ง‘์ ์˜ ์–‘์žํ™”๋œ ์‹ ๊ฒฝ๋ง(QNN)์„ ์œ„ํ•œ NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๊ธฐ๋ฐ˜์˜ ์ƒˆ๋กœ์šด ์‹œ๋ƒ…ํ‹ฑ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์–‘์žํ™” ํ•™์Šต์€ ํ›ˆ๋ จ ํ›„ ์–‘์žํ™”์— ๋น„ํ•ด ์ถ”๋ก  ์ •ํ™•๋„์˜ ์ €ํ•˜๋ฅผ ์ตœ์†Œํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋™์ž‘ ๋ฐฉ์‹์€ BNN์— ๋น„ํ•ด ๋” ๋†’์€ ์ถ”๋ก  ์ •ํ™•๋„๋ฅผ ๊ฐ€์ง€๋Š” QNN์„ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค. ์˜จ ์นฉ ํ•™์Šต์€ ํ›ˆ๋ จ ์ค‘ ์‹œ๊ฐ„๊ณผ ์—๋„ˆ์ง€ ์†Œ๋น„๋ฅผ ํฌ๊ฒŒ ์ค„์ด๊ณ  ์‹œ๋ƒ…์Šค ์†Œ์ž์˜ ์‚ฐํฌ๋ฅผ ๋ณด์ƒํ•˜๋ฉฐ ๋ณ€ํ™”ํ•˜๋Š” ํ™˜๊ฒฝ์— ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ ์‘ํ•  ์ˆ˜ ์žˆ๋‹ค. NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ์˜ ๋†’์€ ์ง‘์ ๋„๋ฅผ ์‚ฌ์šฉํ•œ ์˜จ ์นฉ ํ•™์Šต์€ ๋งค์šฐ ์œ ์šฉํ•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ธฐ์กด์˜ RRAM ์–ด๋ ˆ์ด์— ์‚ฌ์šฉ๋˜๋Š” ์˜จ ์นฉ ํ•™์Šต ๋ฐฉ๋ฒ•์€ NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์˜ ์…€ ์ŠคํŠธ๋ง ๊ตฌ์กฐ๋กœ ์ธํ•ด NAND ํ”Œ๋ž˜์‹œ ์…€์„ ์‹œ๋ƒ…์Šค ์†Œ์ž๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ํ™œ์šฉํ•  ์ˆ˜ ์—†๋‹ค. ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ์˜จ ์นฉ ํ•™์Šต์„ ์œ„ํ•ด NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์—์„œ ์ˆœ๋ฐฉํ–ฅ ์ „ํŒŒ (FP) ๋ฐ ์—ญ๋ฐฉํ–ฅ ์ „ํŒŒ (BP)๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ์ƒˆ๋กœ์šด ์‹œ๋ƒ…์Šค ์–ด๋ ˆ์ด ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ๋œ ์‹œ๋ƒ…์Šค ์•„ํ‚คํ…์ฒ˜์—์„œ๋Š” ๊ฐ€์ค‘์น˜๊ฐ€ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ „์น˜๋  ์ˆ˜ ์žˆ๋„๋ก ์–‘์˜ ์‹œ๋ƒ…์Šค ๊ฐ€์ค‘์น˜์™€ ์Œ์˜ ์‹œ๋ƒ…์Šค ๊ฐ€์ค‘์น˜๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์–ด๋ ˆ์ด๋กœ ๋ถ„๋ฆฌ๋œ๋‹ค. ๋˜ํ•œ ๊ธฐ์กด NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์™€ ๋‹ฌ๋ฆฌ ์†Œ์Šค ๋ผ์ธ (SL)์„ ๋ถ„๋ฆฌํ•˜์—ฌ NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์—์„œ ์ˆœ๋ฐฉํ–ฅ ์ „ํŒŒ์™€ ์—ญ๋ฐฉํ–ฅ ์ „ํŒŒ๋ฅผ ๋ชจ๋‘ ์—ฐ์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค. NAND ์…€ ์–ด๋ ˆ์ด์˜ ๋น„ํŠธ ๋ผ์ธ (BL) ๋ฐ ์ŠคํŠธ๋ง ์„ ํƒ ๋ผ์ธ (SSL)์— ๊ฐ๊ฐ ์ž…๋ ฅ ๋ฐ ์˜ค๋ฅ˜ ์ž…๋ ฅ์„ ์ธ๊ฐ€ํ•จ์œผ๋กœ์จ PASS ์…€์˜ ํšจ๊ณผ๋ฅผ ์ œ๊ฑฐํ•˜์—ฌ ์ˆœ๋ฐฉํ–ฅ ์ „ํŒŒ ๋ฐ ์—ญ๋ฐ•ํ–ฅ ์ „ํŒŒ ๋ชจ๋‘์—์„œ ์ •ํ™•ํ•œ ๋ฒกํ„ฐ ํ–‰๋ ฌ ๊ณฑ์…ˆ์ด ์„ฑ๊ณต์ ์œผ๋กœ ์ˆ˜ํ–‰๋˜๋„๋ก ํ•œ๋‹ค. ์ œ์•ˆ๋œ ์˜จ ์นฉ ํ•™์Šต ์‹œ์Šคํ…œ์€ ์˜คํ”„ ์นฉ ํ•™์Šต ์‹œ์Šคํ…œ์— ๋น„ํ•ด ์†Œ์ž์˜ ์‚ฐํฌ์— ๋Œ€ํ•ด ํ›จ์”ฌ ์˜ํ–ฅ์ด ์ ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ œ์•ˆ๋œ ์˜จ ์นฉ ํ•™์Šต ์•„ํ‚คํ…์ฒ˜์˜ ์šฐ์ˆ˜์„ฑ์„ ์‹ ๊ฒฝ๋ง์˜ ํšŒ๋กœ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•ด ๊ฒ€์ฆํ•˜์˜€๋‹ค.Chapter 1 Introduction 1 1.1 Background 1 Chapter 2 Binary neural networks based on NAND flash memory 7 2.1 Synaptic architecture for BNN 7 2.2 Measurement results 13 2.3 Binary neuron circuit 23 2.4 Simulation results 27 2.5 Differential scheme 32 2.5.1 Differential synaptic architecture 32 2.5.2 Simulation results 41 Chapter 3 Quantized neural networks based on NAND flash memory 47 3.1 Synaptic architecture for QNN 47 3.2 Measurement results 55 3.3 Simulation results 66 Chapter 4 On-chip learning based on NAND flash memory 74 4.1 Synaptic architecture for on-chip learning 74 4.2 Measurement results 82 4.3 Neuron circuits 90 4.4 Simulation results 93 Chapter 5 Conclusion 100 Bibliography 104 Abstract in Korean 111๋ฐ•

    ClaPIM: Scalable Sequence CLAssification using Processing-In-Memory

    Full text link
    DNA sequence classification is a fundamental task in computational biology with vast implications for applications such as disease prevention and drug design. Therefore, fast high-quality sequence classifiers are significantly important. This paper introduces ClaPIM, a scalable DNA sequence classification architecture based on the emerging concept of hybrid in-crossbar and near-crossbar memristive processing-in-memory (PIM). We enable efficient and high-quality classification by uniting the filter and search stages within a single algorithm. Specifically, we propose a custom filtering technique that drastically narrows the search space and a search approach that facilitates approximate string matching through a distance function. ClaPIM is the first PIM architecture for scalable approximate string matching that benefits from the high density of memristive crossbar arrays and the massive computational parallelism of PIM. Compared with Kraken2, a state-of-the-art software classifier, ClaPIM provides significantly higher classification quality (up to 20x improvement in F1 score) and also demonstrates a 1.8x throughput improvement. Compared with EDAM, a recently-proposed SRAM-based accelerator that is restricted to small datasets, we observe both a 30.4x improvement in normalized throughput per area and a 7% increase in classification precision

    Validation practices for satellite based earth observation data across communities

    Get PDF
    Assessing the inherent uncertainties in satellite data products is a challenging task. Different technical approaches have been developed in the Earth Observation (EO) communities to address the validation problem which results in a large variety of methods as well as terminology. This paper reviews state-of-the-art methods of satellite validation and documents their similarities and differences. First the overall validation objectives and terminologies are specified, followed by a generic mathematical formulation of the validation problem. Metrics currently used as well as more advanced EO validation approaches are introduced thereafter. An outlook on the applicability and requirements of current EO validation approaches and targets is given

    Synchronization Controller To Solve The Mismatched Sampling Rates For Acoustic Echo Cancellation

    Get PDF
    Aplikasi-aplikasi Suara melalui IP (VoIP) yang menggunakan set komunikasi bebas tangan semakin meluas digunakan. Voice over Internet Protocol (VoIP) applications are extensively used for handsfree communication (audio conferencing and video conferencing). Although handsfree communication systems may encounter acoustic echo problems, such problems can be solved using acoustic echo cancellation (AEC)

    Advanced modeling of nanoscale devices for analog applications

    Get PDF
    L'abstract รจ presente nell'allegato / the abstract is in the attachmen

    COMPUTE-IN-MEMORY WITH EMERGING NON-VOLATILE MEMORIES FOR ACCELERATING DEEP NEURAL NETWORKS

    Get PDF
    The objective of this research is to accelerate deep neural networks (DNNs) with emerging non-volatile memories (eNVMs) based compute-in-memory (CIM) architecture. The research first focuses on the inference acceleration and proposes a resistive random access memory (RRAM) based CIM architecture. Two generations of RRAM testchips which monolithically integrate the RRAM memory array and CMOS peripheral circuits are designed and fabricated using Winbond 90 nm and TSMC 40 nm commercial embedded RRAM process respectively. The first generation of testchip named XNOR-RRAM is dedicated for binary neural networks (BNNs) and the second generation named Flex-RRAM features 1bit-to-8bit run-time configurable precision and leverages the input sparsity of the DNN model to improve the throughput and energy efficiency. However, the non-ideal characteristics of eNVM devices, especially when utilized as multi-level analog synaptic weights, may incur a notable accuracy degradation for both training and inference. This research develops a PyTorch based framework that incorporates the device characteristics into the DNN model to evaluate the impact of the eNVM nonidealities on training/inference accuracy. The results suggest that it is challenging to directly use eNVMs for in-situ training and resistance drift remains as a critical challenge to maintain a high inference accuracy. Furthermore, to overcome the challenges posed by the asymmetric conductance tuning behavior of typical eNVMs, which is found to be the most critical nonideality that prevents the model from achieving software equivalent training accuracy, this research proposes a novel 2-transistor-1-FeFET (ferroelectric field effect transistor) based synaptic weight cell that exploits hybrid precision for in situ training and inference, which achieves near-software classification accuracy on MNIST and CIFAR-10 dataset.Ph.D

    Sensing ECG signals with variable pulse width finite rate of innovation

    Get PDF
    Mobile health is gradually taking more importance in our society and the need of new power efficient devices acquiring biosignals for long periods of time is becoming substantial. In this thesis, we study the power reduction we could achieve on ECG sensing devices. Emphasis is made on reducing the number of samples both during the sensing phase and the compression phase. To that end, a new scheme called variable pulse width finite rate of innovation (VPW-FRI) is investigated. This new technique relies on the classical finite rate of innovation (FRI) theory and enables the use of a sum of asymmetric Cauchy-based pulses to model ECG signals. Research is done in order to implement VPW in practice and its performance are carefully analysed. Among others, we consider the potential instability of the method, we study its compression effectiveness and compare it with compression schemes widespread in the literature. We also evaluate the spectrum extrapolation performance of VPW when fed with signals sampled at sub-Nyquist rates and propose a modification that improves it. Furthermore, we introduce a method based on the similarities between different heart beats that reduces the computational costs of VPW. The parametric nature of VPW finally allows us to use it as a noise reduction algorithm. In parallel, we review and test a non-uniform sensing technique that adapts the sampling rate to the slope of the signal
    corecore