139 research outputs found

    Towards Accurate and High-Speed Spiking Neuromorphic Systems with Data Quantization-Aware Deep Networks

    Full text link
    Deep Neural Networks (DNNs) have gained immense success in cognitive applications and greatly pushed today's artificial intelligence forward. The biggest challenge in executing DNNs is their extremely data-extensive computations. The computing efficiency in speed and energy is constrained when traditional computing platforms are employed in such computational hungry executions. Spiking neuromorphic computing (SNC) has been widely investigated in deep networks implementation own to their high efficiency in computation and communication. However, weights and signals of DNNs are required to be quantized when deploying the DNNs on the SNC, which results in unacceptable accuracy loss. %However, the system accuracy is limited by quantizing data directly in deep networks deployment. Previous works mainly focus on weights discretize while inter-layer signals are mainly neglected. In this work, we propose to represent DNNs with fixed integer inter-layer signals and fixed-point weights while holding good accuracy. We implement the proposed DNNs on the memristor-based SNC system as a deployment example. With 4-bit data representation, our results show that the accuracy loss can be controlled within 0.02% (2.3%) on MNIST (CIFAR-10). Compared with the 8-bit dynamic fixed-point DNNs, our system can achieve more than 9.8x speedup, 89.1% energy saving, and 30% area saving.Comment: 6 pages, 4 figure

    EnforceSNN: Enabling Resilient and Energy-Efficient Spiking Neural Network Inference considering Approximate DRAMs for Embedded Systems

    Full text link
    Spiking Neural Networks (SNNs) have shown capabilities of achieving high accuracy under unsupervised settings and low operational power/energy due to their bio-plausible computations. Previous studies identified that DRAM-based off-chip memory accesses dominate the energy consumption of SNN processing. However, state-of-the-art works do not optimize the DRAM energy-per-access, thereby hindering the SNN-based systems from achieving further energy efficiency gains. To substantially reduce the DRAM energy-per-access, an effective solution is to decrease the DRAM supply voltage, but it may lead to errors in DRAM cells (i.e., so-called approximate DRAM). Towards this, we propose \textit{EnforceSNN}, a novel design framework that provides a solution for resilient and energy-efficient SNN inference using reduced-voltage DRAM for embedded systems. The key mechanisms of our EnforceSNN are: (1) employing quantized weights to reduce the DRAM access energy; (2) devising an efficient DRAM mapping policy to minimize the DRAM energy-per-access; (3) analyzing the SNN error tolerance to understand its accuracy profile considering different bit error rate (BER) values; (4) leveraging the information for developing an efficient fault-aware training (FAT) that considers different BER values and bit error locations in DRAM to improve the SNN error tolerance; and (5) developing an algorithm to select the SNN model that offers good trade-offs among accuracy, memory, and energy consumption. The experimental results show that our EnforceSNN maintains the accuracy (i.e., no accuracy loss for BER less-or-equal 10^-3) as compared to the baseline SNN with accurate DRAM, while achieving up to 84.9\% of DRAM energy saving and up to 4.1x speed-up of DRAM data throughput across different network sizes.Comment: Accepted for publication at Frontiers in Neuroscience - Section Neuromorphic Engineerin

    Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

    Get PDF
    Corey Lammie designed mixed signal memristive-complementary metalโ€“oxideโ€“semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

    DART: Distribution Aware Retinal Transform for Event-based Cameras

    Full text link
    We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-features classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101). (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) For overcoming the low-sample problem for the one-shot learning of a binary classifier, statistical bootstrapping is leveraged with online learning; (ii) To achieve tracker robustness, the scale and rotation equivariance property of the DART descriptors is exploited for the one-shot learning. (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset. (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201

    ๋”ฅ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ๋น ๋ฅด๊ณ  ์ •ํ™•ํ•œ ์ •๋ณด ์ „๋‹ฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021. 2. ์œค์„ฑ๋กœ.์˜ค๋Š˜ ๋‚  ๋”ฅ๋Ÿฌ๋‹์˜ ํฐ ์„ฑ๊ณต์€ ๊ณ ์„ฑ๋Šฅ ๋ณ‘๋ ฌ ์ปดํ“จํŒ… ์‹œ์Šคํ…œ์˜ ๋ฐœ์ „๊ณผ ๋ณต์žกํ•œ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๋งŽ์€ ์–‘์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์ˆ˜์ง‘๋˜์–ด ์ ‘๊ทผ์ด ๊ฐ€๋Šฅํ•ด์ง„ ์ ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์‹ค์ œ ์„ธ์ƒ์— ์กด์žฌํ•˜๋Š” ๋” ์–ด๋ ค์šด ๋ฌธ์ œ๋“ค์„ ํ’€๊ณ ์žํ•  ๋•Œ๋Š” ๋”์šฑ ๋” ์„ฌ์„ธํ•˜๊ณ  ๋ณต์žกํ•œ ๋ชจ๋ธ๊ณผ ์ด ๋ชจ๋ธ์„ ์„ฑ๊ณต์ ์œผ๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋Œ€ํ•œ ์–‘์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•„์š”ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ์ ๋“ค์€ ๋ชจ๋ธ ์ˆ˜ํ–‰ ์‹œ ์—ฐ์‚ฐ ์˜ค๋ฒ„ํ—ค๋“œ์™€ ์ „๋ ฅ ์†Œ๋ชจ๋ฅผ ๊ธ‰๊ฒฉํ•˜๊ฒŒ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜ ๋ฐ–์— ์—†๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ์ ๋“ค์„ ๊ทน๋ณตํ•˜๋Š” ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•๋“ค ์ค‘ ํ•˜๋‚˜๋กœ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๊ฐ€ ์ตœ๊ทผ ๋งŽ์€ ์ฃผ๋ชฉ์„ ๋ฐ›๊ณ  ์žˆ๋‹ค. ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋Š” ์ œ 3์„ธ๋Œ€ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์œผ๋กœ ๋ถˆ๋ฆฌ๋ฉฐ ์ด๋ฒคํŠธ ์ค‘์‹ฌ์˜ ๋™์ž‘์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜์—ฌ ์ €์ „๋ ฅ์ด ๊ฐ€์žฅ ํฐ ์žฅ์ ์ด๋‹ค. ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋Š” ์‹ค์ œ ์ธ๊ฐ„์˜ ๋‡Œ์—์„œ ๋‰ด๋Ÿฐ๋“ค ๊ฐ„ ์ •๋ณด๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๋ฐฉ์‹์„ ๋ชจ๋ฐฉํ•˜๋ฉฐ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿฐ์„ ์—ฐ์‚ฐ ๋‹จ์œ„๋กœ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค. ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋Š” ์ƒ๋ฌผํ•™์  ์‹ ๊ฒฝ๊ณ„์™€ ๋™์ผํ•˜๊ฒŒ ์‹œ๊ฐ„์  ์ •๋ณด๋ฅผ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋งค์šฐ ๋›ฐ์–ด๋‚œ ์—ฐ์‚ฐ ๋Šฅ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋Š” ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์™€ ๊ฐ™์€ ๋น„๊ต์  ์‰ฌ์šด ์‘์šฉ์—๋งŒ ์ฃผ๋กœ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ์œผ๋ฉฐ ์–•์€ ์ธ๊ณต ์‹ ๊ฒฝ๋ง๊ณผ ๊ฐ„๋‹จํ•œ ๋ฐ์ดํ„ฐ์…‹์—์„œ๋งŒ ์ฃผ๋กœ ์ˆ˜ํ–‰๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ œ์•ฝ์ด ์กด์žฌํ•˜๋Š” ๊ฐ€์žฅ ํฐ ์š”์ธ ์ค‘ ํ•˜๋‚˜๋Š” ์ŠคํŒŒ์ดํฌ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์— ์ ํ•ฉํ•œ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์•„์ง ์กด์žฌํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ์ŠคํŒŒ์ดํฌ๋กœ ์ •๋ณด๋ฅผ ์ „๋‹ฌํ•˜๊ณ  ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ฏธ๋ถ„์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. ๋”ฐ๋ผ์„œ ๋”ฅ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์—์„œ ์ฃผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ์—ญ์ „ํŒŒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์‚ฌ์šฉ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ๋”ฅ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋ฅผ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜๋ณด๋‹ค ๋” ์–ด๋ ค์šด ํšŒ๊ท€ ๋ฌธ์ œ (๊ฐ์ฒด ์ธ์‹)์— ์ ์šฉํ•ด ๋ณด๊ณ , ๋”ฅ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ์„ฑ๋Šฅ์— ๋ฒ„๊ธˆ๊ฐ€๋Š” ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์„ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œ์—์„œ ์ฒ˜์Œ์œผ๋กœ ์ œ์•ˆํ•œ๋‹ค. ๋” ๋‚˜์•„๊ฐ€, ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ๊ณผ ์ง€์—ฐ์‹œ๊ฐ„, ์—๋„ˆ์ง€ ํšจ์œจ์„ฑ์„ ํ–ฅ์ƒ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•๋“ค์„ ์ œ์•ˆํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ์ฃผ์ œ๋กœ ๋‚˜๋ˆ„์–ด ์„ค๋ช…ํ•œ๋‹ค: (a) ๋”ฅ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์—์„œ์˜ ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ, (b) ๋”ฅ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์—์„œ์˜ ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ๋ฐ ํšจ์œจ์„ฑ ํ–ฅ์ƒ. ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์„ ํ†ตํ•ด ๋น ๋ฅด๊ณ  ์ •ํ™•ํ•œ ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์„ ๋”ฅ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์—์„œ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์€ ๋”ฅ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์—์„œ์˜ ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์ด๋‹ค. ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์€ Spiking-YOLO๋กœ ๋ถ€๋ฅด๊ณ , ์ €์ž๋“ค์ด ์•„๋Š” ๋ฐ”์— ์˜ํ•˜๋ฉด PASCAL VOC, MS COCO์™€ ๊ฐ™์€ ๋ฐ์ดํ„ฐ ์…‹์—์„œ ๋”ฅ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ์„ฑ๋Šฅ์— ๋ฒ„๊ธˆ๊ฐ€๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ค€ ์ฒซ ๋ฒˆ์งธ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์ด๋‹ค. Spiking-YOLO์—์„œ๋Š” ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ฒซ๋ฒˆ ์งธ๋Š” ์ฑ„๋„ ๋ณ„ ๊ฐ€์ค‘์น˜ ์ •๊ทœํ™”์ด๊ณ  ๋‘๋ฒˆ์งธ๋Š” ๋ถˆ๊ท ํ˜• ํ•œ๊ณ„ ์ „์••์„ ๊ฐ€์ง€๋Š” ์–‘์Œ์ˆ˜ ๋‰ด๋Ÿฐ์ด๋‹ค. ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ๋น ๋ฅด๊ณ  ์ •ํ™•ํ•œ ์ •๋ณด๋ฅผ ๋”ฅ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์—์„œ ์ „๋‹ฌ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, Spiking-YOLO๋Š” PASCAL VOC์™€ MS COCO ๋ฐ์ดํ„ฐ์…‹์—์„œ ๋”ฅ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ๊ฐ์ฒด ์ธ์‹๋ฅ ์˜ 98%์— ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ๋˜ํ•œ Spiking-YOLO๊ฐ€ ๋‰ด๋กœ๋ชจํ”ฝ ์นฉ์— ๊ตฌํ˜„๋˜์—ˆ์Œ ๊ฐ€์ •ํ•˜์˜€์„ ๋•Œ, Tiny YOLO๋ณด๋‹ค ์•ฝ 280์˜ ์—๋„ˆ์ง€๋ฅผ ์ ๊ฒŒ ์†Œ๋ชจํ•˜์˜€๊ณ  ๊ธฐ์กด์˜ DNN-to-SNN ์ „ํ™˜ ๋ฐฉ๋ฒ•๋“ค ๋ณด๋‹ค 2.3๋ฐฐ์—์„œ 4๋ฐฐ ๋” ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์€ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์— ์กฐ๊ธˆ ๋” ํšจ์œจ์ ์ธ ์—ฐ์‚ฐ ๋Šฅ๋ ฅ์„ ๋ถ€์—ฌํ•˜๋Š”๋ฐ ์ค‘์ ์„ ์ฃผ๊ณ  ์žˆ๋‹ค. ๋น„๋ก ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๊ฐ€ ํฌ๋ฐ•ํ•œ ์–‘์˜ ์ŠคํŒŒ์ดํฌ๋กœ ์ •๋ณด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ „๋‹ฌํ•˜๋ฉฐ ์—ฐ์‚ฐ ์˜ค๋ฒ„ํ—ค๋“œ์™€ ์—๋„ˆ์ง€ ์†Œ๋ชจ๊ฐ€ ์ ์ง€๋งŒ, ๋‘ ๊ฐ€์ง€ ๋งค์šฐ ์ค‘์š”ํ•œ ๋ฌธ์ œ๋“ค์ด ์กด์žฌํ•œ๋‹ค: (a) ์ง€์—ฐ์†๋„: ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ํƒ€์ž„์Šคํƒญ, (b) ์‹œ๋ƒ…ํ‹ฑ ์—ฐ์‚ฐ์ˆ˜: ์ถ”๋ก  ์‹œ ์ƒ์„ฑ๋œ ์ด ์ŠคํŒŒ์ดํฌ์˜ ์ˆ˜. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋“ค์„ ์ ์ ˆํžˆ ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ•œ๋‹ค๋ฉด ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ํฐ ์žฅ์ ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋Š” ์—๋„ˆ์ง€์™€ ์ „๋ ฅ ํšจ์œจ์„ฑ์ด ํฌ๊ฒŒ ์ €ํ•˜๋  ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํ•œ๊ณ„ ์ „์•• ๊ท ํ˜• ๋ฐฉ๋ฒ•๋ก ์„ ์ƒˆ๋กœ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์€ ๋ฒ ์ด์‹œ์•ˆ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์žฅ ์ตœ์ ์˜ ํ•œ๊ณ„์ „์•• ๊ฐ’์„ ์ฐพ๋Š”๋‹ค. ๋˜ํ•œ ๋ฒ ์ด์‹œ์•ˆ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ง€์—ฐ์†๋„๋‚˜ ์‹œ๋ƒ…ํ‹ฑ ์—ฐ์‚ฐ์ˆ˜ ๋“ฑ์˜ ์ŠคํŒŒ์ดํ‚น ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ํŠน์„ฑ์„ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋””์ž์ธํ•œ๋‹ค. ๋” ๋‚˜์•„๊ฐ€, ๋‘ ๋‹จ๊ณ„์˜ ํ•œ๊ณ„ ์ „์••์„ ์ œ์•ˆํ•˜์—ฌ ๋†’์€ ์—๋„ˆ์ง€ ํšจ์œจ์„ ๊ฐ€์ง€๋ฉฐ ๋” ๋น ๋ฅด๊ณ  ๋” ์ •ํ™•ํ•œ ๊ฐ์ฒด ์ธ์‹ ๋ชจ๋ธ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ์— ๋”ฐ๋ฅด๋ฉด ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์„ ํ†ตํ•ด state-of-the-art ๊ฐ์ฒด ์ธ์‹๋ฅ ์„ ๋‹ฌ์„ฑํ•˜์˜€๊ณ  ๊ธฐ์กด์˜ ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค PASCAL VOC์—์„œ๋Š” 2๋ฐฐ, MS COCO์—์„œ๋Š” 1.85๋ฐฐ ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋˜ํ•œ ์‹œ๋ƒ…ํ‹ฑ ์—ฐ์‚ฐ์ˆ˜๋„ PASCAL VOC์—์„œ๋Š” 40.33%, MS COCO์—์„œ๋Š” 45.31%๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ์—ˆ๋‹ค.One of the primary reasons behind the recent success of deep neural networks (DNNs) lies in the development of high-performance parallel computing systems and the availability of enormous amounts of data for training a complex model. Nonetheless, solving such advanced machine learning problems in real world applications requires a more sophisticated model with a vast number of parameters and training data, which leads to substantial amounts of computational overhead and power consumption. Given these circumstances, spiking neural networks (SNNs) have attracted growing interest as the third generation of neural networks due to their event-driven and low-powered nature. SNNs were introduced to mimic how information is encoded and processed in the human brain by employing spiking neurons as computation units. SNNs utilize temporal aspects in information transmission as in biological neural systems, thus providing sparse yet powerful computing ability. SNNs have been successfully applied in several applications, but these applications only include relatively simple tasks such as image classification, and are limited to shallow neural networks and datasets. One of the primary reasons for the limited application scope is the lack of scalable training algorithms attained from non-differential spiking neurons. In this dissertation, we investigate deep SNNs in a much more challenging regression problem (i.e., object detection), and propose a first object detection model in deep SNNs which is able to achieve comparable results to those of DNNs in non-trivial datasets. Furthermore, we introduce novel approaches to improve performance of the object detection model in terms of accuracy, latency and energy efficiency. This dissertation contains mainly two approaches: (a) object detection model in deep SNNs, and (b) improving performance of object detection model in deep SNNs. Consequently, the two approaches enable fast and accurate object detection in deep SNNs. The first approach is an object detection model in deep SNNs. We present a spiked-based object detection model, called Spiking-YOLO. To the best of our knowledge, Spiking-YOLO is the first spiked-based object detection model that is able to achieve comparable results to those of DNNs on a non-trivial dataset, namely PASCAL VOC and MS COCO. In doing so, we introduce two novel methods: a channel-wise weight normalization and a signed neuron with imbalanced threshold, both of which provide fast and accurate information transmission in deep SNNs. Our experiments show that Spiking-YOLO achieves remarkable results that are comparable (up to 98%) to those of Tiny YOLO (DNNs) on PASCAL VOC and MS COCO. Furthermore, Spiking-YOLO on a neuromorphic chip consumes approximately 280 times less energy than Tiny YOLO, and converges 2.3 to 4 times faster than previous DNN-to-SNN conversion methods. The second approach aims to provide a more effective form of computational capabilities in SNNs. Even though, SNNs enable sparse yet efficient information transmission through spike trains, leading to exceptional computational and energy efficiency, the critical challenges in SNNs to date are two-fold: (a) latency: the number of time steps required to achieve competitive results and (b) synaptic operations: the total number of spikes generated during inference. Without addressing these challenges properly, the potential impact of SNNs may be diminished in terms of energy and power efficiency. We present a threshold voltage balancing method for object detection in SNNs, which utilizes Bayesian optimization to find optimal threshold voltages in SNNs. We specifically design Bayesian optimization to consider important characteristics of SNNs, such as latency and number of synaptic operations. Furthermore, we introduce two-phase threshold voltages to provide faster and more accurate object detection, while providing high energy efficiency. According to experimental results, the proposed methods achieve the state-of-the-art object detection accuracy in SNNs, and converge 2x and 1.85x faster than conventional methods on PASCAL VOC and MS COCO, respectively. Moreover, the total number of synaptic operations is reduced by 40.33% and 45.31% on PASCAL VOC and MS COCO, respectively.Abstract i List of Figures ix List of Tables x 1 Introduction 1 2 Background 10 2.1 Object detection 10 2.2 Spiking Neural Networks 16 2.3 DNN-to-SNN conversion 18 2.4 Hyper-parameter optimization 21 3 Object detection model in deep SNNs 25 3.1 Introduction 25 3.2 Channel-wise weight normalization 27 3.2.1 Conventional weight normalization methods 27 3.2.2 Analysis of limitations in layer-wise weight normalization 29 3.2.3 Proposed weight normalization method 30 3.2.4 Analysis of the improved firing rate 38 3.3 Signed neuron with imbalanced threshold 39 3.3.1 Limitation of leaky-ReLU implementation in SNNs 39 3.3.2 The notion of imbalanced threshold 41 3.4 Evaluation 43 3.4.1 Spiking-YOLO detection results 43 3.4.2 Spiking-YOLO energy efficiency 57 4 Improving performance and efficiency of deep SNNs 60 4.1 Introduction 60 4.2 Threshold voltage balancing through Bayesian optimization 62 4.2.1 Motivation 62 4.2.2 Overall process and setup 67 4.2.3 Design of Bayesian optimization for SNNs 69 4.3 Fast and accurate object detection with two-phase threshold voltages 74 4.3.1 Motivation 74 4.3.2 Phase-1 threshold voltages: fast object detection 76 4.3.3 Phase-2 threshold voltages: accurate detection 76 4.4 Evaluation 79 4.4.1 Experimental setup 79 4.4.2 Experimental results 79 5 Conclusion 85 5.1 Dissertation summary 86 5.2 Discussion 88 5.2.1 Overview of the proposed methods and their usages 88 5.3 Challenges in SNNs 90 5.4 Future Work 92 5.4.1 Extension to various applications and DNN models 92 5.4.2 Further improve efficiency of SNNs 93 5.4.3 Optimization of deep SNNs 94 Bibliography 95 Abstract (In Korean) 110Docto

    SpikingJelly: An open-source machine learning infrastructure platform for spike-based intelligence

    Full text link
    Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency by introducing neural dynamics and spike properties. As the emerging spiking deep learning paradigm attracts increasing interest, traditional programming frameworks cannot meet the demands of the automatic differentiation, parallel computation acceleration, and high integration of processing neuromorphic datasets and deployment. In this work, we present the SpikingJelly framework to address the aforementioned dilemma. We contribute a full-stack toolkit for pre-processing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips. Compared to existing methods, the training of deep SNNs can be accelerated 11ร—11\times, and the superior extensibility and flexibility of SpikingJelly enable users to accelerate custom models at low costs through multilevel inheritance and semiautomatic code generation. SpikingJelly paves the way for synthesizing truly energy-efficient SNN-based machine intelligence systems, which will enrich the ecology of neuromorphic computing.Comment: Accepted in Science Advances (https://www.science.org/doi/10.1126/sciadv.adi1480

    FireFly: A High-Throughput and Reconfigurable Hardware Accelerator for Spiking Neural Networks

    Full text link
    Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency. With the introduction of the backpropagation algorithm and surrogate gradient, the structure of spiking neural networks has become more complex, and the performance gap with artificial neural networks has gradually decreased. However, most SNN hardware implementations for field-programmable gate arrays (FPGAs) cannot meet arithmetic or memory efficiency requirements, which significantly restricts the development of SNNs. They do not delve into the arithmetic operations between the binary spikes and synaptic weights or assume unlimited on-chip RAM resources by using overly expensive devices on small tasks. To improve arithmetic efficiency, we analyze the neural dynamics of spiking neurons, generalize the SNN arithmetic operation to the multiplex-accumulate operation, and propose a high-performance implementation of such operation by utilizing the DSP48E2 hard block in Xilinx Ultrascale FPGAs. To improve memory efficiency, we design a memory system to enable efficient synaptic weights and membrane voltage memory access with reasonable on-chip RAM consumption. Combining the above two improvements, we propose an FPGA accelerator that can process spikes generated by the firing neuron on-the-fly (FireFly). FireFly is implemented on several FPGA edge devices with limited resources but still guarantees a peak performance of 5.53TSOP/s at 300MHz. As a lightweight accelerator, FireFly achieves the highest computational density efficiency compared with existing research using large FPGA devices

    Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

    Get PDF
    Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning (DL) is already present in many applications ranging from computer vision for medicine to autonomous driving of modern cars as well as other sectors in security, healthcare, and finance. However, to achieve impressive performance, these algorithms employ very deep networks, requiring a significant computational power, both during the training and inference time. A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute-and energy-hungry. In a scenario where several sophisticated algorithms need to be executed with limited energy and low latency, the need for cost-effective hardware platforms capable of implementing energy-efficient DL execution arises. This paper first introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and Spiking Neural Network (SNN), and then analyzes techniques to produce efficient and high-performance designs. This work summarizes and compares the works for four leading platforms for the execution of algorithms such as CPU, GPU, FPGA and ASIC describing the main solutions of the state-of-the-art, giving much prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking, explaining how to assess the quality of different networks and hardware systems designed for them

    Unleashing the Potential of Spiking Neural Networks by Dynamic Confidence

    Full text link
    This paper presents a new methodology to alleviate the fundamental trade-off between accuracy and latency in spiking neural networks (SNNs). The approach involves decoding confidence information over time from the SNN outputs and using it to develop a decision-making agent that can dynamically determine when to terminate each inference. The proposed method, Dynamic Confidence, provides several significant benefits to SNNs. 1. It can effectively optimize latency dynamically at runtime, setting it apart from many existing low-latency SNN algorithms. Our experiments on CIFAR-10 and ImageNet datasets have demonstrated an average 40% speedup across eight different settings after applying Dynamic Confidence. 2. The decision-making agent in Dynamic Confidence is straightforward to construct and highly robust in parameter space, making it extremely easy to implement. 3. The proposed method enables visualizing the potential of any given SNN, which sets a target for current SNNs to approach. For instance, if an SNN can terminate at the most appropriate time point for each input sample, a ResNet-50 SNN can achieve an accuracy as high as 82.47% on ImageNet within just 4.71 time steps on average. Unlocking the potential of SNNs needs a highly-reliable decision-making agent to be constructed and fed with a high-quality estimation of ground truth. In this regard, Dynamic Confidence represents a meaningful step toward realizing the potential of SNNs
    • โ€ฆ
    corecore