Search CORE

47 research outputs found

Improving Performance and Endurance for Crossbar Resistive Memory

Author: Wen Wen
Publication venue
Publication date: 28/09/2020
Field of study

Resistive Memory (ReRAM) has emerged as a promising non-volatile memory technology that may replace a significant portion of DRAM in future computer systems. When adopting crossbar architecture, ReRAM cell can achieve the smallest theoretical size in fabrication, ideally for constructing dense memory with large capacity. However, crossbar cell structure suffers from severe performance and endurance degradations, which come from large voltage drops on long wires. In this dissertation, I first study the correlation between the ReRAM cell switching latency and the number of cells in low resistant state (LRS) along bitlines, and propose to dynamically speed up write operations based on bitline data patterns. By leveraging the intrinsic in-memory processing capability of ReRAM crossbars, a low overhead runtime profiler that effectively tracks the data patterns in different bitlines is proposed. To achieve further write latency reduction, data compression and row address dependent memory data layout are employed to reduce the numbers of LRS cells on bitlines. Moreover, two optimization techniques are presented to mitigate energy overhead brought by bitline data patterns tracking. Second, I propose XWL, a novel table-based wear leveling scheme for ReRAM crossbars and study the correlation between write endurance and voltage stress in ReRAM crossbars. By estimating and tracking the effective write stress to different rows at runtime, XWL chooses the ones that are stressed the most to mitigate. Additionally, two extended scenarios are further examined for the performance and endurance issues in neural network accelerators as well as 3D vertical ReRAM (3D-VRAM) arrays. For the ReRAM crossbar-based accelerators, by exploiting the wearing out mechanism of ReRAM cell, a novel comprehensive framework, ReNEW, is proposed to enhance the lifetime of the ReRAM crossbar-based accelerators, particularly for neural network training. To reduce the write latency in 3D-VRAM arrays, a collection of techniques, including an in-memory data encoding scheme, a data pattern estimator for assessing cell resistance distributions, and a write time reduction scheme that opportunistically reduces RESET latency with runtime data patterns, are devised

D-Scholarship@Pitt

In-Memory Computing for Machine Learning and Deep Learning

Author: Cattaneo L.
Farronato M.
Glukhov A.
Ielmini D.
Lepri N.
Mannocci P.
Publication venue
Publication date: 01/01/2023
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

A Phase Change Memory and DRAM Based Framework For Energy-Efficient and High-Speed In-Memory Stochastic Computing

Author: Mysore Supreeth
Publication venue: UKnowledge
Publication date: 01/01/2023
Field of study

Convolutional Neural Networks (CNNs) have proven to be highly effective in various fields related to Artificial Intelligence (AI) and Machine Learning (ML). However, the significant computational and memory requirements of CNNs make their processing highly compute and memory-intensive. In particular, the multiply-accumulate (MAC) operation, which is a fundamental building block of CNNs, requires enormous arithmetic operations. As the input dataset size increases, the traditional processor-centric von-Neumann computing architecture becomes ill-suited for CNN-based applications. This results in exponentially higher latency and energy costs, making the processing of CNNs highly challenging. To overcome these challenges, researchers have explored the Processing-In Memory (PIM) technique, which involves placing the processing unit inside or near the memory unit. This approach reduces data migration length and utilizes the internal memory bandwidth at the memory chip level. However, developing a reliable PIM-based system with minimal hardware modifications and design complexity remains a significant challenge. The proposed solution in the report suggests utilizing different memory technologies, such as Dynamic RAM (DRAM) and phase change memory (PCM), with Stochastic arithmetic and minimal add-on logic. Stochastic computing is a technique that uses random numbers to perform arithmetic operations instead of traditional binary representation. This technique reduces hardware requirements for CNN\u27s arithmetic operations, making it possible to implement them with minimal add-on logic. The report details the workflow for performing arithmetical operations used by CNNs, including MAC, activation, and floating-point functions. The proposed solution includes designs for scalable Stochastic Number Generator (SNG), DRAM CNN accelerator, non-volatile memory (NVM) class PCRAM-based CNN accelerator, and DRAM-based stochastic to binary conversion (StoB) for in-situ deep learning. These designs utilize stochastic computing to reduce the hardware requirements for CNN\u27s arithmetic operations and enable energy and time-efficient processing of CNNs. The report also identifies future research directions for the proposed designs, including in-situ PCRAM-based SNG, ODIN (A Bit-Parallel Stochastic Arithmetic Based Accelerator for In-Situ Neural Network Processing in Phase Change RAM), ATRIA (Bit-Parallel Stochastic Arithmetic Based Accelerator for In-DRAM CNN Processing), and AGNI (In-Situ, Iso-Latency Stochastic-to-Binary Number Conversion for In-DRAM Deep Learning), and presents initial findings for these ideas. In summary, the proposed solution in the report offers a comprehensive approach to address the challenges of processing CNNs, and the proposed designs have the potential to improve the energy and time efficiency of CNNs significantly. Using Stochastic Computing and different memory technologies enables the development of reliable PIM-based systems with minimal hardware modifications and design complexity, providing a promising path for the future of CNN-based applications

University of Kentucky

에너지 효율적 인공신경망 설계

Author: 김재현
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2019. 2. 최기영.최근 심층 학습은 이미지 분류, 음성 인식 및 강화 학습과 같은 영역에서 놀라운 성과를 거두고 있다. 최첨단 심층 인공신경망 중 일부는 이미 인간의 능력을 넘어선 성능을 보여주고 있다. 그러나 인공신경망은 엄청난 수의 고정밀 계산과 수백만개의 매개 변수를 이용하기 위한 빈번한 메모리 액세스를 수반한다. 이는 엄청난 칩 공간과 에너지 소모 문제를 야기하여 임베디드 시스템에서 인공신경망이 사용되는 것을 제한하게 된다. 이 문제를 해결하기 위해 인공신경망을 높은 에너지 효율성을 갖도록 설계하는 방법을 제안한다. 첫번째 파트에서는 가중 스파이크를 이용하여 짧은 추론 시간과 적은 에너지 소모의 장점을 갖는 스파이킹 인공신경망 설계 방법을 다룬다. 스파이킹 인공신경망은 인공신경망의 높은 에너지 소비 문제를 극복하기 위한 유망한 대안 중 하나이다. 기존 연구에서 심층 인공신경망을 정확도 손실없이 스파이킹 인공신경망으로 변환하는 방법이 발표되었다. 그러나 기존의 방법들은 rate coding을 사용하기 때문에 긴 추론 시간을 갖게 되고 이것이 많은 에너지 소모를 야기하게 되는 단점이 있다. 이 파트에서는 페이즈에 따라 다른 스파이크 가중치를 부여하는 방법으로 추론 시간을 크게 줄이는 방법을 제안한다. MNIST, SVHN, CIFAR-10, CIFAR-100 데이터셋에서의 실험 결과는 제안된 방법을 이용한 스파이킹 인공신경망이 기존 방법에 비해 큰 폭으로 추론 시간과 스파이크 발생 빈도를 줄여서 보다 에너지 효율적으로 동작함을 보여준다. 두번째 파트에서는 공정 변이가 있는 상황에서 동작하는 고에너지효율 아날로그 인공신경망 설계 방법을 다루고 있다. 인공신경망을 아날로그 회로를 사용하여 구현하면 높은 병렬성과 에너지 효율성을 얻을 수 있는 장점이 있다. 하지만, 아날로그 시스템은 노이즈에 취약한 중대한 결점을 가지고 있다. 이러한 노이즈 중 하나로 공정 변이를 들 수 있는데, 이는 아날로그 회로의 적정 동작 지점을 변화시켜 심각한 성능 저하 또는 오동작을 유발하는 원인이다. 이 파트에서는 ReRAM에 기반한 고에너지 효율 아날로그 이진 인공신경망을 구현하고, 공정 변이 문제를 해결하기 위해 활성도 일치 방법을 사용한 공정 변이 보상 기법을 제안한다. 제안된 인공신경망은 1T1R 구조의 ReRAM 배열과 차동증폭기를 이용한 뉴런을 이용하여 고밀도 집적과 고에너지 효율 동작이 가능하게 구성되었다. 또한, 아날로그 뉴런 회로의 공정 변이 취약성 문제를 해결하기 위해 이상적인 뉴런의 활성도와 동일한 활성도를 갖도록 뉴런의 바이어스를 조절하는 방법을 소개한다. 제안된 방법을 사용하여 32nm 공정에서 구현된 인공신경망은 3-sigma 지점에서 50% 문턱 전압 변이와 15%의 저항값 변이가 있는 상황에서도 MNIST에서 98.55%, CIFAR-10에서 89.63%의 정확도를 달성하였으며, 970 TOPS/W에 달하는 매우 높은 에너지 효율성을 달성하였다.Recently, deep learning has shown astounding performances on specific tasks such as image classification, speech recognition, and reinforcement learning. Some of the state-of-the-art deep neural networks have already gone over humans ability. However, neural networks involve tremendous number of high precision computations and frequent off-chip memory accesses with millions of parameters. It incurs problems of large area and exploding energy consumption, which hinder neural networks from being exploited in embedded systems. To cope with the problem, techniques for designing energy efficient neural networks are proposed. The first part of this dissertation addresses the design of spiking neural networks with weighted spikes which has advantages of shorter inference latency and smaller energy consumption compared to the conventional spiking neural networks. Spiking neural networks are being regarded as one of the promising alternative techniques to overcome the high energy costs of artificial neural networks. It is supported by many researches showing that a deep convolutional neural network can be converted into a spiking neural network with near zero accuracy loss. However, the advantage on energy consumption of spiking neural networks comes at a cost of long classification latency due to the use of Poisson-distributed spike trains (rate coding), especially in deep networks. We propose to use weighted spikes, which can greatly reduce the latency by assigning a different weight to a spike depending on which time phase it belongs. Experimental results on MNIST, SVHN, CIFAR-10, and CIFAR-100 show that the proposed spiking neural networks with weighted spikes achieve significant reduction in classification latency and number of spikes, which leads to faster and more energy-efficient spiking neural networks than the conventional spiking neural networks with rate coding. We also show that one of the state-of-the-art networks the deep residual network can be converted into spiking neural network without accuracy loss. The second part of this dissertation focuses on the design of highly energy-efficient analog neural networks in the presence of variations. Analog hardware accelerators for deep neural networks have taken center stage in the aspect of high parallelism and energy efficiency. However, a critical weakness of the analog hardware systems is vulnerability to noise. One of the biggest noise sources is a process variation. It is a big obstacle to using analog circuits since the variation shifts various parameters of analog circuits from the correct operating points, which causes severe performance degradation or even malfunction. To achieve high energy efficiency with analog neural networks, we propose resistive random access memory (ReRAM) based analog implementation of binarized neural networks (BNNs) with a novel variation compensation technique through activation matching (VCAM). The proposed architecture consists of 1-transistor-1-resistor (1T1R) structured ReRAM synaptic arrays and differential amplifier based neurons, which leads to high-density integration and energy efficiency. To cope with the vulnerability of analog neurons due to process variation, the biases of all neurons are adjusted in the direction that matches average output activation of ideal neurons without variation. The technique effectively restores the classification accuracy degraded by the variation. Experimental results on 32nm technology show that the proposed architecture achieves the classification accuracy of 98.55% on MNIST and 89.63% on CIFAR-10 in the presence of 50% threshold voltage variation and 15% resistance variation at 3-sigma point. It also achieves 970 TOPS/W energy efficiency with MLP on MNIST.1 Introduction 1 1.1 Deep Neural Networks with Weighted Spikes . . . . . . . . . . . . . 2 1.2 VCAM: Variation Compensation through Activation Matching for Analog Binarized Neural Networks . . . . . . . . . . . . . . . . . . . . . 5 2 Background 8 2.1 Spiking neural network . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Spiking neuron model . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Rate coding in SNNs . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Binarized neural networks . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Resistive random access memory . . . . . . . . . . . . . . . . . . . . 18 3 RelatedWork 22 3.1 Training SNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2 SNNs with various spike coding schemes . . . . . . . . . . . . . . . 25 3.3 BNN implementations . . . . . . . . . . . . . . . . . . . . . . . . . 28 4 Deep Neural Networks withWeighted Spikes 33 4.1 SNN with weighted spikes . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.1 Weighted spikes . . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.2 Spiking neuron model for weighted spikes . . . . . . . . . . . 35 4.1.3 Noise spike . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.1.4 Approximation of the ReLU activation . . . . . . . . . . . . 39 4.1.5 ANN-to-SNN conversion . . . . . . . . . . . . . . . . . . . . 41 4.2 Optimization techniques . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2.1 Skipping initial input currents in the output layer . . . . . . . 45 4.2.2 The number of phases in a period . . . . . . . . . . . . . . . 47 4.2.3 Accuracy-energy trade-off by early decision . . . . . . . . . . 50 4.2.4 Consideration on hardware implementation . . . . . . . . . . 52 4.3 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.1 Comparison between SNN-RC and SNN-WS . . . . . . . . . 56 4.4.2 Trade-off by early decision . . . . . . . . . . . . . . . . . . . 64 4.4.3 Comparison with other algorithms . . . . . . . . . . . . . . . 67 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5 VCAM: Variation Compensation through Activation Matching for Analog Binarized Neural Networks 71 5.1 Modification of Binarized Neural Network . . . . . . . . . . . . . . . 72 5.1.1 Binarized Neural Network . . . . . . . . . . . . . . . . . . . 72 5.1.2 Use of 0 and 1 Activations . . . . . . . . . . . . . . . . . . . 72 5.1.3 Removal of Batch Normalization Layer . . . . . . . . . . . . 73 5.2 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.1 ReRAM Synaptic Array . . . . . . . . . . . . . . . . . . . . 75 5.2.2 Neuron Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2.3 Issues with Neuron Circuit . . . . . . . . . . . . . . . . . . . 82 5.3 Variation Compensation . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.3.1 Variation Modeling . . . . . . . . . . . . . . . . . . . . . . . 85 5.3.2 Impact of VT Variation . . . . . . . . . . . . . . . . . . . . . 87 5.3.3 Variation Compensation Techniques . . . . . . . . . . . . . . 88 5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . 93 5.4.2 Accuracy of the Modified BNN Algorithm . . . . . . . . . . 94 5.4.3 Variation Compensation . . . . . . . . . . . . . . . . . . . . 95 5.4.4 Performance Comparison . . . . . . . . . . . . . . . . . . . . 99 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6 Conclusion 102Docto

SNU Open Repository and Archive

Embracing Low-Power Systems with Improvement in Security and Energy-Efficiency

Author: Pandey Pramesh
Publication venue: DigitalCommons@USU
Publication date: 01/08/2021
Field of study

As the economies around the world are aligning more towards usage of computing systems, the global energy demand for computing is increasing rapidly. Additionally, the boom in AI based applications and services has already invited the pervasion of specialized computing hardware architectures for AI (accelerators). A big chunk of research in the industry and academia is being focused on providing energy efficiency to all kinds of power hungry computing architectures. This dissertation adds to these efforts. Aggressive voltage underscaling of chips is one the effective low power paradigms of providing energy efficiency. This dissertation identifies and deals with the reliability and performance problems associated with this paradigm and innovates novel energy efficient approaches. Specifically, the properties of a low power security primitive have been improved and, higher performance has been unlocked in an AI accelerator (Google TPU) in an aggressively voltage underscaled environment. And, novel power saving opportunities have been unlocked by characterizing the usage pattern of a baseline TPU with rigorous mathematical analysis

DigitalCommons@USU

Fast Fitting of the Dynamic Memdiode Model to the Conduction Characteristics of RRAM Devices Using Convolutional Neural Networks

Author: Aguirre Fernando Leonel
Alff Lambert
Gehrunger Jonas
Hochberger Christian
Kaiser Nico
Miranda Enrique
Oster Timo
Petzold Stephan
Piros Eszter
Suñé Jordi
Vogel Tobias
Publication venue: MDPI
Publication date: 01/01/2022
Field of study

In this paper, the use of Artificial Neural Networks (ANNs) in the form of Convolutional Neural Networks (AlexNET) for the fast and energy-efficient fitting of the Dynamic Memdiode Model (DMM) to the conduction characteristics of bipolar-type resistive switching (RS) devices is investigated. Despite an initial computationally intensive training phase the ANNs allow obtaining a mapping between the experimental Current-Voltage (I-V) curve and the corresponding DMM parameters without incurring a costly iterative process as typically considered in error minimization-based optimization algorithms. In order to demonstrate the fitting capabilities of the proposed approach, a complete set of I-Vs obtained from Y₂O₃-based RRAM devices, fabricated with different oxidation conditions and measured with different current compliances, is considered. In this way, in addition to the intrinsic RS variability, extrinsic variation is achieved by means of external factors (oxygen content and damage control during the set process). We show that the reported method provides a significant reduction of the fitting time (one order of magnitude), especially in the case of large data sets. This issue is crucial when the extraction of the model parameters and their statistical characterization are required

Directory of Open Access Journals

PubMed Central

Diposit Digital de Documents de la UAB

tuprints

Ultra-low Power Circuits and Architectures for Neuromorphic Computing Accelerators with Emerging TFETs and ReRAMs

Author: Lin Jie
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2020
Field of study

Neuromorphic computing using post-CMOS technologies is gaining increasing popularity due to its promising potential to resolve the power constraints in Von-Neumann machine and its similarity to the operation of the real human brain. To design the ultra-low voltage and ultra-low power analog-to-digital converters (ADCs) for the neuromorphic computing systems, we explore advantages of tunnel field effect transistor (TFET) analog-to-digital converters (ADCs) on energy efficiency and temperature stability. A fully-differential SAR ADC is designed using 20 nm TFET technology with doubled input swing and controlled comparator input common-mode voltage. To further increase the resolution of the ADC, we design an energy efficient 12-bit noise shaping (NS) successive-approximation register (SAR) ADC. The 2nd-order noise shaping architecture with multiple feed-forward paths is adopted and analyzed to optimize system design parameters. By utilizing tunnel field effect transistors (TFETs), the Delta-Sigma SAR is realized under an ultra-low supply voltage VDD with high energy efficiency. The stochastic neuron is a key for event-based probabilistic neural networks. We propose a stochastic neuron using a metal-oxide resistive random-access memory (ReRAM). The ReRAM\u27s conducting filament with built-in stochasticity is used to mimic the neuron\u27s membrane capacitor, which temporally integrates input spikes. A capacitor-less neuron circuit is designed, laid out, and simulated. The output spiking train of the neuron obeys the Poisson distribution. Based on the ReRAM based neuron, we propose a scalable and reconfigurable architecture that exploits the ReRAM-based neurons for deep Spiking Neural Networks (SNNs). In prior publications, neurons were implemented using dedicated analog or digital circuits that are not area and energy efficient. In our work, for the first time, we address the scaling and power bottlenecks of neuromorphic architecture by utilizing a single one-transistor-one-ReRAM (1T1R) cell to emulate the neuron. We show that the ReRAM-based neurons can be integrated within the synaptic crossbar to build extremely dense Process Element (PE)–spiking neural network in memory array–with high throughput. We provide microarchitecture and circuit designs to enable the deep spiking neural network computing in memory with an insignificant area overhead

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Reliability and Security of Compute-In-Memory Based Deep Neural Network Accelerators

Author: Huang Shanshi
Publication venue: Georgia Institute of Technology
Publication date: 10/01/2023
Field of study

Compute-In-Memory (CIM) is a promising solution for accelerating DNNs at edge devices, utilizing mixed-signal computations. However, it requires more cross-layer designs from algorithm levels to hardware implementations as it behaves differently from the pure digital system. On one side, the mixed-signal computations of CIM face unignorable variations, which could hamper the software performance. On the other side, there are potential software/hardware security vulnerabilities with CIM accelerators. This research aims to solve the reliability and security issues in CIM design for accelerating Deep Neural Network (DNN) algorithms as they prevent the real-life use of the CIM-based accelerators. Some non-ideal effects in CIM accelerators are explored, which could cause reliability issues, and solved by the software-hardware co-design methods. In addition, different security vulnerabilities for SRAM-based CIM and eNVM-based CIM inference engines are defined, and corresponding countermeasures are proposed.Ph.D

Scholarly Materials And Research @ Georgia Tech

Designing energy-efficient computing systems using equalization and machine learning

Author: Takhirov Zafar
Publication venue
Publication date: 20/02/2018
Field of study

As technology scaling slows down in the nanometer CMOS regime and mobile computing becomes more ubiquitous, designing energy-efficient hardware for mobile systems is becoming increasingly critical and challenging. Although various approaches like near-threshold computing (NTC), aggressive voltage scaling with shadow latches, etc. have been proposed to get the most out of limited battery life, there is still no “silver bullet” to increasing power-performance demands of the mobile systems. Moreover, given that a mobile system could operate in a variety of environmental conditions, like different temperatures, have varying performance requirements, etc., there is a growing need for designing tunable/reconfigurable systems in order to achieve energy-efficient operation. In this work we propose to address the energy- efficiency problem of mobile systems using two different approaches: circuit tunability and distributed adaptive algorithms. Inspired by the communication systems, we developed feedback equalization based digital logic that changes the threshold of its gates based on the input pattern. We showed that feedback equalization in static complementary CMOS logic enabled up to 20% reduction in energy dissipation while maintaining the performance metrics. We also achieved 30% reduction in energy dissipation for pass-transistor digital logic (PTL) with equalization while maintaining performance. In addition, we proposed a mechanism that leverages feedback equalization techniques to achieve near optimal operation of static complementary CMOS logic blocks over the entire voltage range from near threshold supply voltage to nominal supply voltage. Using energy-delay product (EDP) as a metric we analyzed the use of the feedback equalizer as part of various sequential computational blocks. Our analysis shows that for near-threshold voltage operation, when equalization was used, we can improve the operating frequency by up to 30%, while the energy increase was less than 15%, with an overall EDP reduction of ≈10%. We also observe an EDP reduction of close to 5% across entire above-threshold voltage range. On the distributed adaptive algorithm front, we explored energy-efficient hardware implementation of machine learning algorithms. We proposed an adaptive classifier that leverages the wide variability in data complexity to enable energy-efficient data classification operations for mobile systems. Our approach takes advantage of varying classification hardness across data to dynamically allocate resources and improve energy efficiency. On average, our adaptive classifier is ≈100× more energy efficient but has ≈1% higher error rate than a complex radial basis function classifier and is ≈10× less energy efficient but has ≈40% lower error rate than a simple linear classifier across a wide range of classification data sets. We also developed a field of groves (FoG) implementation of random forests (RF) that achieves an accuracy comparable to Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) under tight energy budgets. The FoG architecture takes advantage of the fact that in random forests a small portion of the weak classifiers (decision trees) might be sufficient to achieve high statistical performance. By dividing the random forest into smaller forests (Groves), and conditionally executing the rest of the forest, FoG is able to achieve much higher energy efficiency levels for comparable error rates. We also take advantage of the distributed nature of the FoG to achieve high level of parallelism. Our evaluation shows that at maximum achievable accuracies FoG consumes ≈1.48×, ≈24×, ≈2.5×, and ≈34.7× lower energy per classification compared to conventional RF, SVM-RBF , Multi-Layer Perceptron Network (MLP), and CNN, respectively. FoG is 6.5× less energy efficient than SVM-LR, but achieves 18% higher accuracy on average across all considered datasets

Boston University Institutional Repository (OpenBU)