1,603 research outputs found
Supervised Machine Learning Techniques for Trojan Detection with Ring Oscillator Network
With the globalization of the semiconductor manufacturing process, electronic
devices are powerless against malicious modification of hardware in the supply
chain. The ever-increasing threat of hardware Trojan attacks against integrated
circuits has spurred a need for accurate and efficient detection methods. Ring
oscillator network (RON) is used to detect the Trojan by capturing the
difference in power consumption; the power consumption of a Trojan-free circuit
is different from the Trojan-inserted circuit. However, the process variation
and measurement noise are the major obstacles to detect hardware Trojan with
high accuracy. In this paper, we quantitatively compare four supervised machine
learning algorithms and classifier optimization strategies for maximizing
accuracy and minimizing the false positive rate (FPR). These supervised
learning techniques show an improved false positive rate compared to principal
component analysis (PCA) and convex hull classification by nearly 40% while
maintaining > 90\% binary classification accuracy
ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks
The deployment of AI models on low-power, real-time edge devices requires
accelerators for which energy, latency, and area are all first-order concerns.
There are many approaches to enabling deep neural networks (DNNs) in this
domain, including pruning, quantization, compression, and binary neural
networks (BNNs), but with the emergence of the "extreme edge", there is now a
demand for even more efficient models. In order to meet the constraints of
ultra-low-energy devices, we propose ULEEN, a model architecture based on
weightless neural networks. Weightless neural networks (WNNs) are a class of
neural model which use table lookups, not arithmetic, to perform computation.
The elimination of energy-intensive arithmetic operations makes WNNs
theoretically well suited for edge inference; however, they have historically
suffered from poor accuracy and excessive memory usage. ULEEN incorporates
algorithmic improvements and a novel training strategy inspired by BNNs to make
significant strides in improving accuracy and reducing model size. We compare
FPGA and ASIC implementations of an inference accelerator for ULEEN against
edge-optimized DNN and BNN devices. On a Xilinx Zynq Z-7045 FPGA, we
demonstrate classification on the MNIST dataset at 14.3 million inferences per
second (13 million inferences/Joule) with 0.21 s latency and 96.2%
accuracy, while Xilinx FINN achieves 12.3 million inferences per second (1.69
million inferences/Joule) with 0.31 s latency and 95.83% accuracy. In a
45nm ASIC, we achieve 5.1 million inferences/Joule and 38.5 million
inferences/second at 98.46% accuracy, while a quantized Bit Fusion model
achieves 9230 inferences/Joule and 19,100 inferences/second at 99.35% accuracy.
In our search for ever more efficient edge devices, ULEEN shows that WNNs are
deserving of consideration.Comment: 14 pages, 14 figures Portions of this article draw heavily from
arXiv:2203.01479, most notably sections 5E and 5F.
LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning
Machine learning has recently gained traction as a way to overcome the slow
accelerator generation and implementation process on an FPGA. It can be used to
build performance and resource usage models that enable fast early-stage design
space exploration. First, training requires large amounts of data (features
extracted from design synthesis and implementation tools), which is
cost-inefficient because of the time-consuming accelerator design and
implementation process. Second, a model trained for a specific environment
cannot predict performance or resource usage for a new, unknown environment. In
a cloud system, renting a platform for data collection to build an ML model can
significantly increase the total-cost-ownership (TCO) of a system. Third,
ML-based models trained using a limited number of samples are prone to
overfitting. To overcome these limitations, we propose LEAPER, a transfer
learning-based approach for prediction of performance and resource usage in
FPGA-based systems. The key idea of LEAPER is to transfer an ML-based
performance and resource usage model trained for a low-end edge environment to
a new, high-end cloud environment to provide fast and accurate predictions for
accelerator implementation. Experimental results show that LEAPER (1) provides,
on average across six workloads and five FPGAs, 85% accuracy when we use our
transferred model for prediction in a cloud environment with 5-shot learning
and (2) reduces design-space exploration time for accelerator implementation on
an FPGA by 10x, from days to only a few hours
Real-Time Localization of Epileptogenic Foci EEG Signals: An FPGA-Based Implementation
The epileptogenic focus is a brain area that may be surgically removed to control of epileptic seizures. Locating it is an essential and crucial step prior to the surgical treatment. However, given the difficulty of determining the localization of this brain region responsible of the initial seizure discharge, many works have proposed machine learning methods for the automatic classification of focal and non-focal electroencephalographic (EEG) signals. These works use automatic classification as an analysis tool for helping neurosurgeons to identify focal areas off-line, out of surgery, during the processing of the huge amount of information collected during several days of patient monitoring. In turn, this paper proposes an automatic classification procedure capable of assisting neurosurgeons online, during the resective epilepsy surgery, to refine the localization of the epileptogenic area to be resected, if they have doubts. This goal requires a real-time implementation with as low a computational cost as possible. For that reason, this work proposes both a feature set and a classifier model that minimizes the computational load while preserving the classification accuracy at 95.5%, a level similar to previous works. In addition, the classification procedure has been implemented on a FPGA device to determine its resource needs and throughput. Thus, it can be concluded that such a device can embed the whole classification process, from accepting raw signals to the delivery of the classification results in a cost-effective Xilinx Spartan-6 FPGA device. This real-time implementation begins providing results after a 5 s latency, and later, can deliver floating-point classification results at 3.5 Hz rate, using overlapped time-windows
- …