1,603 research outputs found

    Supervised Machine Learning Techniques for Trojan Detection with Ring Oscillator Network

    Full text link
    With the globalization of the semiconductor manufacturing process, electronic devices are powerless against malicious modification of hardware in the supply chain. The ever-increasing threat of hardware Trojan attacks against integrated circuits has spurred a need for accurate and efficient detection methods. Ring oscillator network (RON) is used to detect the Trojan by capturing the difference in power consumption; the power consumption of a Trojan-free circuit is different from the Trojan-inserted circuit. However, the process variation and measurement noise are the major obstacles to detect hardware Trojan with high accuracy. In this paper, we quantitatively compare four supervised machine learning algorithms and classifier optimization strategies for maximizing accuracy and minimizing the false positive rate (FPR). These supervised learning techniques show an improved false positive rate compared to principal component analysis (PCA) and convex hull classification by nearly 40% while maintaining > 90\% binary classification accuracy

    ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

    Full text link
    The deployment of AI models on low-power, real-time edge devices requires accelerators for which energy, latency, and area are all first-order concerns. There are many approaches to enabling deep neural networks (DNNs) in this domain, including pruning, quantization, compression, and binary neural networks (BNNs), but with the emergence of the "extreme edge", there is now a demand for even more efficient models. In order to meet the constraints of ultra-low-energy devices, we propose ULEEN, a model architecture based on weightless neural networks. Weightless neural networks (WNNs) are a class of neural model which use table lookups, not arithmetic, to perform computation. The elimination of energy-intensive arithmetic operations makes WNNs theoretically well suited for edge inference; however, they have historically suffered from poor accuracy and excessive memory usage. ULEEN incorporates algorithmic improvements and a novel training strategy inspired by BNNs to make significant strides in improving accuracy and reducing model size. We compare FPGA and ASIC implementations of an inference accelerator for ULEEN against edge-optimized DNN and BNN devices. On a Xilinx Zynq Z-7045 FPGA, we demonstrate classification on the MNIST dataset at 14.3 million inferences per second (13 million inferences/Joule) with 0.21 ÎĽ\mus latency and 96.2% accuracy, while Xilinx FINN achieves 12.3 million inferences per second (1.69 million inferences/Joule) with 0.31 ÎĽ\mus latency and 95.83% accuracy. In a 45nm ASIC, we achieve 5.1 million inferences/Joule and 38.5 million inferences/second at 98.46% accuracy, while a quantized Bit Fusion model achieves 9230 inferences/Joule and 19,100 inferences/second at 99.35% accuracy. In our search for ever more efficient edge devices, ULEEN shows that WNNs are deserving of consideration.Comment: 14 pages, 14 figures Portions of this article draw heavily from arXiv:2203.01479, most notably sections 5E and 5F.

    LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning

    Full text link
    Machine learning has recently gained traction as a way to overcome the slow accelerator generation and implementation process on an FPGA. It can be used to build performance and resource usage models that enable fast early-stage design space exploration. First, training requires large amounts of data (features extracted from design synthesis and implementation tools), which is cost-inefficient because of the time-consuming accelerator design and implementation process. Second, a model trained for a specific environment cannot predict performance or resource usage for a new, unknown environment. In a cloud system, renting a platform for data collection to build an ML model can significantly increase the total-cost-ownership (TCO) of a system. Third, ML-based models trained using a limited number of samples are prone to overfitting. To overcome these limitations, we propose LEAPER, a transfer learning-based approach for prediction of performance and resource usage in FPGA-based systems. The key idea of LEAPER is to transfer an ML-based performance and resource usage model trained for a low-end edge environment to a new, high-end cloud environment to provide fast and accurate predictions for accelerator implementation. Experimental results show that LEAPER (1) provides, on average across six workloads and five FPGAs, 85% accuracy when we use our transferred model for prediction in a cloud environment with 5-shot learning and (2) reduces design-space exploration time for accelerator implementation on an FPGA by 10x, from days to only a few hours

    Real-Time Localization of Epileptogenic Foci EEG Signals: An FPGA-Based Implementation

    Get PDF
    The epileptogenic focus is a brain area that may be surgically removed to control of epileptic seizures. Locating it is an essential and crucial step prior to the surgical treatment. However, given the difficulty of determining the localization of this brain region responsible of the initial seizure discharge, many works have proposed machine learning methods for the automatic classification of focal and non-focal electroencephalographic (EEG) signals. These works use automatic classification as an analysis tool for helping neurosurgeons to identify focal areas off-line, out of surgery, during the processing of the huge amount of information collected during several days of patient monitoring. In turn, this paper proposes an automatic classification procedure capable of assisting neurosurgeons online, during the resective epilepsy surgery, to refine the localization of the epileptogenic area to be resected, if they have doubts. This goal requires a real-time implementation with as low a computational cost as possible. For that reason, this work proposes both a feature set and a classifier model that minimizes the computational load while preserving the classification accuracy at 95.5%, a level similar to previous works. In addition, the classification procedure has been implemented on a FPGA device to determine its resource needs and throughput. Thus, it can be concluded that such a device can embed the whole classification process, from accepting raw signals to the delivery of the classification results in a cost-effective Xilinx Spartan-6 FPGA device. This real-time implementation begins providing results after a 5 s latency, and later, can deliver floating-point classification results at 3.5 Hz rate, using overlapped time-windows
    • …
    corecore