37 research outputs found
PowerGAN: A Machine Learning Approach for Power Side-Channel Attack on Compute-in-Memory Accelerators
Analog compute-in-memory (CIM) accelerators are becoming increasingly popular
for deep neural network (DNN) inference due to their energy efficiency and
in-situ vector-matrix multiplication (VMM) capabilities. However, as the use of
DNNs expands, protecting user input privacy has become increasingly important.
In this paper, we identify a security vulnerability wherein an adversary can
reconstruct the user's private input data from a power side-channel attack,
under proper data acquisition and pre-processing, even without knowledge of the
DNN model. We further demonstrate a machine learning-based attack approach
using a generative adversarial network (GAN) to enhance the reconstruction. Our
results show that the attack methodology is effective in reconstructing user
inputs from analog CIM accelerator power leakage, even when at large noise
levels and countermeasures are applied. Specifically, we demonstrate the
efficacy of our approach on the U-Net for brain tumor detection in magnetic
resonance imaging (MRI) medical images, with a noise-level of 20% standard
deviation of the maximum power signal value. Our study highlights a significant
security vulnerability in analog CIM accelerators and proposes an effective
attack methodology using a GAN to breach user privacy
FPGA-Based PUF Designs: A Comprehensive Review and Comparative Analysis
Field-programmable gate arrays (FPGAs) have firmly established themselves as dynamic platforms for the implementation of physical unclonable functions (PUFs). Their intrinsic reconfigurability and profound implications for enhancing hardware security make them an invaluable asset in this realm. This groundbreaking study not only dives deep into the universe of FPGA-based PUF designs but also offers a comprehensive overview coupled with a discerning comparative analysis. PUFs are the bedrock of device authentication and key generation and the fortification of secure cryptographic protocols. Unleashing the potential of FPGA technology expands the horizons of PUF integration across diverse hardware systems. We set out to understand the fundamental ideas behind PUF and how crucially important it is to current security paradigms. Different FPGA-based PUF solutions, including static, dynamic, and hybrid systems, are closely examined. Each design paradigm is painstakingly examined to reveal its special qualities, functional nuances, and weaknesses. We closely assess a variety of performance metrics, including those related to distinctiveness, reliability, and resilience against hostile threats. We compare various FPGA-based PUF systems against one another to expose their unique advantages and disadvantages. This study provides system designers and security professionals with the crucial information they need to choose the best PUF design for their particular applications. Our paper provides a comprehensive view of the functionality, security capabilities, and prospective applications of FPGA-based PUF systems. The depth of knowledge gained from this research advances the field of hardware security, enabling security practitioners, researchers, and designers to make wise decisions when deciding on and implementing FPGA-based PUF solutions.publishedVersio
Secure Instruction and Data-Level Information Flow Tracking Model for RISC-V
Rising device use and third-party IP integration in semiconductors raise
security concerns. Unauthorized access, fault injection, and privacy invasion
are potential threats from untrusted actors. Different security techniques have
been proposed to provide resilience to secure devices from potential
vulnerabilities; however, no one technique can be applied as an overarching
solution. We propose an integrated Information Flow Tracking (IFT) technique to
enable runtime security to protect system integrity by tracking the flow of
data from untrusted communication channels. Existing hardware-based IFT schemes
are either fine-, which are resource-intensive, or coarse-grained models, which
have minimal precision logic, providing either control flow or data-flow
integrity. No current security model provides multi-granularity due to the
difficulty in balancing both the flexibility and hardware overheads at the same
time. This study proposes a multi-level granularity IFT model that integrates a
hardware-based IFT technique with a gate-level-based IFT (GLIFT) technique,
along with flexibility, for better precision and assessments. Translation from
the instruction level to the data level is based on module instantiation with
security-critical data for accurate information flow behaviors without any
false conservative flows. A simulation-based IFT model is demonstrated, which
translates the architecture-specific extensions into a compiler-specific
simulation model with toolchain extensions for Reduced Instruction Set
Architecture (RISC-V) to verify the security extensions. This approach provides
better precision logic by enhancing the tagged mechanism with 1-bit tags and
implementing an optimized shadow logic that eliminates the area overhead by
tracking the data for only security-critical modules
RoHNAS: A Neural Architecture Search Framework with Conjoint Optimization for Adversarial Robustness and Hardware Efficiency of Convolutional and Capsule Networks
Neural Architecture Search (NAS) algorithms aim at finding efficient Deep Neural Network (DNN) architectures for a given application under given system constraints. DNNs are computationally-complex as well as vulnerable to adversarial attacks. In order to address multiple design objectives, we propose RoHNAS , a novel NAS framework that jointly optimizes for adversarial-robustness and hardware-efficiency of DNNs executed on specialized hardware accelerators. Besides the traditional convolutional DNNs, RoHNAS additionally accounts for complex types of DNNs such as Capsule Networks. For reducing the exploration time, RoHNAS analyzes and selects appropriate values of adversarial perturbation for each dataset to employ in the NAS flow. Extensive evaluations on multi - Graphics Processing Unit (GPU) - High Performance Computing (HPC) nodes provide a set of Pareto-optimal solutions, leveraging the tradeoff between the above-discussed design objectives. For example, a Pareto-optimal DNN for the CIFAR-10 dataset exhibits 86.07% accuracy, while having an energy of 38.63 mJ, a memory footprint of 11.85 MiB, and a latency of 4.47 ms
Axp: A hw-sw co-design pipeline for energy-efficient approximated convnets via associative matching
The reduction in energy consumption is key for deep neural networks (DNNs) to ensure usability and reliability, whether they are deployed on low-power end-nodes with limited resources or high-performance platforms that serve large pools of users. Leveraging the over-parametrization shown by many DNN models, convolutional neural networks (ConvNets) in particular, energy efficiency can be improved substantially preserving the model accuracy. The solution proposed in this work exploits the intrinsic redundancy of ConvNets to maximize the reuse of partial arithmetic results during the inference stages. Specifically, the weight-set of a given ConvNet is discretized through a clustering procedure such that the largest possible number of inner multiplications fall into predefined bins; this allows an off-line computation of the most frequent results, which in turn can be stored locally and retrieved when needed during the forward pass. Such a reuse mechanism leads to remarkable energy savings with the aid of a custom processing element (PE) that integrates an associative memory with a standard floating-point unit (FPU). Moreover, the adoption of an approximate associative rule based on a partial bit-match increases the hit rate over the pre-computed results, maximizing the energy reduction even further. Results collected on a set of ConvNets trained for computer vision and speech processing tasks reveal that the proposed associative-based hw-sw co-design achieves up to 77% in energy savings with less than 1% in accuracy loss
Model-based symbolic design space exploration at the electronic system level: a systematic approach
In this thesis, a novel, fully systematic approach is proposed that addresses the automated design space exploration at the electronic system level. The problem is formulated as multi-objective optimization problem and is encoded symbolically using Answer Set Programming (ASP). Several specialized solvers are tightly coupled as background theories with the foreground ASP solver under the ASP modulo Theories (ASPmT) paradigm. By utilizing the ASPmT paradigm, the search is executed entirely systematically and the disparate synthesis steps can be coupled to explore the search space effectively.In dieser Arbeit wird ein vollständig systematischer Ansatz präsentiert, der sich mit der Entwurfsraumexploration auf der elektronischen Systemebene befasst. Das Problem wird als multikriterielles Optimierungsproblem formuliert und symbolisch mit Hilfe von Answer Set Programming (ASP) kodiert. Spezialisierte Solver sind im Rahmen des ASP modulo Theories (ASPmT) Paradigmas als Hintergrundtheorien eng mit dem ASP Solver gekoppelt. Durch die Verwendung von ASPmT wird die Suche systematisch ausgeführt und die individuellen Schritte können gekoppelt werden, um den Suchraum effektiv zu durchsuchen
Design Space Exploration and Resource Management of Multi/Many-Core Systems
The increasing demand of processing a higher number of applications and related data on computing platforms has resulted in reliance on multi-/many-core chips as they facilitate parallel processing. However, there is a desire for these platforms to be energy-efficient and reliable, and they need to perform secure computations for the interest of the whole community. This book provides perspectives on the aforementioned aspects from leading researchers in terms of state-of-the-art contributions and upcoming trends
A Neural Network and Principal Component Analysis Approach to Develop a Real-Time Driving Cycle in an Urban Environment: The Case of Addis Ababa, Ethiopia
This study aimed to develop the Addis Ababa Driving Cycle (DC) using real-time data from passenger vehicles in Addis Ababa based on a neural network (NN) and principal component analysis (PCA) approach. Addis Ababa has no local DC for automobile emissions tests and standard DCs do not reflect the current scenario. During the DC's development, the researchers determined the DC duration based on their experience and the literature. A k-means clustering method was also applied to cluster the dimensionally reduced data without identifying the best clustering method. First, a shape-preserving cubic interpolation technique was applied to remove outliers, followed by the Bayes wavelet signal denoising technique to smooth the data. Rules were then set for the extraction of trips and trip indicators before PCA was applied, and the machine learning classification was applied to identify the best clustering method. Finally, after training the NN using Bayesian regularization with a back propagation, the velocity for each route section was predicted and its performance had an overall R-value of 0.99. Compared with target data, the DCs developed by the NN and micro trip methods have a relative difference of 0.056 and 0.111, respectively, and resolve the issue of the DC duration decision in the micro trip method