875 research outputs found

    MgX: Near-Zero Overhead Memory Protection with an Application to Secure DNN Acceleration

    Full text link
    In this paper, we propose MgX, a near-zero overhead memory protection scheme for hardware accelerators. MgX minimizes the performance overhead of off-chip memory encryption and integrity verification by exploiting the application-specific aspect of accelerators. Accelerators tend to explicitly manage data movement between on-chip and off-chip memory, typically at an object granularity that is much larger than cache lines. Exploiting these accelerator-specific characteristics, MgX generates version numbers used in memory encryption and integrity verification only using on-chip state without storing them in memory, and also customizes the granularity of the memory protection to match the granularity used by the accelerator. To demonstrate the applicability of MgX, we present an in-depth study of MgX for deep neural network (DNN) and also describe implementations for H.264 video decoding and genome alignment. Experimental results show that applying MgX has less than 1% performance overhead for both DNN inference and training on state-of-the-art DNN architectures

    GuardNN: Secure DNN Accelerator for Privacy-Preserving Deep Learning

    Full text link
    This paper proposes GuardNN, a secure deep neural network (DNN) accelerator, which provides strong hardware-based protection for user data and model parameters even in an untrusted environment. GuardNN shows that the architecture and protection can be customized for a specific application to provide strong confidentiality and integrity protection with negligible overhead. The design of the GuardNN instruction set reduces the TCB to just the accelerator and enables confidentiality protection without the overhead of integrity protection. GuardNN also introduces a new application-specific memory protection scheme to minimize the overhead of memory encryption and integrity verification. The scheme shows that most of the off-chip meta-data in today's state-of-the-art memory protection can be removed by exploiting the known memory access patterns of a DNN accelerator. GuardNN is implemented as an FPGA prototype, which demonstrates effective protection with less than 2% performance overhead for inference over a variety of modern DNN models

    QuMoS: A Framework for Preserving Security of Quantum Machine Learning Model

    Full text link
    Security has always been a critical issue in machine learning (ML) applications. Due to the high cost of model training -- such as collecting relevant samples, labeling data, and consuming computing power -- model-stealing attack is one of the most fundamental but vitally important issues. When it comes to quantum computing, such a quantum machine learning (QML) model-stealing attack also exists and is even more severe because the traditional encryption method, such as homomorphic encryption can hardly be directly applied to quantum computation. On the other hand, due to the limited quantum computing resources, the monetary cost of training QML model can be even higher than classical ones in the near term. Therefore, a well-tuned QML model developed by a third-party company can be delegated to a quantum cloud provider as a service to be used by ordinary users. In this case, the QML model will likely be leaked if the cloud provider is under attack. To address such a problem, we propose a novel framework, namely QuMoS, to preserve model security. We propose to divide the complete QML model into multiple parts and distribute them to multiple physically isolated quantum cloud providers for execution. As such, even if the adversary in a single provider can obtain a partial model, it does not have sufficient information to retrieve the complete model. Although promising, we observed that an arbitrary model design under distributed settings cannot provide model security. We further developed a reinforcement learning-based security engine, which can automatically optimize the model design under the distributed setting, such that a good trade-off between model performance and security can be made. Experimental results on four datasets show that the model design proposed by QuMoS can achieve competitive performance while providing the highest security than the baselines

    Embedding Security into Ferroelectric FET Array via In-Situ Memory Operation

    Full text link
    Non-volatile memories (NVMs) have the potential to reshape next-generation memory systems because of their promising properties of near-zero leakage power consumption, high density and non-volatility. However, NVMs also face critical security threats that exploit the non-volatile property. Compared to volatile memory, the capability of retaining data even after power down makes NVM more vulnerable. Existing solutions to address the security issues of NVMs are mainly based on Advanced Encryption Standard (AES), which incurs significant performance and power overhead. In this paper, we propose a lightweight memory encryption/decryption scheme by exploiting in-situ memory operations with negligible overhead. To validate the feasibility of the encryption/decryption scheme, device-level and array-level experiments are performed using ferroelectric field effect transistor (FeFET) as an example NVM without loss of generality. Besides, a comprehensive evaluation is performed on a 128x128 FeFET AND-type memory array in terms of area, latency, power and throughput. Compared with the AES-based scheme, our scheme shows around 22.6x/14.1x increase in encryption/decryption throughput with negligible power penalty. Furthermore, we evaluate the performance of our scheme over the AES-based scheme when deploying different neural network workloads. Our scheme yields significant latency reduction by 90% on average for encryption and decryption processes

    Edge Intelligence : Empowering Intelligence to the Edge of Network

    Get PDF
    Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe

    Edge Intelligence : Empowering Intelligence to the Edge of Network

    Get PDF
    Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe

    EvoFed: Leveraging Evolutionary Strategies for Communication-Efficient Federated Learning

    Full text link
    Federated Learning (FL) is a decentralized machine learning paradigm that enables collaborative model training across dispersed nodes without having to force individual nodes to share data. However, its broad adoption is hindered by the high communication costs of transmitting a large number of model parameters. This paper presents EvoFed, a novel approach that integrates Evolutionary Strategies (ES) with FL to address these challenges. EvoFed employs a concept of 'fitness-based information sharing', deviating significantly from the conventional model-based FL. Rather than exchanging the actual updated model parameters, each node transmits a distance-based similarity measure between the locally updated model and each member of the noise-perturbed model population. Each node, as well as the server, generates an identical population set of perturbed models in a completely synchronized fashion using the same random seeds. With properly chosen noise variance and population size, perturbed models can be combined to closely reflect the actual model updated using the local dataset, allowing the transmitted similarity measures (or fitness values) to carry nearly the complete information about the model parameters. As the population size is typically much smaller than the number of model parameters, the savings in communication load is large. The server aggregates these fitness values and is able to update the global model. This global fitness vector is then disseminated back to the nodes, each of which applies the same update to be synchronized to the global model. Our analysis shows that EvoFed converges, and our experimental results validate that at the cost of increased local processing loads, EvoFed achieves performance comparable to FedAvg while reducing overall communication requirements drastically in various practical settings
    • …
    corecore