875 research outputs found
MgX: Near-Zero Overhead Memory Protection with an Application to Secure DNN Acceleration
In this paper, we propose MgX, a near-zero overhead memory protection scheme
for hardware accelerators. MgX minimizes the performance overhead of off-chip
memory encryption and integrity verification by exploiting the
application-specific aspect of accelerators. Accelerators tend to explicitly
manage data movement between on-chip and off-chip memory, typically at an
object granularity that is much larger than cache lines. Exploiting these
accelerator-specific characteristics, MgX generates version numbers used in
memory encryption and integrity verification only using on-chip state without
storing them in memory, and also customizes the granularity of the memory
protection to match the granularity used by the accelerator. To demonstrate the
applicability of MgX, we present an in-depth study of MgX for deep neural
network (DNN) and also describe implementations for H.264 video decoding and
genome alignment. Experimental results show that applying MgX has less than 1%
performance overhead for both DNN inference and training on state-of-the-art
DNN architectures
GuardNN: Secure DNN Accelerator for Privacy-Preserving Deep Learning
This paper proposes GuardNN, a secure deep neural network (DNN) accelerator,
which provides strong hardware-based protection for user data and model
parameters even in an untrusted environment. GuardNN shows that the
architecture and protection can be customized for a specific application to
provide strong confidentiality and integrity protection with negligible
overhead. The design of the GuardNN instruction set reduces the TCB to just the
accelerator and enables confidentiality protection without the overhead of
integrity protection. GuardNN also introduces a new application-specific memory
protection scheme to minimize the overhead of memory encryption and integrity
verification. The scheme shows that most of the off-chip meta-data in today's
state-of-the-art memory protection can be removed by exploiting the known
memory access patterns of a DNN accelerator. GuardNN is implemented as an FPGA
prototype, which demonstrates effective protection with less than 2%
performance overhead for inference over a variety of modern DNN models
QuMoS: A Framework for Preserving Security of Quantum Machine Learning Model
Security has always been a critical issue in machine learning (ML)
applications. Due to the high cost of model training -- such as collecting
relevant samples, labeling data, and consuming computing power --
model-stealing attack is one of the most fundamental but vitally important
issues. When it comes to quantum computing, such a quantum machine learning
(QML) model-stealing attack also exists and is even more severe because the
traditional encryption method, such as homomorphic encryption can hardly be
directly applied to quantum computation. On the other hand, due to the limited
quantum computing resources, the monetary cost of training QML model can be
even higher than classical ones in the near term. Therefore, a well-tuned QML
model developed by a third-party company can be delegated to a quantum cloud
provider as a service to be used by ordinary users. In this case, the QML model
will likely be leaked if the cloud provider is under attack. To address such a
problem, we propose a novel framework, namely QuMoS, to preserve model
security. We propose to divide the complete QML model into multiple parts and
distribute them to multiple physically isolated quantum cloud providers for
execution. As such, even if the adversary in a single provider can obtain a
partial model, it does not have sufficient information to retrieve the complete
model. Although promising, we observed that an arbitrary model design under
distributed settings cannot provide model security. We further developed a
reinforcement learning-based security engine, which can automatically optimize
the model design under the distributed setting, such that a good trade-off
between model performance and security can be made. Experimental results on
four datasets show that the model design proposed by QuMoS can achieve
competitive performance while providing the highest security than the
baselines
Embedding Security into Ferroelectric FET Array via In-Situ Memory Operation
Non-volatile memories (NVMs) have the potential to reshape next-generation
memory systems because of their promising properties of near-zero leakage power
consumption, high density and non-volatility. However, NVMs also face critical
security threats that exploit the non-volatile property. Compared to volatile
memory, the capability of retaining data even after power down makes NVM more
vulnerable. Existing solutions to address the security issues of NVMs are
mainly based on Advanced Encryption Standard (AES), which incurs significant
performance and power overhead. In this paper, we propose a lightweight memory
encryption/decryption scheme by exploiting in-situ memory operations with
negligible overhead. To validate the feasibility of the encryption/decryption
scheme, device-level and array-level experiments are performed using
ferroelectric field effect transistor (FeFET) as an example NVM without loss of
generality. Besides, a comprehensive evaluation is performed on a 128x128 FeFET
AND-type memory array in terms of area, latency, power and throughput. Compared
with the AES-based scheme, our scheme shows around 22.6x/14.1x increase in
encryption/decryption throughput with negligible power penalty. Furthermore, we
evaluate the performance of our scheme over the AES-based scheme when deploying
different neural network workloads. Our scheme yields significant latency
reduction by 90% on average for encryption and decryption processes
Edge Intelligence : Empowering Intelligence to the Edge of Network
Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe
Edge Intelligence : Empowering Intelligence to the Edge of Network
Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe
EvoFed: Leveraging Evolutionary Strategies for Communication-Efficient Federated Learning
Federated Learning (FL) is a decentralized machine learning paradigm that
enables collaborative model training across dispersed nodes without having to
force individual nodes to share data. However, its broad adoption is hindered
by the high communication costs of transmitting a large number of model
parameters. This paper presents EvoFed, a novel approach that integrates
Evolutionary Strategies (ES) with FL to address these challenges. EvoFed
employs a concept of 'fitness-based information sharing', deviating
significantly from the conventional model-based FL. Rather than exchanging the
actual updated model parameters, each node transmits a distance-based
similarity measure between the locally updated model and each member of the
noise-perturbed model population. Each node, as well as the server, generates
an identical population set of perturbed models in a completely synchronized
fashion using the same random seeds. With properly chosen noise variance and
population size, perturbed models can be combined to closely reflect the actual
model updated using the local dataset, allowing the transmitted similarity
measures (or fitness values) to carry nearly the complete information about the
model parameters. As the population size is typically much smaller than the
number of model parameters, the savings in communication load is large. The
server aggregates these fitness values and is able to update the global model.
This global fitness vector is then disseminated back to the nodes, each of
which applies the same update to be synchronized to the global model. Our
analysis shows that EvoFed converges, and our experimental results validate
that at the cost of increased local processing loads, EvoFed achieves
performance comparable to FedAvg while reducing overall communication
requirements drastically in various practical settings
- …