16 research outputs found
Horizontal Federated Learning and Secure Distributed Training for Recommendation System with Intel SGX
With the advent of big data era and the development of artificial
intelligence and other technologies, data security and privacy protection have
become more important. Recommendation systems have many applications in our
society, but the model construction of recommendation systems is often
inseparable from users' data. Especially for deep learning-based recommendation
systems, due to the complexity of the model and the characteristics of deep
learning itself, its training process not only requires long training time and
abundant computational resources but also needs to use a large amount of user
data, which poses a considerable challenge in terms of data security and
privacy protection. How to train a distributed recommendation system while
ensuring data security has become an urgent problem to be solved. In this
paper, we implement two schemes, Horizontal Federated Learning and Secure
Distributed Training, based on Intel SGX(Software Guard Extensions), an
implementation of a trusted execution environment, and TensorFlow framework, to
achieve secure, distributed recommendation system-based learning schemes in
different scenarios. We experiment on the classical Deep Learning
Recommendation Model (DLRM), which is a neural network-based machine learning
model designed for personalization and recommendation, and the results show
that our implementation introduces approximately no loss in model performance.
The training speed is within acceptable limits.Comment: 5 pages, 8 figure
PDoT: Private DNS-over-TLS with TEE Support
Security and privacy of the Internet Domain Name System (DNS) have been
longstanding concerns. Recently, there is a trend to protect DNS traffic using
Transport Layer Security (TLS). However, at least two major issues remain: (1)
how do clients authenticate DNS-over-TLS endpoints in a scalable and extensible
manner; and (2) how can clients trust endpoints to behave as expected? In this
paper, we propose a novel Private DNS-over-TLS (PDoT ) architecture. PDoT
includes a DNS Recursive Resolver (RecRes) that operates within a Trusted
Execution Environment (TEE). Using Remote Attestation, DNS clients can
authenticate, and receive strong assurance of trustworthiness of PDoT RecRes.
We provide an open-source proof-of-concept implementation of PDoT and use it to
experimentally demonstrate that its latency and throughput match that of the
popular Unbound DNS-over-TLS resolver.Comment: To appear: ACSAC 201
A Generative Framework for Low-Cost Result Validation of Outsourced Machine Learning Tasks
The growing popularity of Machine Learning (ML) has led to its deployment in
various sensitive domains, which has resulted in significant research focused
on ML security and privacy. However, in some applications, such as autonomous
driving, integrity verification of the outsourced ML workload is more
critical--a facet that has not received much attention. Existing solutions,
such as multi-party computation and proof-based systems, impose significant
computation overhead, which makes them unfit for real-time applications. We
propose Fides, a novel framework for real-time validation of outsourced ML
workloads. Fides features a novel and efficient distillation technique--Greedy
Distillation Transfer Learning--that dynamically distills and fine-tunes a
space and compute-efficient verification model for verifying the corresponding
service model while running inside a trusted execution environment. Fides
features a client-side attack detection model that uses statistical analysis
and divergence measurements to identify, with a high likelihood, if the service
model is under attack. Fides also offers a re-classification functionality that
predicts the original class whenever an attack is identified. We devised a
generative adversarial network framework for training the attack detection and
re-classification models. The evaluation shows that Fides achieves an accuracy
of up to 98% for attack detection and 94% for re-classification.Comment: 16 pages, 11 figure
Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing