891 research outputs found
Heterogeneous Collaborative Learning for Personalized Healthcare Analytics via Messenger Distillation
In this paper, we propose a Similarity-Quality-based Messenger Distillation
(SQMD) framework for heterogeneous asynchronous on-device healthcare analytics.
By introducing a preloaded reference dataset, SQMD enables all participant
devices to distill knowledge from peers via messengers (i.e., the soft labels
of the reference dataset generated by clients) without assuming the same model
architecture. Furthermore, the messengers also carry important auxiliary
information to calculate the similarity between clients and evaluate the
quality of each client model, based on which the central server creates and
maintains a dynamic collaboration graph (communication graph) to improve the
personalization and reliability of SQMD under asynchronous conditions.
Extensive experiments on three real-life datasets show that SQMD achieves
superior performance
k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data
Data Mining has wide applications in many areas such as banking, medicine,
scientific research and among government agencies. Classification is one of the
commonly used tasks in data mining applications. For the past decade, due to
the rise of various privacy issues, many theoretical and practical solutions to
the classification problem have been proposed under different security models.
However, with the recent popularity of cloud computing, users now have the
opportunity to outsource their data, in encrypted form, as well as the data
mining tasks to the cloud. Since the data on the cloud is in encrypted form,
existing privacy preserving classification techniques are not applicable. In
this paper, we focus on solving the classification problem over encrypted data.
In particular, we propose a secure k-NN classifier over encrypted data in the
cloud. The proposed k-NN protocol protects the confidentiality of the data,
user's input query, and data access patterns. To the best of our knowledge, our
work is the first to develop a secure k-NN classifier over encrypted data under
the semi-honest model. Also, we empirically analyze the efficiency of our
solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text
overlap with arXiv:1307.482
Addressing the clinical unmet needs in primary Sjögren's Syndrome through the sharing, harmonization and federated analysis of 21 European cohorts
For many decades, the clinical unmet needs of primary Sjögren's Syndrome (pSS) have been left unresolved due to the rareness of the disease and the complexity of the underlying pathogenic mechanisms, including the pSS-associated lymphomagenesis process. Here, we present the HarmonicSS cloud-computing exemplar which offers beyond the state-of-the-art data analytics services to address the pSS clinical unmet needs, including the development of lymphoma classification models and the identification of biomarkers for lymphomagenesis. The users of the platform have been able to successfully interlink, curate, and harmonize 21 regional, national, and international European cohorts of 7,551 pSS patients with respect to the ethical and legal issues for data sharing. Federated AI algorithms were trained across the harmonized databases, with reduced execution time complexity, yielding robust lymphoma classification models with 85% accuracy, 81.25% sensitivity, 85.4% specificity along with 5 biomarkers for lymphoma development. To our knowledge, this is the first GDPR compliant platform that provides federated AI services to address the pSS clinical unmet needs. © 2022 The Author(s
Secure k-Nearest Neighbor Query over Encrypted Data in Outsourced Environments
For the past decade, query processing on relational data has been studied
extensively, and many theoretical and practical solutions to query processing
have been proposed under various scenarios. With the recent popularity of cloud
computing, users now have the opportunity to outsource their data as well as
the data management tasks to the cloud. However, due to the rise of various
privacy issues, sensitive data (e.g., medical records) need to be encrypted
before outsourcing to the cloud. In addition, query processing tasks should be
handled by the cloud; otherwise, there would be no point to outsource the data
at the first place. To process queries over encrypted data without the cloud
ever decrypting the data is a very challenging task. In this paper, we focus on
solving the k-nearest neighbor (kNN) query problem over encrypted database
outsourced to a cloud: a user issues an encrypted query record to the cloud,
and the cloud returns the k closest records to the user. We first present a
basic scheme and demonstrate that such a naive solution is not secure. To
provide better security, we propose a secure kNN protocol that protects the
confidentiality of the data, user's input query, and data access patterns.
Also, we empirically analyze the efficiency of our protocols through various
experiments. These results indicate that our secure protocol is very efficient
on the user end, and this lightweight scheme allows a user to use any mobile
device to perform the kNN query.Comment: 23 pages, 8 figures, and 4 table
On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects
The Internet of Things (IoT) will be a main data generation infrastructure
for achieving better system intelligence. This paper considers the design and
implementation of a practical privacy-preserving collaborative learning scheme,
in which a curious learning coordinator trains a better machine learning model
based on the data samples contributed by a number of IoT objects, while the
confidentiality of the raw forms of the training data is protected against the
coordinator. Existing distributed machine learning and data encryption
approaches incur significant computation and communication overhead, rendering
them ill-suited for resource-constrained IoT objects. We study an approach that
applies independent Gaussian random projection at each IoT object to obfuscate
data and trains a deep neural network at the coordinator based on the projected
data from the IoT objects. This approach introduces light computation overhead
to the IoT objects and moves most workload to the coordinator that can have
sufficient computing resources. Although the independent projections performed
by the IoT objects address the potential collusion between the curious
coordinator and some compromised IoT objects, they significantly increase the
complexity of the projected data. In this paper, we leverage the superior
learning capability of deep learning in capturing sophisticated patterns to
maintain good learning performance. Extensive comparative evaluation shows that
this approach outperforms other lightweight approaches that apply additive
noisification for differential privacy and/or support vector machines for
learning in the applications with light data pattern complexities.Comment: 12 pages,IOTDI 201
Exploiting Record Similarity for Practical Vertical Federated Learning
As the privacy of machine learning has drawn increasing attention, federated
learning is introduced to enable collaborative learning without revealing raw
data. Notably, \textit{vertical federated learning} (VFL), where parties share
the same set of samples but only hold partial features, has a wide range of
real-world applications. However, existing studies in VFL rarely study the
``record linkage'' process. They either design algorithms assuming the data
from different parties have been linked or use simple linkage methods like
exact-linkage or top1-linkage. These approaches are unsuitable for many
applications, such as the GPS location and noisy titles requiring fuzzy
matching. In this paper, we design a novel similarity-based VFL framework,
FedSim, which is suitable for more real-world applications and achieves higher
performance on traditional VFL tasks. Moreover, we theoretically analyze the
privacy risk caused by sharing similarities. Our experiments on three synthetic
datasets and five real-world datasets with various similarity metrics show that
FedSim consistently outperforms other state-of-the-art baselines
SoK: Cryptographically Protected Database Search
Protected database search systems cryptographically isolate the roles of
reading from, writing to, and administering the database. This separation
limits unnecessary administrator access and protects data in the case of system
breaches. Since protected search was introduced in 2000, the area has grown
rapidly; systems are offered by academia, start-ups, and established companies.
However, there is no best protected search system or set of techniques.
Design of such systems is a balancing act between security, functionality,
performance, and usability. This challenge is made more difficult by ongoing
database specialization, as some users will want the functionality of SQL,
NoSQL, or NewSQL databases. This database evolution will continue, and the
protected search community should be able to quickly provide functionality
consistent with newly invented databases.
At the same time, the community must accurately and clearly characterize the
tradeoffs between different approaches. To address these challenges, we provide
the following contributions:
1) An identification of the important primitive operations across database
paradigms. We find there are a small number of base operations that can be used
and combined to support a large number of database paradigms.
2) An evaluation of the current state of protected search systems in
implementing these base operations. This evaluation describes the main
approaches and tradeoffs for each base operation. Furthermore, it puts
protected search in the context of unprotected search, identifying key gaps in
functionality.
3) An analysis of attacks against protected search for different base
queries.
4) A roadmap and tools for transforming a protected search system into a
protected database, including an open-source performance evaluation platform
and initial user opinions of protected search.Comment: 20 pages, to appear to IEEE Security and Privac
- …