21,170 research outputs found
Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation
Federated Learning (FL) on knowledge graphs (KGs) has yet to be as well
studied as other domains, such as computer vision and natural language
processing. A recent study FedE first proposes an FL framework that shares
entity embeddings of KGs across all clients. However, compared with model
sharing in vanilla FL, entity embedding sharing from FedE would incur severe
privacy leakage. Specifically, the known entity embedding can be used to infer
whether a specific relation between two entities exists in a private client. In
this paper, we first develop a novel attack that aims to recover the original
data based on embedding information, which is further used to evaluate the
vulnerabilities of FedE. Furthermore, we propose a Federated learning paradigm
with privacy-preserving Relation embedding aggregation (FedR) to tackle the
privacy issue in FedE. Compared to entity embedding sharing, relation embedding
sharing policy can significantly reduce the communication cost due to its
smaller size of queries. We conduct extensive experiments to evaluate FedR with
five different embedding learning models and three benchmark KG datasets.
Compared to FedE, FedR achieves similar utility and significant (nearly 2X)
improvements in both privacy and efficiency on link prediction task.Comment: Accepted to ACL 2022 Workshop on Federated Learning for Natural
Language Processin
Adversarial Removal of Demographic Attributes from Text Data
Recent advances in Representation Learning and Adversarial Training seem to
succeed in removing unwanted features from the learned representation. We show
that demographic information of authors is encoded in -- and can be recovered
from -- the intermediate representations learned by text-based neural
classifiers. The implication is that decisions of classifiers trained on
textual data are not agnostic to -- and likely condition on -- demographic
attributes. When attempting to remove such demographic information using
adversarial training, we find that while the adversarial component achieves
chance-level development-set accuracy during training, a post-hoc classifier,
trained on the encoded sentences from the first part, still manages to reach
substantially higher classification accuracies on the same data. This behavior
is consistent across several tasks, demographic properties and datasets. We
explore several techniques to improve the effectiveness of the adversarial
component. Our main conclusion is a cautionary one: do not rely on the
adversarial training to achieve invariant representation to sensitive features
Quantifying the Leakage of Quantum Protocols for Classical Two-Party Cryptography
We study quantum protocols among two distrustful parties. By adopting a
rather strict definition of correctness - guaranteeing that honest players
obtain their correct outcomes only - we can show that every strictly correct
quantum protocol implementing a non-trivial classical primitive necessarily
leaks information to a dishonest player. This extends known impossibility
results to all non-trivial primitives. We provide a framework for quantifying
this leakage and argue that leakage is a good measure for the privacy provided
to the players by a given protocol. Our framework also covers the case where
the two players are helped by a trusted third party. We show that despite the
help of a trusted third party, the players cannot amplify the cryptographic
power of any primitive. All our results hold even against quantum
honest-but-curious adversaries who honestly follow the protocol but purify
their actions and apply a different measurement at the end of the protocol. As
concrete examples, we establish lower bounds on the leakage of standard
universal two-party primitives such as oblivious transfer.Comment: 38 pages, completely supersedes arXiv:0902.403
Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning
Deep neural networks are susceptible to various inference attacks as they
remember information about their training data. We design white-box inference
attacks to perform a comprehensive privacy analysis of deep learning models. We
measure the privacy leakage through parameters of fully trained models as well
as the parameter updates of models during training. We design inference
algorithms for both centralized and federated learning, with respect to passive
and active inference attackers, and assuming different adversary prior
knowledge.
We evaluate our novel white-box membership inference attacks against deep
learning algorithms to trace their training data records. We show that a
straightforward extension of the known black-box attacks to the white-box
setting (through analyzing the outputs of activation functions) is ineffective.
We therefore design new algorithms tailored to the white-box setting by
exploiting the privacy vulnerabilities of the stochastic gradient descent
algorithm, which is the algorithm used to train deep neural networks. We
investigate the reasons why deep learning models may leak information about
their training data. We then show that even well-generalized models are
significantly susceptible to white-box membership inference attacks, by
analyzing state-of-the-art pre-trained and publicly available models for the
CIFAR dataset. We also show how adversarial participants, in the federated
learning setting, can successfully run active membership inference attacks
against other participants, even when the global model achieves high prediction
accuracies.Comment: 2019 IEEE Symposium on Security and Privacy (SP
Constructing practical Fuzzy Extractors using QIM
Fuzzy extractors are a powerful tool to extract randomness from noisy data. A fuzzy extractor can extract randomness only if the source data is discrete while in practice source data is continuous. Using quantizers to transform continuous data into discrete data is a commonly used solution. However, as far as we know no study has been made of the effect of the quantization strategy on the performance of fuzzy extractors. We construct the encoding and the decoding function of a fuzzy extractor using quantization index modulation (QIM) and we express properties of this fuzzy extractor in terms of parameters of the used QIM. We present and analyze an optimal (in the sense of embedding rate) two dimensional construction. Our 6-hexagonal tiling construction offers ( log2 6 / 2-1) approx. 3 extra bits per dimension of the space compared to the known square quantization based fuzzy extractor
- …