1,462 research outputs found
On the Relationship Between Information-Theoretic Privacy Metrics And Probabilistic Information Privacy
Information-theoretic (IT) measures based on -divergences have recently
gained interest as a measure of privacy leakage as they allow for trading off
privacy against utility using only a single-value characterization. However,
their operational interpretations in the privacy context are unclear. In this
paper, we relate the notion of probabilistic information privacy (IP) to
several IT privacy metrics based on -divergences. We interpret probabilistic
IP under both the detection and estimation frameworks and link it to
differential privacy, thus allowing a precise operational interpretation of
these IT privacy metrics. We show that the -divergence privacy metric
is stronger than those based on total variation distance and Kullback-Leibler
divergence. Therefore, we further develop a data-driven empirical risk
framework based on the -divergence privacy metric and realized using
deep neural networks. This framework is agnostic to the adversarial attack
model. Empirical experiments demonstrate the efficacy of our approach
Learning to Generate Image Embeddings with User-level Differential Privacy
Small on-device models have been successfully trained with user-level
differential privacy (DP) for next word prediction and image classification
tasks in the past. However, existing methods can fail when directly applied to
learn embedding models using supervised training data with a large class space.
To achieve user-level DP for large image-to-embedding feature extractors, we
propose DP-FedEmb, a variant of federated learning algorithms with per-user
sensitivity control and noise addition, to train from user-partitioned data
centralized in the datacenter. DP-FedEmb combines virtual clients, partial
aggregation, private local fine-tuning, and public pretraining to achieve
strong privacy utility trade-offs. We apply DP-FedEmb to train image embedding
models for faces, landmarks and natural species, and demonstrate its superior
utility under same privacy budget on benchmark datasets DigiFace, EMNIST, GLD
and iNaturalist. We further illustrate it is possible to achieve strong
user-level DP guarantees of while controlling the utility drop
within 5%, when millions of users can participate in training
Novel approaches to anonymity and privacy in decentralized, open settings
The Internet has undergone dramatic changes in the last two decades, evolving from a mere communication network to a global multimedia platform in which billions of users actively exchange information. While this transformation has brought tremendous benefits to society, it has also created new threats to online privacy that existing technology is failing to keep pace with. In this dissertation, we present the results of two lines of research that developed two novel approaches to anonymity and privacy in decentralized, open settings. First, we examine the issue of attribute and identity disclosure in open settings and develop the novel notion of (k,d)-anonymity for open settings that we extensively study and validate experimentally. Furthermore, we investigate the relationship between anonymity and linkability using the notion of (k,d)-anonymity and show that, in contrast to the traditional closed setting, anonymity within one online community does necessarily imply unlinkability across different online communities in the decentralized, open setting. Secondly, we consider the transitive diffusion of information that is shared in social networks and spread through pairwise interactions of user connected in this social network. We develop the novel approach of exposure minimization to control the diffusion of information within an open network, allowing the owner to minimize its exposure by suitably choosing who they share their information with. We implement our algorithms and investigate the practical limitations of user side exposure minimization in large social networks. At their core, both of these approaches present a departure from the provable privacy guarantees that we can achieve in closed settings and a step towards sound assessments of privacy risks in decentralized, open settings.Das Internet hat in den letzten zwei Jahrzehnten eine drastische Transformation erlebt und entwickelte sich dabei von einem einfachen Kommunikationsnetzwerk zu einer globalen Multimedia Plattform auf der Milliarden von Nutzern aktiv Informationen austauschen. Diese Transformation hat zwar einen gewaltigen Nutzen und vielfältige Vorteile für die Gesellschaft mit sich gebracht, hat aber gleichzeitig auch neue Herausforderungen und Gefahren für online Privacy mit sich gebracht mit der die aktuelle Technologie nicht mithalten kann. In dieser Dissertation präsentieren wir zwei neue Ansätze für Anonymität und Privacy in dezentralisierten und offenen Systemen. Mit unserem ersten Ansatz untersuchen wir das Problem der Attribut- und Identitätspreisgabe in offenen Netzwerken und entwickeln hierzu den Begriff der (k, d)-Anonymität für offene Systeme welchen wir extensiv analysieren und anschließend experimentell validieren. Zusätzlich untersuchen wir die Beziehung zwischen Anonymität und Unlinkability in offenen Systemen mithilfe des Begriff der (k, d)-Anonymität und zeigen, dass, im Gegensatz zu traditionell betrachteten, abgeschlossenen Systeme, Anonymität innerhalb einer Online Community nicht zwingend die Unlinkability zwischen verschiedenen Online Communitys impliziert. Mit unserem zweiten Ansatz untersuchen wir die transitive Diffusion von Information die in Sozialen Netzwerken geteilt wird und sich dann durch die paarweisen Interaktionen von Nutzern durch eben dieses Netzwerk ausbreitet. Wir entwickeln eine neue Methode zur Kontrolle der Ausbreitung dieser Information durch die Minimierung ihrer Exposure, was dem Besitzer dieser Information erlaubt zu kontrollieren wie weit sich deren Information ausbreitet indem diese initial mit einer sorgfältig gewählten Menge von Nutzern geteilt wird. Wir implementieren die hierzu entwickelten Algorithmen und untersuchen die praktischen Grenzen der Exposure Minimierung, wenn sie von Nutzerseite für große Netzwerke ausgeführt werden soll. Beide hier vorgestellten Ansätze verbindet eine Neuausrichtung der Aussagen die diese bezüglich Privacy treffen: wir bewegen uns weg von beweisbaren Privacy Garantien für abgeschlossene Systeme, und machen einen Schritt zu robusten Privacy Risikoeinschätzungen für dezentralisierte, offene Systeme in denen solche beweisbaren Garantien nicht möglich sind
Privacy and Robustness in Federated Learning: Attacks and Defenses
As data are increasingly being stored in different silos and societies
becoming more aware of data privacy issues, the traditional centralized
training of artificial intelligence (AI) models is facing efficiency and
privacy challenges. Recently, federated learning (FL) has emerged as an
alternative solution and continue to thrive in this new reality. Existing FL
protocol design has been shown to be vulnerable to adversaries within or
outside of the system, compromising data privacy and system robustness. Besides
training powerful global models, it is of paramount importance to design FL
systems that have privacy guarantees and are resistant to different types of
adversaries. In this paper, we conduct the first comprehensive survey on this
topic. Through a concise introduction to the concept of FL, and a unique
taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against
robustness; 3) inference attacks and defenses against privacy, we provide an
accessible review of this important topic. We highlight the intuitions, key
techniques as well as fundamental assumptions adopted by various attacks and
defenses. Finally, we discuss promising future research directions towards
robust and privacy-preserving federated learning.Comment: arXiv admin note: text overlap with arXiv:2003.02133; text overlap
with arXiv:1911.11815 by other author
- …