44 research outputs found
Recommended from our members
Quantifying and Enhancing the Security of Federated Learning
Federated learning is an emerging distributed learning paradigm that allows multiple users to collaboratively train a joint machine learning model without having to share their private data with any third party. Due to many of its attractive properties, federated learning has received significant attention from academia as well as industry and now powers major applications, e.g., Google\u27s Gboard and Assistant, Apple\u27s Siri, Owkin\u27s health diagnostics, etc. However, federated learning is yet to see widespread adoption due to a number of challenges. One such challenge is its susceptibility to poisoning by malicious users who aim to manipulate the joint machine learning model.
In this work, we take significant steps towards this challenge. We start by providing a systemization of poisoning adversaries in federated learning and use it to build adversaries with varying strengths and to show how some adversaries common in the prior literature are not practically relevant. For the majority of this thesis, we focus on untargeted poisoning as it can impact much larger federated learning population than other types of poisoning and also because most of the prior poisoning defenses for federated learning aim to defend against untargeted poisoning.
% Next, we introduce a general framework to design strong untargeted poisoning attacks against various federated learning algorithms. Using our framework, we design state-of-the-art poisoning attacks and demonstrate how the theoretical guarantees and empirical claims of prior state-of-the-art federated learning poisoning defenses are brittle under the same strong (albeit theoretical) adversaries that these defenses aim to defend against. We also provide concrete lessons highlighting the shortcomings of prior defenses. Using these lessons, we also design two novel defenses with strong theoretical guarantees and demonstrate their state-of-the-art performances in various adversarial settings.
Finally, for the first time, we thoroughly investigate the impact of poisoning in real-world federated learning settings and draw significant, and rather surprising, conclusions about robustness of federated learning in practice. For instance, we show that contrary to the established belief, federated learning is highly robust in practice even when using simple, low-cost defenses. One of the major implications of our study is that, although interesting from theoretical perspectives, many of the strong adversaries, and hence, strong prior defenses, are of little use in practice
Membership Privacy for Machine Learning Models Through Knowledge Transfer
Large capacity machine learning (ML) models are prone to membership inference
attacks (MIAs), which aim to infer whether the target sample is a member of the
target model's training dataset. The serious privacy concerns due to the
membership inference have motivated multiple defenses against MIAs, e.g.,
differential privacy and adversarial regularization. Unfortunately, these
defenses produce ML models with unacceptably low classification performances.
Our work proposes a new defense, called distillation for membership privacy
(DMP), against MIAs that preserves the utility of the resulting models
significantly better than prior defenses. DMP leverages knowledge distillation
to train ML models with membership privacy. We provide a novel criterion to
tune the data used for knowledge transfer in order to amplify the membership
privacy of DMP. Our extensive evaluation shows that DMP provides significantly
better tradeoffs between membership privacy and classification accuracies
compared to state-of-the-art MIA defenses. For instance, DMP achieves ~100%
accuracy improvement over adversarial regularization for DenseNet trained on
CIFAR100, for similar membership privacy (measured using MIA risk): when the
MIA risk is 53.7%, adversarially regularized DenseNet is 33.6% accurate, while
DMP-trained DenseNet is 65.3% accurate.Comment: To Appear in the 35th AAAI Conference on Artificial Intelligence,
202
Quantifying Privacy Leakage in Graph Embedding
Graph embeddings have been proposed to map graph data to low dimensional
space for downstream processing (e.g., node classification or link prediction).
With the increasing collection of personal data, graph embeddings can be
trained on private and sensitive data. For the first time, we quantify the
privacy leakage in graph embeddings through three inference attacks targeting
Graph Neural Networks. We propose a membership inference attack to infer
whether a graph node corresponding to individual user's data was member of the
model's training or not. We consider a blackbox setting where the adversary
exploits the output prediction scores, and a whitebox setting where the
adversary has also access to the released node embeddings. This attack provides
an accuracy up to 28% (blackbox) 36% (whitebox) beyond random guess by
exploiting the distinguishable footprint between train and test data records
left by the graph embedding. We propose a Graph Reconstruction attack where the
adversary aims to reconstruct the target graph given the corresponding graph
embeddings. Here, the adversary can reconstruct the graph with more than 80% of
accuracy and link inference between two nodes around 30% more confidence than a
random guess. We then propose an attribute inference attack where the adversary
aims to infer a sensitive attribute. We show that graph embeddings are strongly
correlated to node attributes letting the adversary inferring sensitive
information (e.g., gender or location).Comment: 11 page
Application of Iron Complexes as Catalysts in C-O and C-C bond forming reactions
‘Green Chemistry’ is a philosophy that encourages chemists in research and industry to reduce toxic waste for cleaner and safer chemical production. Oxidation reactions are commonly used in industry and are traditionally achieved by methods that involve toxic metals (such as chromium) and solvents, generating considerable waste compared to the value of the final product. The Mukaiyama Aldol reaction is another organic reaction that utilizes metals as Lewis acid catalysts. The principles of Green Chemistry suggest that chemical reactions should be performed with a maximum of efficiency and with a minimum of side product formation, accompanied by a minimal use of the toxic reagents. The overall purpose of the PhD project was to replace toxic metals in catalyst systems by environmentally more benign iron. The research focused on synthesizing iron based organometallic complexes, that were fully characterized and then employed as catalysts, as compared to conventional in situ synthesis of catalytically active complexes (where the identification and modification of active sites is difficult due to lack of knowledge of the exact structure of the metal complex). Iron complexes of the general formula [BrCpFe(CO)(L)] (1) where L is a monodentate phosphoramidite, [Fe(L)2(OTf)2] (2) where L is bidentate α-iminopyridine and OTf is triflate, and [Fe2(L)Cl6] (3) where L is a tris-dentate α-diiminopyridine were synthesized and characterized by various instrumental methods such as multinuclear NMR, IR, Mass-spectrometry, UV-VIS spectroscopy, elemental analyses, and single crystal X-ray diffraction methods. Complexes 1 and 2 were successfully employed as catalyst precursor in oxidation reactions of activated methylene groups and secondary alcohols to obtain the corresponding ketones in 21-91% yield. Complex 3 was successfully employed as catalysts after activation using AgSbF6 in Mukaiyama Aldol reactions to give silyl protected β-hydroxyketones in 43-91% yield
GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep Learning
Embedded systems demand on-device processing of data using Neural Networks (NNs) while conforming to the memory, power and computation constraints, leading to an efficiency and accuracy tradeoff. To bring NNs to edge devices, several optimizations such as model compression through pruning, quantization, and off-the-shelf architectures with efficient design have been extensively adopted. These algorithms when deployed to real world sensitive applications, requires to resist inference attacks to protect privacy of users training data. However, resistance against inference attacks is not accounted for designing NN models for IoT. In this work, we analyse the three-dimensional privacy-accuracy-efficiency tradeoff in NNs for IoT devices and propose Gecko training methodology where we explicitly add resistance to private inferences as a design objective. We optimize the inference-time memory, computation, and power constraints of embedded devices as a criterion for designing NN architecture while also preserving privacy. We choose quantization as design choice for highly efficient and private models. This choice is driven by the observation that compressed models leak more information compared to baseline models while off-the-shelf efficient architectures indicate poor efficiency and privacy tradeoff. We show that models trained using Gecko methodology are comparable to prior defences against black-box membership attacks in terms of accuracy and privacy while providing efficiency
Towards Privacy Aware Deep Learning for Embedded Systems
International audienceMemorization of training data by deep neural networks enables an adversary to mount successful membership inference attacks. Here, an adversary with blackbox query access to the model can infer whether an individual’s data record was part of the model’s sensitive training data using only the output predictions. This violates the data confidentiality, by inferring samples from proprietary training data, and privacy of the individual whose sensitive record was used to train the model. This privacy threat is profound in commercial embedded systems with on-device processing. Addressing this problem requires neural networks to be inherently private by design while conforming to the memory, power and computation constraints of embedded systems. This is lacking in literature. We present the first work towards membership privacy by design in neural networks while reconciling privacy-accuracy-efficiency trade-offs for embedded systems. We conduct an extensive privacy-centred neural network design space exploration to understand the membership privacy risks of well adopted state-of-the-art techniques: model compression via pruning, quantization, and off-the-shelf efficient architectures. We study the impact of model capacity on memorization of training data and show that compressed models (after retraining) leak more membership information compared to baseline uncompressed models while off-the-shelf architectures do not satisfy all efficiency requirements. Based on these observations, we identify quantization as a potential design choice to address the three dimensional trade-off. We propose Gecko training methodology where we explicitly add resistance to membership inference attacks as a design objective along with memory, computation, and power constraints of the embedded devices. We show that models trained using Gecko are comparable to prior defences against blackbox membership attacks in terms of accuracy and privacy while additionally providing efficiency. This enables Gecko models to be deployed on embeddedsystems while providing membership privacy
Privacy Leakage in Graph Embedding
Graph embeddings have been proposed to map graph data to low dimensional space for downstream processing (e.g., node classification or link prediction). With the increasing collection of personal data, graph embeddings can be trained on private and sensitive data. For the first time, we quantify the privacy leakage in graph embeddings through three inference attacks targeting Graph Neural Networks. We propose a membership inference attack to infer whether a graph node corresponding to individual user's data was member of the model's training or not. We consider a blackbox setting where the adversary exploits the output prediction scores, and a whitebox setting where the adversary has also access to the released node embeddings. This attack provides an accuracy up to 28% (blackbox) 36% (whitebox) beyond random guess by exploiting the distinguishable footprint between train and test data records left by the graph embedding. We propose a Graph Reconstruction attack where the adversary aims to reconstruct the target graph given the corresponding graph embeddings. Here, the adversary can reconstruct the graph with more than 80% of accuracy and link inference between two nodes around 30% more confidence than a random guess. We then propose an attribute inference attack where the adversary aims to infer a sensitive attribute. We show that graph embeddings are strongly correlated to node attributes letting the adversary inferring sensitive information (e.g., gender or location)
Evaluation of the effects of passion fruit peel flour (Passiflora edulis fo. flavicarpa) on metabolic changes in HIV patients with lipodystrophy syndrome secondary to antiretroviral therapy
AbstractThis study evaluated the effects of using passion fruit peel flour together with diet therapy and counseling in 36 patients with HIV lipodystrophy who were in an ambulatory clinic in a university hospital. The patients were divided into two groups. One received 30g of passion fruit peel flour daily for 90 days and diet therapy counseling. The other group received only diet therapy counseling. The metabolic changes were analyzed before and after the intervention, with a significance level predetermined at p≤0.05. The use of passion fruit peel flour was effective in reducing total cholesterol and triacylglycerides after 30 days. The concentrations of LDL-C decreased, while HDL-C increased in the blood of lipodystrophy patients after 90 days passion fruit peel flour treatment. No significant differences in food consumption were seen between groups. The use of 30g of passion fruit peel flour for 90 days together with diet therapy counseling was effective in improving plasma concentrations of total cholesterol, LDL-C, HDL-C and triacylglycerides