3,611 research outputs found
Generating Artificial Data for Private Deep Learning
In this paper, we propose generating artificial data that retain statistical
properties of real data as the means of providing privacy with respect to the
original dataset. We use generative adversarial network to draw
privacy-preserving artificial data samples and derive an empirical method to
assess the risk of information disclosure in a differential-privacy-like way.
Our experiments show that we are able to generate artificial data of high
quality and successfully train and validate machine learning models on this
data while limiting potential privacy loss.Comment: Privacy-Enhancing Artificial Intelligence and Language Technologies,
AAAI Spring Symposium Series, 201
Survey: Leakage and Privacy at Inference Time
Leakage of data from publicly available Machine Learning (ML) models is an
area of growing significance as commercial and government applications of ML
can draw on multiple sources of data, potentially including users' and clients'
sensitive data. We provide a comprehensive survey of contemporary advances on
several fronts, covering involuntary data leakage which is natural to ML
models, potential malevolent leakage which is caused by privacy attacks, and
currently available defence mechanisms. We focus on inference-time leakage, as
the most likely scenario for publicly available models. We first discuss what
leakage is in the context of different data, tasks, and model architectures. We
then propose a taxonomy across involuntary and malevolent leakage, available
defences, followed by the currently available assessment metrics and
applications. We conclude with outstanding challenges and open questions,
outlining some promising directions for future research
Local and Central Differential Privacy for Robustness and Privacy in Federated Learning
Federated Learning (FL) allows multiple participants to train machine
learning models collaboratively by keeping their datasets local while only
exchanging model updates. Alas, this is not necessarily free from privacy and
robustness vulnerabilities, e.g., via membership, property, and backdoor
attacks. This paper investigates whether and to what extent one can use
differential Privacy (DP) to protect both privacy and robustness in FL. To this
end, we present a first-of-its-kind evaluation of Local and Central
Differential Privacy (LDP/CDP) techniques in FL, assessing their feasibility
and effectiveness. Our experiments show that both DP variants do d fend against
backdoor attacks, albeit with varying levels of protection-utility trade-offs,
but anyway more effectively than other robustness defenses. DP also mitigates
white-box membership inference attacks in FL, and our work is the first to show
it empirically. Neither LDP nor CDP, however, defend against property
inference. Overall, our work provides a comprehensive, re-usable measurement
methodology to quantify the trade-offs between robustness/privacy and utility
in differentially private FL
Privacy in Practice: Private COVID-19 Detection in X-Ray Images
Machine learning (ML) can help fight the COVID-19 pandemic by enabling rapid
screening of large volumes of chest X-ray images. To perform such data analysis
while maintaining patient privacy, we create ML models that satisfy
Differential Privacy (DP). Previous works exploring private COVID-19 ML models
are in part based on small or skewed datasets, are lacking in their privacy
guarantees, and do not investigate practical privacy. In this work, we
therefore suggest several improvements to address these open gaps. We account
for inherent class imbalances in the data and evaluate the utility-privacy
trade-off more extensively and over stricter privacy budgets than in previous
work. Our evaluation is supported by empirically estimating practical privacy
leakage through actual attacks. Based on theory, the introduced DP should help
limit and mitigate information leakage threats posed by black-box Membership
Inference Attacks (MIAs). Our practical privacy analysis is the first to test
this hypothesis on the COVID-19 detection task. In addition, we also re-examine
the evaluation on the MNIST database. Our results indicate that based on the
task-dependent threat from MIAs, DP does not always improve practical privacy,
which we show on the COVID-19 task. The results further suggest that with
increasing DP guarantees, empirical privacy leakage reaches an early plateau
and DP therefore appears to have a limited impact on MIA defense. Our findings
identify possibilities for better utility-privacy trade-offs, and we thus
believe that empirical attack-specific privacy estimation can play a vital role
in tuning for practical privacy
- …