Search CORE

4 research outputs found

Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform

Author: Geambasu Roxana
Hsu Daniel
Lecuyer Mathias
Spahn Riley
Vodrahalli Kiran
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/09/2019
Field of study

Companies increasingly expose machine learning (ML) models trained over sensitive user data to untrusted domains, such as end-user devices and wide-access model stores. We present Sage, a differentially private (DP) ML platform that bounds the cumulative leakage of training data through models. Sage builds upon the rich literature on DP ML algorithms and contributes pragmatic solutions to two of the most pressing systems challenges of global DP: running out of privacy budget and the privacy-utility tradeoff. To address the former, we develop block composition, a new privacy loss accounting method that leverages the growing database regime of ML workloads to keep training models endlessly on a sensitive data stream while enforcing a global DP guarantee for the stream. To address the latter, we develop privacy-adaptive training, a process that trains a model on growing amounts of data and/or with increasing privacy parameters until, with high probability, the model meets developer-configured quality criteria. They illustrate how a systems focus on characteristics of ML workloads enables pragmatic solutions that are not apparent when one focuses on individual algorithms, as most DP ML literature does.Comment: Extended version of a paper presented at the 27th ACM Symposium on Operating Systems Principles (SOSP '19

arXiv.org e-Print Archive

Practical Privacy Filters and Odometers with R\'enyi Differential Privacy and Applications to Differentially Private Deep Learning

Author: Lécuyer Mathias
Publication venue
Publication date: 04/06/2021
Field of study

Differential Privacy (DP) is the leading approach to privacy preserving deep learning. As such, there are multiple efforts to provide drop-in integration of DP into popular frameworks. These efforts, which add noise to each gradient computation to make it DP, rely on composition theorems to bound the total privacy loss incurred over this sequence of DP computations. However, existing composition theorems present a tension between efficiency and flexibility. Most theorems require all computations in the sequence to have a predefined DP parameter, called the privacy budget. This prevents the design of training algorithms that adapt the privacy budget on the fly, or that terminate early to reduce the total privacy loss. Alternatively, the few existing composition results for adaptive privacy budgets provide complex bounds on the privacy loss, with constants too large to be practical. In this paper, we study DP composition under adaptive privacy budgets through the lens of R\'enyi Differential Privacy, proving a simpler composition theorem with smaller constants, making it practical enough to use in algorithm design. We demonstrate two applications of this theorem for DP deep learning: adapting the noise or batch size online to improve a model's accuracy within a fixed total privacy loss, and stopping early when fine-tuning a model to reduce total privacy loss

arXiv.org e-Print Archive

DP-Sync: Hiding Update Patterns in Secure Outsourced Databases with Differential Privacy

Author: Bater Johes
Machanavajjhala Ashwin
Nayak Kartik
Wang Chenghong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/04/2021
Field of study

In this paper, we have introduced a new type of leakage associated with modern encrypted databases called update pattern leakage. We formalize the definition and security model of DP-Sync with DP update patterns. We also proposed the framework DP-Sync, which extends existing encrypted database schemes to DP-Sync with DP update patterns. DP-Sync guarantees that the entire data update history over the outsourced data structure is protected by differential privacy. This is achieved by imposing differentially-private strategies that dictate the data owner's synchronization of local~data

arXiv.org e-Print Archive

Encode, Shuffle, Analyze Privacy Revisited: Formalizations and Empirical Evaluation

Author: Erlingsson Úlfar
Feldman Vitaly
Mironov Ilya
Raghunathan Ananth
Song Shuang
Talwar Kunal
Thakurta Abhradeep
Publication venue
Publication date: 10/01/2020
Field of study

Recently, a number of approaches and techniques have been introduced for reporting software statistics with strong privacy guarantees. These range from abstract algorithms to comprehensive systems with varying assumptions and built upon local differential privacy mechanisms and anonymity. Based on the Encode-Shuffle-Analyze (ESA) framework, notable results formally clarified large improvements in privacy guarantees without loss of utility by making reports anonymous. However, these results either comprise of systems with seemingly disparate mechanisms and attack models, or formal statements with little guidance to practitioners. Addressing this, we provide a formal treatment and offer prescriptive guidelines for privacy-preserving reporting with anonymity. We revisit the ESA framework with a simple, abstract model of attackers as well as assumptions covering it and other proposed systems of anonymity. In light of new formal privacy bounds, we examine the limitations of sketch-based encodings and ESA mechanisms such as data-dependent crowds. We also demonstrate how the ESA notion of fragmentation (reporting data aspects in separate, unlinkable messages) improves privacy/utility tradeoffs both in terms of local and central differential-privacy guarantees. Finally, to help practitioners understand the applicability and limitations of privacy-preserving reporting, we report on a large number of empirical experiments. We use real-world datasets with heavy-tailed or near-flat distributions, which pose the greatest difficulty for our techniques; in particular, we focus on data drawn from images that can be easily visualized in a way that highlights reconstruction errors. Showing the promise of the approach, and of independent interest, we also report on experiments using anonymous, privacy-preserving reporting to train high-accuracy deep neural networks on standard tasks---MNIST and CIFAR-10

arXiv.org e-Print Archive