Search CORE

26 research outputs found

Differentially Private Mixture of Generative Neural Networks

Author: Acs Gergely
Castelluccia Claude
De Cristofaro Emiliano
Melis Luca
Publication venue
Publication date: 18/11/2017
Field of study

Generative models are used in a wide range of applications building on large amounts of contextually rich information. Due to possible privacy violations of the individuals whose data is used to train these models, however, publishing or sharing generative models is not always viable. In this paper, we present a novel technique for privately releasing generative models and entire high-dimensional datasets produced by these models. We model the generator distribution of the training data with a mixture of

k

generative neural networks. These are trained together and collectively learn the generator distribution of a dataset. Data is divided into

k

clusters, using a novel differentially private kernel

k

-means, then each cluster is given to separate generative neural networks, such as Restricted Boltzmann Machines or Variational Autoencoders, which are trained only on their own cluster using differentially private gradient descent. We evaluate our approach using the MNIST dataset, as well as call detail records and transit datasets, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.Comment: A shorter version of this paper appeared at the 17th IEEE International Conference on Data Mining (ICDM 2017). This is the full version, published in IEEE Transactions on Knowledge and Data Engineering (TKDE

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

A Learning-based Declarative Privacy-Preserving Framework for Federated Data Management

Author: Ambrish Rajan Hari
Gautier Summer
Giriyan Dhanush
Guan Hong
Gupta Deepti
Lakamsani Harsha
Maslanka Saajan
Wang Yancheng
Xiao Chaowei
Yang Yingzhen
Zou Jia
Publication venue
Publication date: 22/01/2024
Field of study

It is challenging to balance the privacy and accuracy for federated query processing over multiple private data silos. In this work, we will demonstrate an end-to-end workflow for automating an emerging privacy-preserving technique that uses a deep learning model trained using the Differentially-Private Stochastic Gradient Descent (DP-SGD) algorithm to replace portions of actual data to answer a query. Our proposed novel declarative privacy-preserving workflow allows users to specify "what private information to protect" rather than "how to protect". Under the hood, the system automatically chooses query-model transformation plans as well as hyper-parameters. At the same time, the proposed workflow also allows human experts to review and tune the selected privacy-preserving mechanism for audit/compliance, and optimization purposes

arXiv.org e-Print Archive

Private Topic Modeling

Author: Chaudhuri K.
Foulds J.
Park M.
Welling M.
Publication venue: NIPS
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

Building and evaluating privacy-preserving data processing systems

Author: Melis Luca
Publication venue: UCL (University College London)
Publication date: 28/08/2018
Field of study

Large-scale data processing prompts a number of important challenges, including guaranteeing that collected or published data is not misused, preventing disclosure of sensitive information, and deploying privacy protection frameworks that support usable and scalable services. In this dissertation, we study and build systems geared for privacy-friendly data processing, enabling computational scenarios and applications where potentially sensitive data can be used to extract useful knowledge, and which would otherwise be impossible without such strong privacy guarantees. For instance, we show how to privately and efficiently aggregate data from many sources and large streams, and how to use the aggregates to extract useful statistics and train simple machine learning models. We also present a novel technique for privately releasing generative machine learning models and entire high-dimensional datasets produced by these models. Finally, we demonstrate that the data used by participants in training generative and collaborative learning models may be vulnerable to inference attacks and discuss possible mitigation strategies

UCL Discovery

Recommended from our members

Noise-Aware Inference for Differential Privacy

Author: Bernstein Garrett
Publication venue: ScholarWorks@UMass Amherst
Publication date: 24/03/2020
Field of study

Domains involving sensitive human data, such as health care, human mobility, and online activity, are becoming increasingly dependent upon machine learning algorithms. This leads to scenarios in which data owners wish to protect the privacy of individuals comprising the sensitive data, while at the same time data modelers wish to analyze and draw conclusions from the data. Thus there is a growing demand to develop effective private inference methods that can marry the needs of both parties. For this we turn to differential privacy, which provides a framework for executing algorithms in a private fashion by injecting specifically-designed randomization at various points in the process. The majority of existing work proceeds by ignoring the injected randomization, potentially leading to pathologies in algorithmic performance. There is, however, a small body of existing work that performs inference over the injected randomization in an attempt to design more principled algorithms. This thesis summarizes the subfield of noise-aware differentially private inference and contributes novel algorithms for important problems. Differential privacy literature provides a multitude of privacy mechanisms. We opt for sufficient statistics perturbation (SSP), in which sufficient statistics, a quantity that captures all information about the model parameters, are corrupted with random noise and released to the public. This mechanism offers desirable efficiency properties in comparison to alternatives. In this thesis we develop methods in a principled manner that directly accounts for the injected noise in three settings: maximum likelihood estimation of undirected graphical models, Bayesian inference of exponential family models, and Bayesian inference of conditional regression models

ScholarWorks@UMass Amherst

Secure and Private Federated Learning at Large Scale

Author: Stevens Timothy
Publication venue: UVM ScholarWorks
Publication date: 01/01/2022
Field of study

We present novel techniques to forward the goal of secure and private machine learning. The widespread use of machine learning poses a serious privacy risk to the data used to train models. Data owners are forced to trust that aggregators will keep their data secure, and that released models will maintain their privacy. The works presented in this thesis strive to solve both problems through secure multiparty computation and differential privacy based approaches. The novel FLDP protocol leverages the learning with errors (LWE) problem to mask model updates and implements an efficient secure aggregation protocol, which easily scales to large models. Continuing on the vein of scalable secure aggregation the SHARD protocol utilizes a multi-layered secret sharing scheme to perform efficient secure aggregation on very large federations. Together, these protocols allow a federation to train models without requiring data owners to trust an aggregator. In order to ensure the privacy of trained models, we propose immediate sensitivity, a framework for reducing membership inference attack efficacy against neural networks. Immediate sensitivity uses a differential privacy inspired additive noise mechanism to privatize model updates during training. By determining the scale of the noise through the gradient of the gradient, immediate sensitivity trains more accurate models than differentially private gradient clipping approach. Each of these works is supported by extensive experimental evaluation

UVM ScholarWorks