10,546 research outputs found
Near-Optimal Differentially Private Reinforcement Learning
Motivated by personalized healthcare and other applications involving
sensitive data, we study online exploration in reinforcement learning with
differential privacy (DP) constraints. Existing work on this problem
established that no-regret learning is possible under joint differential
privacy (JDP) and local differential privacy (LDP) but did not provide an
algorithm with optimal regret. We close this gap for the JDP case by designing
an -JDP algorithm with a regret of
which matches the
information-theoretic lower bound of non-private learning for all choices of
. In the above, , denote the
number of states and actions, denotes the planning horizon, and is the
number of steps. To the best of our knowledge, this is the first private RL
algorithm that achieves \emph{privacy for free} asymptotically as . Our techniques -- which could be of independent interest -- include
privately releasing Bernstein-type exploration bonuses and an improved method
for releasing visitation statistics. The same techniques also imply a slightly
improved regret bound for the LDP case.Comment: 38 page
Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
In this paper, we study the problem of (finite horizon tabular) Markov
decision processes (MDPs) with heavy-tailed rewards under the constraint of
differential privacy (DP). Compared with the previous studies for private
reinforcement learning that typically assume rewards are sampled from some
bounded or sub-Gaussian distributions to ensure DP, we consider the setting
where reward distributions have only finite -th moments with some . By resorting to robust mean estimators for rewards, we first propose
two frameworks for heavy-tailed MDPs, i.e., one is for value iteration and
another is for policy optimization. Under each framework, we consider both
joint differential privacy (JDP) and local differential privacy (LDP) models.
Based on our frameworks, we provide regret upper bounds for both JDP and LDP
cases and show that the moment of distribution and privacy budget both have
significant impacts on regrets. Finally, we establish a lower bound of regret
minimization for heavy-tailed MDPs in JDP model by reducing it to the
instance-independent lower bound of heavy-tailed multi-armed bandits in DP
model. We also show the lower bound for the problem in LDP by adopting some
private minimax methods. Our results reveal that there are fundamental
differences between the problem of private RL with sub-Gaussian and that with
heavy-tailed rewards.Comment: ICML 2023. arXiv admin note: text overlap with arXiv:2009.09052 by
other author
Decentralized Differentially Private Without-Replacement Stochastic Gradient Descent
While machine learning has achieved remarkable results in a wide variety of
domains, the training of models often requires large datasets that may need to
be collected from different individuals. As sensitive information may be
contained in the individual's dataset, sharing training data may lead to severe
privacy concerns. Therefore, there is a compelling need to develop
privacy-aware machine learning methods, for which one effective approach is to
leverage the generic framework of differential privacy. Considering that
stochastic gradient descent (SGD) is one of the mostly adopted methods for
large-scale machine learning problems, two decentralized differentially private
SGD algorithms are proposed in this work. Particularly, we focus on SGD without
replacement due to its favorable structure for practical implementation. In
addition, both privacy and convergence analysis are provided for the proposed
algorithms. Finally, extensive experiments are performed to verify the
theoretical results and demonstrate the effectiveness of the proposed
algorithms
Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning
Deep Learning has recently become hugely popular in machine learning,
providing significant improvements in classification accuracy in the presence
of highly-structured and large databases.
Researchers have also considered privacy implications of deep learning.
Models are typically trained in a centralized manner with all the data being
processed by the same training algorithm. If the data is a collection of users'
private data, including habits, personal pictures, geographical positions,
interests, and more, the centralized server will have access to sensitive
information that could potentially be mishandled. To tackle this problem,
collaborative deep learning models have recently been proposed where parties
locally train their deep learning structures and only share a subset of the
parameters in the attempt to keep their respective training sets private.
Parameters can also be obfuscated via differential privacy (DP) to make
information extraction even more challenging, as proposed by Shokri and
Shmatikov at CCS'15.
Unfortunately, we show that any privacy-preserving collaborative deep
learning is susceptible to a powerful attack that we devise in this paper. In
particular, we show that a distributed, federated, or decentralized deep
learning approach is fundamentally broken and does not protect the training
sets of honest participants. The attack we developed exploits the real-time
nature of the learning process that allows the adversary to train a Generative
Adversarial Network (GAN) that generates prototypical samples of the targeted
training set that was meant to be private (the samples generated by the GAN are
intended to come from the same distribution as the training data).
Interestingly, we show that record-level DP applied to the shared parameters of
the model, as suggested in previous work, is ineffective (i.e., record-level DP
is not designed to address our attack).Comment: ACM CCS'17, 16 pages, 18 figure
A novel differentially private advising framework in cloud server environment
Due to the rapid development of the cloud computing environment, it is widely accepted that cloud servers are important for users to improve work efficiency. Users need to know servers' capabilities and make optimal decisions on selecting the best available servers for users' tasks. We consider the process of learning servers' capabilities by users as a multiagent reinforcement learning process. The learning speed and efficiency in reinforcement learning can be improved by sharing the learning experience among learning agents which is defined as advising. However, existing advising frameworks are limited by the requirement that during advising all learning agents in a reinforcement learning environment must have exactly the same actions. To address the above limitation, this article proposes a novel differentially private advising framework for multiagent reinforcement learning. Our proposed approach can significantly improve the application of conventional advising frameworks when agents have one different action. The approach can also widen the applicable field of advising and speed up reinforcement learning by triggering more potential advising processes among agents with different actions
- …