Search CORE

17,274 research outputs found

Differentially Private Continual Releases of Streaming Frequency Moment Estimations

Author: Epasto Alessandro
Mao Jieming
Medina Andres Munoz
Mirrokni Vahab
Vassilvitskii Sergei
Zhong Peilin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

The streaming model of computation is a popular approach for working with large-scale data. In this setting, there is a stream of items and the goal is to compute the desired quantities (usually data statistics) while making a single pass through the stream and using as little space as possible. Motivated by the importance of data privacy, we develop differentially private streaming algorithms under the continual release setting, where the union of outputs of the algorithm at every timestamp must be differentially private. Specifically, we study the fundamental ?_p (p ? [0,+?)) frequency moment estimation problem under this setting, and give an ?-DP algorithm that achieves (1+?)-relative approximation (? ? ? (0,1)) with polylog(Tn) additive error and uses polylog(Tn)? max(1, n^{1-2/p}) space, where T is the length of the stream and n is the size of the universe of elements. Our space is near optimal up to poly-logarithmic factors even in the non-private setting. To obtain our results, we first reduce several primitives under the differentially private continual release model, such as counting distinct elements, heavy hitters and counting low frequency elements, to the simpler, counting/summing problems in the same setting. Based on these primitives, we develop a differentially private continual release level set estimation approach to address the ?_p frequency moment estimation problem. We also provide a simple extension of our results to the harder sliding window model, where the statistics must be maintained over the past W data items

Dagstuhl Research Online Publication Server

Counting Distinct Elements in the Turnstile Model with Differential Privacy under Continual Observation

Author: Jain Palak
Kalemaj Iden
Raskhodnikova Sofya
Sivakumar Satchit
Smith Adam
Publication venue
Publication date: 30/10/2023
Field of study

Privacy is a central challenge for systems that learn from sensitive data sets, especially when a system's outputs must be continuously updated to reflect changing data. We consider the achievable error for differentially private continual release of a basic statistic -- the number of distinct items -- in a stream where items may be both inserted and deleted (the turnstile model). With only insertions, existing algorithms have additive error just polylogarithmic in the length of the stream

T

. We uncover a much richer landscape in the turnstile model, even without considering memory restrictions. We show that every differentially private mechanism that handles insertions and deletions has worst-case additive error at least

T^{1/4}

even under a relatively weak, event-level privacy definition. Then, we identify a parameter of the input stream, its maximum flippancy, that is low for natural data streams and for which we give tight parameterized error guarantees. Specifically, the maximum flippancy is the largest number of times that the contribution of a single item to the distinct elements count changes over the course of the stream. We present an item-level differentially private mechanism that, for all turnstile streams with maximum flippancy

w

, continually outputs the number of distinct elements with an

O(\sqrt{w} \cdot poly\log T)

additive error, without requiring prior knowledge of

w

. We prove that this is the best achievable error bound that depends only on

w

, for a large range of values of

w

. When

w

is small, the error of our mechanism is similar to the polylogarithmic in

T

error in the insertion-only setting, bypassing the hardness in the turnstile model

arXiv.org e-Print Archive

Improved differential privacy for SGD via optimal private linear operators on adaptive streams

Author: Denisov Sergei
McMahan Brendan
Rush Keith
Smith Adam
Thakurta Abhradeep
Publication venue
Publication date: 08/06/2022
Field of study

CCF-1763786 - National Science Foundation; Apple, Inchttps://arxiv.org/abs/2202.0831

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Private Matchings and Allocations

Author: Hsu Justin
Huang Zhiyi
Roth Aaron
Roughgarden Tim
Wu Zhiwei Steven
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2016
Field of study

We consider a private variant of the classical allocation problem: given k goods and n agents with individual, private valuation functions over bundles of goods, how can we partition the goods amongst the agents to maximize social welfare? An important special case is when each agent desires at most one good, and specifies her (private) value for each good: in this case, the problem is exactly the maximum-weight matching problem in a bipartite graph. Private matching and allocation problems have not been considered in the differential privacy literature, and for good reason: they are plainly impossible to solve under differential privacy. Informally, the allocation must match agents to their preferred goods in order to maximize social welfare, but this preference is exactly what agents wish to hide. Therefore, we consider the problem under the relaxed constraint of joint differential privacy: for any agent i, no coalition of agents excluding i should be able to learn about the valuation function of agent i. In this setting, the full allocation is no longer published---instead, each agent is told what good to get. We first show that with a small number of identical copies of each good, it is possible to efficiently and accurately solve the maximum weight matching problem while guaranteeing joint differential privacy. We then consider the more general allocation problem, when bidder valuations satisfy the gross substitutes condition. Finally, we prove that the allocation problem cannot be solved to non-trivial accuracy under joint differential privacy without requiring multiple copies of each type of good.Comment: Journal version published in SIAM Journal on Computation; an extended abstract appeared in STOC 201

arXiv.org e-Print Archive

HKU Scholars Hub

Continuous Release of Data Streams under both Centralized and Local Differential Privacy

Author: Chen Joann Qiongna
Cheng Yueqiang
Jha Somesh
Li Ninghui
Li Zhou
Su Dong
Wang Tianhao
Zhang Zhikun
Publication venue
Publication date: 24/05/2020
Field of study

In this paper, we study the problem of publishing a stream of real-valued data satisfying differential privacy (DP). One major challenge is that the maximal possible value can be quite large; thus it is necessary to estimate a threshold so that numbers above it are truncated to reduce the amount of noise that is required to all the data. The estimation must be done based on the data in a private fashion. We develop such a method that uses the Exponential Mechanism with a quality function that approximates well the utility goal while maintaining a low sensitivity. Given the threshold, we then propose a novel online hierarchical method and several post-processing techniques. Building on these ideas, we formalize the steps into a framework for private publishing of stream data. Our framework consists of three components: a threshold optimizer that privately estimates the threshold, a perturber that adds calibrated noises to the stream, and a smoother that improves the result using post-processing. Within our framework, we design an algorithm satisfying the more stringent setting of DP called local DP (LDP). To our knowledge, this is the first LDP algorithm for publishing streaming data. Using four real-world datasets, we demonstrate that our mechanism outperforms the state-of-the-art by a factor of 6-10 orders of magnitude in terms of utility (measured by the mean squared error of answering a random range query)

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit