Search CORE

7 research outputs found

Empirical Risk Minimization in the Non-interactive Local Model of Differential Privacy

Author: Gaboardi Marco
Smith Adam
Wang Di
Xu Jinhui
Publication venue
Publication date: 01/01/2020
Field of study

In this paper, we study the Empirical Risk Minimization (ERM) problem in the non-interactive Local Differential Privacy (LDP) model. Previous research on this problem \citep{smith2017interaction} indicates that the sample complexity, to achieve error

\alpha

, needs to be exponentially depending on the dimensionality

p

for general loss functions. In this paper, we make two attempts to resolve this issue by investigating conditions on the loss functions that allow us to remove such a limit. In our first attempt, we show that if the loss function is

(\infty, T)

-smooth, by using the Bernstein polynomial approximation we can avoid the exponential dependency in the term of

\alpha

. We then propose player-efficient algorithms with

1

-bit communication complexity and

O(1)

computation cost for each player. The error bound of these algorithms is asymptotically the same as the original one. With some additional assumptions, we also give an algorithm which is more efficient for the server. In our second attempt, we show that for any

1

-Lipschitz generalized linear convex loss function, there is an

(\epsilon, \delta)

-LDP algorithm whose sample complexity for achieving error

\alpha

is only linear in the dimensionality

p

. Our results use a polynomial of inner product approximation technique. Finally, motivated by the idea of using polynomial approximation and based on different types of polynomial approximations, we propose (efficient) non-interactive locally differentially private algorithms for learning the set of k-way marginal queries and the set of smooth queries.Comment: Appeared at Journal of Machine Learning Research. The journal version of arXiv:1802.04085, fixed a bug in arXiv:1812.0682

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

OpBoost: A Vertical Federated Tree Boosting Framework Based on Order-Preserving Desensitization

Author: Feng Hanwen
Hong Yuan
Hu Yuke
Li Xiaochen
Liu Weiran
Peng Li
Qin Zhan
Ren Kui
Publication venue
Publication date: 03/10/2022
Field of study

Vertical Federated Learning (FL) is a new paradigm that enables users with non-overlapping attributes of the same data samples to jointly train a model without directly sharing the raw data. Nevertheless, recent works show that it's still not sufficient to prevent privacy leakage from the training process or the trained model. This paper focuses on studying the privacy-preserving tree boosting algorithms under the vertical FL. The existing solutions based on cryptography involve heavy computation and communication overhead and are vulnerable to inference attacks. Although the solution based on Local Differential Privacy (LDP) addresses the above problems, it leads to the low accuracy of the trained model. This paper explores to improve the accuracy of the widely deployed tree boosting algorithms satisfying differential privacy under vertical FL. Specifically, we introduce a framework called OpBoost. Three order-preserving desensitization algorithms satisfying a variant of LDP called distance-based LDP (dLDP) are designed to desensitize the training data. In particular, we optimize the dLDP definition and study efficient sampling distributions to further improve the accuracy and efficiency of the proposed algorithms. The proposed algorithms provide a trade-off between the privacy of pairs with large distance and the utility of desensitized values. Comprehensive evaluations show that OpBoost has a better performance on prediction accuracy of trained models compared with existing LDP approaches on reasonable settings. Our code is open source

arXiv.org e-Print Archive

Intertwining Order Preserving Encryption and Differential Privacy

Author: Chowdhury Amrita Roy
Ding Bolin
Jha Somesh
Liu Weiran
Zhou Jingren
Publication venue
Publication date: 14/09/2020
Field of study

Ciphertexts of an order-preserving encryption (OPE) scheme preserve the order of their corresponding plaintexts. However, OPEs are vulnerable to inference attacks that exploit this preserved order. At another end, differential privacy has become the de-facto standard for achieving data privacy. One of the most attractive properties of DP is that any post-processing (inferential) computation performed on the noisy output of a DP algorithm does not degrade its privacy guarantee. In this paper, we intertwine the two approaches and propose a novel differentially private order preserving encryption scheme, OP

\epsilon

. Under OP

\epsilon

, the leakage of order from the ciphertexts is differentially private. As a result, in the least, OP

\epsilon

ensures a formal guarantee (specifically, a relaxed DP guarantee) even in the face of inference attacks. To the best of our knowledge, this is the first work to intertwine DP with a property-preserving encryption scheme. We demonstrate OP

\epsilon

's practical utility in answering range queries via extensive empirical evaluation on four real-world datasets. For instance, OP

\epsilon

misses only around

4

in every

10K

correct records on average for a dataset of size

\sim732K

with an attribute of domain size

\sim18K

and

\epsilon= 1

arXiv.org e-Print Archive

On the Risks of Collecting Multidimensional Data Under Local Differential Privacy

Author: Arcolezi Héber H.
Couchot Jean-François
Gambs Sébastien
Palamidessi Catuscia
Publication venue
Publication date: 28/12/2022
Field of study

The private collection of multiple statistics from a population is a fundamental statistical problem. One possible approach to realize this is to rely on the local model of differential privacy (LDP). Numerous LDP protocols have been developed for the task of frequency estimation of single and multiple attributes. These studies mainly focused on improving the utility of the algorithms to ensure the server performs the estimations accurately. In this paper, we investigate privacy threats (re-identification and attribute inference attacks) against LDP protocols for multidimensional data following two state-of-the-art solutions for frequency estimation of multiple attributes. To broaden the scope of our study, we have also experimentally assessed five widely used LDP protocols, namely, generalized randomized response, optimal local hashing, subset selection, RAPPOR and optimal unary encoding. Finally, we also proposed a countermeasure that improves both utility and robustness against the identified threats. Our contributions can help practitioners aiming to collect users' statistics privately to decide which LDP mechanism best fits their needs.Comment: Accepted at VLDB 202

arXiv.org e-Print Archive

HAL - Université de Franche-Comté

INRIA a CCSD electronic archive server

HAL-Polytechnique