42 research outputs found

    Non-interactive (t, n)-Incidence Counting from Differentially Private Indicator Vectors

    Get PDF
    International audienceWe present a novel non-interactive (t,n)-incidence count estimation for indicator vectors ensuring Differential Privacy. Given one or two differentially private indicator vectors, estimating the distinct count of elements in each and their intersection cardinality (equivalently, their inner product) have been studied in the literature, along with other extensions for estimating the cardinality set intersection in case the elements are hashed prior to insertion. The core contribution behind all these studies was to address the problem of estimating the Hamming weight (the number of bits set to one) of a bit vector from its differentially private version, and in the case of inner product and set intersection, estimating the number of positions which are jointly set to one in both bit vectors. We develop the most general case of estimating the number of positions which are set to one in exactly t out of n bit vectors (this quantity is denoted the (t,n)-incidence count), given access only to the differentially private version of those bit vectors. This means that if each bit vector belongs to a different owner, each can locally sanitize their bit vector prior to sharing it, hence the non-interactive nature of our algorithm. Our main contribution is a novel algorithm that simultaneously estimates the (t,n)-incidence counts for all t in {0,...,n}. We provide upper and lower bounds to the estimation error. Our lower bound is achieved by generalizing the limit of two-party differential privacy into n-party differential privacy, which is a contribution of independent interest. In particular we prove a lower bound on the additive error that must be incurred by any n-wise inner product of n mutually differentially-private bit vectors. Our results are very general and are not limited to differentially private bit vectors. They should apply to a large class of sanitization mechanism of bit vectors which depend on flipping the bits with a constant probability. Some potential applications for our technique include physical mobility analytics, call-detail-record analysis, and similarity metrics computation

    Towards an in-depth understanding of privacy parameters for randomized sanitization mechanisms

    Get PDF
    Differential privacy, and close other notions such as dχd_\chi-privacy, is at the heart of the privacy framework when considering the use of randomization to ensure data privacy. Such a guarantee is always submitted to some trade-off between the privacy level and the accuracy of the result. While a privacy parameter of the differentially private algorithms leverages this trade-off, it is often a hard task to choose a meaningful value for this numerical parameter. Only a few works have tackled this issue, and the present paper\u27s goal is to continue this effort in two ways. First, we propose a generic framework to decide whether a privacy parameter value is sufficient to prevent from some pre-determined and well-understood risks for privacy. Second, we instantiate our framework on mobility data from real-life datasets, and show some insightful features necessary for practical applications of randomized sanitization mechanisms. In our framework, we model scenarii where an attacker\u27s goal is to de-sanitize some data previously sanitized in the sense of dχd_{\chi}-privacy, a privacy guarantee close to that of differential privacy. To each attack is associated a meaningful risk of data disclosure, and the level of success for the attack suggests a relevant value for the corresponding privacy parameter

    Privacy-Preserving t-Incidence for WiFi-based Mobility Analytics

    Get PDF
    National audiencePhysical mobility analytics have gained attention lately. As people become more equipped with ubiquitous wireless-communication-enabled mobile appliances, they tend to leave signatures of their presence wherever they go. One particular example is Wi-Fi enabled devices which continuously send packets (called “probe requests”) to access points around it even if no connection is established between them. Aggregating a list of such probe requests over a number of geographically distributed monitoring nodes gives rise to a rich set of physical mobility analytics such as visitor density in rush hours and most frequently taken routes.However, privacy of individual users is a grave concern. To address this concern we propose to implement physical mobility analytics using a collection of privacy-preserving primitives of set operations. The sets are the MAC addresses of the devices observed by one monitoring node. There is at least one set per monitoring node. An monitoring node may have more than one set if the MAC addresses are split according to the time of reception. The primitives we propose are the t-incidences of these sets. We present an ε-differentially pan-private algorithm to estimate the t-incidence of n sets, up to multiplicative error O(α), given three (ε/3)-differentially pan-private Bloom filters for each of those sets

    GLOVE: towards privacy-preserving publishing of record-level-truthful mobile phone trajectories

    Get PDF
    Datasets of mobile phone trajectories collected by network operators offer an unprecedented opportunity to discover new knowledge from the activity of large populations of millions. However, publishing such trajectories also raises significant privacy concerns, as they contain personal data in the form of individual movement patterns. Privacy risks induce network operators to enforce restrictive confidential agreements in the rare occasions when they grant access to collected trajectories, whereas a less involved circulation of these data would fuel research and enable reproducibility in many disciplines. In this work, we contribute a building block toward the design of privacy-preserving datasets of mobile phone trajectories that are truthful at the record level. We present GLOVE, an algorithm that implements k-anonymity, hence solving the crucial unicity problem that affects this type of data while ensuring that the anonymized trajectories correspond to real-life users. GLOVE builds on original insights about the root causes behind the undesirable unicity of mobile phone trajectories, and leverages generalization and suppression to remove them. Proof-of-concept validations with large-scale real-world datasets demonstrate that the approach adopted by GLOVE allows preserving a substantial level of accuracy in the data, higher than that granted by previous methodologies.This work was supported by the AtracciĂłn de Talento Investigador program of the Comunidad de Madrid under Grant No. 2019-T1/TIC-16037 NetSense

    Building and evaluating privacy-preserving data processing systems

    Get PDF
    Large-scale data processing prompts a number of important challenges, including guaranteeing that collected or published data is not misused, preventing disclosure of sensitive information, and deploying privacy protection frameworks that support usable and scalable services. In this dissertation, we study and build systems geared for privacy-friendly data processing, enabling computational scenarios and applications where potentially sensitive data can be used to extract useful knowledge, and which would otherwise be impossible without such strong privacy guarantees. For instance, we show how to privately and efficiently aggregate data from many sources and large streams, and how to use the aggregates to extract useful statistics and train simple machine learning models. We also present a novel technique for privately releasing generative machine learning models and entire high-dimensional datasets produced by these models. Finally, we demonstrate that the data used by participants in training generative and collaborative learning models may be vulnerable to inference attacks and discuss possible mitigation strategies

    Privacy in trajectory micro-data publishing : a survey

    Get PDF
    We survey the literature on the privacy of trajectory micro-data, i.e., spatiotemporal information about the mobility of individuals, whose collection is becoming increasingly simple and frequent thanks to emerging information and communication technologies. The focus of our review is on privacy-preserving data publishing (PPDP), i.e., the publication of databases of trajectory micro-data that preserve the privacy of the monitored individuals. We classify and present the literature of attacks against trajectory micro-data, as well as solutions proposed to date for protecting databases from such attacks. This paper serves as an introductory reading on a critical subject in an era of growing awareness about privacy risks connected to digital services, and provides insights into open problems and future directions for research.Comment: Accepted for publication at Transactions for Data Privac

    Differential Privacy in Distributed Settings

    Get PDF
    corecore