1,116 research outputs found

    Development and Analysis of Deterministic Privacy-Preserving Policies Using Non-Stochastic Information Theory

    Get PDF
    A deterministic privacy metric using non-stochastic information theory is developed. Particularly, minimax information is used to construct a measure of information leakage, which is inversely proportional to the measure of privacy. Anyone can submit a query to a trusted agent with access to a non-stochastic uncertain private dataset. Optimal deterministic privacy-preserving policies for responding to the submitted query are computed by maximizing the measure of privacy subject to a constraint on the worst-case quality of the response (i.e., the worst-case difference between the response by the agent and the output of the query computed on the private dataset). The optimal privacy-preserving policy is proved to be a piecewise constant function in the form of a quantization operator applied on the output of the submitted query. The measure of privacy is also used to analyze the performance of kk-anonymity methodology (a popular deterministic mechanism for privacy-preserving release of datasets using suppression and generalization techniques), proving that it is in fact not privacy-preserving.Comment: improved introduction and numerical exampl

    Preserving prosumer privacy in a district level smart grid

    Get PDF
    This study presents the anonymization of consumer data in a district-level smart grid using the k-anonymity approach. The data utilized in this study covers the demographic information and associated energy consumption of consumers. The anonymization process is implemented at the prosumer level, considering their importance in sharing flexibility and distributed generation at the low voltage grid, and the fact that they need to interact with each other and the grid while keeping their data private. The proposed approach is tested under three anonymization scenarios: prosecutor, journalist, and marketer. The smart grid data are investigated mostly under the prosecutor scenario with three risk levels: lowest, medium and highest. The results of the k-anonymity approach are compared to k-map and k-map + k-anonymity. No difference has been found between the three investigated approaches for the selected data set. Since, the aim of the k-anonymity is to not transform the information about any individual record among those k-1 individuals, the recorded type and the number of attributes play a key role in the anonymization process. One of the risks is the using continuous attributes in the anonymization process which may cause the information lose in the anonymization process such as near real-time energy consumptions. Hence we have focused on to anonymization of the consumers' demographic information, rather than their energy consumption

    Apriori-based algorithms for k^m-anonymizing trajectory data

    Get PDF
    The proliferation of GPS-enabled devices (e.g., smartphones and tablets) and locationbased social networks has resulted in the abundance of trajectory data. The publication of such data opens up new directions in analyzing, studying and understanding human behavior. However, it should be performed in a privacy-preserving way, because the identities of individuals, whose movement is recorded in trajectories, can be disclosed even after removing identifying information. Existing trajectory data anonymization approaches offer privacy but at a high data utility cost, since they either do not produce truthful data (an important requirement of several applications), or are limited in their privacy specification component. In this work, we propose a novel approach that overcomes these shortcomings by adapting km-anonymity to trajectory data. To realize our approach, we develop three efficient and effective anonymization algorithms that are based on the apriori principle. These algorithms aim at preserving different data characteristics, including location distance and semantic similarity, as well as user-specified utility requirements, which must be satisfied to ensure that the released data can be meaningfully analyzed. Our extensive experiments using synthetic and real datasets verify that the proposed algorithms are efficient and effective at preserving data utility

    A look ahead approach to secure multi-party protocols

    Get PDF
    Secure multi-party protocols have been proposed to enable non-colluding parties to cooperate without a trusted server. Even though such protocols prevent information disclosure other than the objective function, they are quite costly in computation and communication. Therefore, the high overhead makes it necessary for parties to estimate the utility that can be achieved as a result of the protocol beforehand. In this paper, we propose a look ahead approach, specifically for secure multi-party protocols to achieve distributed k-anonymity, which helps parties to decide if the utility benefit from the protocol is within an acceptable range before initiating the protocol. Look ahead operation is highly localized and its accuracy depends on the amount of information the parties are willing to share. Experimental results show the effectiveness of the proposed methods

    Building K-Anonymous User Cohorts with\\ Consecutive Consistent Weighted Sampling (CCWS)

    Full text link
    To retrieve personalized campaigns and creatives while protecting user privacy, digital advertising is shifting from member-based identity to cohort-based identity. Under such identity regime, an accurate and efficient cohort building algorithm is desired to group users with similar characteristics. In this paper, we propose a scalable KK-anonymous cohort building algorithm called {\em consecutive consistent weighted sampling} (CCWS). The proposed method combines the spirit of the (pp-powered) consistent weighted sampling and hierarchical clustering, so that the KK-anonymity is ensured by enforcing a lower bound on the size of cohorts. Evaluations on a LinkedIn dataset consisting of >70>70M users and ads campaigns demonstrate that CCWS achieves substantial improvements over several hashing-based methods including sign random projections (SignRP), minwise hashing (MinHash), as well as the vanilla CWS

    z-anonymity: Zero-Delay Anonymization for Data Streams

    Get PDF
    With the advent of big data and the birth of the data markets that sell personal information, individuals' privacy is of utmost importance. The classical response is anonymization, i.e., sanitizing the information that can directly or indirectly allow users' re-identification. The most popular solution in the literature is the k-anonymity. However, it is hard to achieve k-anonymity on a continuous stream of data, as well as when the number of dimensions becomes high.In this paper, we propose a novel anonymization property called z-anonymity. Differently from k-anonymity, it can be achieved with zero-delay on data streams and it is well suited for high dimensional data. The idea at the base of z-anonymity is to release an attribute (an atomic information) about a user only if at least z - 1 other users have presented the same attribute in a past time window. z-anonymity is weaker than k-anonymity since it does not work on the combinations of attributes, but treats them individually. In this paper, we present a probabilistic framework to map the z-anonymity into the k-anonymity property. Our results show that a proper choice of the z-anonymity parameters allows the data curator to likely obtain a k-anonymized dataset, with a precisely measurable probability. We also evaluate a real use case, in which we consider the website visits of a population of users and show that z-anonymity can work in practice for obtaining the k-anonymity too

    Routes for breaching and protecting genetic privacy

    Full text link
    We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.Comment: Draft for comment
    corecore