1,116 research outputs found
Development and Analysis of Deterministic Privacy-Preserving Policies Using Non-Stochastic Information Theory
A deterministic privacy metric using non-stochastic information theory is
developed. Particularly, minimax information is used to construct a measure of
information leakage, which is inversely proportional to the measure of privacy.
Anyone can submit a query to a trusted agent with access to a non-stochastic
uncertain private dataset. Optimal deterministic privacy-preserving policies
for responding to the submitted query are computed by maximizing the measure of
privacy subject to a constraint on the worst-case quality of the response
(i.e., the worst-case difference between the response by the agent and the
output of the query computed on the private dataset). The optimal
privacy-preserving policy is proved to be a piecewise constant function in the
form of a quantization operator applied on the output of the submitted query.
The measure of privacy is also used to analyze the performance of -anonymity
methodology (a popular deterministic mechanism for privacy-preserving release
of datasets using suppression and generalization techniques), proving that it
is in fact not privacy-preserving.Comment: improved introduction and numerical exampl
Preserving prosumer privacy in a district level smart grid
This study presents the anonymization of consumer data in a district-level smart grid using the k-anonymity approach. The data utilized in this study covers the demographic information and associated energy consumption of consumers. The anonymization process is implemented at the prosumer level, considering their importance in sharing flexibility and distributed generation at the low voltage grid, and the fact that they need to interact with each other and the grid while keeping their data private. The proposed approach is tested under three anonymization scenarios: prosecutor, journalist, and marketer. The smart grid data are investigated mostly under the prosecutor scenario with three risk levels: lowest, medium and highest. The results of the k-anonymity approach are compared to k-map and k-map + k-anonymity. No difference has been found between the three investigated approaches for the selected data set. Since, the aim of the k-anonymity is to not transform the information about any individual record among those k-1 individuals, the recorded type and the number of attributes play a key role in the anonymization process. One of the risks is the using continuous attributes in the anonymization process which may cause the information lose in the anonymization process such as near real-time energy consumptions. Hence we have focused on to anonymization of the consumers' demographic information, rather than their energy consumption
Apriori-based algorithms for k^m-anonymizing trajectory data
The proliferation of GPS-enabled devices (e.g., smartphones and tablets) and locationbased social networks has resulted in the abundance of trajectory data. The publication of such data opens up new directions in analyzing, studying and understanding human behavior. However, it should be performed in a privacy-preserving way, because the identities of individuals, whose movement is recorded in trajectories, can be disclosed even after removing identifying information. Existing trajectory data anonymization approaches offer privacy but at a high data utility cost, since they either do not produce truthful data (an important requirement of several applications), or are limited in their privacy specification component. In this work, we propose a novel approach that overcomes these shortcomings by adapting km-anonymity to trajectory data. To realize our approach, we develop three efficient and effective anonymization algorithms that are based on the apriori principle. These algorithms aim at preserving different data characteristics, including location distance and semantic similarity, as well as user-specified utility requirements, which must be satisfied to ensure that the released data can be meaningfully analyzed. Our extensive experiments using synthetic and real datasets verify that the proposed algorithms are efficient and effective at preserving data utility
Recommended from our members
Patient privacy protection using anonymous access control techniques
Objective: The objective of this study is to develop a solution to preserve security and privacy in a healthcare environment where health-sensitive information will be accessed by many parties and stored in various distributed databases. The solution should maintain anonymous medical records and it should be able to link anonymous medical information in distributed databases into a single patient medical record with the patient identity. Methods: In this paper we present a protocol that can be used to authenticate and authorize patients to healthcare services without providing the patient identification. Healthcare service can identify the patient using separate temporary identities in each identification session and medical records are linked to these temporary identities. Temporary identities can be used to enable record linkage and reverse track real patient identity in critical medical situations. Results: The proposed protocol provides main security and privacy services such as user anonymity, message privacy, message confidentiality, user authentication, user authorization and message replay attacks. The medical environment validates the patient at the healthcare service as a real and registered patient for the medical services. Using the proposed protocol, the patient anonymous medical records at different healthcare services can be linked into one single report and it is possible to securely reverse track anonymous patient into the real identity. Conclusion: The protocol protects the patient privacy with a secure anonymous authentication to healthcare services and medical record registries according to the European and the UK legislations, where the patient real identity is not disclosed with the distributed patient medical records
A look ahead approach to secure multi-party protocols
Secure multi-party protocols have been proposed to enable non-colluding parties to cooperate without a trusted server. Even though such protocols prevent information disclosure other than the objective function, they are quite costly
in computation and communication. Therefore, the high overhead makes it necessary for parties to estimate the utility that can be achieved as a result of the protocol beforehand. In this paper, we propose a look ahead approach, specifically for secure multi-party protocols to achieve distributed
k-anonymity, which helps parties to decide if the utility benefit from the protocol is within an acceptable range before initiating the protocol. Look ahead operation is highly localized and its accuracy depends on the amount of information the parties are willing to share. Experimental results show
the effectiveness of the proposed methods
Building K-Anonymous User Cohorts with\\ Consecutive Consistent Weighted Sampling (CCWS)
To retrieve personalized campaigns and creatives while protecting user
privacy, digital advertising is shifting from member-based identity to
cohort-based identity. Under such identity regime, an accurate and efficient
cohort building algorithm is desired to group users with similar
characteristics. In this paper, we propose a scalable -anonymous cohort
building algorithm called {\em consecutive consistent weighted sampling}
(CCWS). The proposed method combines the spirit of the (-powered) consistent
weighted sampling and hierarchical clustering, so that the -anonymity is
ensured by enforcing a lower bound on the size of cohorts. Evaluations on a
LinkedIn dataset consisting of M users and ads campaigns demonstrate that
CCWS achieves substantial improvements over several hashing-based methods
including sign random projections (SignRP), minwise hashing (MinHash), as well
as the vanilla CWS
z-anonymity: Zero-Delay Anonymization for Data Streams
With the advent of big data and the birth of the data markets that sell
personal information, individuals' privacy is of utmost importance. The
classical response is anonymization, i.e., sanitizing the information that can
directly or indirectly allow users' re-identification. The most popular
solution in the literature is the k-anonymity. However, it is hard to achieve
k-anonymity on a continuous stream of data, as well as when the number of
dimensions becomes high.In this paper, we propose a novel anonymization
property called z-anonymity. Differently from k-anonymity, it can be achieved
with zero-delay on data streams and it is well suited for high dimensional
data. The idea at the base of z-anonymity is to release an attribute (an atomic
information) about a user only if at least z - 1 other users have presented the
same attribute in a past time window. z-anonymity is weaker than k-anonymity
since it does not work on the combinations of attributes, but treats them
individually. In this paper, we present a probabilistic framework to map the
z-anonymity into the k-anonymity property. Our results show that a proper
choice of the z-anonymity parameters allows the data curator to likely obtain a
k-anonymized dataset, with a precisely measurable probability. We also evaluate
a real use case, in which we consider the website visits of a population of
users and show that z-anonymity can work in practice for obtaining the
k-anonymity too
Routes for breaching and protecting genetic privacy
We are entering the era of ubiquitous genetic information for research,
clinical care, and personal curiosity. Sharing these datasets is vital for
rapid progress in understanding the genetic basis of human diseases. However,
one growing concern is the ability to protect the genetic privacy of the data
originators. Here, we technically map threats to genetic privacy and discuss
potential mitigation strategies for privacy-preserving dissemination of genetic
data.Comment: Draft for comment
- …