163 research outputs found
Preventing Unintended Disclosure of Personally Identifiable Data Following Anonymisation
Errors and anomalies during the capture and processing of health data have the potential to place personally identifiable values into attributes of a dataset that are expected to contain non-identifiable values. Anonymisation focuses on those attributes that have been judged to enable identification of individuals. Attributes that are judged to contain non-identifiable values are not considered, but may be included in datasets that are shared by organisations. Consequently, organisations are at risk of sharing datasets that unintendedly disclose personally identifiable values through these attributes. This would have ethical and legal implications for organisations and privacy implications for individuals whose personally identifiable values are disclosed. In this paper, we formulate the problem of unintended disclosure following anonymisation, describe the necessary steps to address this problem, and discuss some key challenges to applying these steps in practice
Towards privacy protection in a middleware for context-awareness
Privacy is recognized as a fundamental issue for the provision of context-aware services. In this paper we present work in progress regarding the definition of a comprehensive framework for supporting context-aware services while protecting users' privacy. Our proposal is based on a combination of mechanisms for enforcing context-aware privacy policies and k-anonymity. Moreover, our proposed technique involves the use of stereotypes for generalizing precise identity information to the aim of protecting users' privacy
Link Prediction by De-anonymization: How We Won the Kaggle Social Network Challenge
This paper describes the winning entry to the IJCNN 2011 Social Network
Challenge run by Kaggle.com. The goal of the contest was to promote research on
real-world link prediction, and the dataset was a graph obtained by crawling
the popular Flickr social photo sharing website, with user identities scrubbed.
By de-anonymizing much of the competition test set using our own Flickr crawl,
we were able to effectively game the competition. Our attack represents a new
application of de-anonymization to gaming machine learning contests, suggesting
changes in how future competitions should be run.
We introduce a new simulated annealing-based weighted graph matching
algorithm for the seeding step of de-anonymization. We also show how to combine
de-anonymization with link prediction---the latter is required to achieve good
performance on the portion of the test set not de-anonymized---for example by
training the predictor on the de-anonymized portion of the test set, and
combining probabilistic predictions from de-anonymization and link prediction.Comment: 11 pages, 13 figures; submitted to IJCNN'201
Investigations in Privacy Preserving Data Mining
Data Mining, Data Sharing and Privacy-Preserving are fast emerging as a field of the high level of the research study. A close review of the research based on Privacy Preserving Data Mining revealed the twin fold problems, first is the protection of private data (Data Hiding in Database) and second is the protection of sensitive rules (Knowledge) ingrained in data (Knowledge Hiding in the database). The first problem has its impetus on how to obtain accurate results even when private data is concealed. The second issue focuses on how to protect sensitive association rule contained in the database from being discovered, while non-sensitive association rules can still be mined with traditional data mining projects. Undoubtedly, performance is a major concern with knowledge hiding techniques. This paper focuses on the description of approaches for Knowledge Hiding in the database as well as discuss issues and challenges about the development of an integrated solution for Data Hiding in Database and Knowledge Hiding in Database. This study also highlights directions for the future studies so that suggestive pragmatic measures can be incorporated in ongoing research process on hiding sensitive association rules
- …