163 research outputs found

    Preventing Unintended Disclosure of Personally Identifiable Data Following Anonymisation

    Get PDF
    Errors and anomalies during the capture and processing of health data have the potential to place personally identifiable values into attributes of a dataset that are expected to contain non-identifiable values. Anonymisation focuses on those attributes that have been judged to enable identification of individuals. Attributes that are judged to contain non-identifiable values are not considered, but may be included in datasets that are shared by organisations. Consequently, organisations are at risk of sharing datasets that unintendedly disclose personally identifiable values through these attributes. This would have ethical and legal implications for organisations and privacy implications for individuals whose personally identifiable values are disclosed. In this paper, we formulate the problem of unintended disclosure following anonymisation, describe the necessary steps to address this problem, and discuss some key challenges to applying these steps in practice

    Towards privacy protection in a middleware for context-awareness

    Get PDF
    Privacy is recognized as a fundamental issue for the provision of context-aware services. In this paper we present work in progress regarding the definition of a comprehensive framework for supporting context-aware services while protecting users' privacy. Our proposal is based on a combination of mechanisms for enforcing context-aware privacy policies and k-anonymity. Moreover, our proposed technique involves the use of stereotypes for generalizing precise identity information to the aim of protecting users' privacy

    Link Prediction by De-anonymization: How We Won the Kaggle Social Network Challenge

    Full text link
    This paper describes the winning entry to the IJCNN 2011 Social Network Challenge run by Kaggle.com. The goal of the contest was to promote research on real-world link prediction, and the dataset was a graph obtained by crawling the popular Flickr social photo sharing website, with user identities scrubbed. By de-anonymizing much of the competition test set using our own Flickr crawl, we were able to effectively game the competition. Our attack represents a new application of de-anonymization to gaming machine learning contests, suggesting changes in how future competitions should be run. We introduce a new simulated annealing-based weighted graph matching algorithm for the seeding step of de-anonymization. We also show how to combine de-anonymization with link prediction---the latter is required to achieve good performance on the portion of the test set not de-anonymized---for example by training the predictor on the de-anonymized portion of the test set, and combining probabilistic predictions from de-anonymization and link prediction.Comment: 11 pages, 13 figures; submitted to IJCNN'201

    Investigations in Privacy Preserving Data Mining

    Get PDF
    Data Mining, Data Sharing and Privacy-Preserving are fast emerging as a field of the high level of the research study. A close review of the research based on Privacy Preserving Data Mining revealed the twin fold problems, first is the protection of private data (Data Hiding in Database) and second is the protection of sensitive rules (Knowledge) ingrained in data (Knowledge Hiding in the database). The first problem has its impetus on how to obtain accurate results even when private data is concealed. The second issue focuses on how to protect sensitive association rule contained in the database from being discovered, while non-sensitive association rules can still be mined with traditional data mining projects. Undoubtedly, performance is a major concern with knowledge hiding techniques. This paper focuses on the description of approaches for Knowledge Hiding in the database as well as discuss issues and challenges about the development of an integrated solution for Data Hiding in Database and Knowledge Hiding in Database. This study also highlights directions for the future studies so that suggestive pragmatic measures can be incorporated in ongoing research process on hiding sensitive association rules
    • …
    corecore