57 research outputs found

    Utility Promises of Self-Organising Maps in Privacy Preserving Data Mining

    Get PDF
    Data mining techniques are highly efficient in sifting through big data to extract hidden knowledge and assist evidence-based decisions. However, it poses severe threats to individuals’ privacy because it can be exploited to allow inferences to be made on sensitive data. Researchers have proposed several privacy-preserving data mining techniques to address this challenge. One unique method is by extending anonymisation privacy models in data mining processes to enhance privacy and utility. Several published works in this area have utilised clustering techniques to enforce anonymisation models on private data, which work by grouping the data into clusters using a quality measure and then generalise the data in each group separately to achieve an anonymisation threshold. Although they are highly efficient and practical, however guaranteeing adequate balance between data utility and privacy protection remains a challenge. In addition to this, existing approaches do not work well with high-dimensional data, since it is difficult to develop good groupings without incurring excessive information loss. Our work aims to overcome these challenges by proposing a hybrid approach, combining self organising maps with conventional privacy based clustering algorithms. The main contribution of this paper is to show that, dimensionality reduction techniques can improve the anonymisation process by incurring less information loss, thus producing a more desirable balance between privacy and utility properties

    Privacy enhancing technologies (PETs) for connected vehicles in smart cities

    Get PDF
    This is an accepted manuscript of an article published by Wiley in Transactions on Emerging Telecommunications Technologies, available online: https://doi.org/10.1002/ett.4173 The accepted version of the publication may differ from the final published version.Many Experts believe that the Internet of Things (IoT) is a new revolution in technology that has brought many benefits for our organizations, businesses, and industries. However, information security and privacy protection are important challenges particularly for smart vehicles in smart cities that have attracted the attention of experts in this domain. Privacy Enhancing Technologies (PETs) endeavor to mitigate the risk of privacy invasions, but the literature lacks a thorough review of the approaches and techniques that support individuals' privacy in the connection between smart vehicles and smart cities. This gap has stimulated us to conduct this research with the main goal of reviewing recent privacy-enhancing technologies, approaches, taxonomy, challenges, and solutions on the application of PETs for smart vehicles in smart cities. The significant aspect of this study originates from the inclusion of data-oriented and process-oriented privacy protection. This research also identifies limitations of existing PETs, complementary technologies, and potential research directions.Published onlin

    Handbook of Mobile Data Privacy

    No full text

    Utility-aware anonymization of diagnosis codes

    No full text
    The growing need for performing large-scale and low-cost biomedical studies has led organizations to promote the reuse of patient data. For instance, the National Institutes of Health in the US requires patient-specific data collected and analyzed in the context of Genome-Wide Association Studies (GWAS) to be deposited into a biorepository and broadly disseminated. While essential to comply with regulations, disseminating such data risks privacy breaches, because patients genomic sequences can be linked to their identities through diagnosis codes. This work proposes a novel approach that prevents this type of data linkage by modifying diagnosis codes to limit the probability of associating a patients identity to their genomic sequence. Our approach employs an effective algorithm that uses generalization and suppression of diagnosis codes to preserve privacy and takes into account the intended uses of the disseminated data to guarantee utility. We also present extensive experiments using several datasets derived from the Electronic Medical Record (EMR) system of the Vanderbilt University Medical Center, as well as a large-scale case-study using the EMRs of 79K patients, which are linked to DNA contained in the Vanderbilt University biobank. Our results verify that our approach generates anonymized data that permit accurate biomedical analysis in tasks including case count studies and GWAS

    Introduction to Mobility Data Privacy

    No full text
    The recent advances in mobile computing and positioning technologies have resulted in a tremendous increase to the amount and accuracy in which human location data can be collected and processed. Human mobility traces can be used to support a number of real-world applications spanning from urban planning and traffic engineering, to studying the spread of diseases and managing environmental pollution. At the same time, research studies have shown that individual mobility is highly predictable and mostly unique, thus information about individuals\u2019 movement can be used by adversaries to re-identify them and to learn sensitive information about their whereabouts. To address such privacy concerns, a significant body of research has emerged in the last 15 years, studying privacy issues related to human mobility and location information, in a number of contexts and real-world applications. This work has led to the adoption of privacy laws worldwide, for location privacy protection, as well as to the proposal of novel privacy models and techniques for technically protecting user privacy, while maintaining data utility. This chapter provides an introduction to the field of mobility data privacy, discusses the emerging research directions, along with the real-world systems and applications that have been proposed

    Privacy in trajectory data

    No full text
    In this era of significant advances in telecommunications and GPS sensors technology, a person can be tracked down to proximity of less than 5 meters. This remarkable progress enabled the offering of services that depend on user location (the so-called location-based services-LBSs), as well as the existence of applications that analyze movement data for various purposes. However, without strict safeguards, both the deployment of LBSs and the mining of movement data come at a cost of privacy for the users, whose movement is recorded. This chapter studies privacy in both online and offline movement data. After introducing the reader to this field of study, we review state-of-the-art work for location and trajectory privacy both in LBSs and in trajectory databases. Then, we present a qualitative evaluation of these works, pointing out their strengths and weaknesses. We conclude the chapter by providing our point of view regarding the future trends in trajectory data privacy. © 2009, IGI Global

    Concealing the position of individuals in location-based services

    No full text
    The offering of location based services requires an in- depth knowledge of the subscriber's whereabouts. Thus, without the existence of strict safeguards, the deployment of such services may easily breach user privacy. To address this issue, special algorithms are necessary that anonymize user location information prior to its release to the service provider of the telecom operator. In this paper, we extend existing work in historical K- anonymity (1) by considering an underlying network of user movement and (2) by pushing the core functionality of the anonymizer into a spatiotemporal DBMS. The proposed scheme allows each individual to specify his/her anonymity requirements, involving a series of spatiotemporal regions that are considered as unsafe with respect to his/her privacy. When the user requests an LBS from within one of his unsafe regions, the anonymizer performs a spatial along with a temporal generalization of his request in order to protect the user's privacy. If the generalization algorithm fails to provide the necessary anonymity, the system dynamically constructs a mix- zone around the requester with the aim of unlinking his future requests from the previous ones. As the experimental results indicate, by utilizing the spatiotemporal capabilities of the used DBMS, the performance of the anonymizer improves when compared to existing work in historical K- anonymity
    corecore