4,009 research outputs found

    Semantic privacy-preserving framework for electronic health record linkage

    Get PDF
    The combination of digitized health information and web-based technologies offers many possibilities for data analysis and business intelligence. In the healthcare and biomedical research domain, applications depending on electronic health records (EHRs) identify privacy preservation as a major concern. Existing solutions cannot always satisfy the evolving research demands such as linking patient records across organizational boundaries due to the potential for patient re-identification. In this work, we show how semantic methods can be applied to support the formulation and enforcement of access control policy whilst ensuring that privacy leakage can be detected and prevented. The work is illustrated through a case study associated with the Australasian Diabetes Data Network (ADDN – www.addn.org.au), the national paediatric type-1 diabetes data registry, and the Australian Urban Research Infrastructure Network (AURIN – www.aurin.org.au) platform that supports Australia-wide access to urban and built environment data sets. We demonstrate that through extending the eXtensible Access Control Markup Language (XACML) with semantic capabilities, finer-grained access control encompassing data risk disclosure mechanisms can be supported. We discuss the contributions that can be made using this approach to socio-economic development and political management within business systems, and especially those situations where secure data access and data linkage is required

    Semantic-based Privacy-preserving Record Linkage.

    Get PDF
    Introduction Sharing aggregated electronic health records (EHRs) for integrated health care and public health studies is increasingly demanded. Patient privacy demands that anonymisation procedures are in place for data sharing. Objective Traditional methods such as k-anonymity and its derivations are often overgeneralising resulting in lower data accuracy. To tackle this issue, we proposed the Semantic Linkage K-Anonymity (SLKA) approach to balance the privacy and utility preservation through detecting risky combinations hidden in the record linkage releases. Approach K-anonymity processing quasi-identifiers of data may lead to ‘over generalisation’ when dealing with linkage data sets. As most linkage cases do not include all local patients and thus not all modifying data for privacy-preserving purposes needs to be used, we proposed the linkage k-anonymity (LKA) by which only obfuscated individuals in a released linkage set are required to be indistinguishable from at least k-1 other individuals in the local dataset. Considering the inference disclosure issue, we further designed the semantic-based linkage k-anonymity (SLKA) method through extending with a semantic-rule base for automatic detection of (and ruling out) risky associations from previous linked data releases. Specially, associations identified from the “previous releases” of the linkage dataset can become the input of semantic reasoning for the “next release”. Results The approach is evaluated based on a linkage scenario where researchers apply to link data from an Australia-wide national type-1 diabetes platform with survey results from 25,000+ Victorians about their health and wellbeing. In comparing the information loss of three methods, we find that extra cost can be incurred in SLKA for dealing with risky individuals, e.g., 13.7% vs 5.9% (LKA, k=4) however it performs much better than k-anonymity, which can cause 24% information loss (k=4). Besides, the k values can affect the level of distortion in SLKA, such as 11.5% (k=2) vs 12.9% (k=3). Conclusion The SLKA framework provides dynamic protection for repeated linkage releases while preserving data utility by avoiding unnecessary generalisation as typified by k-anonymity

    Privacy-Preserving Access Control in Electronic Health Record Linkage

    Get PDF
    Sharing aggregated electronic health records (EHRs) for integrated health care and public health studies is increasingly demanded. Patient privacy demands that anonymisation procedures are in place for data sharing. However traditional methods such as k-anonymity and its derivations are often over-generalizing resulting in lower data accuracy. To tackle this issue, we present the Semantic Linkage K-Anonymity (SLKA) approach supporting ongoing record linkages. We show how SLKA balances privacy and utility preservation through detecting risky combinations hidden in data releases

    Semantic-Based Policy Composition for Privacy-Demanding Data Linkage

    Get PDF
    Record linkage can be used to support current and future health research across populations however such approaches give rise to many challenges related to patient privacy and confidentiality including inference attacks. To address this, we present a semantic-based policy framework where linkage privacy detects attribute associations that can lead to inference disclosure issues. To illustrate the effectiveness of the approach, we present a case study exploring health data combining spatial, ethnicity and language information from several major on-going projects occurring across Australia. Compared with classic access control models, the results show that our proposal outperforms other approaches with regards to effectiveness, reliability and subsequent data utility

    A Taxonomy of Privacy-Preserving Record Linkage Techniques

    Get PDF
    The process of identifying which records in two or more databases correspond to the same entity is an important aspect of data quality activities such as data pre-processing and data integration. Known as record linkage, data matching or entity resolution, this process has attracted interest from researchers in fields such as databases and data warehousing, data mining, information systems, and machine learning. Record linkage has various challenges, including scalability to large databases, accurate matching and classification, and privacy and confidentiality. The latter challenge arises because commonly personal identifying data, such as names, addresses and dates of birth of individuals, are used in the linkage process. When databases are linked across organizations, the issue of how to protect the privacy and confidentiality of such sensitive information is crucial to successful application of record linkage. In this paper we present an overview of techniques that allow the linking of databases between organizations while at the same time preserving the privacy of these data. Known as 'privacy-preserving record linkage' (PPRL), various such techniques have been developed. We present a taxonomy of PPRL techniques to characterize these techniques along 15 dimensions, and conduct a survey of PPRL techniques. We then highlight shortcomings of current techniques and discuss avenues for future research

    A Survey on Privacy in Human Mobility

    Get PDF
    In the last years we have witnessed a pervasive use of location-aware technologies such as vehicular GPS-enabled devices, RFID based tools, mobile phones, etc which generate collection and storing of a large amount of human mobility data. The powerful of this data has been recognized by both the scientific community and the industrial worlds. Human mobility data can be used for different scopes such as urban traffic management, urban planning, urban pollution estimation, etc. Unfortunately, data describing human mobility is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database and the access to the places visited by individuals may enable the inference of sensitive information such as religious belief, sexual preferences, health conditions, and so on. The literature reports many approaches aimed at overcoming privacy issues in mobility data, thus in this survey we discuss the advancements on privacy-preserving mobility data publishing. We first describe the adversarial attack and privacy models typically taken into consideration for mobility data, then we present frameworks for the privacy risk assessment and finally, we discuss three main categories of privacy-preserving strategies: methods based on anonymization of mobility data, methods based on the differential privacy models and methods which protect privacy by exploiting generative models for synthetic trajectory generation
    corecore