Search CORE

6 research outputs found

Evaluating privacy-preserving record linkage using cryptographic long-term keys and multibit trees on large medical datasets.

Author: A McCallum
Adrian P. Brown
Christian Borgs
CJ Bradley
D Karapiperis
D Rosman
D Vatsalan
DP Jutte
E Durham
EA Durham
EL Brook
F Niedermeyer
G Lawrence
GH Shah
IA Binswanger
J Smith
JH Boyd
JJ Trinckes
JMM Evans
M Kroll
M Kuzu
M Kuzu
MA Hernández
MG Maxfield
P Christen
R Schnell
R Schnell
R Schnell
R Schnell
R Schnell
R Schnell
Rainer Schnell
SA McDonald
Sean M. Randall
SM Randall
SM Randall
TG Kristensen
TL Dassanayake
TN Herzog
Z Wan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background: Integrating medical data using databases from different sources by record linkage is a powerful technique increasingly used in medical research. Under many jurisdictions, unique personal identifiers needed for linking the records are unavailable. Since sensitive attributes, such as names, have to be used instead, privacy regulations usually demand encrypting these identifiers. The corresponding set of techniques for privacy-preserving record linkage (PPRL) has received widespread attention. One recent method is based on Bloom filters. Due to superior resilience against cryptographic attacks, composite Bloom filters (cryptographic long-term keys, CLKs) are considered best practice for privacy in PPRL. Real-world performance of these techniques using large-scale data is unknown up to now. Methods: Using a large subset of Australian hospital admission data, we tested the performance of an innovative PPRL technique (CLKs using multibit trees) against a gold-standard derived from clear-text probabilistic record linkage. Linkage time and linkage quality (recall, precision and F-measure) were evaluated. Results: Clear text probabilistic linkage resulted in marginally higher precision and recall than CLKs. PPRL required more computing time but 5 million records could still be de-duplicated within one day. However, the PPRL approach required fine tuning of parameters. Conclusions: We argue that increased privacy of PPRL comes with the price of small losses in precision and recall and a large increase in computational burden and setup time. These costs seem to be acceptable in most applied settings, but they have to be considered in the decision to apply PPRL. Further research on the optimal automatic choice of parameters is needed

Crossref

Directory of Open Access Journals

espace@Curtin

SFour: A Protocol for Cryptographically Secure Record Linkage at Scale

Author: Khurram Muhammad Basit
Publication venue: 'University of Waterloo'
Publication date: 17/09/2019
Field of study

The prevalence of various (and increasingly large) datasets presents the challenging problem of discovering common entities dispersed across disparate datasets. Solutions to the private record linkage problem (PRL) aim to enable such explorations of datasets in a secure manner. A two-party PRL protocol allows two parties to determine for which entities they each possess a record (either an exact matching record or a fuzzy matching record) in their respective datasets — without revealing to one another information about any entities for which they do not both possess records. Although several solutions have been proposed to solve the PRL problem, no current solution offers a fully cryptographic security guarantee while maintaining both high accuracy of output and subquadratic runtime efficiency. To this end, we propose the first known efficient PRL protocol that runs in subquadratic time, provides high accuracy, and guarantees cryptographic security

University of Waterloo's Institutional Repository

A Scalable Blocking Framework for Multidatabase Privacy-preserving Record Linkage

Author: Ranbaduge Thilina
Publication venue
Publication date
Field of study

Today many application domains, such as national statistics, healthcare, business analytic, fraud detection, and national security, require data to be integrated from multiple databases. Record linkage (RL) is a process used in data integration which links multiple databases to identify matching records that belong to the same entity. RL enriches the usefulness of data by removing duplicates, errors, and inconsistencies which improves the effectiveness of decision making in data analytic applications. Often, organisations are not willing or authorised to share the sensitive information in their databases with any other party due to privacy and confidentiality regulations. The linkage of databases of different organisations is an emerging research area known as privacy-preserving record linkage (PPRL). PPRL facilitates the linkage of databases by ensuring the privacy of the entities in these databases. In multidatabase (MD) context, PPRL is significantly challenged by the intrinsic exponential growth in the number of potential record pair comparisons. Such linkage often requires significant time and computational resources to produce the resulting matching sets of records. Due to increased risk of collusion, preserving the privacy of the data is more problematic with an increase of number of parties involved in the linkage process. Blocking is commonly used to scale the linkage of large databases. The aim of blocking is to remove those record pairs that correspond to non-matches (refer to different entities). Many techniques have been proposed for RL and PPRL for blocking two databases. However, many of these techniques are not suitable for blocking multiple databases. This creates a need to develop blocking technique for the multidatabase linkage context as real-world applications increasingly require more than two databases. This thesis is the first to conduct extensive research on blocking for multidatabase privacy-preserved record linkage (MD-PPRL). We consider several research problems in blocking of MD-PPRL. First, we start with a broad background literature on PPRL. This allow us to identify the main research gaps that need to be investigated in MD-PPRL. Second, we introduce a blocking framework for MD-PPRL which provides more flexibility and control to database owners in the block generation process. Third, we propose different techniques that are used in our framework for (1) blocking of multiple databases, (2) identifying blocks that need to be compared across subgroups of these databases, and (3) filtering redundant record pair comparisons by the efficient scheduling of block comparisons to improve the scalability of MD-PPRL. Each of these techniques covers an important aspect of blocking in real-world MD-PPRL applications. Finally, this thesis reports on an extensive evaluation of the combined application of these methods with real datasets, which illustrates that they outperform existing approaches in term of scalability, accuracy, and privacy

The Australian National University

A Comparison of Statistical Linkage Keys with Bloom Filter-based Encryptions for Privacy-preserving Record Linkage using Real-world Mammography Data

Author: Borgs Christian
Richter A.
Schnell R.
Publication venue: 'Scitepress'
Publication date: 01/01/2017
Field of study

espace@Curtin

Performance Analysis For Wireless G (IEEE 802.11 G) And Wireless N (IEEE 802.11 N) In Outdoor Environment

Author: - Suzi Iryanti Fadilah
Abal Abas Zuraida
Abd Wahab Mohd Helmy
Hashim Wan Nur Wahidah
Shibghatullah Abdul Samad
Publication venue
Publication date: 01/01/2014
Field of study

This paper described an analysis the different capabilities and limitation of both IEEE technologies that has been utilized for data transmission directed to mobile device. In this work, we have compared an IEEE 802.11/g/n outdoor environment to know what technology is better. the comparison consider on coverage area (mobility), through put and measuring the interferences. The work presented here is to help the researchers to select the best technology depending of their deploying case, and investigate the best variant for outdoor. The tool used is Iperf software which is to measure the data transmission performance of IEEE 802.11n and IEEE 802.11g

Universiti Teknikal Malaysia Melaka (UTeM) Repository

Performance analysis for wireless G (IEEE 802.11G) and wireless N (IEEE 802.11N) in outdoor environment

Author: Abal Abas Zuraida
Abd Wahab Mohd Helmy
Fadilah Suri Iryani
Hashim Wan Nur Wahidah
Shibghatullah Abdul Samad
Publication venue
Publication date: 24/06/2014
Field of study

This paper described an analysis the different capabilities and limitation of both IEEE technologies that has been utilized for data transmission directed to mobile device. In this work, we have compared an IEEE 802.11/g/n outdoor environment to know what technology is better. The comparison consider on coverage area (mobility), throughput and measuring the interferences. The work presented here is to help the researchers to select the best technology depending of their deploying case, and investigate the best variant for outdoor. The tool used is Iperf software which is to measure the data transmission performance of IEEE 802.11n and IEEE 802.11g

Universiti Teknikal Malaysia Melaka (UTeM) Repository