Search CORE

1,636 research outputs found

Routes for breaching and protecting genetic privacy

Author: A Acquisti
A Cavoukian
A Kong
A Machanavajjhala
A Narayanan
AD Johnson
AJ Pakstis
AK Manning
AL McGuire
Arvind Narayanan
B Fons
B Malin
B Malin
BA Malin
BM Henn
C Dwork
C Shannon
CD Huff
D Clayton
D He
D Zubakov
DJ Solve
DR Nyholt
DW Craig
EA Zerhouni
EE Schadt
EM Ramos
F Liu
G Church
H Lango Allen
H Li
HK Im
HS Venter
J Burn
J Gitschier
J Kaiser
J Kaye
J Kaye
J Lee
J Marchini
JE Lunshof
JH Park
JM Oliver
JP Roberts
K Benitez
K El Emam
K El Emam
K Silventoinen
KA Tryka
KB Jacobs
KS Kendler
L Kamm
L Sweeney
L Sweeney
LA Sweeney
LA Sweeney
LAP Kohn
LL Rodriguez
M Canim
M Gymrek
M Gymrek
M Kantarcioglu
M Kayser
MD Mailman
N Chatterjee
N Homer
NN Taleb
P Bohannon
P Kwok
P Ohm
P Paillier
PM Visscher
R Braun
R Drmanac
R Khan
R Noumeir
RL Bennett
S Byers
S McClure
S Sankararaman
S Walsh
SE Brenner
SF Terry
SH Friend
T Lumley
TE King
TE King
V Bafna
W Fu
W Hartzog
WG Hill
WW Lowrance
XL Ou
Yaniv Erlich
Z Lin
Publication venue
Publication date: 01/12/2013
Field of study

We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.Comment: Draft for comment

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

PubMed Central

Creation of public use files: lessons learned from the comparative effectiveness research public use files data pilot project

Author: Erdem Erkan
Prada Sergio I
Publication venue
Publication date: 13/09/2011
Field of study

In this paper we describe lessons learned from the creation of Basic Stand Alone (BSA) Public Use Files (PUFs) for the Comparative Effectiveness Research Public Use Files Data Pilot Project (CER-PUF). CER-PUF is aimed at increasing access to the Centers for Medicare and Medicaid Services (CMS) Medicare claims datasets through PUFs that: do not require user fees and data use agreements, have been de-identified to assure the confidentiality of the beneficiaries and providers, and still provide substantial analytic utility to researchers. For this paper we define PUFs as datasets characterized by free and unrestricted access to any user. We derive lessons learned from five major project activities: (i) a review of the statistical and computer science literature on best practices in PUF creation, (ii) interviews with comparative effectiveness researchers to assess their data needs, (iii) case studies of PUF initiatives in the United States, (iv) interviews with stakeholders to identify the most salient issues regarding making microdata publicly available, and (v) the actual process of creating the Medicare claims data BSA PUFs

Munich RePEc Personal Archive

New threats to health data privacy

Author: Chen Jake Y
Li Fengjun
Liu Peng
Zou Xukai
Publication venue: BioMed Central
Publication date: 01/11/2011
Field of study

Abstract Background Along with the rapid digitalization of health data (e.g. Electronic Health Records), there is an increasing concern on maintaining data privacy while garnering the benefits, especially when the data are required to be published for secondary use. Most of the current research on protecting health data privacy is centered around data de-identification and data anonymization, which removes the identifiable information from the published health data to prevent an adversary from reasoning about the privacy of the patients. However, published health data is not the only source that the adversaries can count on: with a large amount of information that people voluntarily share on the Web, sophisticated attacks that join disparate information pieces from multiple sources against health data privacy become practical. Limited efforts have been devoted to studying these attacks yet. Results We study how patient privacy could be compromised with the help of today’s information technologies. In particular, we show that private healthcare information could be collected by aggregating and associating disparate pieces of information from multiple online data sources including online social networks, public records and search engine results. We demonstrate a real-world case study to show user identity and privacy are highly vulnerable to the attribution, inference and aggregation attacks. We also show that people are highly identifiable to adversaries even with inaccurate information pieces about the target, with real data analysis. Conclusion We claim that too much information has been made available electronic and available online that people are very vulnerable without effective privacy protection.</p

Directory of Open Access Journals

KU ScholarWorks

PubMed Central

Challenges of web-based personal genomic data sharing

Author: Borry Pascal
Shabani Mahsa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

In order to study the relationship between genes and diseases, the increasing availability and sharing of phenotypic and genotypic data have been promoted as an imperative within the scientific community. In parallel with data sharing practices by clinicians and researchers, recent initiatives have been observed in which individuals are sharing personal genomic data. The involvement of individuals in such initiatives is facilitated by the increased accessibility of personal genomic data, offered by private test providers along with availability of online networks. Personal webpages and on-line data sharing platforms such as Consent to Research (Portable Legal Consent), Free the Data, and Genomes Unzipped are being utilized to host and share genotypes, electronic health records and family history uploaded by individuals. Although personal genomic data sharing initiatives vary in nature, the emphasis on the individuals’ control on their data in order to benefit research and ultimately health care has seen as a key theme across these initiatives. In line with the growing practice of personal genomic data sharing, this paper aims to shed light on the potential challenges surrounding these initiatives. As in the course of these initiatives individuals are solicited to individually balance the risks and benefits of sharing their genomic data, their awareness of the implications of personal genomic data sharing for themselves and their family members is a necessity. Furthermore, given the sensitivity of genomic data and the controversies around their complete de-identifiability, potential privacy risks and harms originating from unintended uses of data have to be taken into consideration

Springer - Publisher Connector

Ghent University Academic Bibliography

PubMed Central

DRM and Privacy

Author: Cohen Julie E.
Publication venue: Scholarship @ GEORGETOWN LAW
Publication date: 01/01/2003
Field of study

Interrogating the relationship between copyright enforcement and privacy raises deeper questions about the nature of privacy and what counts, or ought to count, as privacy invasion in the age of networked digital technologies. This Article begins, in Part II, by identifying the privacy interests that individuals enjoy in their intellectual activities and exploring the different ways in which certain implementations of DRM technologies may threaten those interests. Part III considers the appropriate scope of legal protection for privacy in the context of DRM, and argues that both the common law of privacy and an expanded conception of consumer protection law have roles to play in protecting the privacy of information users. As Parts II and III demonstrate, consideration of how the theory and law of privacy should respond to the development and implementation of DRM technologies also raises the reverse question: How should the development and implementation of DRM technologies respond to privacy theory and law? As artifacts designed to regulate user behavior, DRM technologies already embody value choices. Might privacy itself become one of the values embodied in DRM design? Part IV argues that with some conceptual and procedural adjustments, DRM technologies and related standard-setting processes could be harnessed to preserve and protect privacy

bepress Legal Repository

Georgetown Law Scholarly Commons

Identification, data combination and the risk of disclosure

Author: Komarova Tatiana
Nekipelov Denis
Yakovlev Evgeny
Publication venue: 'The Econometric Society'
Publication date: 01/03/2018
Field of study

It is commonplace that the data needed for econometric inference are not contained in a single source. In this paper we analyze the problem of parametric inference from combined individual-level data when data combination is based on personal and demographic identifiers such as name, age, or address. Our main question is the identification of the econometric model based on the combined data when the data do not contain exact individual identifiers and no parametric assumptions are imposed on the joint distribution of information that is common across the combined dataset. We demonstrate the conditions on the observable marginal distributions of data in individual datasets that can and cannot guarantee identification of the parameters of interest. We also note that the data combination procedure is essential in the semiparametric setting such as ours. Provided that the (non-parametric) data combination procedure can only be defined in finite samples, we introduce a new notion of identification based on the concept of limits of statistical experiments. Our results apply to the setting where the individual data used for inferences are sensitive and their combination may lead to a substantial increase in the data sensitivity or lead to a de-anonymization of the previously anonymized information. We demonstrate that the point identification of an econometric model from combined data is incompatible with restrictions on the risk of individual disclosure. If the data combination procedure guarantees a bound on the risk of individual disclosure, then the information available from the combined dataset allows one to identify the parameter of interest only partially, and the size of the identification region is inversely related to the upper bound guarantee for the disclosure risk. This result is new in the context of data combination as we notice that the quality of links that need to be used in the combined data to assure point identification may be much higher than the average link quality in the entire dataset, and thus point inference requires the use of the most sensitive subset of the data. Our results provide important insights into the ongoing discourse on the empirical analysis of merged administrative records as well as discussions on the disclosive nature of policies implemented by the data-driven companies (such as Internet services companies and medical companies using individual patient records for policy decisions

Crossref

LSE Research Online

The University of Manchester - Institutional Repository