1,240 research outputs found
Routes for breaching and protecting genetic privacy
We are entering the era of ubiquitous genetic information for research,
clinical care, and personal curiosity. Sharing these datasets is vital for
rapid progress in understanding the genetic basis of human diseases. However,
one growing concern is the ability to protect the genetic privacy of the data
originators. Here, we technically map threats to genetic privacy and discuss
potential mitigation strategies for privacy-preserving dissemination of genetic
data.Comment: Draft for comment
Supporting Regularized Logistic Regression Privately and Efficiently
As one of the most popular statistical and machine learning models, logistic
regression with regularization has found wide adoption in biomedicine, social
sciences, information technology, and so on. These domains often involve data
of human subjects that are contingent upon strict privacy regulations.
Increasing concerns over data privacy make it more and more difficult to
coordinate and conduct large-scale collaborative studies, which typically rely
on cross-institution data sharing and joint analysis. Our work here focuses on
safeguarding regularized logistic regression, a widely-used machine learning
model in various disciplines while at the same time has not been investigated
from a data security and privacy perspective. We consider a common use scenario
of multi-institution collaborative studies, such as in the form of research
consortia or networks as widely seen in genetics, epidemiology, social
sciences, etc. To make our privacy-enhancing solution practical, we demonstrate
a non-conventional and computationally efficient method leveraging distributing
computing and strong cryptography to provide comprehensive protection over
individual-level and summary data. Extensive empirical evaluation on several
studies validated the privacy guarantees, efficiency and scalability of our
proposal. We also discuss the practical implications of our solution for
large-scale studies and applications from various disciplines, including
genetic and biomedical studies, smart grid, network analysis, etc
Enabling Privacy-Preserving GWAS in Heterogeneous Human Populations
The projected increase of genotyping in the clinic and the rise of large
genomic databases has led to the possibility of using patient medical data to
perform genomewide association studies (GWAS) on a larger scale and at a lower
cost than ever before. Due to privacy concerns, however, access to this data is
limited to a few trusted individuals, greatly reducing its impact on biomedical
research. Privacy preserving methods have been suggested as a way of allowing
more people access to this precious data while protecting patients. In
particular, there has been growing interest in applying the concept of
differential privacy to GWAS results. Unfortunately, previous approaches for
performing differentially private GWAS are based on rather simple statistics
that have some major limitations. In particular, they do not correct for
population stratification, a major issue when dealing with the genetically
diverse populations present in modern GWAS. To address this concern we
introduce a novel computational framework for performing GWAS that tailors
ideas from differential privacy to protect private phenotype information, while
at the same time correcting for population stratification. This framework
allows us to produce privacy preserving GWAS results based on two of the most
commonly used GWAS statistics: EIGENSTRAT and linear mixed model (LMM) based
statistics. We test our differentially private statistics, PrivSTRAT and
PrivLMM, on both simulated and real GWAS datasets and find that they are able
to protect privacy while returning meaningful GWAS results.Comment: To be presented at RECOMB 201
The Crypto-democracy and the Trustworthy
In the current architecture of the Internet, there is a strong asymmetry in
terms of power between the entities that gather and process personal data
(e.g., major Internet companies, telecom operators, cloud providers, ...) and
the individuals from which this personal data is issued. In particular,
individuals have no choice but to blindly trust that these entities will
respect their privacy and protect their personal data. In this position paper,
we address this issue by proposing an utopian crypto-democracy model based on
existing scientific achievements from the field of cryptography. More
precisely, our main objective is to show that cryptographic primitives,
including in particular secure multiparty computation, offer a practical
solution to protect privacy while minimizing the trust assumptions. In the
crypto-democracy envisioned, individuals do not have to trust a single physical
entity with their personal data but rather their data is distributed among
several institutions. Together these institutions form a virtual entity called
the Trustworthy that is responsible for the storage of this data but which can
also compute on it (provided first that all the institutions agree on this).
Finally, we also propose a realistic proof-of-concept of the Trustworthy, in
which the roles of institutions are played by universities. This
proof-of-concept would have an important impact in demonstrating the
possibilities offered by the crypto-democracy paradigm.Comment: DPM 201
Systematizing Genome Privacy Research: A Privacy-Enhancing Technologies Perspective
Rapid advances in human genomics are enabling researchers to gain a better
understanding of the role of the genome in our health and well-being,
stimulating hope for more effective and cost efficient healthcare. However,
this also prompts a number of security and privacy concerns stemming from the
distinctive characteristics of genomic data. To address them, a new research
community has emerged and produced a large number of publications and
initiatives.
In this paper, we rely on a structured methodology to contextualize and
provide a critical analysis of the current knowledge on privacy-enhancing
technologies used for testing, storing, and sharing genomic data, using a
representative sample of the work published in the past decade. We identify and
discuss limitations, technical challenges, and issues faced by the community,
focusing in particular on those that are inherently tied to the nature of the
problem and are harder for the community alone to address. Finally, we report
on the importance and difficulty of the identified challenges based on an
online survey of genome data privacy expertsComment: To appear in the Proceedings on Privacy Enhancing Technologies
(PoPETs), Vol. 2019, Issue
- âŠ