Search CORE

729 research outputs found

Systematizing Genome Privacy Research: A Privacy-Enhancing Technologies Perspective

Author: De Cristofaro Emiliano
Malin Bradley
Mittos Alexandros
Publication venue
Publication date: 17/08/2018
Field of study

Rapid advances in human genomics are enabling researchers to gain a better understanding of the role of the genome in our health and well-being, stimulating hope for more effective and cost efficient healthcare. However, this also prompts a number of security and privacy concerns stemming from the distinctive characteristics of genomic data. To address them, a new research community has emerged and produced a large number of publications and initiatives. In this paper, we rely on a structured methodology to contextualize and provide a critical analysis of the current knowledge on privacy-enhancing technologies used for testing, storing, and sharing genomic data, using a representative sample of the work published in the past decade. We identify and discuss limitations, technical challenges, and issues faced by the community, focusing in particular on those that are inherently tied to the nature of the problem and are harder for the community alone to address. Finally, we report on the importance and difficulty of the identified challenges based on an online survey of genome data privacy expertsComment: To appear in the Proceedings on Privacy Enhancing Technologies (PoPETs), Vol. 2019, Issue

arXiv.org e-Print Archive

UCL Discovery

Privacy-Preserving Genetic Relatedness Test

Author: De Cristofaro Emiliano
Liang Kaitai
Zhang Yuruo
Publication venue
Publication date: 09/11/2016
Field of study

An increasing number of individuals are turning to Direct-To-Consumer (DTC) genetic testing to learn about their predisposition to diseases, traits, and/or ancestry. DTC companies like 23andme and Ancestry.com have started to offer popular and affordable ancestry and genealogy tests, with services allowing users to find unknown relatives and long-distant cousins. Naturally, access and possible dissemination of genetic data prompts serious privacy concerns, thus motivating the need to design efficient primitives supporting private genetic tests. In this paper, we present an effective protocol for privacy-preserving genetic relatedness test (PPGRT), enabling a cloud server to run relatedness tests on input an encrypted genetic database and a test facility's encrypted genetic sample. We reduce the test to a data matching problem and perform it, privately, using searchable encryption. Finally, a performance evaluation of hamming distance based PP-GRT attests to the practicality of our proposals.Comment: A preliminary version of this paper appears in the Proceedings of the 3rd International Workshop on Genome Privacy and Security (GenoPri'16

arXiv.org e-Print Archive

UCL Discovery

Efficient Privacy-preserving Whole-Genome Variant Queries

Author: Agkün M.
Kohlbacher O.
Pfeifer N.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2022
Field of study

MOTIVATION: Diagnosis and treatment decisions on genomic data have become widespread as the cost of genome sequencing decreases gradually. In this context, disease–gene association studies are of great importance. However, genomic data are very sensitive when compared to other data types and contains information about individuals and their relatives. Many studies have shown that this information can be obtained from the query-response pairs on genomic databases. In this work, we propose a method that uses secure multi-party computation to query genomic databases in a privacy-protected manner. The proposed solution privately outsources genomic data from arbitrarily many sources to the two non-colluding proxies and allows genomic databases to be safely stored in semi-honest cloud environments. It provides data privacy, query privacy and output privacy by using XOR-based sharing and unlike previous solutions, it allows queries to run efficiently on hundreds of thousands of genomic data. RESULTS: We measure the performance of our solution with parameters similar to real-world applications. It is possible to query a genomic database with 3 000 000 variants with five genomic query predicates under 400 ms. Querying 1 048 576 genomes, each containing 1 000 000 variants, for the presence of five different query variants can be achieved approximately in 6 min with a small amount of dedicated hardware and connectivity. These execution times are in the right range to enable real-world applications in medical research and healthcare. Unlike previous studies, it is possible to query multiple databases with response times fast enough for practical application. To the best of our knowledge, this is the first solution that provides this performance for querying large-scale genomic data. AVAILABILITY AND IMPLEMENTATION: https://gitlab.com/DIFUTURE/privacy-preserving-variant-queries. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

PubMed Central

Publikationsserver der Universität Tübingen

MPG.PuRe

Secure Similar Sequence Query on Outsourced Genomic Data

Author: Asharov Gilad
Atlas Cancer Genome
Demmler Daniel
Elmehdwi Yousef
Emiliano De Cristofaro
Huang Yan
Ishai Yuval
Liu An
Mohassel P.
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2018
Field of study

The growing availability of genomic data is unlocking research potentials on genomic-data analysis. It is of great importance to outsource the genomic-analysis tasks onto clouds to leverage their powerful computational resources over the large-scale genomic sequences. However, the remote placement of the data raises personal-privacy concerns, and it is challenging to evaluate data-analysis functions on outsourced genomic data securely and efficiently. In this work, we study the secure similar-sequence-query (SSQ) problem over outsourced genomic data, which has not been fully investigated. To address the challenges of security and efficiency, we propose two protocols in the mixed form, which combine two-party secure secret sharing, garbled circuit, and partial homomorphic encryptions together and use them to jointly fulfill the secure SSQ function. In addition, our protocols support multi-user queries over a joint genomic data set collected from multiple data owners, making our solution scalable. We formally prove the security of protocols under the semi-honest adversary model, and theoretically analyze the performance. We use extensive experiments over real-world dataset on a commercial cloud platform to validate the efficacy of our proposed solution, and demonstrate the performance improvements compared with state-of-the-art works

Crossref

Boise State University - ScholarWorks

Privacy in the Genomic Era

Author: Ayday Erman
Clayton Ellen W.
Fellay Jacques
Gunter Carl A.
Hubaux Jean-Pierre
Malin Bradley A.
Naveed Muhammad
Wang XiaoFeng
Publication venue
Publication date: 01/01/2015
Field of study

Genome sequencing technology has advanced at a rapid pace and it is now possible to generate highly-detailed genotypes inexpensively. The collection and analysis of such data has the potential to support various applications, including personalized medical services. While the benefits of the genomics revolution are trumpeted by the biomedical community, the increased availability of such data has major implications for personal privacy; notably because the genome has certain essential features, which include (but are not limited to) (i) an association with traits and certain diseases, (ii) identification capability (e.g., forensics), and (iii) revelation of family relationships. Moreover, direct-to-consumer DNA testing increases the likelihood that genome data will be made available in less regulated environments, such as the Internet and for-profit companies. The problem of genome data privacy thus resides at the crossroads of computer science, medicine, and public policy. While the computer scientists have addressed data privacy for various data types, there has been less attention dedicated to genomic data. Thus, the goal of this paper is to provide a systematization of knowledge for the computer science community. In doing so, we address some of the (sometimes erroneous) beliefs of this field and we report on a survey we conducted about genome data privacy with biomedical specialists. Then, after characterizing the genome privacy problem, we review the state-of-the-art regarding privacy attacks on genomic data and strategies for mitigating such attacks, as well as contextualizing these attacks from the perspective of medicine and public policy. This paper concludes with an enumeration of the challenges for genome data privacy and presents a framework to systematize the analysis of threats and the design of countermeasures as the field moves forward

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Bilkent University Institutional Repository

PubMed Central

Cryptology ePrint Archive

Protecting genomic data analytics in the cloud: state of the art and opportunities

Author
Publication venue: BioMed Central
Publication date
Field of study

Springer - Publisher Connector

EPISODE: Efficient Privacy-PreservIng Similar Sequence Queries on Outsourced Genomic DatabasEs

Author: Oleksandr Tkachenko
Thomas Schneider
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 12/01/2021
Field of study

Nowadays, genomic sequencing has become much more affordable for many people and, thus, many people own their genomic data in a digital format. Having paid for genomic sequencing, they want to make use of their data for different tasks that are possible only using genomics, and they share their data with third parties to achieve these tasks, e.g., to find their relatives in a genomic database. As a consequence, more genomic data get collected worldwide. The upside of the data collection is that unique analyses on these data become possible. However, this raises privacy concerns because the genomic data uniquely identify their owner, contain sensitive data about his/her risk for getting particular diseases, and even sensitive information about his/her family members. In this paper, we introduce EPISODE - a highly efficient privacy-preserving protocol for Similar Sequence Queries (SSQs), which can be used for finding genetically similar individuals in an outsourced genomic database, i.e., securely aggregated from data of multiple institutions. Our SSQ protocol is based on the edit distance approximation by Asharov et al. (PETS\u2718), which we further optimize and extend to the outsourcing scenario. We improve their protocol by using more efficient building blocks and achieve a 5-6x run-time improvement compared to their work in the same two-party scenario. Recently, Cheng et al. (ASIACCS\u2718) introduced protocols for outsourced SSQs that rely on homomorphic encryption. Our new protocol outperforms theirs by more than factor 24000x in terms of run-time in the same setting and guarantees the same level of security. In addition, we show that our algorithm scales for practical database sizes by querying a database that contains up to a million short sequences within a few minutes, and a database with hundreds of whole-genome sequences containing 75 million alleles each within a few hours

Cryptology ePrint Archive