5 research outputs found
No major flaws in "Identification of individuals by trait prediction using whole-genome sequencing data"
In a recently published PNAS article, we studied the identifiability of genomic samples using machine learning methods [Lippert et al., 2017]. In a response, Erlich [2017]
argued that our work contained major flaws. The main technical critique of Erlich [2017] builds on a simulation experiment that shows that our proposed algorithm, which
uses only a genomic sample for identification, performed no better than a strategy that uses demographic variables. Below, we show why this comparison is misleading and
provide a detailed discussion of the key critical points in our analyses that have been brought up in Erlich [2017] and in the media. Further, not only faces may be derived
from DNA, but a wide range of phenotypes and demographic variables. In this light, the main contribution of Lippert et al. [2017] is an algorithm that identifies genomes of
individuals by combining multiple DNA-based predictive models for a myriad of traits
Identification of individuals by trait prediction using whole-genome sequencing data
Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited. However, we have developed a maximum entropy algorithm that integrates multiple predictions to determine which genomic samples and phenotype measurements originate from the same person. Using this algorithm, we have reidentified an average of >8 of 10 held-out individuals in an ethnically mixed cohort and an average of 5 of either 10 African Americans or 10 Europeans. This work challenges current conceptions of personal privacy and may have far-reaching ethical and legal implications