8 research outputs found

    Phased whole-genome genetic risk in a family quartet using a major allele reference sequence

    Get PDF
    Abstract Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (,1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing. Funding: FED was supported by NIH/NHLBI training grant T32 HL094274-01A2 and the Stanford University School of Medicine Dean's Postdoctoral Fellowship. MTW was supported by NIH National Research Service Award fellowship F32 HL097462. JKB, OEC, and CDB were supported by NHGRI grant U01HG005715. CFT, JMH, KS, LG, MW-C, MW, and RBA were supported by grants from the NIH/NIGMS U01 GM61374. KEO was supported by NIH/NHGRI 5 P50 HG003389-05. AJB was supported by the Lucile Packard Foundation for Children's Health, Hewlett Packard Foundation, and NIH/NIGMS R01 GM079719. JTD and KJK were supported by NIH/NLM T15 LM007033. EAA was supported by NIH/NHLBI KO8 HL083914, NIH New Investigator DP2 Award OD004613, and a grant from the Breetwor Family Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: JVT and AWZ are founders, consultants, and equity holders in Clinical Future; GMC has advisory roles in and research sponsorships from several companies involved in genome sequencing technology and personal genomics (see http://arep.med.harvard.edu/gmc/tech.html); MS is on the scientific advisory board of DNA Nexus and holds stock in Personalis; RBA has received consultancy fees from Novartis and 23andMe and holds stock in Personalis; AJB is a scientific advisory board membe

    Ethical, legal and social issues in diversifying genomic data: literature review and synthesis

    Get PDF
    Advances in technology have resulted in the ability to sequence entire human genomes as a routine, relatively inexpensive, investigation in healthcare. This offers many promises of personalising, stratifying, and targeting healthcare with an understanding of genetic susceptibility to particular diseases or conditions. However, research collections (databases, biobanks etc) that underpin these developments are significantly skewed towards populations of European ancestry meaning that our understanding of genetic susceptibility (or indeed of genetic protection to disease) is less good for many other populations in the world. Just as a dermatology text book skewed towards skin problems on white skin may be less useful to black populations, so genomic knowledge derived from one particular ancestry means it may be less useful to people with different ancestries. The need to diversify genomic data, to improve the evidence base for genomic medicine for all ancestries, is well recognised, but is more complex than simply increasing the collection of data from people from a range of ancestries. We reviewed the literature to understand the challenges of diversifying genomic data to identify key ethical, legal and social issues. Our findings were: 1. Many research practices are exclusionary and need to change. Examples include approaches to recruitment or data collection that do not consider the cultural setting in which potential participants are situated. Research also often lacks reflexivity about diversity on the part of researchers and research institutions. 2. Co-design is key to identifying and avoiding potential problems around data diversification. This requires an understanding of the concerns of underserved individuals and communities regarding exploitation and stigmatisation, as well as issues of data ownership and sovereignty. Without attention to group as well as individual concerns, participant engagement may become tokenistic which in turn risks exacerbating existing, as well as creating new, inequalities. 3. There are wider structural issues that influence researchers’ and participants’ attempts to generate diverse data. For example, (a) some researchers view data as neutral, but this ignores the social construction of data and technologies, and their tendencies to reflect societal inequalities. (b). Efforts to diversify data should be contextualised within the historical trajectory of structural racism and legacies of colonialism. (c) Classification and categorisation of populations have political consequences and need to be closely interrogated. These findings show that deliberation between researchers and participants, during all stages of research from planning and recruitment through to analysis, interpretation and dissemination is key to successful diversification

    ????????? ??????????????????

    Get PDF
    Department of Biomedical EngineeringHuman genomes are routinely compared against a universal human reference. However, this strategy could miss population-specific and personal genomic variations, which may be detected more efficiently using an ethnically-relevant or personal reference. Here I describe principles and methods in constructing a hybrid assembly of the first Korean reference genome (KOREF) by compiling all the major contemporary sequencing and mapping technologies: short and long paired-end sequences, synthetic and single molecule long reads, and optical and nanochannel genome maps. This low-cost hybrid approach shows the feasibility of routine reference-quality de novo assembled genomes to precisely analyze many personal and ethnic genomes in the future. I also introduce the concept of the consensus variome reference, providing information on millions of variants incorporated directly from 40 additional ethnically homogeneous genomes from the Korean Personal Genome Project. KOREF is the first de novo assembled consensus variome reference. KOREF has been constructed according to standardized production and evaluation procedures, and registered as a standard reference data for ethnic Korean genomes by evaluating its traceability, uncertainty, and consistency. By comparing KOREF against other ethnic references, I find that the ethnically-relevant consensus reference can be beneficial for efficient variants detection and possibly other purposes in the future. Therefore, I propose that, despite the limited level of divergence within our species, the level of genomic scale variation is sufficiently high to warrant the use of ethnically-relevant references for large-scale personal and disease genome projects. Systematic comparison of human assemblies also shows the importance of assembly quality, suggesting the necessity of new technologies to comprehensively map ethnic and personal genomic structure variations. In the era of large-scale population genome projects, the leveraging of ethnicity-specific genome assemblies as well as the human reference genome will accelerate mapping all human genome diversity on Earth.ope

    A molecular and genetic analysis of otosclerosis

    Get PDF
    Otosclerosis is a common form of conductive hearing loss. It is characterised by abnormal bone remodelling within the otic capsule, leading to formation of sclerotic lesions of the temporal bone. Encroachment of these lesions on to the footplate of the stapes in the middle ear leads to stapes fixation and subsequent conductive hearing loss. The hereditary nature of otosclerosis has long been recognised due to its recurrence within families, but its genetic aetiology is yet to be characterised. Although many familial linkage studies and candidate gene association studies to investigate the genetic nature of otosclerosis have been performed in recent years, progress in identifying disease causing genes has been slow. This is largely due to the highly heterogeneous nature of this condition. The research presented in this thesis examines the molecular and genetic basis of otosclerosis using two next generation sequencing technologies; RNA-sequencing and Whole Exome Sequencing. RNA–sequencing has provided human stapes transcriptomes for healthy and diseased stapes, and in combination with pathway analysis has helped identify genes and molecular processes dysregulated in otosclerotic tissue. Whole Exome Sequencing has been employed to investigate rare variants that segregate with otosclerosis in affected families, and has been followed by a variant filtering strategy, which has prioritised genes found to be dysregulated during RNA-sequencing. This has identified multiple variants predicted to be involved in splicing within genes involved in the bone disorder Osteogenesis Imperfecta, indicating a shared genetic aetiology for this condition and otosclerosis and a possible disease mechanism involving alternative splicing in the stapes. Whilst the hereditability of otosclerosis remains elusive, the identification of new candidate genes will make a significant contribution to the current literature. It is hoped that long term, this research will help reveal disease mechanisms and thereby improve treatment options for otosclerosis patients
    corecore