199 research outputs found

    Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants

    Get PDF
    The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at maximum accuracy and low cost

    Diagnostic exome sequencing in 266 Dutch patients with visual impairment

    Get PDF
    Inherited eye disorders have a large clinical and genetic heterogeneity, which makes genetic diagnosis cumbersome. An exome-sequencing approach was developed in which data analysis was divided into two steps: the vision gene panel and exome analysis. In the vision gene panel analysis, variants in genes known to cause inherited eye disorders were assessed for pathogenicity. If no causative variants were detected and when the patient consented, the entire exome data was analyzed. A total of 266 Dutch patients with different types of inherited eye disorders, including inherited retinal dystrophies, cataract, developmental eye disorders and optic atrophy, were investigated. In the vision gene panel analysis (likely), causative variants were detected in 49% and in the exome analysis in an additional 2% of the patients. The highest detection rate of (likely) causative variants was in patients with inherited retinal dystrophies, for instance a yield of 63% in patients with retinitis pigmentosa. In patients with developmental eye defects, cataract and optic atrophy, the detection rate was 50, 33 and 17%, respectively. An exome-sequencing approach enables a genetic diagnosis in patients with different types of inherited eye disorders using one test. The exome approach has the same detection rate as targeted panel sequencing tests, but offers a number of advantages. For instance, the vision gene panel can be frequently and easily updated with additional (novel) eye disorder genes. Determination of the genetic diagnosis improved the clinical diagnosis, regarding the assessment of the inheritance pattern as well as future disease perspective

    Mapping and phasing of structural variation in patient genomes using nanopore sequencing

    Get PDF
    Despite improvements in genomics technology, the detection of structural variants (SVs) from short-read sequencing still poses challenges, particularly for complex variation. Here we analyse the genomes of two patients with congenital abnormalities using the MinION nanopore sequencer and a novel computational pipeline—NanoSV. We demonstrate that nanopore long reads are superior to short reads with regard to detection of de novo chromothripsis rearrangements. The long reads also enable efficient phasing of genetic variations, which we leveraged to determine the parental origin of all de novo chromothripsis breakpoints and to resolve the structure of these complex rearrangements. Additionally, genome-wide surveillance of inherited SVs reveals novel variants, missed in short-read data sets, a large proportion of which are retrotransposon insertions. We provide a first exploration of patient genome sequencing with a nanopore sequencer and demonstrate the value of long-read sequencing in mapping and phasing of SVs for both clinical and research applications

    Enabling global clinical collaborations on identifiable patient data: The Minerva Initiative

    Get PDF
    The clinical utility of computational phenotyping for both genetic and rare diseases is increasingly appreciated; however, its true potential is yet to be fully realized. Alongside the growing clinical and research availability of sequencing technologies, precise deep and scalable phenotyping is required to serve unmet need in genetic and rare diseases. To improve the lives of individuals affected with rare diseases through deep phenotyping, global big data interrogation is necessary to aid our understanding of disease biology, assist diagnosis, and develop targeted treatment strategies. This includes the application of cutting-edge machine learning methods to image data. As with most digital tools employed in health care, there are ethical and data governance challenges associated with using identifiable personal image data. There are also risks with failing to deliver on the patient benefits of these new technologies, the biggest of which is posed by data siloing. The Minerva Initiative has been designed to enable the public good of deep phenotyping while mitigating these ethical risks. Its open structure, enabling collaboration and data sharing between individuals, clinicians, researchers and private enterprise, is key for delivering precision public health

    Opposite Modulation of RAC1 by Mutations in TRIO Is Associated with Distinct, Domain-Specific Neurodevelopmental Disorders

    Get PDF
    The Rho-guanine nucleotide exchange factor (RhoGEF) TRIO acts as a key regulator of neuronal migration, axonal outgrowth, axon guidance, and synaptogenesis by activating the GTPase RAC1 and modulating actin cytoskeleton remodeling. Pathogenic variants in TRIO are associated with neurodevelopmental diseases, including intellectual disability (ID) and autism spectrum disorders (ASD). Here, we report the largest international cohort of 24 individuals with confirmed pathogenic missense or nonsense variants in TRIO. The nonsense mutations are spread along the TRIO sequence, and affected individuals show variable neurodevelopmental phenotypes. In contrast, missense variants cluster into two mutational hotspots in the TRIO sequence, one in the seventh spectrin repeat and one in the RAC1-activating GEFD1. Although all individuals in this cohort present with developmental delay and a neuro-behavioral phenotype, individuals with a pathogenic variant in the seventh spectrin repeat have a more severe ID associated with macrocephaly than do most individuals with GEFD1 variants, who display milder ID and microcephaly. Functional studies show that the spectrin and GEFD1 variants cause a TRIO-mediated hyper- or hypo-activation of RAC1, respectively, and we observe a striking correlation between RAC1 activation levels and the head size of the affected individuals. In addition, truncations in TRIO GEFD1 in the vertebrate model X. tropicalis induce defects that are concordant with the human phenotype. This work demonstrates distinct clinical and molecular disorders clustering in the GEFD1 and seventh spectrin repeat domains and highlights the importance of tight control of TRIO-RAC1 signaling in neuronal development.<br/

    Accurate Distinction of Pathogenic from Benign CNVs in Mental Retardation

    Get PDF
    Copy number variants (CNVs) have recently been recognized as a common form of genomic variation in humans. Hundreds of CNVs can be detected in any individual genome using genomic microarrays or whole genome sequencing technology, but their phenotypic consequences are still poorly understood. Rare CNVs have been reported as a frequent cause of neurological disorders such as mental retardation (MR), schizophrenia and autism, prompting widespread implementation of CNV screening in diagnostics. In previous studies we have shown that, in contrast to benign CNVs, MR-associated CNVs are significantly enriched in genes whose mouse orthologues, when disrupted, result in a nervous system phenotype. In this study we developed and validated a novel computational method for differentiating between benign and MR-associated CNVs using structural and functional genomic features to annotate each CNV. In total 13 genomic features were included in the final version of a Naïve Bayesian Tree classifier, with LINE density and mouse knock-out phenotypes contributing most to the classifier's accuracy. After demonstrating that our method (called GECCO) perfectly classifies CNVs causing known MR-associated syndromes, we show that it achieves high accuracy (94%) and negative predictive value (99%) on a blinded test set of more than 1,200 CNVs from a large cohort of individuals with MR. These results indicate that this classification method will be of value for objectively prioritizing CNVs in clinical research and diagnostics

    The Genome of the Netherlands: Design, and project goals

    Get PDF
    Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project

    Fitness Consequences of Advanced Ancestral Age over Three Generations in Humans

    Get PDF
    A rapid rise in age at parenthood in contemporary societies has increased interest in reports of higher prevalence of de novo mutations and health problems in individuals with older fathers, but the fitness consequences of such age effects over several generations remain untested. Here, we use extensive pedigree data on seven pre-industrial Finnish populations to show how the ages of ancestors for up to three generations are associated with fitness traits. Individuals whose fathers, grandfathers and great-grandfathers fathered their lineage on average under age 30 were ~13% more likely to survive to adulthood than those whose ancestors fathered their lineage at over 40 years. In addition, females had a lower probability of marriage if their male ancestors were older. These findings are consistent with an increase of the number of accumulated de novo mutations with male age, suggesting that deleterious mutations acquired from recent ancestors may be a substantial burden to fitness in humans. However, possible non-mutational explanations for the observed associations are also discussed
    • …
    corecore