30 research outputs found

    Exome sequencing of case-unaffected-parents trios reveals recessive and de novo genetic variants in sporadic ALS

    Get PDF
    The contribution of genetic variants to sporadic amyotrophic lateral sclerosis (ALS) remains largely unknown. Either recessive or de novo variants could result in an apparently sporadic occurrence of ALS. In an attempt to find such variants we sequenced the exomes of 44 ALS-unaffected-parents trios. Rare and potentially damaging compound heterozygous variants were found in 27% of ALS patients, homozygous recessive variants in 14% and coding de novo variants in 27%. In 20% of patients more than one of the above variants was present. Genes with recessive variants were enriched in nucleotide binding capacity, ATPase activity, and the dynein heavy chain. Genes with de novo variants were enriched in transcription regulation and cell cycle processes. This trio study indicates that rare private recessive variants could be a mechanism underlying some case of sporadic ALS, and that de novo mutations are also likely to play a part in the disease

    SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research.</p> <p>Results</p> <p>SeqAnt (<it>Seq</it>uence <it>An</it>notator) is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds.</p> <p>Conclusion</p> <p>SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.</p

    Single haplotype assembly of the human genome from a hydatidiform mole

    Get PDF
    A complete reference assembly is essential for accurately interpreting individual genomes and associating variation with phenotypes. While the current human reference genome sequence is of very high quality, gaps and misassemblies remain due to biological and technical complexities. Large repetitive sequences and complex allelic diversity are the two main drivers of assembly error. Although increasing the length of sequence reads and library fragments can improve assembly, even the longest available reads do not resolve all regions. In order to overcome the issue of allelic diversity, we used genomic DNA from an essentially haploid hydatidiform mole, CHM1. We utilized several resources from this DNA including a set of end-sequenced and indexed BAC clones and 100× Illumina whole-genome shotgun (WGS) sequence coverage. We used the WGS sequence and the GRCh37 reference assembly to create an assembly of the CHM1 genome. We subsequently incorporated 382 finished BAC clone sequences to generate a draft assembly, CHM1_1.1 (NCBI AssemblyDB GCA_000306695.2). Analysis of gene, repetitive element, and segmental duplication content show this assembly to be of excellent quality and contiguity. However, comparison to assembly-independent resources, such as BAC clone end sequences and PacBio long reads, indicate misassembled regions. Most of these regions are enriched for structural variation and segmental duplication, and can be resolved in the future. This publicly available assembly will be integrated into the Genome Reference Consortium curation framework for further improvement, with the ultimate goal being a completely finished gap-free assembly

    CHM1 Single Haplotype Assembly Supplementary Material

    No full text
    <p>This folder contains all of the supplementary material for Meltz Steinberg et al "Single haplotype assembly of the human genome from a hydatidiform mole"</p

    #BoG14 poster

    No full text
    <p>A single haplotype platinum genome assembly</p> <p>Karyn Meltz Steinberg1, Tina Graves-Lindsay1, Robert S. Fulton1, Deanna M. Church2, Valerie A. Schneider3, Richa Agarwala3, Sergey Shiryev3, Aleksandr Morgulis3, John Huddleston4, Urvashi Surti5, Wesley C. Warren1, Evan E. Eichler4, Richard K. Wilson1</p> <p>1The Genome Institute at Washington University, St. Louis, MO; 2Personalis, Inc. Menlo Park, CA; 3NCBI, Bethesda, MD; 4Department of Genome Sciences, University of Washington, Seattle, WA; 5University of Pittsburgh, Pittsburgh, PA</p> <p>An accurate and complete reference genome sequence is essential to interpret re-sequencing and genotyping results. Distinguishing allelic from paralogous sequences is a central challenge and regions enhanced for large repetitive sequences are often recalcitrant to traditional techniques for closure in the human reference sequence. The complex structural diversity of these regions complicates the ability to produce representative genomic sequence at these loci from a single, diploid donor. To overcome the issues associated with multiple alleles and achieve a single allelic representation of a human genome, we have leveraged genomic DNA and a BAC library from the essentially haploid hydatidiform mole, CHM1. We have generated ~100X whole genome shotgun sequence as Illumina paired-end data and ~500 BAC sequences from the CHM1 BAC library, CH17. We produced a reference-guided assembly based on alignments using SRPRISM. We subsequently incorporated finished BAC sequences to generate the complete assembly. The contig N50 is >140kbp, which is larger than any other human whole genome assembly submitted to GenBank. The assembly was annotated with genes, transcripts, and proteins and masked with RepeatMasker and segmental duplications. We assessed the quality of our assembly by mapping the Illumina reads and BAC end sequences to the assembly and identifying regions of SNV density above Illumina error rate as well as discordant BAC end mappings. Thirty-four regions totaling 49Mb have SNV density 2 standard deviations greater than the mean SNV density for the whole genome, and only 5% of BAC ends mapped discordantly. These SNV dense regions and BAC ends are significantly enriched for repetitive elements. Additionally, we utilized BioNano genomic maps to identify assembly issues, size gaps and confirm structural variants. BioNano aligned 98.2% of maps with a map N50 of ~1Mb resulting in 89.6% genome coverage that increased scaffold length. Finally, the recently released 54X PacBio long read data from CHM1 were used to identify misassemblies and resolve gaps to improve our assembly. Our ultimate goal is a single allelic representation of the human genome of the same quality as the human reference.</p
    corecore