23 research outputs found

    A family-based probabilistic method for capturing de novo mutations from high-throughput short-read sequencing data

    Get PDF
    Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors' genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date

    The Next PAGE in Understanding Complex Traits: Design for the Analysis of Population Architecture Using Genetics and Epidemiology (PAGE) Study

    Get PDF
    Genetic studies have identified thousands of variants associated with complex traits. However, most association studies are limited to populations of European descent and a single phenotype. The Population Architecture using Genomics and Epidemiology (PAGE) Study was initiated in 2008 by the National Human Genome Research Institute to investigate the epidemiologic architecture of well-replicated genetic variants associated with complex diseases in several large, ethnically diverse population-based studies. Combining DNA samples and hundreds of phenotypes from multiple cohorts, PAGE is well-suited to address generalization of associations and variability of effects in diverse populations; identify genetic and environmental modifiers; evaluate disease subtypes, intermediate phenotypes, and biomarkers; and investigate associations with novel phenotypes. PAGE investigators harmonize phenotypes across studies where possible and perform coordinated cohort-specific analyses and meta-analyses. PAGE researchers are genotyping thousands of genetic variants in up to 121,000 DNA samples from African-American, white, Hispanic/Latino, Asian/Pacific Islander, and American Indian participants. Initial analyses will focus on single nucleotide polymorphisms (SNPs) associated with obesity, lipids, cardiovascular disease, type 2 diabetes, inflammation, various cancers, and related biomarkers. PAGE SNPs are also assessed for pleiotropy using the “phenome-wide association study” approach, testing each SNP for associations with hundreds of phenotypes. PAGE data will be deposited into the National Center for Biotechnology Information's Database of Genotypes and Phenotypes and made available via a custom browser

    Genome-wide Analyses Identify KIF5A as a Novel ALS Gene

    Get PDF
    To identify novel genes associated with ALS, we undertook two lines of investigation. We carried out a genome-wide association study comparing 20,806 ALS cases and 59,804 controls. Independently, we performed a rare variant burden analysis comparing 1,138 index familial ALS cases and 19,494 controls. Through both approaches, we identified kinesin family member 5A (KIF5A) as a novel gene associated with ALS. Interestingly, mutations predominantly in the N-terminal motor domain of KIF5A are causative for two neurodegenerative diseases: hereditary spastic paraplegia (SPG10) and Charcot-Marie-Tooth type 2 (CMT2). In contrast, ALS-associated mutations are primarily located at the C-terminal cargo-binding tail domain and patients harboring loss-of-function mutations displayed an extended survival relative to typical ALS cases. Taken together, these results broaden the phenotype spectrum resulting from mutations in KIF5A and strengthen the role of cytoskeletal defects in the pathogenesis of ALS.Peer reviewe

    A Family-Based Probabilistic Method for Capturing De Novo Mutations from High-Throughput Short-Read Sequencing Data

    No full text
    Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors' genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.

    <i>De novo</i> Sequencing, Characterization, and Comparison of Inflorescence Transcriptomes of <i>Cornus canadensis</i> and <i>C. florida</i> (Cornaceae)

    Get PDF
    <div><p>Background</p><p>Transcriptome sequencing analysis is a powerful tool in molecular genetics and evolutionary biology. Here we report the results of <i>de novo</i> 454 sequencing, characterization, and comparison of inflorescence transcriptomes of two closely related dogwood species, <i>Cornus canadensis</i> and <i>C. florida</i> (Cornaceae). Our goals were to build a preliminary source of genome sequence data, and to identify genes potentially expressed differentially between the inflorescence transcriptomes for these important horticultural species.</p><p>Results</p><p>The sequencing of cDNAs from inflorescence buds of <i>C. canadensis</i> (cc) and <i>C. florida</i> (cf), and normalized cDNAs from leaves of <i>C. canadensis</i> resulted in 251799 (ccBud), 96245 (ccLeaf) and 114648 (cfBud) raw reads, respectively. The <i>de novo</i> assembly of the high quality (HQ) reads resulted in 36088, 17802 and 21210 unigenes for ccBud, ccLeaf and cfBud. A reference transcriptome for <i>C. canadensis</i> was built by assembling HQ reads of ccBud and ccLeaf, containing 40884 unigenes. Reference mapping and comparative analyses found 10926 sequences were putatively specific to ccBud, and 6979 putatively specific to cfBud. Putative differentially expressed genes between ccBud and cfBud that are related to flower development and/or stress response were identified among 7718 shared sequences by ccBud and cfBud. Bi-directional BLAST found 87 (41.83% of 208) of <i>Arabidopsis</i> genes related to inflorescence development had putative orthologs in the dogwood transcriptomes. Comparisons of the shared sequences by ccBud and cfBud yielded 65931 high quality SNPs between two species. The twenty unigenes with the most SNPs are listed as potential genetic markers for evolutionary studies.</p><p>Conclusions</p><p>The data provide an important, although preliminary, information platform for functional genomics and evolutionary developmental biology in <i>Cornus</i>. The study identified putative candidates potentially involved in the genetic regulation of inflorescence evolution and/or disease resistance in dogwoods for future analyses. <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0082674#s2" target="_blank">Results</a> of the study also provide markers useful for dogwood phylogenomic studies.</p></div

    Assembly characteristics of <i>Cornus canadensis</i> inflorescence transcriptome (ccBud) and <i>C. florida</i> inflorescence transcriptome (cfBud).

    No full text
    <p>(A–D) ccBud and (E–H) cfBud. (A,E) Length frequency distribution of assembled contigs. (B,F) Length frequency distribution of singletons. (C,G) Average coverage frequency distribution of assembled contigs. (D,H) Length frequency distribution of unigenes.</p

    Assembly characteristics of <i>Cornus canadensis</i> transcriptome.

    No full text
    <p>(A) Length frequency distribution of assembled contigs. (B) Length frequency distribution of singletons. (C) Average coverage frequency distribution of assembled contigs. (D) Length frequency distribution of unigenes.</p
    corecore