Search CORE

655 research outputs found

The distribution and mutagenesis of short coding INDELs from 1,128 whole exomes

Author: Antunes Lilian
Banks Eric
Challis Danny
Evani Uday S
Garrison Erik
Gibbs Richard A
Marth Gabor
Muzny Donna
Poplin Ryan
Yu Fuli
Publication venue: Digital Commons@Becker
Publication date: 01/01/2015
Field of study

BACKGROUND: Identifying insertion/deletion polymorphisms (INDELs) with high confidence has been intrinsically challenging in short-read sequencing data. Here we report our approach for improving INDEL calling accuracy by using a machine learning algorithm to combine call sets generated with three independent methods, and by leveraging the strengths of each individual pipeline. Utilizing this approach, we generated a consensus exome INDEL call set from a large dataset generated by the 1000 Genomes Project (1000G), maximizing both the sensitivity and the specificity of the calls. RESULTS: This consensus exome INDEL call set features 7,210 INDELs, from 1,128 individuals across 13 populations included in the 1000 Genomes Phase 1 dataset, with a false discovery rate (FDR) of about 7.0%. CONCLUSIONS: In our study we further characterize the patterns and distributions of these exonic INDELs with respect to density, allele length, and site frequency spectrum, as well as the potential mutagenic mechanisms of coding INDELs in humans. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1333-7) contains supplementary material, which is available to authorized users

Crossref

Springer - Publisher Connector

Digital Commons@Becker

PubMed Central

Whole genome sequence of Treponema pallidum ssp. pallidum, strain Mexico A, suggests recombination between yaws and syphilis strains

Author: Cejkova Darina
Chen Lei
Mikalova Lenka
Muzny Donna M
Petrosova Helena
Pospisilova Petra
Qin Xiang
Smajs David
Strouhal Michal
Weinstock George M
Zobanikova Maria
Publication venue: Digital Commons@Becker
Publication date: 01/01/2012
Field of study

Treponema pallidum ssp. pallidum (TPA), the causative agent of syphilis, and Treponema pallidum ssp. pertenue (TPE), the causative agent of yaws, are closely related spirochetes causing diseases with distinct clinical manifestations. The TPA Mexico A strain was isolated in 1953 from male, with primary syphilis, living in Mexico. Attempts to cultivate TPA Mexico A strain under in vitro conditions have revealed lower growth potential compared to other tested TPA strains.The complete genome sequence of the TPA Mexico A strain was determined using the Illumina sequencing technique. The genome sequence assembly was verified using the whole genome fingerprinting technique and the final sequence was annotated. The genome size of the Mexico A strain was determined to be 1,140,038 bp with 1,035 predicted ORFs. The Mexico A genome sequence was compared to the whole genome sequences of three TPA (Nichols, SS14 and Chicago) and three TPE (CDC-2, Samoa D and Gauthier) strains. No large rearrangements in the Mexico A genome were found and the identified nucleotide changes occurred most frequently in genes encoding putative virulence factors. Nevertheless, the genome of the Mexico A strain, revealed two genes (TPAMA_0326 (tp92) and TPAMA_0488 (mcp2-1)) which combine TPA- and TPE- specific nucleotide sequences. Both genes were found to be under positive selection within TPA strains and also between TPA and TPE strains.The observed mosaic character of the TPAMA_0326 and TPAMA_0488 loci is likely a result of inter-strain recombination between TPA and TPE strains during simultaneous infection of a single host suggesting horizontal gene transfer between treponemal subspecies

Crossref

Directory of Open Access Journals

Digital Commons@Becker

PubMed Central

The Francis Crick Institute

Loss of the Polyketide Synthase StlB Results in Stalk Cell Over production in Polysphondylium violaceum

Author: Gibbs Richard A.
Kawabe Yoshinori
Kin Koryu
Kuspa Adam
Muzny Donna
Narita Takaaki B.
Richards Stephen
Schaap Pauline
Strassmann Joan E.
Sucgang Richard
Worley Kim C.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/05/2020
Field of study

University of Dundee Online Publications

Associations of NINJ2 sequence variants with incident ischemic stroke in the Cohorts for Heart and Aging in Genomic Epidemiology (CHARGE) consortium

Author: Bis Joshua C.
Boerwinkle Eric
Brody Jennifer A
Butler Kenneth R.
Choi Seung Hoan
Debette Stéphanie
DeStefano Anita
Fornage Myriam
Gibbs Richard A.
Gottesman Rebecca F.
Gupta Mayetri
Hofman Albert
Ikram M. Arfan
Kovar Christie L.
Liu Xiaoming
Longstreth W. T.
Lumley Thomas
Mosley Thomas H.
Muzny Donna
Psaty Bruce M.
Seshadri Sudha
Shahar Eyal
van Duijn Cornelia
Verhaaren Benjamin F. J.
Wolf Philip A.
Publication venue: Public Library of Science
Publication date: 01/01/2014
Field of study

Background Stroke, the leading neurologic cause of death and disability, has a substantial genetic component. We previously conducted a genome-wide association study (GWAS) in four prospective studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium and demonstrated that sequence variants near the NINJ2 gene are associated with incident ischemic stroke. Here, we sought to fine-map functional variants in the region and evaluate the contribution of rare variants to ischemic stroke risk. Methods and Results We sequenced 196 kb around NINJ2 on chromosome 12p13 among 3,986 European ancestry participants, including 475 ischemic stroke cases, from the Atherosclerosis Risk in Communities Study, Cardiovascular Health Study, and Framingham Heart Study. Meta-analyses of single-variant tests for 425 common variants (minor allele frequency [MAF] ≥ 1%) confirmed the original GWAS results and identified an independent intronic variant, rs34166160 (MAF = 0.012), most significantly associated with incident ischemic stroke (HR = 1.80, p = 0.0003). Aggregating 278 putatively-functional variants with MAF≤ 1% using count statistics, we observed a nominally statistically significant association, with the burden of rare NINJ2 variants contributing to decreased ischemic stroke incidence (HR = 0.81; p = 0.026). Conclusion Common and rare variants in the NINJ2 region were nominally associated with incident ischemic stroke among a subset of CHARGE participants. Allelic heterogeneity at this locus, caused by multiple rare, low frequency, and common variants with disparate effects on risk, may explain the difficulties in replicating the original GWAS results. Additional studies that take into account the complex allelic architecture at this locus are needed to confirm these findings

Crossref

Directory of Open Access Journals

PubMed Central

Enlighten

Whole exome capture in solution with 3 Gbp of data

Author: Albert Thomas J
Bainbridge Matthew N
Burgess Daniel L
D'Ascenzo Mark
Gibbs Richard A
Jeddeloh Jeffrey A
Kitzman Jacob
Kovar Christie
Muzny Donna
Newsham Irene
Richmond Todd A
Rodesch Matthew J
Wang Min
Wu Yuan-Qing
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

We have developed a solution-based method for targeted DNA capture-sequencing that is directed to the complete human exome. Using this approach allows the discovery of greater than 95% of all expected heterozygous singe base variants, requires as little as 3 Gbp of raw sequence data and constitutes an effective tool for identifying rare coding alleles in large scale genomic studies

Crossref

PubMed Central

The gut mycobiome of the Human Microbiome Project healthy cohort

Author: Ajami Nadim J.
Auchtung Thomas A.
Gesell Jonathan R.
Gibbs Richard A.
Metcalf Ginger A.
Muzny Donna M.
Nash Andrea K.
Petrosino Joseph F.
Ross Matthew C.
Smith Daniel P.
Stewart Christopher J.
Wong Matthew C.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2017
Field of study

Background: Most studies describing the human gut microbiome in healthy and diseased states have emphasized the bacterial component, but the fungal microbiome (i.e., the mycobiome) is beginning to gain recognition as a fundamental part of our microbiome. To date, human gut mycobiome studies have primarily been disease centric or in small cohorts of healthy individuals. To contribute to existing knowledge of the human mycobiome, we investigated the gut mycobiome of the Human Microbiome Project (HMP) cohort by sequencing the Internal Transcribed Spacer 2 (ITS2) region as well as the 18S rRNA gene. Results: Three hundred seventeen HMP stool samples were analyzed by ITS2 sequencing. Fecal fungal diversity was significantly lower in comparison to bacterial diversity. Yeast dominated the samples, comprising eight of the top 15 most abundant genera. Specifically, fungal communities were characterized by a high prevalence of Saccharomyces, Malassezia, and Candida, with S. cerevisiae, M. restricta, and C. albicans operational taxonomic units (OTUs) present in 96. 8, 88.3, and 80.8% of samples, respectively. There was a high degree of inter- and intra-volunteer variability in fungal communities. However, S. cerevisiae, M. restricta, and C. albicans OTUs were found in 92.2, 78.3, and 63.6% of volunteers, respectively, in all samples donated over an approximately 1-year period. Metagenomic and 18S rRNA gene sequencing data agreed with ITS2 results; however, ITS2 sequencing provided greater resolution of the relatively low abundance mycobiome constituents. Conclusions: Compared to bacterial communities, the human gut mycobiome is low in diversity and dominated by yeast including Saccharomyces, Malassezia, and Candida. Both inter- and intra-volunteer variability in the HMP cohort were high, revealing that unlike bacterial communities, an individual’s mycobiome is no more similar to itself over time than to another person’s. Nonetheless, several fungal species persisted across a majority of samples, evidence that a core gut mycobiome may exist. ITS2 sequencing data provided greater resolution of the mycobiome membership compared to metagenomic and 18S rRNA gene sequencing data, suggesting that it is a more sensitive method for studying the mycobiome of stool samples

Characterization of single-nucleotide variation in Indian-origin rhesus macaques (Macaca mulatta)

Author: Chen David
Deiros David Rio
Fawcett Gloria L
Gibbs Richard
Harris Ronald Alan
Kalin Ned H
Milosavljevic Aleksandar
Muzny Donna M
Raveendran Muthuswamy
Reid Jeffrey G
Ren Yanru
Rogers Jeffrey
Shelton Steven E
Wheeler David A
Worley Kimberly C
Yu Fuli
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Rhesus macaques are the most widely utilized nonhuman primate model in biomedical research. Previous efforts have validated fewer than 900 single nucleotide polymorphisms (SNPs) in this species, which limits opportunities for genetic studies related to health and disease. Extensive information about SNPs and other genetic variation in rhesus macaques would facilitate valuable genetic analyses, as well as provide markers for genome-wide linkage analysis and the genetic management of captive breeding colonies. Results We used the available rhesus macaque draft genome sequence, new sequence data from unrelated individuals and existing published sequence data to create a genome-wide SNP resource for Indian-origin rhesus monkeys. The original reference animal and two additional Indian-origin individuals were resequenced to low coverage using SOLiD™ sequencing. We then used three strategies to validate SNPs: comparison of potential SNPs found in the same individual using two different sequencing chemistries, and comparison of potential SNPs in different individuals identified with either the same or different sequencing chemistries. Our approach validated approximately 3 million SNPs distributed across the genome. Preliminary analysis of SNP annotations suggests that a substantial number of these macaque SNPs may have functional effects. More than 700 non-synonymous SNPs were scored by Polyphen-2 as either possibly or probably damaging to protein function and these variants now constitute potential models for studying functional genetic variation relevant to human physiology and disease. Conclusions Resequencing of a small number of animals identified greater than 3 million SNPs. This provides a significant new information resource for rhesus macaques, an important research animal. The data also suggests that overall genetic variation is high in this species. We identified many potentially damaging non-synonymous coding SNPs, providing new opportunities to identify rhesus models for human disease.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central