65 research outputs found

    Assessment of the influence of intrinsic environmental and geographical factors on the bacterial ecology of pit latrines

    Get PDF
    Funding Information: This research received financial support from the Bill and Melinda Gates Foundation (grant number OPP52641). AWW and JP were supported by the Wellcome Trust [grant number 098051]. AWW and the Rowett Institute of Nutrition and Health, University of Aberdeen, receive core funding support from the Scottish Government Rural and Environmental Science and Analysis Service (RESAS). UZ is funded by Natural Environment Research Council (NERC) Independent Research Fellowship (NE/L011956/1). CQ is funded through an Medical Research Council fellowship (MR/M50161X/1) as part of the MRC Cloud Infrastructure for Microbial Bioinformatics consortium (MR/L015080/1).Peer reviewedPublisher PD

    Analysis of pit latrine microbiota reveals depth-related variation in composition, and key parameters and taxa associated with latrine fill-up rate

    Get PDF
    Funding statement This research received financial support from the Bill and Melinda Gates Foundation (grant number OPP52641 to the London School of Hygiene and Tropical Medicine). AWW and JP were supported by the Wellcome Trust [grant number 098051]. AWW and the Rowett Institute, University of Aberdeen, receive core funding support from the Scottish Government Rural and Environmental Science and In review Analysis Service (RESAS). UZI is funded by NERC Independent Research Fellowship (NE/L011956/1) and further supported by EPSRC (EP/P029329/1 and EP/V030515/1). CQ is funded through an MRC fellowship (MR/M50161X/1) as part of the MRC Cloud Infrastructure for Microbial Bioinformatics consortium (MR/L015080/1). Acknowledgements In review Pit latrine microbiota associated with depth and fill-up rate. We would like to thank all the field and laboratory teams and to all the pit latrine owners who participated in this study. We also thank Paul Scott, Richard Rance and members of the Wellcome Sanger Institute's sequencing team for generating 16S rRNA gene data.Peer reviewedPublisher PD

    HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein domain classification is an important step in metagenomic annotation. The state-of-the-art method for protein domain classification is profile HMM-based alignment. However, the relatively high rates of insertions and deletions in homopolymer regions of pyrosequencing reads create frameshifts, causing conventional profile HMM alignment tools to generate alignments with marginal scores. This makes error-containing gene fragments unclassifiable with conventional tools. Thus, there is a need for an accurate domain classification tool that can detect and correct sequencing errors.</p> <p>Results</p> <p>We introduce HMM-FRAME, a protein domain classification tool based on an augmented Viterbi algorithm that can incorporate error models from different sequencing platforms. HMM-FRAME corrects sequencing errors and classifies putative gene fragments into domain families. It achieved high error detection sensitivity and specificity in a data set with annotated errors. We applied HMM-FRAME in Targeted Metagenomics and a published metagenomic data set. The results showed that our tool can correct frameshifts in error-containing sequences, generate much longer alignments with significantly smaller E-values, and classify more sequences into their native families.</p> <p>Conclusions</p> <p>HMM-FRAME provides a complementary protein domain classification tool to conventional profile HMM-based methods for data sets containing frameshifts. Its current implementation is best used for small-scale metagenomic data sets. The source code of HMM-FRAME can be downloaded at <url>http://www.cse.msu.edu/~zhangy72/hmmframe/</url> and at <url>https://sourceforge.net/projects/hmm-frame/</url>.</p

    Hundreds of variants clustered in genomic loci and biological pathways affect human height

    Get PDF
    Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P < 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.

    Large-Scale Gene-Centric Meta-Analysis across 39 Studies Identifies Type 2 Diabetes Loci

    Get PDF
    To identify genetic factors contributing to type 2 diabetes (T2D), we performed large-scale meta-analyses by using a custom similar to 50,000 SNP genotyping array (the ITMAT-Broad-CARe array) with similar to 2000 candidate genes in 39 multiethnic population-based studies, case-control studies, and clinical trials totaling 17,418 cases and 70,298 controls. First, meta-analysis of 25 studies comprising 14,073 cases and 57,489 controls of European descent confirmed eight established T2D loci at genome-wide significance. In silico follow-up analysis of putative association signals found in independent genome-wide association studies (including 8,130 cases and 38,987 controls) performed by the DIAGRAM consortium identified a T2D locus at genome-wide significance (GATAD2A/CILP2/PBX4; p = 5.7 x 10(-9)) and two loci exceeding study-wide significance (SREBF1, and TH/INS; p <2.4 x 10(-6)). Second, meta-analyses of 1,986 cases and 7,695 controls from eight African-American studies identified study-wide-significant (p = 2.4 x 10(-7)) variants in HMGA2 and replicated variants in TCF7L2 (p = 5.1 x 10(-15)). Third, conditional analysis revealed multiple known and novel independent signals within five T2D-associated genes in samples of European ancestry and within HMGA2 in African-American samples. Fourth, a multiethnic meta-analysis of all 39 studies identified T2D-associated variants in BCL2 (p = 2.1 x 10(-8)). Finally, a composite genetic score of SNPs from new and established T2D signals was significantly associated with increased risk of diabetes in African-American, Hispanic, and Asian populations. In summary, large-scale meta-analysis involving a dense gene-centric approach has uncovered additional loci and variants that contribute to T2D risk and suggests substantial overlap of T2D association signals across multiple ethnic groups

    A communal catalogue reveals Earth's multiscale microbial diversity

    Get PDF
    Our growing awareness of the microbial world's importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth's microbial diversity.Peer reviewe

    A communal catalogue reveals Earth’s multiscale microbial diversity

    Get PDF
    Our growing awareness of the microbial world’s importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth’s microbial diversity

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Association Between Chromosome 9p21 Variants and the Ankle-Brachial Index Identified by a Meta-Analysis of 21 Genome-Wide Association Studies

    Get PDF
    Genetic determinants of peripheral arterial disease (PAD) remain largely unknown. To identify genetic variants associated with the ankle-brachial index (ABI), a noninvasive measure of PAD, we conducted a meta-analysis of genome-wide association study data from 21 population-based cohorts
    corecore