168 research outputs found

    Automated group assignment in large phylogenetic trees using GRUNT: GRouping, Ungrouping, Naming Tool

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accurate taxonomy is best maintained if species are arranged as hierarchical groups in phylogenetic trees. This is especially important as trees grow larger as a consequence of a rapidly expanding sequence database. Hierarchical group names are typically manually assigned in trees, an approach that becomes unfeasible for very large topologies.</p> <p>Results</p> <p>We have developed an automated iterative procedure for delineating stable (monophyletic) hierarchical groups to large (or small) trees and naming those groups according to a set of sequentially applied rules. In addition, we have created an associated ungrouping tool for removing existing groups that do not meet user-defined criteria (such as monophyly). The procedure is implemented in a program called GRUNT (GRouping, Ungrouping, Naming Tool) and has been applied to the current release of the Greengenes (Hugenholtz) 16S rRNA gene taxonomy comprising more than 130,000 taxa.</p> <p>Conclusion</p> <p>GRUNT will facilitate researchers requiring comprehensive hierarchical grouping of large tree topologies in, for example, database curation, microarray design and pangenome assignments. The application is available at the greengenes website <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p

    An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea

    Get PDF
    Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a ‘taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408 315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/

    Simrank: Rapid and sensitive general-purpose k-mer search tool

    Get PDF
    Terabyte-scale collections of string-encoded data are expected from consortia efforts such as the Human Microbiome Project (http://nihroadmap.nih.gov/hmp). Intra- and inter-project data similarity searches are enabled by rapid k-mer matching strategies. Software applications for sequence database partitioning, guide tree estimation, molecular classification and alignment acceleration have benefited from embedded k-mer searches as sub-routines. However, a rapid, general-purpose, open-source, flexible, stand-alone k-mer tool has not been available. Here we present a stand-alone utility, Simrank, which allows users to rapidly identify database strings the most similar to query strings. Performance testing of Simrank and related tools against DNA, RNA, protein and human-languages found Simrank 10X to 928X faster depending on the dataset. Simrank provides molecular ecologists with a high-throughput, open source choice for comparing large sequence sets to find similarity

    Three-Dimensional Reconstructions of Tadpole Chondrocrania from Historical Sections

    Get PDF
    Reconstructing three dimensional structures (3DR) from histological sections has always been difficult but is becoming more accessible with the assistance of digital imaging. We sought to assemble a low cost system using readily available hardware and software to generate 3DR for a study of tadpole chondrocrania. We found that a combination of RGB can1era, stereomicro­scope, and Apple Macintosh PowerPC computers running NIH Image, Object Image, Rotater, and SURFdriver software provided acceptable reconstruc­tions. These are limited in quality primarily by the distortions arising from histological protocols rather than hardware or software

    Expansion of Urease- and Uricase-Containing, Indole- and p-Cresol-Forming and Contraction of Short-Chain Fatty Acid-Producing Intestinal Microbiota in ESRD

    Full text link
    BACKGROUND: Intestinal microbiome constitutes a symbiotic ecosystem that is essential for health, and changes in its composition/function cause various illnesses. Biochemical milieu shapes the structure and function of the microbiome. Recently we found marked differences in the abundance of numerous bacterial taxa between ESRD and healthy individuals. Influx of urea and uric acid and dietary restriction of fruits and vegetables to prevent hyperkalemia alter ESRD patients’ intestinal milieu. We hypothesized that relative abundances of bacteria possessing urease, uricase, and p-cresol- and indole-producing enzymes is increased, while abundance of bacteria containing enzymes converting dietary fiber to short chain fatty acids (SCFA) is reduced in ESRD. METHODS: Reference sets of bacteria containing genes of interest were compiled to family, and sets of intestinal bacterial families showing differential abundances between 12 healthy and 24 ESRD individuals enrolled in our original study were compiled. Overlap between sets was assessed using hypergeometric distribution tests. RESULTS: : Among 19 microbial families that were dominant in ESRD patients, 12 possessed urease, 5 possessed uricase, and 4 possessed indole and p-cresol forming enzymes. Among 4 microbial families that were diminished in ESRD patients, 2 possessed butyrate-forming enzymes. Probabilities of these overlapping distributions were <0.05. CONCLUSIONS: ESRD patients exhibited significant expansion of bacterial families possessing urease, uricase, and indole and p-cresol forming enzymes, and contraction of families possessing butyrate-forming enzymes. Given the deleterious effects of indoxyl sulfate, p-cresol sulfate, and urea-derived ammonia, and beneficial actions of SCFA, these changes in intestinal microbial metabolism contribute to uremic toxicity and inflammation

    Three-Dimensional Reconstructions of Tadpole Chondrocrania from Histological Sections

    Get PDF
    Reconstructing three dimensional structures (3DR) from histological sections has always been difficult but is becoming more accessible with the assistance of digital imaging. We sought to assemble a low cost system using readily available hardware and software to generate 3DR for a study of tadpole chondrocrania. We found that a combination of RGB camera, stereomicroscope, and Apple Macintosh PowerPC computers running NIH Image, Object Image, Rotater. and SURFdriver software provided acceptable reconstructions. These are limited in quality primarily by the distortions arising from histological protocols rather than hardware or software

    Foregut microbiome in development of esophageal adenocarcinoma

    Get PDF
    Esophageal adenocarcinoma (EA), the type of cancer linked to heartburn due to gastroesophageal reflux diseases (GERD), has increased six fold in the past 30 years. This cannot currently be explained by the usual environmental or by host genetic factors. EA is the end result of a sequence of GERD-related diseases, preceded by reflux esophagitis (RE) and Barrett&#x2019;s esophagus (BE). Preliminary studies by Pei and colleagues at NYU on elderly male veterans identified two types of microbiotas in the esophagus. Patients who carry the type II microbiota are &#x3e;15 fold likely to have esophagitis and BE than those harboring the type I microbiota. In a small scale study, we also found that 3 of 3 cases of EA harbored the type II biota. The findings have opened a new approach to understanding the recent surge in the incidence of EA. &#xd;&#xa;&#xd;&#xa;Our long-term goal is to identify the cause of GERD sequence. The hypothesis to be tested is that changes in the foregut microbiome are associated with EA and its precursors, RE and BE in GERD sequence. We will conduct a case control study to demonstrate the microbiome disease association in every stage of GERD sequence, as well as analyze the trend in changes in the microbiome along disease progression toward EA, by two specific aims. Aim 1 is to conduct a comprehensive population survey of the foregut microbiome and demonstrate its association with GERD sequence. Furthermore, spatial relationship between the esophageal microbiota and upstream (mouth) and downstream (stomach) foregut microbiotas as well as temporal stability of the microbiome-disease association will also be examined. Aim 2 is to define the distal esophageal metagenome and demonstrate its association with GERD sequence. Detailed analyses will include pathway-disease and gene-disease associations. Archaea, fungi and viruses, if identified, also will be correlated with the diseases. A significant association between the foregut microbiome and GERD sequence, if demonstrated, will be the first step for eventually testing whether an abnormal microbiome is required for the development of the sequence of phenotypic changes toward EA. If EA and its precursors represent a microecological disease, treating the cause of GERD might become possible, for example, by normalizing the microbiota through use of antibiotics, probiotics, or prebiotics. Causative therapy of GERD could prevent its progression and reverse the current trend of increasing incidence of EA

    Piphillin predicts metagenomic composition and dynamics from DADA2-corrected 16S rDNA sequences

    Get PDF
    Shotgun metagenomic sequencing reveals the potential in microbial communities. However, lower-cost 16S ribosomal RNA (rRNA) gene sequencing provides taxonomic, not functional, observations. To remedy this, we previously introduced Piphillin, a software package that predicts functional metagenomic content based on the frequency of detected 16S rRNA gene sequences corresponding to genomes in regularly updated, functionally annotated genome databases. Piphillin (and similar tools) have previously been evaluated on 16S rRNA data processed by the clustering of sequences into operational taxonomic units (OTUs). New techniques such as amplicon sequence variant error correction are in increased use, but it is unknown if these techniques perform better in metagenomic content prediction pipelines, or if they should be treated the same as OTU data in respect to optimal pipeline parameters
    corecore