15 research outputs found

    Ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses.

    Get PDF
    BackgroundFungi play critical roles in many ecosystems, cause serious diseases in plants and animals, and pose significant threats to human health and structural integrity problems in built environments. While most fungal diversity remains unknown, the development of PCR primers for the internal transcribed spacer (ITS) combined with next-generation sequencing has substantially improved our ability to profile fungal microbial diversity. Although the high sequence variability in the ITS region facilitates more accurate species identification, it also makes multiple sequence alignment and phylogenetic analysis unreliable across evolutionarily distant fungi because the sequences are hard to align accurately. To address this issue, we created ghost-tree, a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach starts with a "foundation" phylogeny based on one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families). Then, "extension" phylogenies are built for more closely related organisms (e.g., fungal species or strains) using a second more rapidly evolving genetic marker. These smaller phylogenies are then grafted onto the foundation tree by mapping taxonomic names such that each corresponding foundation-tree tip would branch into its new "extension tree" child.ResultsWe applied ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. Our analysis of simulated and real fungal ITS data sets found that phylogenetic distances between fungal communities computed using ghost-tree phylogenies explained significantly more variance than non-phylogenetic distances. The phylogenetic metrics also improved our ability to distinguish small differences (effect sizes) between microbial communities, though results were similar to non-phylogenetic methods for larger effect sizes.ConclusionsThe Silva/UNITE-based ghost tree presented here can be easily integrated into existing fungal analysis pipelines to enhance the resolution of fungal community differences and improve understanding of these communities in built environments. The ghost-tree software package can also be used to develop phylogenetic trees for other marker gene sets that afford different taxonomic resolution, or for bridging genome trees with amplicon trees.Availabilityghost-tree is pip-installable. All source code, documentation, and test code are available under the BSD license at https://github.com/JTFouquier/ghost-tree

    Geography and Location Are the Primary Drivers of Office Microbiome Composition.

    Get PDF
    In the United States, humans spend the majority of their time indoors, where they are exposed to the microbiome of the built environment (BE) they inhabit. Despite the ubiquity of microbes in BEs and their potential impacts on health and building materials, basic questions about the microbiology of these environments remain unanswered. We present a study on the impacts of geography, material type, human interaction, location in a room, seasonal variation, and indoor and microenvironmental parameters on bacterial communities in offices. Our data elucidate several important features of microbial communities in BEs. First, under normal office environmental conditions, bacterial communities do not differ on the basis of surface material (e.g., ceiling tile or carpet) but do differ on the basis of the location in a room (e.g., ceiling or floor), two features that are often conflated but that we are able to separate here. We suspect that previous work showing differences in bacterial composition with surface material was likely detecting differences based on different usage patterns. Next, we find that offices have city-specific bacterial communities, such that we can accurately predict which city an office microbiome sample is derived from, but office-specific bacterial communities are less apparent. This differs from previous work, which has suggested office-specific compositions of bacterial communities. We again suspect that the difference from prior work arises from different usage patterns. As has been previously shown, we observe that human skin contributes heavily to the composition of BE surfaces. IMPORTANCE Our study highlights several points that should impact the design of future studies of the microbiology of BEs. First, projects tracking changes in BE bacterial communities should focus sampling efforts on surveying different locations in offices and in different cities but not necessarily different materials or different offices in the same city. Next, disturbance due to repeated sampling, though detectable, is small compared to that due to other variables, opening up a range of longitudinal study designs in the BE. Next, studies requiring more samples than can be sequenced on a single sequencing run (which is increasingly common) must control for run effects by including some of the same samples in all of the sequencing runs as technical replicates. Finally, detailed tracking of indoor and material environment covariates is likely not essential for BE microbiome studies, as the normal range of indoor environmental conditions is likely not large enough to impact bacterial communities

    Ecological succession and viability of human-associated microbiota on restroom surfaces

    Get PDF
    Author Posting. © The Author(s), 2014. This is the author's version of the work. It is posted here by permission of American Society for Microbiology for personal use, not for redistribution. The definitive version was published in Applied and Environmental Microbiology (2014), doi:10.1128/AEM.03117-14.Human-associated bacteria dominate the built environment (BE). Following decontamination of floors, toilet seats, and soap dispensers in 4 public restrooms, in situ bacterial communities were characterized hourly, daily, and weekly to determine their successional ecology. The viability of cultivable bacteria, following the removal of dispersal agents (humans), was also assessed hourly. A late successional community developed within 5-8 hours on restroom floors, and showed remarkable stability over weeks to months. Despite late successional dominance by skin- and outdoor-associated bacteria, the most ubiquitous organisms were predominantly gut-associated taxa, which persisted following exclusion of humans. Staphylococcus represented the majority of the cultivable community, even after several hours of human-exclusion. MRSA-associated virulence genes were found on floors, but were not present in assembled Staphylococcus pan-genomes. Viral abundances, which were predominantly enterophage, human papilloma and herpes viruses, were significantly correlated with bacteria abundances, and showed an unexpectedly low virus-to-bacteria ratio in surface-associated samples, suggesting that bacterial hosts are mostly dormant on BE surfaces.S.M.G. was supported by an EPA STAR Graduate Fellowship and the National Institutes of Health Training Grant 5T-32EB-009412. We acknowledge funding from the Alfred P Sloan Foundation’s Microbiology of the Built Environment Program.2015-05-1

    Citizen Science for Mining the Biomedical Literature

    No full text
    Biomedical literature represents one of the largest and fastest growing collections of unstructured biomedical knowledge. Finding critical information buried in the literature can be challenging. To extract information from free-flowing text, researchers need to: 1. identify the entities in the text (named entity recognition), 2. apply a standardized vocabulary to these entities (normalization), and 3. identify how entities in the text are related to one another (relationship extraction). Researchers have primarily approached these information extraction tasks through manual expert curation and computational methods. We have previously demonstrated that named entity recognition (NER) tasks can be crowdsourced to a group of non-experts via the paid microtask platform, Amazon Mechanical Turk (AMT), and can dramatically reduce the cost and increase the throughput of biocuration efforts. However, given the size of the biomedical literature, even information extraction via paid microtask platforms is not scalable. With our web-based application Mark2Cure (http://mark2cure.org), we demonstrate that NER tasks also can be performed by volunteer citizen scientists with high accuracy. We apply metrics from the Zooniverse Matrices of Citizen Science Success and provide the results here to serve as a basis of comparison for other citizen science projects. Further, we discuss design considerations, issues, and the application of analytics for successfully moving a crowdsourcing workflow from a paid microtask platform to a citizen science platform. To our knowledge, this study is the first application of citizen science to a natural language processing task

    Additional file 1: Figure S1. of ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses

    No full text
    Principal Coordinates comparing unsimulated (real) samples based on (a) unweighted UniFrac distances where trees are computed using ghost-tree, (b) weighted UniFrac distances where trees are computed using ghost-tree, (c) unweighted UniFrac distances where trees are computed using ghost-tree, 0-branch length-foundation, (d) weighted UniFrac distances where trees are computed using ghost-tree, 0-branch-length foundation, (e) unweighted UniFrac distances where trees are computed using ghost-tree, 0-branch-length extensions, (f) weighted UniFrac distances where trees are computed using ghost-tree, 0-branch-length extensions. Blue points are simulated and real human saliva samples, and red points are simulated and real restroom surface samples. Plots were made using EMPeror software [25]. (PDF 522 kb

    Reproducible, Interactive, Scalable and Extensible Microbiome Data Science Using QIIME 2

    No full text

    QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science

    No full text
    Bolyen E, Rideout JR, Dillon MR, et al. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. PeerJ. 2018
    corecore