392 research outputs found

    Predicting Prokaryotic Ecological Niches Using Genome Sequence Analysis

    Get PDF
    Automated DNA sequencing technology is so rapid that analysis has become the rate-limiting step. Hundreds of prokaryotic genome sequences are publicly available, with new genomes uploaded at the rate of approximately 20 per month. As a result, this growing body of genome sequences will include microorganisms not previously identified, isolated, or observed. We hypothesize that evolutionary pressure exerted by an ecological niche selects for a similar genetic repertoire in those prokaryotes that occupy the same niche, and that this is due to both vertical and horizontal transmission. To test this, we have developed a novel method to classify prokaryotes, by calculating their Pfam protein domain distributions and clustering them with all other sequenced prokaryotic species. Clusters of organisms are visualized in two dimensions as ‘mountains’ on a topological map. When compared to a phylogenetic map constructed using 16S rRNA, this map more accurately clusters prokaryotes according to functional and environmental attributes. We demonstrate the ability of this map, which we term a “niche map”, to cluster according to ecological niche both quantitatively and qualitatively, and propose that this method be used to associate uncharacterized prokaryotes with their ecological niche as a means of predicting their functional role directly from their genome sequence

    A statistical toolbox for metagenomics: assessing functional diversity in microbial communities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The 99% of bacteria in the environment that are recalcitrant to culturing have spurred the development of metagenomics, a culture-independent approach to sample and characterize microbial genomes. Massive datasets of metagenomic sequences have been accumulated, but analysis of these sequences has focused primarily on the descriptive comparison of the relative abundance of proteins that belong to specific functional categories. More robust statistical methods are needed to make inferences from metagenomic data. In this study, we developed and applied a suite of tools to describe and compare the richness, membership, and structure of microbial communities using peptide fragment sequences extracted from metagenomic sequence data.</p> <p>Results</p> <p>Application of these tools to acid mine drainage, soil, and whale fall metagenomic sequence collections revealed groups of peptide fragments with a relatively high abundance and no known function. When combined with analysis of 16S rRNA gene fragments from the same communities these tools enabled us to demonstrate that although there was no overlap in the types of 16S rRNA gene sequence observed, there was a core collection of operational protein families that was shared among the three environments.</p> <p>Conclusion</p> <p>The results of comparisons between the three habitats were surprising considering the relatively low overlap of membership and the distinctively different characteristics of the three habitats. These tools will facilitate the use of metagenomics to pursue statistically sound genome-based ecological analyses.</p

    Achievements and new knowledge unraveled by metagenomic approaches

    Get PDF
    Metagenomics has paved the way for cultivation-independent assessment and exploitation of microbial communities present in complex ecosystems. In recent years, significant progress has been made in this research area. A major breakthrough was the improvement and development of high-throughput next-generation sequencing technologies. The application of these technologies resulted in the generation of large datasets derived from various environments such as soil and ocean water. The analyses of these datasets opened a window into the enormous phylogenetic and metabolic diversity of microbial communities living in a variety of ecosystems. In this way, structure, functions, and interactions of microbial communities were elucidated. Metagenomics has proven to be a powerful tool for the recovery of novel biomolecules. In most cases, functional metagenomics comprising construction and screening of complex metagenomic DNA libraries has been applied to isolate new enzymes and drugs of industrial importance. For this purpose, several novel and improved screening strategies that allow efficient screening of large collections of clones harboring metagenomes have been introduced

    A metagenomic assessment of winter and summer bacterioplankton from Antarctica Peninsula coastal surface waters

    Get PDF
    © The Author(s), 2012. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in The ISME Journal 6 (2012): 1901-1915, doi:10.1038/ismej.2012.31.Antarctic surface oceans are well-studied during summer when irradiance levels are high, sea ice is melting and primary productivity is at a maximum. Coincident with this timing, the bacterioplankton respond with significant increases in secondary productivity. Little is known about bacterioplankton in winter when darkness and sea-ice cover inhibit photoautotrophic primary production. We report here an environmental genomic and small subunit ribosomal RNA (SSU rRNA) analysis of winter and summer Antarctic Peninsula coastal seawater bacterioplankton. Intense inter-seasonal differences were reflected through shifts in community composition and functional capacities encoded in winter and summer environmental genomes with significantly higher phylogenetic and functional diversity in winter. In general, inferred metabolisms of summer bacterioplankton were characterized by chemoheterotrophy, photoheterotrophy and aerobic anoxygenic photosynthesis while the winter community included the capacity for bacterial and archaeal chemolithoautotrophy. Chemolithoautotrophic pathways were dominant in winter and were similar to those recently reported in global ‘dark ocean’ mesopelagic waters. If chemolithoautotrophy is widespread in the Southern Ocean in winter, this process may be a previously unaccounted carbon sink and may help account for the unexplained anomalies in surface inorganic nitrogen content.CSR was supported by an NSF Postdoctoral Fellowship in Biological Informatics (DBI-0532893). The research was supported by National Science Foundation awards: ANT 0632389 (to AEM and JJG), and ANT 0632278 and 0217282 (to HWD), all from the Antarctic Organisms and Ecosystems Program

    Gene prediction in metagenomic fragments: A large scale machine learning approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions.</p> <p>Results</p> <p>We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability.</p> <p>Conclusion</p> <p>Large scale machine learning methods are well-suited for gene prediction in metagenomic DNA fragments. In particular, the combination of linear discriminants and neural networks is promising and should be considered for integration into metagenomic analysis pipelines. The data sets can be downloaded from the URL provided (see Availability and requirements section).</p

    Multiple Data Analyses and Statistical Approaches for Analyzing Data from Metagenomic Studies and Clinical Trials

    Get PDF
    Metagenomics, also known as environmental genomics, is the study of the genomic content of a sample of organisms (microbes) obtained from a common habitat. Metagenomics and other “omics” disciplines have captured the attention of researchers for several decades. The effect of microbes in our body is a relevant concern for health studies. There are plenty of studies using metagenomics which examine microorganisms that inhabit niches in the human body, sometimes causing disease, and are often correlated with multiple treatment conditions. No matter from which environment it comes, the analyses are often aimed at determining either the presence or absence of specific species of interest in a given metagenome or comparing the biological diversity and the functional activity of a wider range of microorganisms within their communities. The importance increases for comparison within different environments such as multiple patients with different conditions, multiple drugs, and multiple time points of same treatment or same patient. Thus, no matter how many hypotheses we have, we need a good understanding of genomics, bioinformatics, and statistics to work together to analyze and interpret these datasets in a meaningful way. This chapter provides an overview of different data analyses and statistical approaches (with example scenarios) to analyze metagenomics samples from different medical projects or clinical trials

    Clinical Trials in Head Injury

    Full text link
    Traumatic brain injury (TBI) remains a major public health problem globally. In the United States the incidence of closed head injuries admitted to hospitals is conservatively estimated to be 200 per 100,000 population, and the incidence of penetrating head injury is estimated to be 12 per 100,000, the highest of any developed country in the world. This yields an approximate number of 500,000 new cases each year, a sizeable proportion of which demonstrate signficant long-term disabilities. Unfortunately, there is a paucity of proven therapies for this disease. For a variety of reasons, clinical trials for this condition have been difficult to design and perform. Despite promising pre-clinical data, most of the trials that have been performed in recent years have failed to demonstrate any significant improvement in outcomes. The reasons for these failures have not always been apparent and any insights gained were not always shared. It was therefore feared that we were running the risk of repeating our mistakes. Recognizing the importance of TBI, the National Institute of Neurological Disorders and Stroke (NINDS) sponsored a workshop that brought together experts from clinical, research, and pharmaceutical backgrounds. This workshop proved to be very informative and yielded many insights into previous and future TBI trials. This paper is an attempt to summarize the key points made at the workshop. It is hoped that these lessons will enhance the planning and design of future efforts in this important field of research.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/63185/1/089771502753754037.pd

    Pyrosequencing of Antibiotic-Contaminated River Sediments Reveals High Levels of Resistance and Gene Transfer Elements

    Get PDF
    The high and sometimes inappropriate use of antibiotics has accelerated the development of antibiotic resistance, creating a major challenge for the sustainable treatment of infections world-wide. Bacterial communities often respond to antibiotic selection pressure by acquiring resistance genes, i.e. mobile genetic elements that can be shared horizontally between species. Environmental microbial communities maintain diverse collections of resistance genes, which can be mobilized into pathogenic bacteria. Recently, exceptional environmental releases of antibiotics have been documented, but the effects on the promotion of resistance genes and the potential for horizontal gene transfer have yet received limited attention. In this study, we have used culture-independent shotgun metagenomics to investigate microbial communities in river sediments exposed to waste water from the production of antibiotics in India. Our analysis identified very high levels of several classes of resistance genes as well as elements for horizontal gene transfer, including integrons, transposons and plasmids. In addition, two abundant previously uncharacterized resistance plasmids were identified. The results suggest that antibiotic contamination plays a role in the promotion of resistance genes and their mobilization from environmental microbes to other species and eventually to human pathogens. The entire life-cycle of antibiotic substances, both before, under and after usage, should therefore be considered to fully evaluate their role in the promotion of resistance

    Finding the Needles in the Metagenome Haystack

    Get PDF
    In the collective genomes (the metagenome) of the microorganisms inhabiting the Earth’s diverse environments is written the history of life on this planet. New molecular tools developed and used for the past 15 years by microbial ecologists are facilitating the extraction, cloning, screening, and sequencing of these genomes. This approach allows microbial ecologists to access and study the full range of microbial diversity, regardless of our ability to culture organisms, and provides an unprecedented access to the breadth of natural products that these genomes encode. However, there is no way that the mere collection of sequences, no matter how expansive, can provide full coverage of the complex world of microbial metagenomes within the foreseeable future. Furthermore, although it is possible to fish out highly informative and useful genes from the sea of gene diversity in the environment, this can be a highly tedious and inefficient procedure. Microbial ecologists must be clever in their pursuit of ecologically relevant, valuable, and niche-defining genomic information within the vast haystack of microbial diversity. In this report, we seek to describe advances and prospects that will help microbial ecologists glean more knowledge from investigations into metagenomes. These include technological advances in sequencing and cloning methodologies, as well as improvements in annotation and comparative sequence analysis. More significant, however, will be ways to focus in on various subsets of the metagenome that may be of particular relevance, either by limiting the target community under study or improving the focus or speed of screening procedures. Lastly, given the cost and infrastructure necessary for large metagenome projects, and the almost inexhaustible amount of data they can produce, trends toward broader use of metagenome data across the research community coupled with the needed investment in bioinformatics infrastructure devoted to metagenomics will no doubt further increase the value of metagenomic studies in various environments

    Computational Biology Methods and Their Application to the Comparative Genomics of Endocellular Symbiotic Bacteria of Insects

    Get PDF
    Comparative genomics has become a real tantalizing challenge in the postgenomic era. This fact has been mostly magnified by the plethora of new genomes becoming available in a daily bases. The overwhelming list of new genomes to compare has pushed the field of bioinformatics and computational biology forward toward the design and development of methods capable of identifying patterns in a sea of swamping data noise. Despite many advances made in such endeavor, the ever-lasting annoying exceptions to the general patterns remain to pose difficulties in generalizing methods for comparative genomics. In this review, we discuss the different tools devised to undertake the challenge of comparative genomics and some of the exceptions that compromise the generality of such methods. We focus on endosymbiotic bacteria of insects because of their genomic dynamics peculiarities when compared to free-living organisms
    corecore