1,113 research outputs found

    Phylogeny of Prokaryotes and Chloroplasts Revealed by a Simple Composition Approach on All Protein Sequences from Complete Genomes Without Sequence Alignment

    Get PDF
    The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped to resolve the evolution of this organelle in photosynthetic eukaryotes. In this paper we propose an alternative method of phylogenetic analysis using compositional statistics for all protein sequences from complete genomes. This new method is conceptually simpler than and computationally as fast as the one proposed by Qi et al. (2004b) and Chu et al. (2004). The same data sets used in Qi et al. (2004b) and Chu et al. (2004) are analyzed using the new method. Our distance-based phylogenic tree of the 109 prokaryotes and eukaryotes agrees with the biologists tree of life based on 16S rRNA comparison in a predominant majority of basic branching and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated to two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution

    Origin of symbol-using systems: speech, but not sign, without the semantic urge

    Get PDF
    Natural language—spoken and signed—is a multichannel phenomenon, involving facial and body expression, and voice and visual intonation that is often used in the service of a social urge to communicate meaning. Given that iconicity seems easier and less abstract than making arbitrary connections between sound and meaning, iconicity and gesture have often been invoked in the origin of language alongside the urge to convey meaning. To get a fresh perspective, we critically distinguish the origin of a system capable of evolution from the subsequent evolution that system becomes capable of. Human language arose on a substrate of a system already capable of Darwinian evolution; the genetically supported uniquely human ability to learn a language reflects a key contact point between Darwinian evolution and language. Though implemented in brains generated by DNA symbols coding for protein meaning, the second higher-level symbol-using system of language now operates in a world mostly decoupled from Darwinian evolutionary constraints. Examination of Darwinian evolution of vocal learning in other animals suggests that the initial fixation of a key prerequisite to language into the human genome may actually have required initially side-stepping not only iconicity, but the urge to mean itself. If sign languages came later, they would not have faced this constraint

    Four small puzzles that Rosetta doesn't solve

    Get PDF
    A complete macromolecule modeling package must be able to solve the simplest structure prediction problems. Despite recent successes in high resolution structure modeling and design, the Rosetta software suite fares poorly on deceptively small protein and RNA puzzles, some as small as four residues. To illustrate these problems, this manuscript presents extensive Rosetta results for four well-defined test cases: the 20-residue mini-protein Trp cage, an even smaller disulfide-stabilized conotoxin, the reactive loop of a serine protease inhibitor, and a UUCG RNA tetraloop. In contrast to previous Rosetta studies, several lines of evidence indicate that conformational sampling is not the major bottleneck in modeling these small systems. Instead, approximations and omissions in the Rosetta all-atom energy function currently preclude discriminating experimentally observed conformations from de novo models at atomic resolution. These molecular "puzzles" should serve as useful model systems for developers wishing to make foundational improvements to this powerful modeling suite.Comment: Published in PLoS One as a manuscript for the RosettaCon 2010 Special Collectio

    Missing lithotroph identified as new planctomycete

    Get PDF
    With the increased use of chemical fertilizers in agriculture, many densely populated countries face environmental problems associated with high ammonia emissions. The process of anaerobic ammonia oxidation ('anammox') is one of the most innovative technological advances in the removal of ammonia nitrogen from waste water. This new process combines ammonia and nitrite directly into dinitrogen gas. Until now, bacteria capable of anaerobically oxidizing ammonia had never been found and were known as "lithotrophs missing from nature". Here we report the discovery of this missing lithotroph and its identification as a new, autotrophic member of the order Planctomycetales, one of the major distinct divisions of the Bacteria. The new planctomycete grows extremely slowly, dividing only once every two weeks. At present, it cannot be cultivated by conventional microbiological techniques. The identification of this bacterium as the one responsible for anaerobic oxidation of ammonia makes an important contribution to the problem of unculturability

    Ancient horizontal gene transfer and the last common ancestors

    Get PDF
    Background The genomic history of prokaryotic organismal lineages is marked by extensive horizontal gene transfer (HGT) between groups of organisms at all taxonomic levels. These HGT events have played an essential role in the origin and distribution of biological innovations. Analyses of ancient gene families show that HGT existed in the distant past, even at the time of the organismal last universal common ancestor (LUCA). Most gene transfers originated in lineages that have since gone extinct. Therefore, one cannot assume that the last common ancestors of each gene were all present in the same cell representing the cellular ancestor of all extant life. Results Organisms existing as part of a diverse ecosystem at the time of LUCA likely shared genetic material between lineages. If these other lineages persisted for some time, HGT with the descendants of LUCA could have continued into the bacterial and archaeal lineages. Phylogenetic analyses of aminoacyl-tRNA synthetase protein families support the hypothesis that the molecular common ancestors of the most ancient gene families did not all coincide in space and time. This is most apparent in the evolutionary histories of seryl-tRNA synthetase and threonyl-tRNA synthetase protein families, each containing highly divergent “rare” forms, as well as the sparse phylogenetic distributions of pyrrolysyl-tRNA synthetase, and the bacterial heterodimeric form of glycyl-tRNA synthetase. These topologies and phyletic distributions are consistent with horizontal transfers from ancient, likely extinct branches of the tree of life. Conclusions Of all the organisms that may have existed at the time of LUCA, by definition only one lineage is survived by known progeny; however, this lineage retains a genomic record of heterogeneous genetic origins. The evolutionary histories of aminoacyl-tRNA synthetases (aaRS) are especially informative in detecting this signal, as they perform primordial biological functions, have undergone several ancient HGT events, and contain many sites with low substitution rates allowing deep phylogenetic reconstruction. We conclude that some aaRS families contain groups that diverge before LUCA. We propose that these ancient gene variants be described by the term “hypnologs”, reflecting their ancient, reticulate origin from a time in life history that has been all but erased”.National Science Foundation (U.S.) (Grant DEB 0830024)Exobiology Program (U.S.) (Grant NNX10AR85G)United States. National Aeronautics and Space Administration (Postdoctoral Program

    Molecular Evolution of Aminoacyl tRNA Synthetase Proteins in the Early History of Life

    Get PDF
    Aminoacyl-tRNA synthetases (aaRS) consist of several families of functionally conserved proteins essential for translation and protein synthesis. Like nearly all components of the translation machinery, most aaRS families are universally distributed across cellular life, being inherited from the time of the Last Universal Common Ancestor (LUCA). However, unlike the rest of the translation machinery, aaRS have undergone numerous ancient horizontal gene transfers, with several independent events detected between domains, and some possibly involving lineages diverging before the time of LUCA. These transfers reveal the complexity of molecular evolution at this early time, and the chimeric nature of genomes within cells that gave rise to the major domains. Additionally, given the role of these protein families in defining the amino acids used for protein synthesis, sequence reconstruction of their pre-LUCA ancestors can reveal the evolutionary processes at work in the origin of the genetic code. In particular, sequence reconstructions of the paralog ancestors of isoleucyl- and valyl- RS provide strong empirical evidence that at least for this divergence, the genetic code did not co-evolve with the aaRSs; rather, both amino acids were already part of the genetic code before their cognate aaRSs diverged from their common ancestor. The implications of this observation for the early evolution of RNA-directed protein biosynthesis are discussed.National Science Foundation (U.S.) (Grant DEB 0830024)National Science Foundation (U.S.) (Grant DEB 0936234)United States. National Aeronautics and Space Administration (NASA Postdoctoral Fellowship

    Tensor Decomposition Reveals Concurrent Evolutionary Convergences and Divergences and Correlations with Structural Motifs in Ribosomal RNA

    Get PDF
    Evolutionary relationships among organisms are commonly described by using a hierarchy derived from comparisons of ribosomal RNA (rRNA) sequences. We propose that even on the level of a single rRNA molecule, an organism's evolution is composed of multiple pathways due to concurrent forces that act independently upon different rRNA degrees of freedom. Relationships among organisms are then compositions of coexisting pathway-dependent similarities and dissimilarities, which cannot be described by a single hierarchy. We computationally test this hypothesis in comparative analyses of 16S and 23S rRNA sequence alignments by using a tensor decomposition, i.e., a framework for modeling composite data. Each alignment is encoded in a cuboid, i.e., a third-order tensor, where nucleotides, positions and organisms, each represent a degree of freedom. A tensor mode-1 higher-order singular value decomposition (HOSVD) is formulated such that it separates each cuboid into combinations of patterns of nucleotide frequency variation across organisms and positions, i.e., “eigenpositions” and corresponding nucleotide-specific segments of “eigenorganisms,” respectively, independent of a-priori knowledge of the taxonomic groups or rRNA structures. We find, in support of our hypothesis that, first, the significant eigenpositions reveal multiple similarities and dissimilarities among the taxonomic groups. Second, the corresponding eigenorganisms identify insertions or deletions of nucleotides exclusively conserved within the corresponding groups, that map out entire substructures and are enriched in adenosines, unpaired in the rRNA secondary structure, that participate in tertiary structure interactions. This demonstrates that structural motifs involved in rRNA folding and function are evolutionary degrees of freedom. Third, two previously unknown coexisting subgenic relationships between Microsporidia and Archaea are revealed in both the 16S and 23S rRNA alignments, a convergence and a divergence, conferred by insertions and deletions of these motifs, which cannot be described by a single hierarchy. This shows that mode-1 HOSVD modeling of rRNA alignments might be used to computationally predict evolutionary mechanisms

    Varieties of living things: Life at the intersection of lineage and metabolism

    Get PDF
    publication-status: Publishedtypes: Articl

    Bioinformatics for the human microbiome project

    Get PDF
    Microbes inhabit virtually all sites of the human body, yet we know very little about the role they play in our health. In recent years, there has been increasing interest in studying human-associated microbial communities, particularly since microbial dysbioses have now been implicated in a number of human diseases [1]–[3]. Dysbiosis, the disruption of the normal microbial community structure, however, is impossible to define without first establishing what “normal microbial community structure” means within the healthy human microbiome. Recent advances in sequencing technologies have made it feasible to perform large-scale studies of microbial communities, providing the tools necessary to begin to address this question [4], [5]. This led to the implementation of the Human Microbiome Project (HMP) in 2007, an initiative funded by the National Institutes of Health Roadmap for Biomedical Research and constructed as a large, genome-scale community research project [6]. Any such project must plan for data analysis, computational methods development, and the public availability of tools and data; here, we provide an overview of the corresponding bioinformatics organization, history, and results from the HMP (Figure 1).National Institutes of Health (U.S.) (NIH U54HG004969)National Institutes of Health (U.S.) (grant R01HG004885)National Institutes of Health (U.S.) (grant R01HG005975)National Institutes of Health (U.S.) (grant R01HG005969
    corecore