96 research outputs found

    A Parsimony Approach to Biological Pathway Reconstruction/Inference for Genomes and Metagenomes

    Get PDF
    A common biological pathway reconstruction approach—as implemented by many automatic biological pathway services (such as the KAAS and RAST servers) and the functional annotation of metagenomic sequences—starts with the identification of protein functions or families (e.g., KO families for the KEGG database and the FIG families for the SEED database) in the query sequences, followed by a direct mapping of the identified protein families onto pathways. Given a predicted patchwork of individual biochemical steps, some metric must be applied in deciding what pathways actually exist in the genome or metagenome represented by the sequences. Commonly, and straightforwardly, a complete biological pathway can be identified in a dataset if at least one of the steps associated with the pathway is found. We report, however, that this naïve mapping approach leads to an inflated estimate of biological pathways, and thus overestimates the functional diversity of the sample from which the DNA sequences are derived. We developed a parsimony approach, called MinPath (Minimal set of Pathways), for biological pathway reconstructions using protein family predictions, which yields a more conservative, yet more faithful, estimation of the biological pathways for a query dataset. MinPath identified far fewer pathways for the genomes collected in the KEGG database—as compared to the naïve mapping approach—eliminating some obviously spurious pathway annotations. Results from applying MinPath to several metagenomes indicate that the common methods used for metagenome annotation may significantly overestimate the biological pathways encoded by microbial communities

    Ser/Thr/Tyr Protein Phosphorylation in the Archaeon Halobacterium salinarum—A Representative of the Third Domain of Life

    Get PDF
    In the quest for the origin and evolution of protein phosphorylation, the major regulatory post-translational modification in eukaryotes, the members of archaea, the “third domain of life”, play a protagonistic role. A plethora of studies have demonstrated that archaeal proteins are subject to post-translational modification by covalent phosphorylation, but little is known concerning the identities of the proteins affected, the impact on their functionality, the physiological roles of archaeal protein phosphorylation/dephosphorylation, and the protein kinases/phosphatases involved. These limited studies led to the initial hypothesis that archaea, similarly to other prokaryotes, use mainly histidine/aspartate phosphorylation, in their two-component systems representing a paradigm of prokaryotic signal transduction, while eukaryotes mostly use Ser/Thr/Tyr phosphorylation for creating highly sophisticated regulatory networks. In antithesis to the above hypothesis, several studies showed that Ser/Thr/Tyr phosphorylation is also common in the bacterial cell, and here we present the first genome-wide phosphoproteomic analysis of the model organism of archaea, Halobacterium salinarum, proving the existence/conservation of Ser/Thr/Tyr phosphorylation in the “third domain” of life, allowing a better understanding of the origin and evolution of the so-called “Nature's premier” mechanism for regulating the functional properties of proteins

    Early Evolution of Ionotropic GABA Receptors and Selective Regimes Acting on the Mammalian-Specific Theta and Epsilon Subunits

    Get PDF
    BACKGROUND: The amino acid neurotransmitter GABA is abundant in the central nervous system (CNS) of both invertebrates and vertebrates. Receptors of this neurotransmitter play a key role in important processes such as learning and memory. Yet, little is known about the mode and tempo of evolution of the receptors of this neurotransmitter. Here, we investigate the phylogenetic relationships of GABA receptor subunits across the chordates and detail their mode of evolution among mammals. PRINCIPAL FINDINGS: Our analyses support two major monophyletic clades: one clade containing GABA(A) receptor alpha, gamma, and epsilon subunits, and another one containing GABA(A) receptor rho, beta, delta, theta, and pi subunits. The presence of GABA receptor subunits from each of the major clades in the Ciona intestinalis genome suggests that these ancestral duplication events occurred before the divergence of urochordates. However, while gene divergence proceeded at similar rates on most receptor subunits, we show that the mammalian-specific subunits theta and epsilon experienced an episode of positive selection and of relaxed constraints, respectively, after the duplication event. Sites putatively under positive selection are placed on a three-dimensional model obtained by homology-modeling. CONCLUSIONS: Our results suggest an early divergence of the GABA receptor subunits, before the split from urochordates. We show that functional changes occurred in the lineages leading to the mammalian-specific subunit theta, and we identify the amino acid sites putatively responsible for the functional divergence. We discuss potential consequences for the evolution of mammals and of their CNS

    Protein Function Assignment through Mining Cross-Species Protein-Protein Interactions

    Get PDF
    Background: As we move into the post genome-sequencing era, an immediate challenge is how to make best use of the large amount of high-throughput experimental data to assign functions to currently uncharacterized proteins. We here describe CSIDOP, a new method for protein function assignment based on shared interacting domain patterns extracted from cross-species protein-protein interaction data. Methodology/Principal Findings: The proposed method is assessed both biologically and statistically over the genome of H. sapiens. The CSIDOP method is capable of making protein function prediction with accuracy of 95.42 % using 2,972 gene ontology (GO) functional categories. In addition, we are able to assign novel functional annotations for 181 previously uncharacterized proteins in H. sapiens. Furthermore, we demonstrate that for proteins that are characterized by GO, the CSIDOP may predict extra functions. This is attractive as a protein normally executes a variety of functions in different processes and its current GO annotation may be incomplete. Conclusions/Significance: It can be shown through experimental results that the CSIDOP method is reliable and practical in use. The method will continue to improve as more high quality interaction data becomes available and is readily scalable t

    Genome Sequence of a Mesophilic Hydrogenotrophic Methanogen Methanocella paludicola, the First Cultivated Representative of the Order Methanocellales

    Get PDF
    We report complete genome sequence of a mesophilic hydrogenotrophic methanogen Methanocella paludicola, the first cultured representative of the order Methanocellales once recognized as an uncultured key archaeal group for methane emission in rice fields. The genome sequence of M. paludicola consists of a single circular chromosome of 2,957,635 bp containing 3004 protein-coding sequences (CDS). Genes for most of the functions known in the methanogenic archaea were identified, e.g. a full complement of hydrogenases and methanogenesis enzymes. The mixotrophic growth of M. paludicola was clarified by the genomic characterization and re-examined by the subsequent growth experiments. Comparative genome analysis with the previously reported genome sequence of RC-IMRE50, which was metagenomically reconstructed, demonstrated that about 70% of M. paludicola CDSs were genetically related with RC-IMRE50 CDSs. These CDSs included the genes involved in hydrogenotrophic methane production, incomplete TCA cycle, assimilatory sulfate reduction and so on. However, the genetic components for the carbon and nitrogen fixation and antioxidant system were different between the two Methanocellales genomes. The difference is likely associated with the physiological variability between M. paludicola and RC-IMRE50, further suggesting the genomic and physiological diversity of the Methanocellales methanogens. Comparative genome analysis among the previously determined methanogen genomes points to the genome-wide relatedness of the Methanocellales methanogens to the orders Methanosarcinales and Methanomicrobiales methanogens in terms of the genetic repertoire. Meanwhile, the unique evolutionary history of the Methanocellales methanogens is also traced in an aspect by the comparative genome analysis among the methanogens

    Mining Predicted Essential Genes of Brugia malayi for Nematode Drug Targets

    Get PDF
    We report results from the first genome-wide application of a rational drug target selection methodology to a metazoan pathogen genome, the completed draft sequence of Brugia malayi, a parasitic nematode responsible for human lymphatic filariasis. More than 1.5 billion people worldwide are at risk of contracting lymphatic filariasis and onchocerciasis, a related filarial disease. Drug treatments for filariasis have not changed significantly in over 20 years, and with the risk of resistance rising, there is an urgent need for the development of new anti-filarial drug therapies. The recent publication of the draft genomic sequence for B. malayi enables a genome-wide search for new drug targets. However, there is no functional genomics data in B. malayi to guide the selection of potential drug targets. To circumvent this problem, we have utilized the free-living model nematode Caenorhabditis elegans as a surrogate for B. malayi. Sequence comparisons between the two genomes allow us to map C. elegans orthologs to B. malayi genes. Using these orthology mappings and by incorporating the extensive genomic and functional genomic data, including genome-wide RNAi screens, that already exist for C. elegans, we identify potentially essential genes in B. malayi. Further incorporation of human host genome sequence data and a custom algorithm for prioritization enables us to collect and rank nearly 600 drug target candidates. Previously identified potential drug targets cluster near the top of our prioritized list, lending credibility to our methodology. Over-represented Gene Ontology terms, predicted InterPro domains, and RNAi phenotypes of C. elegans orthologs associated with the potential target pool are identified. By virtue of the selection procedure, the potential B. malayi drug targets highlight components of key processes in nematode biology such as central metabolism, molting and regulation of gene expression
    corecore