30,483 research outputs found

    Global Functional Atlas of \u3cem\u3eEscherichia coli\u3c/em\u3e Encompassing Previously Uncharacterized Proteins

    Get PDF
    One-third of the 4,225 protein-coding genes of Escherichia coli K-12 remain functionally unannotated (orphans). Many map to distant clades such as Archaea, suggesting involvement in basic prokaryotic traits, whereas others appear restricted to E. coli, including pathogenic strains. To elucidate the orphans’ biological roles, we performed an extensive proteomic survey using affinity-tagged E. coli strains and generated comprehensive genomic context inferences to derive a high-confidence compendium for virtually the entire proteome consisting of 5,993 putative physical interactions and 74,776 putative functional associations, most of which are novel. Clustering of the respective probabilistic networks revealed putative orphan membership in discrete multiprotein complexes and functional modules together with annotated gene products, whereas a machine-learning strategy based on network integration implicated the orphans in specific biological processes. We provide additional experimental evidence supporting orphan participation in protein synthesis, amino acid metabolism, biofilm formation, motility, and assembly of the bacterial cell envelope. This resource provides a “systems-wide” functional blueprint of a model microbe, with insights into the biological and evolutionary significance of previously uncharacterized proteins

    The genetics of symbiotic nitrogen fixation: comparative genomics of 14 Rhizobia Strains by resolution of protein clusters.

    Get PDF
    The symbiotic relationship between legumes and nitrogen fixing bacteria is critical for agriculture, as it may have profound impacts on lowering costs for farmers, on land sustainability, on soil quality, and on mitigation of greenhouse gas emissions. However, despite the importance of the symbioses to the global nitrogen cycling balance, very few rhizobial genomes have been sequenced so far, although there are some ongoing efforts in sequencing elite strains. In this study, the genomes of fourteen selected strains of the order Rhizobiales, all previously fully sequenced and annotated, were compared to assess differences between the strains and to investigate the feasibility of defining a core ?symbiome??the essential genes required by all rhizobia for nodulation and nitrogen fixation. Comparison of these whole genomes has revealed valuable information, such as several events of lateral gene transfer, particularly in the symbiotic plasmids and genomic islands that have contributed to a better understanding of the evolution of contrasting symbioses. Unique genes were also identified, as well as omissions of symbiotic genes that were expected to be found. Protein comparisons have also allowed the identification of a variety of similarities and differences in several groups of genes, including those involved in nodulation, nitrogen fixation, production of exopolysaccharides, Type I to Type VI secretion systems, among others, and identifying some key genes that could be related to host specificity and/or a better saprophytic ability. However, while several significant differences in the type and number of proteins were observed, the evidence presented suggests no simple core symbiome exists. A more abstract systems biology concept of nitrogen fixing symbiosis may be required. The results have also highlighted that comparative genomics represents a valuable tool for capturing specificities and generalities of each genome.bitstream/item/74069/1/ID-34062.pd

    Genomic and experimental evidence for multiple metabolic functions in the RidA/YjgF/YER057c/UK114 (Rid) protein family.

    Get PDF
    BackgroundIt is now recognized that enzymatic or chemical side-reactions can convert normal metabolites to useless or toxic ones and that a suite of enzymes exists to mitigate such metabolite damage. Examples are the reactive imine/enamine intermediates produced by threonine dehydratase, which damage the pyridoxal 5'-phosphate cofactor of various enzymes causing inactivation. This damage is pre-empted by RidA proteins, which hydrolyze the imines before they do harm. RidA proteins belong to the YjgF/YER057c/UK114 family (here renamed the Rid family). Most other members of this diverse and ubiquitous family lack defined functions.ResultsPhylogenetic analysis divided the Rid family into a widely distributed, apparently archetypal RidA subfamily and seven other subfamilies (Rid1 to Rid7) that are largely confined to bacteria and often co-occur in the same organism with RidA and each other. The Rid1 to Rid3 subfamilies, but not the Rid4 to Rid7 subfamilies, have a conserved arginine residue that, in RidA proteins, is essential for imine-hydrolyzing activity. Analysis of the chromosomal context of bacterial RidA genes revealed clustering with genes for threonine dehydratase and other pyridoxal 5'-phosphate-dependent enzymes, which fits with the known RidA imine hydrolase activity. Clustering was also evident between Rid family genes and genes specifying FAD-dependent amine oxidases or enzymes of carbamoyl phosphate metabolism. Biochemical assays showed that Salmonella enterica RidA and Rid2, but not Rid7, can hydrolyze imines generated by amino acid oxidase. Genetic tests indicated that carbamoyl phosphate overproduction is toxic to S. enterica cells lacking RidA, and metabolomic profiling of Rid knockout strains showed ten-fold accumulation of the carbamoyl phosphate-related metabolite dihydroorotate.ConclusionsLike the archetypal RidA subfamily, the Rid2, and probably the Rid1 and Rid3 subfamilies, have imine-hydrolyzing activity and can pre-empt damage from imines formed by amine oxidases as well as by pyridoxal 5'-phosphate enzymes. The RidA subfamily has an additional damage pre-emption role in carbamoyl phosphate metabolism that has yet to be biochemically defined. Finally, the Rid4 to Rid7 subfamilies appear not to hydrolyze imines and thus remain mysterious

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    MorphDB : prioritizing genes for specialized metabolism pathways and gene ontology categories in plants

    Get PDF
    Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest
    • …
    corecore