83 research outputs found

    PHYLOGENOMICS - GUIDED VALIDATION OF FUNCTION FOR CONSERVED UNKNOWN GENES

    Get PDF
    Identifying functions for all gene products in all sequenced organisms is a central challenge of the post-genomic era. However, at least 30-50% of the proteins encoded by any given genome are of unknown function, or wrongly or vaguely annotated. Many of these 'unknown' proteins are common to prokaryotes and plants. We accordingly set out to predict and experimentally test the functions of such proteins. Our approach to functional prediction is integrative, coupling the extensive post-genomic resources available for plants with comparative genomics based on hundreds of microbial genomes, and functional genomic datasets from model microorganisms. The early phase is computer-assisted; later phases incorporate intellectual input from expert plant and microbial biochemists. The approach thus bridges the gap between automated homology-based annotations and the classical gene discovery efforts of experimentalists, and is much more powerful than purely computational approaches to identifying gene-function associations. Among Arabidopsis genes, we focused on those (2,325 in total) that (i) are unique or belong to families with no more than three members, (ii) are conserved between plants and prokaryotes, and (iii) have unknown or poorly known functions. Computer-assisted selection of promising targets for deeper analysis was based on homology .. independent characteristics associated in the SEED database with the prokaryotic members of each family, specifically gene clustering and phyletic spread, as well as availability of functional genomics data, and publications that could link candidate families to general metabolic areas, or to specific functions. In-depth comparative genomic analysis was then performed for about 500 top candidate families, which connected ~55 of them to general areas of metabolism and led to specific functional predictions for a subset of ~25 more. Twenty predicted functions were experimentally tested in at least one prokaryotic organism via reverse genetics, metabolic profiling, functional complementation, and recombinant protein biochemistry. Our approach predicted and validated functions for 10 formerly uncharacterized protein families common to plants and prokaryotes; none of these functions had previously been correctly predicted by computational methods. The functions of five more are currently being validated. Experimental testing of diverse representatives of these families combined with in silica analysis allowed accurate projection of the annotations to hundreds more sequenced genomes

    Identification of a conserved N-terminal domain in the first module of ACV synthetases

    Get PDF
    Abstract The l‐δ‐(α‐aminoadipoyl)‐l‐cysteinyl‐d‐valine synthetase (ACVS) is a trimodular nonribosomal peptide synthetase (NRPS) that provides the peptide precursor for the synthesis of β‐lactams. The enzyme has been extensively characterized in terms of tripeptide formation and substrate specificity. The first module is highly specific and is the only NRPS unit known to recruit and activate the substrate l‐α‐aminoadipic acid, which is coupled to the α‐amino group of l‐cysteine through an unusual peptide bond, involving its δ‐carboxyl group. Here we carried out an in‐depth investigation on the architecture of the first module of the ACVS enzymes from the fungus Penicillium rubens and the bacterium Nocardia lactamdurans. Bioinformatic analyses revealed the presence of a previously unidentified domain at the N‐terminus which is structurally related to condensation domains, but smaller in size. Deletion variants of both enzymes were generated to investigate the potential impact on penicillin biosynthesis in vivo and in vitro. The data indicate that the N‐terminal domain is important for catalysis

    Main Results of Phase IV BEMUSE Project: Simulation of LBLOCA in an NPP

    Get PDF
    Phase IV of BEMUSE Program is a necessary step for a subsequent uncertainty analysis. It includes the simulation of the reference scenario and a sensitivity study. The scenario is a LBLOCA and the reference plant is Zion 1 NPP, a 4 loop PWR unit. Thirteen participants coming from ten different countries have taken part in the exercise. The BEMUSE (Best Estimate Methods plus Uncertainty and Sensitivity Evaluation) Programhas been promoted by theWorking Group on AccidentManagement and Analysis (WGAMA) and endorsed by the Committee on the Safety of Nuclear Installations (CSNI). The paper presents the results of the calculations performed by participants and emphasizes its usefulness for future uncertainty evaluation, to be performed in next phase. The objectives of the activity are basically to simulate the LBLOCA reproducing the phenomena associated to the scenario and also to build a common, well-known, basis for the future comparison of uncertainty evaluation results among different methodologies and codes. The sensitivity calculations performed by participants are also presented. They allow studying the influence of different parameters such as material properties or initial and boundary conditions, upon the behaviour of the most relevant parameters related to the scenario

    High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

    Get PDF
    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed

    GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training

    Get PDF
    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all

    GOBLET: The Global Organisation for Bioinformatics Learning, Education and Training

    Get PDF
    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy—paradoxically, many are actually closing “niche” bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all

    Uncovering Genes with Divergent mRNA-Protein Dynamics in Streptomyces coelicolor

    Get PDF
    Many biological processes are intrinsically dynamic, incurring profound changes at both molecular and physiological levels. Systems analyses of such processes incorporating large-scale transcriptome or proteome profiling can be quite revealing. Although consistency between mRNA and proteins is often implicitly assumed in many studies, examples of divergent trends are frequently observed. Here, we present a comparative transcriptome and proteome analysis of growth and stationary phase adaptation in Streptomyces coelicolor, taking the time-dynamics of process into consideration. These processes are of immense interest in microbiology as they pertain to the physiological transformations eliciting biosynthesis of many naturally occurring therapeutic agents. A shotgun proteomics approach based on mass spectrometric analysis of isobaric stable isotope labeled peptides (iTRAQ™) enabled identification and rapid quantification of approximately 14% of the theoretical proteome of S. coelicolor. Independent principal component analyses of this and DNA microarray-derived transcriptome data revealed that the prominent patterns in both protein and mRNA domains are surprisingly well correlated. Despite this overall correlation, by employing a systematic concordance analysis, we estimated that over 30% of the analyzed genes likely exhibited significantly divergent patterns, of which nearly one-third displayed even opposing trends. Integrating this data with biological information, we discovered that certain groups of functionally related genes exhibit mRNA-protein discordance in a similar fashion. Our observations suggest that differences between mRNA and protein synthesis/degradation mechanisms are prominent in microbes while reaffirming the plausibility of such mechanisms acting in a concerted fashion at a protein complex or sub-pathway level

    Computing with bacterial constituents, cells and populations: from bioputing to bactoputing

    Get PDF
    The relevance of biological materials and processes to computing—aliasbioputing—has been explored for decades. These materials include DNA, RNA and proteins, while the processes include transcription, translation, signal transduction and regulation. Recently, the use of bacteria themselves as living computers has been explored but this use generally falls within the classical paradigm of computing. Computer scientists, however, have a variety of problems to which they seek solutions, while microbiologists are having new insights into the problems bacteria are solving and how they are solving them. Here, we envisage that bacteria might be used for new sorts of computing. These could be based on the capacity of bacteria to grow, move and adapt to a myriad different fickle environments both as individuals and as populations of bacteria plus bacteriophage. New principles might be based on the way that bacteria explore phenotype space via hyperstructure dynamics and the fundamental nature of the cell cycle. This computing might even extend to developing a high level language appropriate to using populations of bacteria and bacteriophage. Here, we offer a speculative tour of what we term bactoputing, namely the use of the natural behaviour of bacteria for calculating
    corecore