14 research outputs found

    M-ORBIS: Mapping of mOleculaR Binding sItes and Surfaces

    Get PDF
    M-ORBIS is a Molecular Cartography approach that performs integrative high-throughput analysis of structural data to localize all types of binding sites and associated partners by homology and to characterize their properties and behaviors in a systemic way. The robustness of our binding site inferences was compared to four curated datasets corresponding to protein heterodimers and homodimers and protein–DNA/RNA assemblies. The Molecular Cartographies of structurally well-detailed proteins shows that 44% of their surfaces interact with non-solvent partners. Residue contact frequencies with water suggest that ∌86% of their surfaces are transiently solvated, whereas only 15% are specifically solvated. Our analysis also reveals the existence of two major binding site families: specific binding sites which can only bind one type of molecule (protein, DNA, RNA, etc.) and polyvalent binding sites that can bind several distinct types of molecule. Specific homodimer binding sites are for instance nearly twice as hydrophobic than previously described and more closely resemble the protein core, while polyvalent binding sites able to form homo and heterodimers more closely resemble the surfaces involved in crystal packing. Similarly, the regions able to bind DNA and to alternatively form homodimers, are more hydrophobic and less polar than previously described DNA binding sites

    Alliance of Genome Resources Portal: unified model organism research platform

    Get PDF
    The Alliance of Genome Resources (Alliance) is a consortium of the major model organism databases and the Gene Ontology that is guided by the vision of facilitating exploration of related genes in human and well-studied model organisms by providing a highly integrated and comprehensive platform that enables researchers to leverage the extensive body of genetic and genomic studies in these organisms. Initiated in 2016, the Alliance is building a central portal (www.alliancegenome.org) for access to data for the primary model organisms along with gene ontology data and human data. All data types represented in the Alliance portal (e.g. genomic data and phenotype descriptions) have common data models and workflows for curation. All data are open and freely available via a variety of mechanisms. Long-term plans for the Alliance project include a focus on coverage of additional model organisms including those without dedicated curation communities, and the inclusion of new data types with a particular focus on providing data and tools for the non-model-organism researcher that support enhanced discovery about human health and disease. Here we review current progress and present immediate plans for this new bioinformatics resource

    Alliance of Genome Resources Portal: unified model organism research platform

    Get PDF
    The Alliance of Genome Resources (Alliance) is a consortium of the major model organism databases and the Gene Ontology that is guided by the vision of facilitating exploration of related genes in human and well-studied model organisms by providing a highly integrated and comprehensive platform that enables researchers to leverage the extensive body of genetic and genomic studies in these organisms. Initiated in 2016, the Alliance is building a central portal (www.alliancegenome.org) for access to data for the primary model organisms along with gene ontology data and human data. All data types represented in the Alliance portal (e.g. genomic data and phenotype descriptions) have common data models and workflows for curation. All data are open and freely available via a variety of mechanisms. Long-term plans for the Alliance project include a focus on coverage of additional model organisms including those without dedicated curation communities, and the inclusion of new data types with a particular focus on providing data and tools for the non-model-organism researcher that support enhanced discovery about human health and disease. Here we review current progress and present immediate plans for this new bioinformatics resource

    Integrative analysis of structural data and pattern recognition, application to the regulation of eukaryotic transcription

    No full text
    En 5 ans, les projets internationaux de Biologie et GĂ©nomique Structurales ont doublĂ© le nombre de structures molĂ©culaires disponibles dans la Protein Data Bank. Au cours de cette thĂšse, j ai dĂ©veloppĂ© des approches de Bioinformatique Structurale permettant l'analyse intĂ©grative de ces donnĂ©es pour mieux dĂ©crire les mĂ©canismes molĂ©culaires d'interactions. Nous avons montrĂ©, qu en moyenne, 44% de la surface protĂ©ique est impliquĂ©e dans des interactions avec des molĂ©cules autres que solvants et ions, et que, si prĂšs de 86% de la surface des protĂ©ines peut ĂȘtre hydratĂ©e transitoirement, seule 15% l'est de façon spĂ©cifique. En diffĂ©renciant tous les types de sites de liaisons (protĂ©ine, ADN, ARN, ligand ) de chaque protĂ©ine, nous avons montrĂ© l'existence de recouvrements entre ces rĂ©gions. Cette observation a conduit Ă  la dĂ©finition de deux grandes familles de sites de liaisons: des sites spĂ©cifiques, capables de ne lier d'un seul type de molĂ©cule, et des sites polyvalents, capables de lier au moins deux types diffĂ©rents de molĂ©cules. Les sites de liaisons spĂ©cifiques diffĂšrent grandement des sites de liaisons polyvalents, notamment en termes d'hydrophobicitĂ©. Les sites spĂ©cifiques pourraient ĂȘtre l'indicateur d'interactions fortes voir permanentes. L'analyse rapide et systĂ©matique des surfaces molĂ©culaires a Ă©galement requis le dĂ©veloppement d'approches gĂ©omĂ©triques avancĂ©es, mettant en Ɠuvre les formes alphas, pour permettre la construction de rĂ©gions contiguĂ«s et la dĂ©finition de courbures locales. Le criblage des rĂ©gions contiguĂ«s, tout comme un blast mais pour la comparaison de rĂ©gions 3D locales, ouvre la voie Ă  de nombreuses applications biologiques et pharmaceutiques.In 5 years, international projects of Structural Biology and Structural Genomics have doubled the number of available molecular structures in the Protein Data Bank. During this thesis, I have developped Structural Bioinformatic approaches to perform the integrated analysis of structural data, to better describe the molecular mechanisms of interactions. We have shown that, on average, 44% of protein surfaces are involved in interactions with molecules other than solvants and ions. If 86% of protein surfaces can be transiently hydrated, only 15% can be specifically hydrated. By differentiating every type of binding sites (protein, DNA, RNA, ligand ) of each protein, we have shown the existence of overlaps between these regions. This observation has led us to define two major families of binding sites : specific sites, which can only bind one type of molecule, and polyvalent sites, which can bind at least two different types of molecule. The specific binding sites differ greatly from polyvalent ones, in particular in terms of hydrophobicity. Specific binding sites may indicate stronger or permanent interactions. The fast and systematic analysis of molecular surfaces has also required the development of advanced geometrical approaches, based on alpha shapes, to define contiguous regions and local curvatures. The screening of these contiguous regions, like a blast but for local 3D regions, open the way to numerous biological and pharmaceutical applications

    Integrative analysis of structural data and pattern recognition, application to the regulation of eukaryotic transcription

    No full text
    En 5 ans, les projets internationaux de Biologie et Génomique Structurales ont doublé le nombre de structures moléculaires disponibles dans la Protein Data Bank. Au cours de cette thÚse, j ai développé des approches de Bioinformatique Structurale permettaIn 5 years, international projects of Structural Biology and Structural Genomics have doubled the number of available molecular structures in the Protein Data Bank. During this thesis, I have developped Structural Bioinformatic approaches to perform the

    Ancestral Genomes: a resource for reconstructed ancestral genes and genomes across the tree of life

    No full text
    For each ancestral gene, we assign a stable identifier, and provide additional information designed to facilitate analysis: an inferred name (based on its descendants in extant genomes), a reconstructed protein sequence, a set of inferred Gene Ontology (GO) annotations, and a “proxy gene” for each ancestral gene, defined as the least-diverged descendant of the ancestral gene in a given extant genome

    Reactome and the Gene Ontology: Digital convergence of data resources.

    No full text
    MOTIVATION: GO Causal Activity Models (GO-CAMs) assemble individual associations of gene products with cellular components, molecular functions, and biological processes into causally linked activity flow models. Pathway databases such as the Reactome Knowledgebase create detailed molecular process descriptions of reactions and assemble them, based on sharing of entities between individual reactions into pathway descriptions. RESULTS: To convert the rich content of Reactome into GO-CAMs, we have developed a software tool, Pathways2GO, to convert the entire set of normal human Reactome pathways into GO-CAMs. This conversion yields standard GO annotations from Reactome content and supports enhanced quality control for both Reactome and GO, yielding a nearly seamless conversion between these two resources for the bioinformatics community. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online

    Gene Ontology Causal Activity Modeling (GO-CAM) moves beyond GO annotations to structured descriptions of biological functions and systems

    No full text
    To increase the utility of Gene Ontology (GO) annotations for interpretation of genome-wide experimental data, we have developed GO-CAM, a structured framework for linking multiple GO annotations into an integrated model of a biological system. We expect that GO-CAM will enable new applications in pathway and network analysis, as well as improve standard GO annotations for traditional GO-based applications
    corecore