3,751 research outputs found

    Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems

    Get PDF
    A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft

    Tandem machine learning for the identification of genes regulated by transcription factors

    Get PDF
    BACKGROUND: The identification of promoter regions that are regulated by a given transcription factor has traditionally relied upon the identification and distributions of binding sites recognized by the factor. In this study, we have developed a tandem machine learning approach for the identification of regulatory target genes based on these parameters and on the corresponding binding site information contents that measure the affinities of the factor for these cognate elements. RESULTS: This method has been validated using models of DNA binding sites recognized by the xenobiotic-sensitive nuclear receptor, PXR/RXRα, for target genes within the human genome. An information theory-based weight matrix was first derived and refined from known PXR/RXRα binding sites. The promoter region of candidate genes was scanned with the weight matrix. A novel information density-based clustering algorithm was then used to identify clusters of information rich sites. Finally, transformed data representing metrics of location, strength and clustering of binding sites were used for classification of promoter regions using an ensemble approach involving neural networks, decision trees and Naïve Bayesian classification. The method was evaluated on a set of 24 known target genes and 288 genes known not to be regulated by PXR/RXRα. We report an average accuracy (proportion of correctly classified promoter regions) of 71%, sensitivity of 73%, and specificity of 70%, based on multiple cross-validation and the leave-one-out strategy. The performance on a test set of 13 genes showed that 10 were correctly classified. CONCLUSION: We have developed a machine learning approach for the successful detection of gene targets for transcription factors with high accuracy. The method has been validated for the transcription factor PXR/RXRα and has the potential to be extended to other transcription factors

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    Machine learning and structural analysis of Mycobacterium tuberculosis pan-genome identifies genetic signatures of antibiotic resistance.

    Get PDF
    Mycobacterium tuberculosis is a serious human pathogen threat exhibiting complex evolution of antimicrobial resistance (AMR). Accordingly, the many publicly available datasets describing its AMR characteristics demand disparate data-type analyses. Here, we develop a reference strain-agnostic computational platform that uses machine learning approaches, complemented by both genetic interaction analysis and 3D structural mutation-mapping, to identify signatures of AMR evolution to 13 antibiotics. This platform is applied to 1595 sequenced strains to yield four key results. First, a pan-genome analysis shows that M. tuberculosis is highly conserved with sequenced variation concentrated in PE/PPE/PGRS genes. Second, the platform corroborates 33 genes known to confer resistance and identifies 24 new genetic signatures of AMR. Third, 97 epistatic interactions across 10 resistance classes are revealed. Fourth, detailed structural analysis of these genes yields mechanistic bases for their selection. The platform can be used to study other human pathogens

    Serverification of Molecular Modeling Applications: the Rosetta Online Server that Includes Everyone (ROSIE)

    Get PDF
    The Rosetta molecular modeling software package provides experimentally tested and rapidly evolving tools for the 3D structure prediction and high-resolution design of proteins, nucleic acids, and a growing number of non-natural polymers. Despite its free availability to academic users and improving documentation, use of Rosetta has largely remained confined to developers and their immediate collaborators due to the code's difficulty of use, the requirement for large computational resources, and the unavailability of servers for most of the Rosetta applications. Here, we present a unified web framework for Rosetta applications called ROSIE (Rosetta Online Server that Includes Everyone). ROSIE provides (a) a common user interface for Rosetta protocols, (b) a stable application programming interface for developers to add additional protocols, (c) a flexible back-end to allow leveraging of computer cluster resources shared by RosettaCommons member institutions, and (d) centralized administration by the RosettaCommons to ensure continuous maintenance. This paper describes the ROSIE server infrastructure, a step-by-step 'serverification' protocol for use by Rosetta developers, and the deployment of the first nine ROSIE applications by six separate developer teams: Docking, RNA de novo, ERRASER, Antibody, Sequence Tolerance, Supercharge, Beta peptide design, NCBB design, and VIP redesign. As illustrated by the number and diversity of these applications, ROSIE offers a general and speedy paradigm for serverification of Rosetta applications that incurs negligible cost to developers and lowers barriers to Rosetta use for the broader biological community. ROSIE is available at http://rosie.rosettacommons.org
    • …
    corecore