33,000 research outputs found

    STRUCTURAL MODELING OF PROTEIN-PROTEIN INTERACTIONS USING MULTIPLE-CHAIN THREADING AND FRAGMENT ASSEMBLY

    Get PDF
    Since its birth, the study of protein structures has made progress with leaps and bounds. However, owing to the expenses and difficulties involved, the number of protein structures has not been able to catch up with the number of protein sequences and in fact has steadily lost ground. This necessitated the development of high-throughput but accurate computational algorithms capable of predicting the three dimensional structure of proteins from its amino acid sequence. While progress has been made in the realm of protein tertiary structure prediction, the advancement in protein quaternary structure prediction has been limited by the fact that the degree of freedom for protein complexes is even larger and even fewer number of protein complex structures are present in the PDB library. In fact, protein complex structure prediction till date has largely remained a docking problem where automated algorithms aim to predict the protein complex structure starting from the unbound crystal structure of its component subunits and thus has remained largely limited in terms of scope. Secondly, since docking essentially treats the unbound subunits as "rigid-bodies" it has limited accuracy when conformational change accompanies protein-protein interaction. In one of the first of its kind effort, this study aims for the development of protein complex structure algorithms which require only the amino acid sequence of the interacting subunits as input. The study aimed to adapt the best features of protein tertiary structure prediction including template detection and ab initio loop modeling and extend it for protein-protein complexes thus requiring simultaneous modeling of the three dimensional structure of the component subunits as well as ensuring the correct orientation of the chains at the protein-protein interface. Essentially, the algorithms are dependent on knowledge-based statistical potentials for both fold recognition and structure modeling. First, as a way to compare known structure of protein-protein complexes, a complex structure alignment program MM-align was developed. MM-align joins the chains of the complex structures to be aligned to form artificial monomers in every possible order. It then aligns them using a heuristic dynamic programming based approach using TM-score as the objective function. However, the traditional NW dynamic programming was redesigned to prevent the cross alignment of chains during the structure alignment process. Driven by the knowledge obtained from MM-align that protein complex structures share evolutionary relationships and the current protein complex structure library already contains homologous/structurally analogous protein quaternary structure families, a dimeric threading approach, COTH was designed. The new threading-recombination approach boosts the protein complex structure library by combining tertiary structure templates with complex alignments. The query sequences are first aligned to complex templates using the modified dynamic programming algorithm, guided by a number of predicted structural features including ab initio binding-site predictions. Finally, a template-based complex structure prediction approach, TACOS, was designed to build full-length protein complex structures starting from the initial templates identified by COTH. TACOS, fragments the templates aligned regions of templates and reassembles them while building the structure of the threading unaligned region ab inito using a replica-exchange monte-carlo simulation procedure. Simultaneously, TACOS also searches for the best orientation match of the component structures driven by a number of knowledge-based potential terms. Overall, TACOS presents the one of the first approach capable of predicting full length protein complex structures from sequence alone and introduces a new paradigm in the field of protein complex structure modeling

    Evolutionary Multi-Objective Design of SARS-CoV-2 Protease Inhibitor Candidates

    Full text link
    Computational drug design based on artificial intelligence is an emerging research area. At the time of writing this paper, the world suffers from an outbreak of the coronavirus SARS-CoV-2. A promising way to stop the virus replication is via protease inhibition. We propose an evolutionary multi-objective algorithm (EMOA) to design potential protease inhibitors for SARS-CoV-2's main protease. Based on the SELFIES representation the EMOA maximizes the binding of candidate ligands to the protein using the docking tool QuickVina 2, while at the same time taking into account further objectives like drug-likeliness or the fulfillment of filter constraints. The experimental part analyzes the evolutionary process and discusses the inhibitor candidates.Comment: 15 pages, 7 figures, submitted to PPSN 202

    The EM Algorithm and the Rise of Computational Biology

    Get PDF
    In the past decade computational biology has grown from a cottage industry with a handful of researchers to an attractive interdisciplinary field, catching the attention and imagination of many quantitatively-minded scientists. Of interest to us is the key role played by the EM algorithm during this transformation. We survey the use of the EM algorithm in a few important computational biology problems surrounding the "central dogma"; of molecular biology: from DNA to RNA and then to proteins. Topics of this article include sequence motif discovery, protein sequence alignment, population genetics, evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Network Archaeology: Uncovering Ancient Networks from Present-day Interactions

    Get PDF
    Often questions arise about old or extinct networks. What proteins interacted in a long-extinct ancestor species of yeast? Who were the central players in the Last.fm social network 3 years ago? Our ability to answer such questions has been limited by the unavailability of past versions of networks. To overcome these limitations, we propose several algorithms for reconstructing a network's history of growth given only the network as it exists today and a generative model by which the network is believed to have evolved. Our likelihood-based method finds a probable previous state of the network by reversing the forward growth model. This approach retains node identities so that the history of individual nodes can be tracked. We apply these algorithms to uncover older, non-extant biological and social networks believed to have grown via several models, including duplication-mutation with complementarity, forest fire, and preferential attachment. Through experiments on both synthetic and real-world data, we find that our algorithms can estimate node arrival times, identify anchor nodes from which new nodes copy links, and can reveal significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure

    Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems

    Get PDF
    A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft
    • …
    corecore