29 research outputs found

    Evolution of interface binding strengths in simplified model of protein quaternary structure.

    Get PDF
    The self-assembly of proteins into protein quaternary structures is of fundamental importance to many biological processes, and protein misassembly is responsible for a wide range of proteopathic diseases. In recent years, abstract lattice models of protein self-assembly have been used to simulate the evolution and assembly of protein quaternary structure, and to provide a tractable way to study the genotype-phenotype map of such systems. Here we generalize these models by representing the interfaces as mutable binary strings. This simple change enables us to model the evolution of interface strengths, interface symmetry, and deterministic assembly pathways. Using the generalized model we are able to reproduce two important results established for real protein complexes: The first is that protein assembly pathways are under evolutionary selection to minimize misassembly. The second is that the assembly pathway of a complex mirrors its evolutionary history, and that both can be derived from the relative strengths of interfaces. These results demonstrate that the generalized lattice model offers a powerful new idealized framework to facilitate the study of protein self-assembly processes and their evolution

    Application of computational methods for predicting protein interactions

    Full text link
    Protein interactions with other proteins or small molecules are critical to most physiological processes. These interactions may be characterized experimentally, but this can be time consuming and expensive; computational methods for predicting how two proteins interact, or which regions of a protein are most favorable for binding, are thus valuable tools for understanding how proteins of interest function, and have applications in drug discovery and identifying proteins of therapeutic interest. The ClusPro and FTMap algorithms for docking or solvent mapping, respectively, model protein-protein and protein-small molecule interactions, and can be used to identify the most likely orientations of a protein complex or the regions on a protein surface with the greatest propensity for binding. Here we describe three applications of ClusPro and FTMap. ClusPro was used to develop a method for determining whether a protein-protein interface is biologically relevant, by docking the proteins and comparing the results to the given interface; a larger number of near-native structures--which have interfaces similar to that of the given complex--was found to correspond to a greater probability that an interface is biological. In another project, ClusPro was used to predict whether a mutation in a multimeric complex would trigger the formation of a supramolecular assembly, based on how often that mutated residue appeared in the interfaces of the docking results; if a mutation caused such a residue to be present in the docked interfaces more often, in comparison to those of the wild-type structure, then it was likely to induce self-assembly. FTMap was used to detect and analyze the druggability of potential allosteric sites in kinases, with mapping performed on all available kinase structures to identify and determine the potential binding affinity of binding hot spots located outside of the active site. Discrimination of proteins as dimers or monomers was implemented as an addition to the ClusPro server, ClusPro-DC, and the results of the druggability analysis of kinases were organized into an online resource, the Kinase Atlas.2019-02-20T00:00:00

    Gradient descent optimization and deep reinforcement learning for protein-protein interaction

    Get PDF
    Reconstruction of the 3D structure of protein dimers is a crucial and challenging task. Although inter-protein contacts have been found useful in the modeling process of protein complexes, a few methods have been introduced to tackle the challenging quaternary structure prediction problem utilizing inter-chain contacts. We propose an optimization method based on gradient descent algorithm, called GD, to reconstruct the quaternary structures of protein complexes from inter-protein contacts. We test the performance of the GD method on both homodimers and heterodimers utilizing both true and predicted inter-protein contacts. GD has a superior performance than a Markov Chain Monte Carlo (MC), and a method based on Crystallography and NMR System (CNS). When native inter-chain contacts are provided as inputs, GD builds high quality models with TM-scores of more than 0.92 and interface RMSDs (I_RMSDs) of less than 1.64 A for both homodimers and heterodimers. Receiving the predicted inter-chain contacts as restraints, GD is able to generate models with a mean TM-score of 0.76 for 115 homodimers. Besides, for nearly half of the homodimers, GD reconstructs high quality models with TM-scores more than 0.9 using just the predicted inter-chain contacts to guide the modeling process. We also develop a self-learning algorithm based on reinforcement learning, named DRLComplex, to reconstruct protein dimers from true/predicted inter-protein contacts. We evaluate DRLComplex on two standard datasets including CASP-CAPRI dataste (28 homodimers), and Std32 (32 heterodimers). If native inter-chain contacts are provided, DRLComplex generates models with mean TM-score of 0.9895 and mean I_RMSD of 0.2197 for CASP-CAPRI dataset, and models having average TM-score of 0.9881, and average I_RMSD of 0.92 for Std32. Using predicted inter-chain contacts as restraints, DRLComplex builds models with overall average TM-scores of 0.73 and 0.76 for CASP-CvAPRI and Std32, successively. Moreover, utilizing predicted contacts, DRLComplex improves the mean I_RMSD of the reconstructed models for the Std32 dataset by 0.29 percent, 1.01 percent, 13.47 percent, and 8.69 percent over GD, MC, CNS, and Equidock (an end-to-end quaternary structure prediction method), respectively. In addition, the mean I_RMSD of the models predicted by DRLComplex for CASP-CAPRI dataset utilizing predicted contacts is 0.04, 3.94, and 4.07 lower than MC, CNS, and Equidock.Includes bibliographical references

    New computational methods for structural modeling protein-protein and protein-nucleic acid interactions

    Get PDF
    Programa de Doctorat en Biomedicina[eng] The study of the 3D structural details of protein-protein and protein-DNA interactions is essential to understand biomolecular functions at the molecular level. Given the difficulty of the structural determination of these complexes by experimental techniques, computational tools are becoming a powerful to increase the actual structural coverage of protein-protein and protein-DNA interactions. pyDock is one of these tools, which uses its scoring function to determine the quality of models generated by other tools. pyDock is usually combined with the model sampling methods FTDOCK or ZDOCK. This combination has shown a consistently good prediction performance in community-wide assessment experiments like CAPRI or CASP and has provided biological insights and insightful interpretation of experiments by modeling many biomolecular interactions of biomedical and biotechnological interest. This software combination has demonstrated good predictive performance in the blinded evaluation experiments CAPRI and CASP. It has provided biological insights by modeling many biomolecular interactions of biomedical and biotechnological interest. Here, we describe a pyDock software update, which includes its adaptation to the newest python code, the capability of including cofactor and other small molecules, and an internal parallelization to use the computational resources more efficiently. A strategy was designed to integrate the template-based docking and ab initio docking approaches by creating a new scoring function based on the pyDock scoring energy basis function and the TM-score measure of structural similarity of protein structures. This strategy was partially used for our participation in the 7th CAPRI, the 3rd CASP-CAPRI and the 4th CASP-CAPRI joint experiments. These experiments were challenging, as we needed to model protein-protein complexes, multimeric oligomerization proteins, protein-peptide, and protein-oligosaccharide interactions. Many proposed targets required the efficient integration of rigid-body docking, template-based modeling, flexible optimization, multi- parametric scoring, and experimental restraints. This was especially relevant for the multi- molecular assemblies proposed in the 3er and 4th CASP-CAPRI joint experiments. In addition, a case study, in which electron transfer protein complexes were modelled to test the software new capabilities. Good results were achieved as the structural models obtained help explaining the differences in photosynthetic efficiency between red and green algae

    Regulatory mechanisms and biological implications of protein complex assembly

    Get PDF
    Every living organism possesses a genome that contains within it a unique set of genes, a substantial number of which encode proteins. Over the last 20 years, it has become apparent that organismal complexity arises not from the specific complement of genes per se, but rather from interactions between the gene products - in particular, interactions between proteins. As an inevitable consequence of the crowded cellular interior, most protein-protein interactions are fleeting. However, many are significantly more long-lived and result in stable protein complexes, in which the constituent subunits are obligately dependent on their binding partners. Despite the abundance of protein complexes and their critical importance to the cell, we currently have an incomplete understanding of the mechanisms by which the cell ensures their correct assembly. In the chapters that follow, I have attempted to improve our understanding of the regulatory systems underlying assembly of protein complexes, and the way in which assembly as a whole affects the behaviour of the cell. The thesis opens with an extended literature review covering the currently available methods for characterising protein complexes. After this introduction, chapters 2-4 are concerned with regulatory mechanisms and biological implications common to the assembly of all protein complexes. Chapter 5 diverges from this work, and describes a family of evolutionarily related proteins that regulate the behaviour of condensins and cohesins. Bacterial and archaeal genomes contain far less non-coding DNA than eukaryotes, and coding genes are often packaged into discrete units known as operons. The proteins encoded within operons are usually functionally related, either through participation in metabolic pathways or as subunits of heteromeric protein complexes. Since protein complexes assemble via ordered pathways, we reasoned that there might be a signature of assembly order present in operons, the genes of which are translated in sequential order. By comparing computationally predicted assembly pathways with gene order in operons, we demonstrated this to be the case for the large majority of operon-encoded complexes. Within operons, gene order follows assembly order, and adjacent genes are substantially more likely to share a physical interface than those further apart. This work demonstrates that efficient assembly of complexes is of sufficient importance as to have placed major constraints on the evolution of operon gene order. Following this study of bacterial operons, I present results from research investigating how patterns of protein degradation in eukaryotes are influenced by the formation of protein complexes. This showed that, whilst most proteins display exponential degradation kinetics, a sizeable minority deviate considerably from this pattern, instead being more consistent with a two-step degradation process. These proteins are predominantly members of heteromeric complexes, and their two-step decay profiles can be explained using a model under which bound and unbound subunits are degraded at different rates. Within individual complexes, we find that non-exponentially decaying proteins tend to form larger interfaces, assemble earlier, and show a higher degree of coexpression, consistent with the idea that bound subunits are degraded at a slower rate than unbound or peripheral subunits. This model also explains the behaviour of proteins in aneuploid cells where one or more chromosomes have been duplicated. In general, protein abundance scales with gene copy number, so that the immediate effect of duplicating a chromosome is to double the abundance of the proteins encoded on it. However, previous analyses of mass spectrometry data, as well as my own, have shown that the abundance of many proteins on duplicated chromosomes is significantly attenuated compared to what one would expect. These proteins, like those with non-exponential degradation patterns, are very often members of larger complexes. Since the overall concentration of a protein complex is constrained by that of its least abundant members, duplicating a single subunit will predominantly increase the unbound, unstable fraction of that subunit. The results from this work strongly suggest that the apparent attenuation of many proteins observed in aneuploid cells is indeed a consequence of the failure of these proteins to assemble into complexes. Finally, I present a study concerning an important, universally conserved family of protein complexes, namely the SMC-kleisins. Two members of this family, condensin and cohesin, are responsible for two hallmarks of eukaryotic chromatin organisation: the formation of condensed, linear chromosomes, and sister chromatid cohesion during cell division. Unlike other SMC-kleisins, condensin and cohesin possess a number of regulators containing HEAT repeats. By developing a computational pipeline for searching and clustering paralogous repeat proteins, I was able to demonstrate that these regulators form a distinct sub-family within the larger class of HEAT repeat proteins. Furthermore, these regulators arose very early in eukaryotic history, hinting at a possible role in the origin of modern condensins and cohesins

    Circadian oscillator proteins across the kingdoms of life : Structural aspects 06 Biological Sciences 0601 Biochemistry and Cell Biology

    Get PDF
    Circadian oscillators are networks of biochemical feedback loops that generate 24-hour rhythms and control numerous biological processes in a range of organisms. These periodic rhythms are the result of a complex interplay of interactions among clock components. These components are specific to the organism but share molecular mechanisms that are similar across kingdoms. The elucidation of clock mechanisms in different kingdoms has recently started to attain the level of structural interpretation. A full understanding of these molecular processes requires detailed knowledge, not only of the biochemical and biophysical properties of clock proteins and their interactions, but also the three-dimensional structure of clockwork components. Posttranslational modifications (such as phosphorylation) and protein-protein interactions, have become a central focus of recent research, in particular the complex interactions mediated by the phosphorylation of clock proteins and the formation of multimeric protein complexes that regulate clock genes at transcriptional and translational levels. The three-dimensional structures for the cyanobacterial clock components are well understood, and progress is underway to comprehend the mechanistic details. However, structural recognition of the eukaryotic clock has just begun. This review serves as a primer as the clock communities move towards the exciting realm of structural biology

    Assembly and Mechanism of Action of Sulfolobus solfataricus DNA Replication Complexes

    Get PDF
    DNA replication enzymes are essential for the maintenance and propagation of genetic information which precisely governs the growth and development of our cells. Aberrant DNA replication processes have been implicated in a wide variety of human diseases, most notably cancer, and therefore, mechanistic understanding of DNA replication processes is paramount for the development of human therapeutic agents. The study of the eukaryotic replication system however, is difficult, as the system contains a large number of enzymes and regulatory factors making assembly of these systems for in vitro study complicated. Thus, in order to gain insight into the workings of the eukaryotic replication system, several model systems are used, where the complexity of the replication pathways is not as great. The DNA replication system from the thermophilic archaeon Sulfolobus solfataricus is a recently identified model with components sharing high levels of sequence homology to their eukaryotic counterparts. This system is ideal for gaining insight into the mechanistic workings of DNA replication which can be translated to the eukaryotic system. A key advantage to the study of thermophilic enzymes is in the ability to utilize reaction temperatures far lower than the physiological conditions for the organisms. This results in slower kinetics with no significant change in overall function, allowing an easier discernment of the enzyme’s mechanistic details. I have contributed to the development of Sulfolobus solfataricus as a model system primarily through characterization of nucleotide transferase enzymes including DNA polymerases and primases. Firstly, I have determined that the DNA polymerase, SsoPolB3, possesses a low rate of synthesis and fidelity more similar to those involved in lesion bypass. Secondly, I characterized the assembly and mechanism of action SsoPolB1 replication holoenzyme which replicates in a distributive fashion similar to the eukaryotic Pold holoenzyme, and maintains stimulated replication rates through rapid re-recruitment of the polymerase to the processivity clamp. Finally, I discovered and characterized the interactions of a unique primosome complex formed between the bacterial like DnaG primase and eukaryotic like MCM helicase. In all, my thesis provides for a more thorough understanding of the interactions, kinetics, and dynamics occurring at the replication fork
    corecore