7,623 research outputs found

    Serverification of Molecular Modeling Applications: the Rosetta Online Server that Includes Everyone (ROSIE)

    Get PDF
    The Rosetta molecular modeling software package provides experimentally tested and rapidly evolving tools for the 3D structure prediction and high-resolution design of proteins, nucleic acids, and a growing number of non-natural polymers. Despite its free availability to academic users and improving documentation, use of Rosetta has largely remained confined to developers and their immediate collaborators due to the code's difficulty of use, the requirement for large computational resources, and the unavailability of servers for most of the Rosetta applications. Here, we present a unified web framework for Rosetta applications called ROSIE (Rosetta Online Server that Includes Everyone). ROSIE provides (a) a common user interface for Rosetta protocols, (b) a stable application programming interface for developers to add additional protocols, (c) a flexible back-end to allow leveraging of computer cluster resources shared by RosettaCommons member institutions, and (d) centralized administration by the RosettaCommons to ensure continuous maintenance. This paper describes the ROSIE server infrastructure, a step-by-step 'serverification' protocol for use by Rosetta developers, and the deployment of the first nine ROSIE applications by six separate developer teams: Docking, RNA de novo, ERRASER, Antibody, Sequence Tolerance, Supercharge, Beta peptide design, NCBB design, and VIP redesign. As illustrated by the number and diversity of these applications, ROSIE offers a general and speedy paradigm for serverification of Rosetta applications that incurs negligible cost to developers and lowers barriers to Rosetta use for the broader biological community. ROSIE is available at http://rosie.rosettacommons.org

    Computational protein design with backbone plasticity

    Get PDF
    The computational algorithms used in the design of artificial proteins have become increasingly sophisticated in recent years, producing a series of remarkable successes. The most dramatic of these is the de novo design of artificial enzymes. The majority of these designs have reused naturally occurring protein structures as “scaffolds” onto which novel functionality can be grafted without having to redesign the backbone structure. The incorporation of backbone flexibility into protein design is a much more computationally challenging problem due to the greatly increase search space but promises to remove the limitations of reusing natural protein scaffolds. In this review, we outline the principles of computational protein design methods and discuss recent efforts to consider backbone plasticity in the design process

    Methods for Molecular Modelling of Protein Complexes.

    Get PDF
    Biological processes are often mediated by complexes formed between proteins and various biomolecules. The 3D structures of such protein-biomolecule complexes provide insights into the molecular mechanism of their action. The structure of these complexes can be predicted by various computational methods. Choosing an appropriate method for modelling depends on the category of biomolecule that a protein interacts with and the availability of structural information about the protein and its interacting partner. We intend for the contents of this chapter to serve as a guide as to what software would be the most appropriate for the type of data at hand and the kind of 3D complex structure required. Particularly, we have dealt with protein-small molecule ligand, protein-peptide, protein-protein, and protein-nucleic acid interactions.Most, if not all, model building protocols perform some sampling and scoring. Typically, several alternate conformations and configurations of the interactors are sampled. Each such sample is then scored for optimization. To boost the confidence in these predicted models, their assessment using other independent scoring schemes besides the inbuilt/default ones would prove to be helpful. This chapter also lists such software and serves as a guide to gauge the fidelity of modelled structures of biomolecular complexes

    Human PrimPol is a highly error-prone polymerase regulated by single-stranded DNA binding proteins

    Get PDF
    PrimPol is a recently identified polymerase involved in eukaryotic DNA damage tolerance, employed in both re-priming and translesion synthesis mechanisms to bypass nuclear and mitochondrial DNA lesions. In this report, we investigate how the enzymatic activities of human PrimPol are regulated. We show that, unlike other TLS polymerases, PrimPol is not stimulated by PCNA and does not interact with it in vivo. We identify that PrimPol interacts with both of the major single-strand binding proteins, RPA and mtSSB in vivo. Using NMR spectroscopy, we characterize the domains responsible for the PrimPol-RPA interaction, revealing that PrimPol binds directly to the N-terminal domain of RPA70. In contrast to the established role of SSBs in stimulating replicative polymerases, we find that SSBs significantly limit the primase and polymerase activities of PrimPol. To identify the requirement for this regulation, we employed two forward mutation assays to characterize PrimPol's replication fidelity. We find that PrimPol is a mutagenic polymerase, with a unique error specificity that is highly biased towards insertion-deletion errors. Given the error-prone disposition of PrimPol, we propose a mechanism whereby SSBs greatly restrict the contribution of this enzyme to DNA replication at stalled forks, thus reducing the mutagenic potential of PrimPol during genome replication

    Modular assembly of a protein nanotriangle using orthogonally interacting coiled coils

    Get PDF
    Synthetic protein assemblies that adopt programmed shapes would support many applications in nanotechnology. We used a rational design approach that exploits the modularity of orthogonally interacting coiled coils to create a self-assembled protein nanotriangle. Coiled coils have frequently been used to construct nanoassemblies and materials, but rarely with successful prior specification of the resulting structure. We designed a heterotrimer from three pairs of heterodimeric coiled coils that mediate specific interactions while avoiding undesired crosstalk. Non-associating pairs of coiled-coil units were strategically fused to generate three chains that were predicted to preferentially form the heterotrimer, and a rational annealing proc ess led to the desired oligomer. Extensive biophysical characterization and modeling support the formation of a molecular triangle, which is a shape distinct from naturally occurring supramolecular nanostructures. Our approach can be extended to design more complex nanostructures using additional coiled-coil modules, other protein parts, or templated surfaces

    Computational redesign of thioredoxin is hypersensitive towards minor conformational changes in the backbone template

    Get PDF
    Despite the development of powerful computational tools, the full-sequence design of proteins still remains a challenging task. To investigate the limits and capabilities of computational tools, we conducted a study of the ability of the program Rosetta to predict sequences that recreate the authentic fold of thioredoxin. Focusing on the influence of conformational details in the template structures, we based our study on 8 experimentally determined template structures and generated 120 designs from each. For experimental evaluation, we chose six sequences from each of the eight templates by objective criteria. The 48 selected sequences were evaluated based on their progressive ability to (1) produce soluble protein in Escherichia coli and (2) yield stable monomeric protein, and (3) on the ability of the stable, soluble proteins to adopt the target fold. Of the 48 designs, we were able to synthesize 32, 20 of which resulted in soluble protein. Of these, only two were sufficiently stable to be purified. An X-ray crystal structure was solved for one of the designs, revealing a close resemblance to the target structure. We found a significant difference among the eight template structures to realize the above three criteria despite their high structural similarity. Thus, in order to improve the success rate of computational full-sequence design methods, we recommend that multiple template structures are used. Furthermore, this study shows that special care should be taken when optimizing the geometry of a structure prior to computational design when using a method that is based on rigid conformations

    단백질 복합체 구조예측과 설계를 위한 방법 개발

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 자연과학대학 화학부, 2023. 8. 석차옥.단백질 사이의 상호작용은 다양한 생체 내에서 다양한 대사과정과 신호전달 과정에서 중요한 역할을 한다. 이런 역할 때문에 단백질간 상호작용은 질병의 발병 과정에 관련되어 있기 때문에 중요한 치료 표적으로 지목된다. 이러한 상호작용을 원자 수준 구조로 이해하는 것은 단백질의 기능과 특성에 대해 깊은 이해가 가능하게 하고 이를 토대로 분자 약물 또는 단백질 약물을 개발과 개량에 결정적 도움을 줄 수 있다. 이런 맥락에서 컴퓨터를 활용한 단백질 복합체 구조예측 및 상호작용 연구는 주목받아왔다. 최근에 Alphafold2와 RoseTTAFold와 같은 딥러닝 기반의 구조 예측 프로그램의 등장하며 단백질의 구조예측에 대한 성능은 상당히 많이 향상되었다. 그러나 여전히 많은 발전이 필요한 영역들이 남아있으며 특히, 유의미한 다중 단백질서열 정렬(multiple sequence alignment)나 유의미한 정보를 담고 있는 서열 임베딩이 없으면 구조예측 성능이 많이 떨어진다. 그리고 단백질 약물 개발도 단백질 서열공간과 구조 공간을 동시에 예측하고 최적화해야 하는 문제로 상당히 복잡하다. 이 논문에서는 뛰어난 단백질 복합체 구조예측 소프트웨어인 GALAXY에 대해 포괄적으로 다루어 보고, 단백질 구조 예측 및 단백질 복합체 설계 분야의 문제점들을 해결할 수 있는 새로운 두 가지 방법론을 소개한다. 첫째로, AlphaFold2와 구조 공간 담금질(CSA)에 영감을 받아 상보성 결정 영역(CDR) H3 고리를 예측하는 새로운 딥러닝 모델을 소개한다. 이 모델은 구조 예측을 위한 새로운 개념의 모델 구조를 도입하며, 단백질-단백질 결합구조 예측과 일반적인 단백질 구조 예측으로 그 응용 분야를 확장할 가능성을 보여준다.. 둘째로, 'H-map'이라는 새로운 프로그램을 . 이는 표적 단백질의 국소 표면과 강한 상호작용을 해서 결합할 수 있는 아미노산의 종류를 알려주는 H-map 이라는 새로운 프로그램을 소개한다.Protein-protein interactions play a vital role in numerous biological processes and often serve as therapeutic targets due to their involvement in disease pathogenesis. Comprehending the atomistic intricacies of these interactions can lead to the discovery of regulatory molecules for disease-related biological processes and the rational design of proteins for therapeutic applications. The emergence of deep learning-based techniques, such as Alphafold2, RoseTTAFold, and RFdiffusion, has substantially advanced our capabilities in protein structure prediction and design. However, several challenges persist in these domains. Deep learning tools, while transformative, still exhibit limitations, particularly in the absence of strong guiding information for overall conformations, such as those contained in multiple sequence alignment or sequence embedding. Moreover, the protein design problem is quite complex in nature because it requires concurrent optimization in the sequence space and the conformation space. This thesis first provides a comprehensive review of the GALAXY protein modeling package, a highly effective software for protein oligomer structure prediction, and further illuminates the path towards novel breakthroughs in the field of protein structure prediction and protein binder design. Two new methods are then proposed to address the persistent challenges in these areas. First, a novel deep learning model, inspired by the AlphaFold2 structure module and conformational space annealing (CSA) global optimization, is introduced as a technique for predicting the structures of antibody complementarity determining region (CDR) H3 loops. This deep neural network model introduces a novel framework for structure prediction, implying the potential applicability to other prediction domains involving great molecular complexity such as protein-protein docking and ab initio protein structure prediction. Second, we present a new deep neural network amino acid generator called 'H-map' on the surface of the target protein considering the local environment of the target protein only, unlike other methods that require backbone structures of a potential binder.TABLE OF CONTENTS ABSTRACT i TABLE OF CONTENTS iii LIST OF FIGURES viii LIST OF TABLES x 1. Introduction 1 2. Protein Oligomer Structure Prediction with GALAXY Software 4 2.1. Introduction 4 2.2. Brief Introduction of Galaxy Software for Predicting Protein-Protein Complex Structure 6 2.2.1. Overall pipeline for predict protein-protein complex structure with GALAXY Package 6 2.2.2. Protein monomer structure modeling 8 2.2.3. Protein-protein complex structure modeling 9 2.2.4. Protein structure refinement 14 2.3. Applications I: SARS-CoV2-Spike protein structure prediction 16 2.3.1. Introduction 16 2.3.2. Full-length SARS-CoV-2 S protein model building 21 2.3.3. Predicting characteristic stalk movement of the S protein consists of two highly flexible linkers 23 2.4. Applications II: participating in CASP and CAPRI blind prediction experiments 27 2.5. Applications III: prediction of GPCR-peptide complexes 30 2.6. Conclustion 34 3. Deep-Learning based Antibody H3 Loop Structure Predicition Inspired by Alphafold2 and Genetic Algorithm 35 3.1. Introduction 35 3.2. Methods 37 3.2.1. Brief introduction of the overall method 37 3.2.2. Dataset preparation for method training and testing 39 3.2.2.1 Preparation of antibody structure set 39 3.2.2.2. Preparation of general dimer loop set 39 3.2.3. Benchmark set and training set 40 3.2.3.1. IgFold benchmark set 40 3.2.3.2. In-house test set 41 3.2.3.3. Training set and validation set 41 3.2.4. Loop structure prediction neural network architecture 41 3.2.4.1. PerturbInitialStructure : initial loop structure generation moduler for further evolution 42 3.2.4.2. SingleFeatureEmbedder: feature embedding module 44 3.2.4.3. RecycleSingleFeature module 45 3.2.4.4. PairwiseFeatureEmbedder module 45 3.2.4.5. IPAEncoder module 47 3.2.4.6. Cross-over module 48 3.2.4.7. TriangularPairwiseFeature module 51 3.2.4.8. IPAModule 52 3.2.4.9. TorsionAnglePredictior module 53 3.2.4.10. LDDTPredictior module 53 3.2.5. Loss function 54 3.2.6. Training procedure 55 3.2.6.1. Preparation of input 55 3.2.6.2. Data augmentation 56 3.2.6.3. Fine-tuning 57 3.3. Results and discussion 57 3.3.1. Results of CDR H3 loop structure prediction on the benchmark set 57 3.3.2. Evaluate the effect of multi-seed strategy 60 3.3.3. Evolving predicted structures through iterative optimization 60 3.4. Conclusion 69 4. H-map: Amino Acid Generator for Designing and Scoring Protein Binders without Backbone Structure Information 70 4.1. Introduction 70 4.2. Method 74 4.2.1. Overall workflow of the Hmap method 74 4.2.2. Dataset preparation for training and testing 75 4.2.2.1. Amino acid type reconstruction 75 4.2.2.2. Protein-protein docking reranking set 76 4.2.2.3. Mutation effect prediction set: SKEMPI2 76 4.2.3. Algorithm architecture of Hmap 77 4.2.3.1. Input preparation 77 4.2.3.2. SE(3)-Transformer 82 4.2.3.3. Final node-embedding processing 84 4.2.3.4. Loss function 84 4.2.4. Training procedure 89 4.2.5. Performance comparison with our methods 91 4.3. Results and Discussion 91 4.3.1. Performance of amino acid type reconstruction 91 4.3.2. Functional group center position prediction task 95 4.3.3. Protein-protein docking decoy reranking task 95 4.3.4. Mutation effect prediction task 96 4.4. Conclusion 100 5. Conclusion 101 BIBLIOGRAPHY 103 국문초록 111박

    The methyltransferase domain of dengue virus protein NS5 ensures efficient RNA synthesis initiation and elongation by the polymerase domain

    No full text
    International audienceViral RNA-dependent RNA polymerases (RdRps) responsible for the replication of single-strand RNA virus genomes exert their function in the context of complex replication machineries. Within these replication complexes the polymerase activity is often highly regulated by RNA elements, proteins or other domains of multi-domain polymerases. Here, we present data of the influence of the methyltrans-ferase domain (NS5-MTase) of dengue virus (DENV) protein NS5 on the RdRp activity of the polymerase domain (NS5-Pol). The steady-state polymerase activities of DENV-2 recombinant NS5 and NS5-Pol are compared using different biochemical assays allowing the dissection of the de novo initiation, transition and elongation steps of RNA synthesis. We show that NS5-MTase ensures efficient RdRp activity by stimulating the de novo initiation and the elongation phase. This stimulation is related to a higher affinity of NS5 toward the single-strand RNA template indicating NS5-MTase either completes a high-affinity RNA binding site and/or promotes the correct formation of the template tunnel. Furthermore, the NS5-MTase increases the affinity of the priming nucleotide ATP upon de novo initiation and causes a higher catalytic efficiency of the polymerase upon elongation. The complex stimulation pattern is discussed under the perspective that NS5 adopts several conforma-tions during RNA synthesis
    corecore