8,686 research outputs found

    IN SILICO MODELING THE EFFECT OF SINGLE POINT MUTATIONS AND RESCUING THE EFFECT BY SMALL MOLECULES BINDING

    Get PDF
    Single-point mutation in genome, for example, single-nucleotide polymorphism (SNP) or rare genetic mutation, is the change of a single nucleotide for another in the genome sequence. Some of them will result in an amino acid substitution in the corresponding protein sequence (missense mutations); others will not. This investigation focuses on genetic mutations resulting in a change in the amino acid sequence of the corresponding protein. This choice is motivated by the fact that missense mutations are frequently found to affect the native function of proteins by altering their structure, interaction and other properties and cause diseases. A particular disease is the Snyder-Robinson syndrome (SRS), which is an X-linked mental retardation found to be caused by missense mutations in human spermine synthase (SMS). In this thesis, a rational approach to predict the effects of missense mutations on SMS wild-type characteristics was carried. Following this work, a structure-based virtual screening of small molecules was applied to rescue the disease-causing effect by searching the small molecules to stabilize the malfunctioning SMS mutant dimer

    Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes.

    Get PDF
    RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies

    Modelling the Effects of Disease-Associated Single Amino Acid Variants and Rescuing the Effects by Small Molecules

    Get PDF
    Single nucleotide polymorphism (SNP) is a variation of a single nucleotide in the genome. Some of these variations can cause a change of single amino acid in the corresponding protein, resulting in single amino acid variation (SAV). SAVs can lead to profound alterations of the corresponding biological processes and thus can be associated with many human diseases. This dissertation focuses on integration of existing and development of new computational approaches to model the effects of SAVs with the goal to reveal molecular mechanism of human diseases. Since proton transfer and pKa shifts are frequently attributed to disease causality, the proton transfers in the protein-nucleic acid interactions are investigated and along with development of a new computational approach to predict the SAV’s effect on the protein-DNA binding affinity. The SAVs in four proteins: Lysine-specific demethylase 5C (KDM5C), Spermine Synthase (SpmSyn), 7-Dehydrocholesterol reductase (DHCR7) and methyl CpG binding protein 2 (MeCP2) are extensively studied using numerous computational approaches to reveal molecular details of disease-associated effects. In case of MeCP2 protein, the effects of the most commonly occurring disease-causing mutation, R133C, was targeted by structure-based virtual screening to identify the small molecules potentially to rescue the malfunctioning R133C mutant

    PRETICTIVE BIOINFORMATIC METHODS FOR ANALYZING GENES AND PROTEINS

    Get PDF
    Since large amounts of biological data are generated using various high-throughput technologies, efficient computational methods are important for understanding the biological meanings behind the complex data. Machine learning is particularly appealing for biological knowledge discovery. Tissue-specific gene expression and protein sumoylation play essential roles in the cell and are implicated in many human diseases. Protein destabilization is a common mechanism by which mutations cause human diseases. In this study, machine learning approaches were developed for predicting human tissue-specific genes, protein sumoylation sites and protein stability changes upon single amino acid substitutions. Relevant biological features were selected for input vector encoding, and machine learning algorithms, including Random Forests and Support Vector Machines, were used for classifier construction. The results suggest that the approaches give rise to more accurate predictions than previous studies and can provide valuable information for further experimental studies. Moreover, seeSUMO and MuStab web servers were developed to make the classifiers accessible to the biological research community. Structure-based methods can be used to predict the effects of amino acid substitutions on protein function and stability. The nonsynonymous Single Nucleotide Polymorphisms (nsSNPs) located at the protein binding interface have dramatic effects on protein-protein interactions. To model the effects, the nsSNPs at the interfaces of 264 protein-protein complexes were mapped on the protein structures using homology-based methods. The results suggest that disease-causing nsSNPs tend to destabilize the electrostatic component of the binding energy and nsSNPs at conserved positions have significant effects on binding energy changes. The structure-based approach was developed to quantitatively assess the effects of amino acid substitutions on protein stability and protein-protein interaction. It was shown that the structure-based analysis could help elucidate the mechanisms by which mutations cause human genetic disorders. These new bioinformatic methods can be used to analyze some interesting genes and proteins for human genetic research and improve our understanding of their molecular mechanisms underlying human diseases

    Mechanistic behaviour and molecular interactions of heat shock protein 47 (HSP47)

    Get PDF
    This project involves the study of heat shock protein 47 (HSP47), which is a molecular chaperone crucial for collagen biosynthesis. It exhibits a high degree of sequence homology with members of the serine protease inhibitor (serpin) superfamily, though HSP47 does not possess the inhibitory activity. It is a single-substrate chaperone, and binds only to collagen. ‘Knock-out’ of the hsp47 gene impairs the secretion of correctly folded collagen triple helix molecules leading to embryonic lethality in mice. Thus the aim of this project was to elucidate the specific mechanism that governs the binding to and release from collagen at the molecular level, known as the ‘pH-switch mechanism’. Emphasis is given on histidine (His) residues as the HSP47-collagen dissociation pH is similar to the pKa of the imidazole side chain of His residues. Site directed mutagenesis was used to mutate surface His residues, based on a mouse HSP47 homology model. The effects of the mutations on the behaviour of HSP47 were then assessed by collagen binding assays and structural analyses with circular dichroism (CD). All mutants were found to have good solubility and retain their binding ability to collagen like wild-type HSP47 in batch assay, but perturbed behaviour was seen in column experiment. Mutation of His residue at position 191 (H191) causes the shift in the collagen dissociation pH, while mutation of H197 and/or 198 disrupt the specific HSP47-collagen interaction. H191, 197 and 198 are predicted to be located in the region near the C-terminus of strand 3 of ÎČ-sheet A (s3A) in the homology model, a region specifically known as the ‘breach cluster’ in serpin nomenclature. The extent of conformational rearrangement of this region was further investigated by means of intrinsic tryptophan fluorescence spectroscopy using a series of single tryptophan (Trp) mutants. Results from analyses performed on the mutants did not contradict the observation seen in His mutational work, as Trp residues in the ‘breach’ cluster are likely to be located in the dynamic region of HSP47 pH-triggered conformational change. In conclusion, this study establishes the importance of His residues in the ‘breach cluster’ to HSP47 pH-switch behaviour. Finally, a model for HSP47 pH-switch mechanism was proposed from data obtained via mutagenesis experiments. The model is hoped to assist future research into HSP47 cellular behaviour and will also be of great use in therapeutic applications involving the molecular chaperone

    Impact of non synonymous single nucleotide variants on protein fitness: experimental analysis for a comparative study

    Get PDF
    Proteins are large biological molecules that control most vital cellular functions. They consist of one or more chains of amino acids in an order determined by the base sequence of nucleotides in the DNA coding for the protein. Thanks to the information from the genetic code and according to the energy landscape, proteins fold into their correct three-dimensional structures and exert their specific function. The correct fold of large portion of the structure is generally related to specific protein functions and when any even small alterations occur, it is possible to observe a decrease, an increase or a drastic change in the protein function. In several cases alterations at the amino acid level can influence the conformational rearrangement, the function or the binding properties of a given protein. On this premise, knowledge on protein structure-function relationships can be crucial in finding the molecular basis for hereditary diseases and in predicting protein function from structure and vice versa. Therefore, the study of structure-function relationships is really important nowadays to better understand several diseases at their molecular level. In particular, this kind of approach seems to be relevant in cancer research considering that several somatic variants resulting from alterations at the amino acid level have been detected in cancer genome for several proteins. The analysis of this kind of alterations is key to understand the genetic bases of disease progression, patient survival and also response to therapy. Since knowledge of protein function in health and disease is essential to identify new and more specific cures for different diseases and to design pharmacologically active and more selective drugs, the information resulting from the analysis of somatic mutations found in cancer tissues can improve the available therapies and create new and more specific ones suggesting that precision and personalized medicine is not anymore a daydream

    Predicting a Protein's Stability under a Million Mutations

    Full text link
    Stabilizing proteins is a foundational step in protein engineering. However, the evolutionary pressure of all extant proteins makes identifying the scarce number of mutations that will improve thermodynamic stability challenging. Deep learning has recently emerged as a powerful tool for identifying promising mutations. Existing approaches, however, are computationally expensive, as the number of model inferences scales with the number of mutations queried. Our main contribution is a simple, parallel decoding algorithm. Our Mutate Everything is capable of predicting the effect of all single and double mutations in one forward pass. It is even versatile enough to predict higher-order mutations with minimal computational overhead. We build Mutate Everything on top of ESM2 and AlphaFold, neither of which were trained to predict thermodynamic stability. We trained on the Mega-Scale cDNA proteolysis dataset and achieved state-of-the-art performance on single and higher-order mutations on S669, ProTherm, and ProteinGym datasets. Code is available at https://github.com/jozhang97/MutateEverythingComment: NeurIPS 2023. Code available at https://github.com/jozhang97/MutateEverythin

    Effect of bet missense mutations on bromodomain function, inhibitor binding and stability

    Get PDF
    Lysine acetylation is an important epigenetic mark regulating gene transcription and chromatin structure. Acetylated lysine residues are specifically recognized by bromodomains, small protein interaction modules that read these modification in a sequence and acetylation dependent way regulating the recruitment of transcriptional regulators and chromatin remodelling enzymes to acetylated sites in chromatin. Recent studies revealed that bromodomains are highly druggable protein interaction domains resulting in the development of a large number of bromodomain inhibitors. BET bromodomain inhibitors received a lot of attention in the oncology field resulting in the rapid translation of early BET bromodomain inhibitors into clinical studies. Here we investigated the effects of mutations present as polymorphism or found in cancer on BET bromodomain function and stability and the influence of these mutants on inhibitor binding. We found that most BET missense mutations localize to peripheral residues in the two terminal helices. Crystal structures showed that the three dimensional structure is not compromised by these mutations but mutations located in close proximity to the acetyl-lysine binding site modulate acetyl-lysine and inhibitor binding. Most mutations affect significantly protein stability and tertiary structure in solution, suggesting new interactions and an alternative network of protein-protein interconnection as a consequence of single amino acid substitution. To our knowledge this is the first report studying the effect of mutations on bromodomain function and inhibitor binding

    Computational Design of Stable and Soluble Biocatalysts

    Get PDF
    Natural enzymes are delicate biomolecules possessing only marginal thermodynamic stability. Poorly stable, misfolded, and aggregated proteins lead to huge economic losses in the biotechnology and biopharmaceutical industries. Consequently, there is a need to design optimized protein sequences that maximize stability, solubility, and activity over a wide range of temperatures and pH values in buffers of different composition and in the presence of organic cosolvents. This has created great interest in using computational methods to enhance biocatalysts' robustness and solubility. Suitable methods include (i) energy calculations, (ii) machine learning, (iii) phylogenetic analyses, and (iv) combinations of these approaches. We have witnessed impressive progress in the design of stable enzymes over the last two decades, but predictions of protein solubility and expressibility are scarce. Stabilizing mutations can be predicted accurately using available force fields, and the number of sequences available for phylogenetic analyses is growing. In addition, complex computational workflows are being implemented in intuitive web tools, enhancing the quality of protein stability predictions. Conversely, solubility predictors are limited by the lack of robust and balanced experimental data, an inadequate understanding of fundamental principles of protein aggregation, and a dearth of structural information on folding intermediates. Here we summarize recent progress in the development of computational tools for predicting protein stability and solubility, critically assess their strengths and weaknesses, and identify apparent gaps in data and knowledge. We also present perspectives on the computational design of stable and soluble biocatalysts

    Analysis and interpretation of the impact of missense variants in cancer

    Get PDF
    open7noFunding: This work was supported by the PRIN project, “Integrative tools for defining the molecular basis of the diseases: Computational and Experimental methods for Protein Variant Interpretation” of the Ministero Istruzione, Università e Ricerca [201744NR8S].Large scale genome sequencing allowed the identification of a massive number of genetic variations, whose impact on human health is still unknown. In this review we analyze, by an in silico-based strategy, the impact of missense variants on cancer-related genes, whose effect on protein stability and function was experimentally determined. We collected a set of 164 variants from 11 proteins to analyze the impact of missense mutations at structural and functional levels, and to assess the performance of state-of-the-art methods (FoldX and Meta-SNP) for predicting protein stability change and pathogenicity. The result of our analysis shows that a combination of experimental data on protein stability and in silico pathogenicity predictions allowed the identification of a subset of variants with a high probability of having a deleterious phenotypic effect, as confirmed by the significant enrichment of the subset in variants annotated in the COSMIC database as putative cancer-driving variants. Our analysis suggests that the integration of experimental and computational approaches may contribute to evaluate the risk for complex disorders and develop more effective treatment strategies.openPetrosino M.; Novak L.; Pasquo A.; Chiaraluce R.; Turina P.; Capriotti E.; Consalvi V.Petrosino M.; Novak L.; Pasquo A.; Chiaraluce R.; Turina P.; Capriotti E.; Consalvi V
    • 

    corecore