168 research outputs found

    Thermodynamic driving forces in protein regulation studied by molecular dynamics simulations.

    No full text

    Collective Langevin Dynamics of Conformational Motions in Proteins

    No full text

    Machine Learning Applications for Drug Repurposing

    Full text link
    The cost of bringing a drug to market is astounding and the failure rate is intimidating. Drug discovery has been of limited success under the conventional reductionist model of one-drug-one-gene-one-disease paradigm, where a single disease-associated gene is identified and a molecular binder to the specific target is subsequently designed. Under the simplistic paradigm of drug discovery, a drug molecule is assumed to interact only with the intended on-target. However, small molecular drugs often interact with multiple targets, and those off-target interactions are not considered under the conventional paradigm. As a result, drug-induced side effects and adverse reactions are often neglected until a very late stage of the drug discovery, where the discovery of drug-induced side effects and potential drug resistance can decrease the value of the drug and even completely invalidate the use of the drug. Thus, a new paradigm in drug discovery is needed. Structural systems pharmacology is a new paradigm in drug discovery that the drug activities are studied by data-driven large-scale models with considerations of the structures and drugs. Structural systems pharmacology will model, on a genome scale, the energetic and dynamic modifications of protein targets by drug molecules as well as the subsequent collective effects of drug-target interactions on the phenotypic drug responses. To date, however, few experimental and computational methods can determine genome-wide protein-ligand interaction networks and the clinical outcomes mediated by them. As a result, the majority of proteins have not been charted for their small molecular ligands; we have a limited understanding of drug actions. To address the challenge, this dissertation seeks to develop and experimentally validate innovative computational methods to infer genome-wide protein-ligand interactions and multi-scale drug-phenotype associations, including drug-induced side effects. The hypothesis is that the integration of data-driven bioinformatics tools with structure-and-mechanism-based molecular modeling methods will lead to an optimal tool for accurately predicting drug actions and drug associated phenotypic responses, such as side effects. This dissertation starts by reviewing the current status of computational drug discovery for complex diseases in Chapter 1. In Chapter 2, we present REMAP, a one-class collaborative filtering method to predict off-target interactions from protein-ligand interaction network. In our later work, REMAP was integrated with structural genomics and statistical machine learning methods to design a dual-indication polypharmacological anticancer therapy. In Chapter 3, we extend REMAP, the core method in Chapter 2, into a multi-ranked collaborative filtering algorithm, WINTF, and present relevant mathematical justifications. Chapter 4 is an application of WINTF to repurpose an FDA-approved drug diazoxide as a potential treatment for triple negative breast cancer, a deadly subtype of breast cancer. In Chapter 5, we present a multilayer extension of REMAP, applied to predict drug-induced side effects and the associated biological pathways. In Chapter 6, we close this dissertation by presenting a deep learning application to learn biochemical features from protein sequence representation using a natural language processing method

    An in-silico study: Investigating small molecule modulators of bio-molecular interactions

    Get PDF
    Small molecule inhibitors are commonly used to target protein targets that assist in the spread of diseases such as AIDS, cancer and deadly forms of influenza. Despite drug companies spending millions on R&D, the number of drugs that pass clinical trials is limited due to difficulties in engineering optimal non-covalent interactions. As many protein targets have the ability to rapidly evolve resistance, there is an urgent need for methods that rapidly identify effective new compounds. The thermodynamic driving force behind most biochemical reactions is known as the Gibbs free energy and it contains opposing dynamic and structural components that are known as the entropy (ΔS°) and enthalpy (ΔH°) respectively. ΔG° = ΔH° - TΔS°. Traditionally, drug design focussed on complementing the shape of an inhibitor to the binding cavity to optimise ΔG° favourability. However, this approach neglects the entropic contribution and phenomena such as Entropy-Enthalpy Compensation (EEC) often result in favourable bonding interactions not improving ΔG°, due to entropic unfavorability. Similarly, attempts to optimise inhibitor entropy can also have unpredictable results. Experimental methods such as ITC report on global thermodynamics, but have difficulties identifying the underlying molecular rationale for measured values. However, computational techniques do not suffer from the same limitations. MUP-I can promiscuously bind panels of hydrophobic ligands that possess incremental structural differences. Thus, small perturbations to the system can be studied through various in silico approaches. This work analyses the trends exhibited across these panels by examining the dynamic component via the calculation of per-unit entropies of protein, ligand and solvent. Two new methods were developed to assess the translational and rotational contributions to TΔS°, and a protocol created to study ligand internalisation. Synthesising this information with structural data obtained from spatial data on the binding cavity, intermolecular contacts and H-bond analysis allowed detailed molecular rationale for the global thermodynamic signatures to be derived

    Deep Learning for Genomics: A Concise Overview

    Full text link
    Advancements in genomic research such as high-throughput sequencing techniques have driven modern genomic studies into "big data" disciplines. This data explosion is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications.Comment: Invited chapter for Springer Book: Handbook of Deep Learning Application

    Development of Integrated Machine Learning and Data Science Approaches for the Prediction of Cancer Mutation and Autonomous Drug Discovery of Anti-Cancer Therapeutic Agents

    Get PDF
    Few technological ideas have captivated the minds of biochemical researchers to the degree that machine learning (ML) and artificial intelligence (AI) have. Over the last few years, advances in the ML field have driven the design of new computational systems that improve with experience and are able to model increasingly complex chemical and biological phenomena. In this dissertation, we capitalize on these achievements and use machine learning to study drug receptor sites and design drugs to target these sites. First, we analyze the significance of various single nucleotide variations and assess their rate of contribution to cancer. Following that, we used a portfolio of machine learning and data science approaches to design new drugs to target protein kinase inhibitors. We show that these techniques exhibit strong promise in aiding cancer research and drug discovery

    DEEP LEARNING METHODS FOR PREDICTION OF AND ESCAPE FROM PROTEIN RECOGNITION

    Get PDF
    Protein interactions drive diverse processes essential to living organisms, and thus numerous biomedical applications center on understanding, predicting, and designing how proteins recognize their partners. While unfortunately the number of interactions of interest still vastly exceeds the capabilities of experimental determination methods, computational methods promise to fill the gap. My thesis pursues the development and application of computational methods for several protein interaction prediction and design tasks. First, to improve protein-glycan interaction specificity prediction, I developed GlyBERT, which learns biologically relevant glycan representations encapsulating the components most important for glycan recognition within their structures. GlyBERT encodes glycans with a branched biochemical language and employs an attention-based deep language model to embed the correlation between local and global structural contexts. This approach enables the development of predictive models from limited data, supporting applications such as lectin binding prediction. Second, to improve protein-protein interaction prediction, I developed a unified geometric deep neural network, ‘PInet’ (Protein Interface Network), which leverages the best properties of both data- and physics-driven methods, learning and utilizing models capturing both geometrical and physicochemical molecular surface complementarity. In addition to obtaining state-of-the-art performance in predicting protein-protein interactions, PInet can serve as the backbone for other protein-protein interaction modeling tasks such as binding affinity prediction. Finally, I turned from ii prediction to design, addressing two important tasks in the context of antibodyantigen recognition. The first problem is to redesign a given antigen to evade antibody recognition, e.g., to help biotherapeutics avoid pre-existing immunity or to focus vaccine responses on key portions of an antigen. The second problem is to design a panel of variants of a given antigen to use as “bait” in experimental identification of antibodies that recognize different parts of the antigen, e.g., to support classification of immune responses or to help select among different antibody candidates. I developed a geometry-based algorithm to generate variants to address these design problems, seeking to maximize utility subject to experimental constraints. During the design process, the algorithm accounts for and balances the effects of candidate mutations on antibody recognition and on antigen stability. In retrospective case studies, the algorithm demonstrated promising precision, recall, and robustness of finding good designs. This work represents the first algorithm to systematically design antigen variants for characterization and evasion of polyclonal antibody responses

    Quantitative modeling and statistical analysis of protein-DNA binding sites

    Get PDF

    Protein-protein interactions: impact of solvent and effects of fluorination

    Get PDF
    Proteins have an indispensable role in the cell. They carry out a wide variety of structural, catalytic and signaling functions in all known biological systems. To perform their biological functions, proteins establish interactions with other bioorganic molecules including other proteins. Therefore, protein-protein interactions is one of the central topics in molecular biology. My thesis is devoted to three different topics in the field of protein-protein interactions. The first one focuses on solvent contribution to protein interfaces as it is an important component of protein complexes. The second topic discloses the structural and functional potential of fluorine's unique properties, which are attractive for protein design and engineering not feasible within the scope of canonical amino acids. The last part of this thesis is a study of the impact of charged amino acid residues within the hydrophobic interface of a coiled-coil system, which is one of the well-established model systems for protein-protein interactions studies. I. The majority of proteins interact in vivo in solution, thus studies of solvent impact on protein-protein interactions could be crucial for understanding many processes in the cell. However, though solvent is known to be very important for protein-protein interactions in terms of structure, dynamics and energetics, its effects are often disregarded in computational studies because a detailed solvent description requires complex and computationally demanding approaches. As a consequence, many protein residues, which establish water-mediated interactions, are neither considered in an interface definition. In the previous work carried out in our group the protein interfaces database (SCOWLP) has been developed. This database takes into account interfacial solvent and based on this classifies all interfacial protein residues of the PDB into three classes based on their interacting properties: dry (direct interaction), dual (direct and water-mediated interactions), and wet spots (residues interacting only through one water molecule). To define an interaction SCOWLP considers a donor–acceptor distance for hydrogen bonds of 3.2 Å, for salt bridges of 4 Å, and for van der Waals contacts the sum of the van der Waals radii of the interacting atoms. In previous studies of the group, statistical analysis of a non-redundant protein structure dataset showed that 40.1% of the interfacial residues participate in water-mediated interactions, and that 14.5% of the total residues in interfaces are wet spots. Moreover, wet spots have been shown to display similar characteristics to residues contacting water molecules in cores or cavities of proteins. The goals of this part of the thesis were: 1. to characterize the impact of solvent in protein-protein interactions 2. to elucidate possible effects of solvent inclusion into the correlated mutations approach for protein contacts prediction To study solvent impact on protein interfaces a molecular dynamics (MD) approach has been used. This part of the work is elaborated in section 2.1 of this thesis. We have characterized properties of water-mediated protein interactions at residue and solvent level. For this purpose, an MD analysis of 17 representative complexes from SH3 and immunoglobulin protein families has been performed. We have shown that the interfacial residues interacting through a single water molecule (wet spots) are energetically and dynamically very similar to other interfacial residues. At the same time, water molecules mediating protein interactions have been found to be significantly less mobile than surface solvent in terms of residence time. Calculated free energies indicate that these water molecules should significantly affect formation and stability of a protein-protein complex. The results obtained in this part of the work also suggest that water molecules in protein interfaces contribute to the conservation of protein interactions by allowing more sequence variability in the interacting partners, which has important implications for the use of the correlated mutations concept in protein interactions studies. This concept is based on the assumption that interacting protein residues co-evolve, so that a mutation in one of the interacting counterparts is compensated by a mutation in the other. The study presented in section 2.2 has been carried out to prove that an explicit introduction of solvent into the correlated mutations concept indeed yields qualitative improvement of existing approaches. For this, we have used the data on interfacial solvent obtained from the SCOWLP database (the whole PDB) to construct a “wet” similarity matrix. This matrix has been used for prediction of protein contacts together with a well-established “dry” matrix. We have analyzed two datasets containing 50 domains and 10 domain pairs, and have compared the results obtained by using several combinations of both “dry” and “wet” matrices. We have found that for predictions for both intra- and interdomain contacts the introduction of a combination of a “dry” and a “wet” similarity matrix improves the predictions in comparison to the “dry” one alone. Our analysis opens up the idea that the consideration of water may have an impact on the improvement of the contact predictions obtained by correlated mutations approaches. There are two principally novel aspects in this study in the context of the used correlated mutations methodology : i) the first introduction of solvent explicitly into the correlated mutations approach; ii) the use of the definition of protein-protein interfaces, which is essentially different from many other works in the field because of taking into account physico-chemical properties of amino acids and not being exclusively based on distance cut-offs. II. The second part of the thesis is focused on properties of fluorinated amino acids in protein environments. In general, non-canonical amino acids with newly designed side-chain functionalities are powerful tools that can be used to improve structural, catalytic, kinetic and thermodynamic properties of peptides and proteins, which otherwise are not feasible within the use of canonical amino acids. In this context fluorinated amino acids have increasingly gained in importance in protein chemistry because of fluorine's unique properties: high electronegativity and a small atomic size. Despite the wide use of fluorine in drug design, properties of fluorine in protein environments have not been yet extensively studied. The aims of this part of the dissertation were: 1. to analyze the basic properties of fluorinated amino acids such as electrostatic and geometric characteristics, hydrogen bonding abilities, hydration properties and conformational preferences (section 3.1) 2. to describe the behavior of fluorinated amino acids in systems emulating protein environments (section 3.2, section 3.3) First, to characterize fluorinated amino acids side chains we have used fluorinated ethane derivatives as their simplified models and applied a quantum mechanics approach. Properties such as charge distribution, dipole moments, volumes and size of the fluoromethylated groups within the model have been characterized. Hydrogen bonding properties of these groups have been compared with the groups typically presented in natural protein environments. We have shown that hydrogen and fluorine atoms within these fluoromethylated groups are weak hydrogen bond donors and acceptors. Nevertheless they should not be disregarded for applications in protein engineering. Then, we have implemented four fluorinated L-amino acids for the AMBER force field and characterized their conformational and hydration properties at the MD level. We have found that hydrophobicity of fluorinated side chains grows with the number of fluorine atoms and could be explained in terms of high electronegativity of fluorine atoms and spacial demand of fluorinated side-chains. These data on hydration agrees with the results obtained in the experimental work performed by our collaborators. We have rationally engineered systems that allow us to study fluorine properties and extract results that could be extrapolated to proteins. For this, we have emulated protein environments by introducing fluorinated amino acids into a parallel coiled-coil and enzyme-ligand chymotrypsin systems. The results on fluorination effect on coiled-coil dimerization and substrate affinities in the chymotrypsin active site obtained by MD, molecular docking and free energy calculations are in strong agreement with experimental data obtained by our collaborators. In particular, we have shown that fluorine content and position of fluorination can considerably change the polarity and steric properties of an amino acid side chain and, thus, can influence the properties that a fluorinated amino acid reveals within a native protein environment. III. Coiled-coils typically consist of two to five right-handed α-helices that wrap around each other to form a left-handed superhelix. The interface of two α-helices is usually represented by hydrophobic residues. However, the analysis of protein databases revealed that in natural occurring proteins up to 20% of these positions are populated by polar and charged residues. The impact of these residues on stability of coiled-coil system is not clear. MD simulations together with free energy calculations have been utilized to estimate favourable interaction partners for uncommon amino acids within the hydrophobic core of coiled-coils (Chapter 4). Based on these data, the best hits among binding partners for one strand of a coiled-coil bearing a charged amino acid in a central hydrophobic core position have been selected. Computational data have been in agreement with the results obtained by our collaborators, who applied phage display technology and CD spectroscopy. This combination of theoretical and experimental approaches allowed to get a deeper insight into the stability of the coiled-coil system. To conclude, this thesis widens existing concepts of protein structural biology in three areas of its current importance. We expand on the role of solvent in protein interfaces, which contributes to the knowledge of physico-chemical properties underlying protein-protein interactions. We develop a deeper insight into the understanding of the fluorine's impact upon its introduction into protein environments, which may assist in exploiting the full potential of fluorine's unique properties for applications in the field of protein engineering and drug design. Finally we investigate the mechanisms underlying coiled-coil system folding. The results presented in the thesis are of definite importance for possible applications (e.g. introduction of solvent explicitly into the scoring function) into protein folding, docking and rational design methods. The dissertation consists of four chapters: ● Chapter 1 contains an introduction to the topic of protein-protein interactions including basic concepts and an overview of the present state of research in the field. ● Chapter 2 focuses on the studies of the role of solvent in protein interfaces. ● Chapter 3 is devoted to the work on fluorinated amino acids in protein environments. ● Chapter 4 describes the study of coiled-coils folding properties. The experimental parts presented in Chapters 3 and 4 of this thesis have been performed by our collaborators at FU Berlin. Sections 2.1, 2.2, 3.1, 3.2 and Chapter 4 have been submitted/published in peer-reviewed international journals. Their organization follows a standard research article structure: Abstract, Introduction, Methodology, Results and discussion, and Conclusions. Section 3.3, though not published yet, is also organized in the same way. The literature references are summed up together at the end of the thesis to avoid redundancy within different chapters
    • 

    corecore