1,416 research outputs found

    Exploration of Reaction Pathways and Chemical Transformation Networks

    Full text link
    For the investigation of chemical reaction networks, the identification of all relevant intermediates and elementary reactions is mandatory. Many algorithmic approaches exist that perform explorations efficiently and automatedly. These approaches differ in their application range, the level of completeness of the exploration, as well as the amount of heuristics and human intervention required. Here, we describe and compare the different approaches based on these criteria. Future directions leveraging the strengths of chemical heuristics, human interaction, and physical rigor are discussed.Comment: 48 pages, 4 figure

    Rational Design of Small-Molecule Inhibitors of Protein-Protein Interactions: Application to the Oncogenic c-Myc/Max Interaction

    Get PDF
    Protein-protein interactions (PPIs) constitute an emerging class of targets for pharmaceutical intervention pursued by both industry and academia. Despite their fundamental role in many biological processes and diseases such as cancer, PPIs are still largely underrepresented in today's drug discovery. This dissertation describes novel computational approaches developed to facilitate the discovery/design of small-molecule inhibitors of PPIs, using the oncogenic c-Myc/Max interaction as a case study.First, we critically review current approaches and limitations to the discovery of small-molecule inhibitors of PPIs and we provide examples from the literature.Second, we examine the role of protein flexibility in molecular recognition and binding, and we review recent advances in the application of Elastic Network Models (ENMs) to modeling the global conformational changes of proteins observed upon ligand binding. The agreement between predicted soft modes of motions and structural changes experimentally observed upon ligand binding supports the view that ligand binding is facilitated, if not enabled, by the intrinsic (pre-existing) motions thermally accessible to the protein in the unliganded form.Third, we develop a new method for generating models of the bioactive conformations of molecules in the absence of protein structure, by identifying a set of conformations (from different molecules) that are most mutually similar in terms of both their shape and chemical features. We show how to solve the problem using an Integer Linear Programming formulation of the maximum-edge weight clique problem. In addition, we present the application of the method to known c-Myc/Max inhibitors.Fourth, we propose an innovative methodology for molecular mimicry design. We show how the structure of the c-Myc/Max complex was exploited to designing compounds that mimic the binding interactions that Max makes with the leucine zipper domain of c-Myc.In summary, the approaches described in this dissertation constitute important contributions to the fields of computational biology and computer-aided drug discovery, which combine biophysical insights and computational methods to expedite the discovery of novel inhibitors of PPIs

    Computational structure‐based drug design: Predicting target flexibility

    Get PDF
    The role of molecular modeling in drug design has experienced a significant revamp in the last decade. The increase in computational resources and molecular models, along with software developments, is finally introducing a competitive advantage in early phases of drug discovery. Medium and small companies with strong focus on computational chemistry are being created, some of them having introduced important leads in drug design pipelines. An important source for this success is the extraordinary development of faster and more efficient techniques for describing flexibility in three‐dimensional structural molecular modeling. At different levels, from docking techniques to atomistic molecular dynamics, conformational sampling between receptor and drug results in improved predictions, such as screening enrichment, discovery of transient cavities, etc. In this review article we perform an extensive analysis of these modeling techniques, dividing them into high and low throughput, and emphasizing in their application to drug design studies. We finalize the review with a section describing our Monte Carlo method, PELE, recently highlighted as an outstanding advance in an international blind competition and industrial benchmarks.We acknowledge the BSC-CRG-IRB Joint Research Program in Computational Biology. This work was supported by a grant from the Spanish Government CTQ2016-79138-R.J.I. acknowledges support from SVP-2014-068797, awarded by the Spanish Government.Peer ReviewedPostprint (author's final draft

    Theoretical-experimental study on protein-ligand interactions based on thermodynamics methods, molecular docking and perturbation models

    Get PDF
    The current doctoral thesis focuses on understanding the thermodynamic events of protein-ligand interactions which have been of paramount importance from traditional Medicinal Chemistry to Nanobiotechnology. Particular attention has been made on the application of state-of-the-art methodologies to address thermodynamic studies of the protein-ligand interactions by integrating structure-based molecular docking techniques, classical fractal approaches to solve protein-ligand complementarity problems, perturbation models to study allosteric signal propagation, predictive nano-quantitative structure-toxicity relationship models coupled with powerful experimental validation techniques. The contributions provided by this work could open an unlimited horizon to the fields of Drug-Discovery, Materials Sciences, Molecular Diagnosis, and Environmental Health Sciences

    Development of a normal mode-based geometric simulation approach for investigating the intrinsic mobility of proteins

    Get PDF
    Specific functions of biological systems often require conformational transitions of macromolecules. Thus, being able to describe and predict conformational changes of biological macromolecules is not only important for understanding their impact on biological function, but will also have implications for the modelling of (macro)molecular complex formation and in structure-based drug design approaches. The “conformational selection model” provides the foundation for computational investigations of conformational fluctuations of the unbound protein state. These fluctuations may reveal conformational states adopted by the bound proteins. The aim of this work is to incorporate directional information in a geometry-based approach, in order to sample biologically relevant conformational space extensively. Interestingly, coarse-grained normal mode (CGNM) approaches, e.g., the elastic network model (ENM) and rigid cluster normal mode analysis (RCNMA), have emerged recently and provide directions of intrinsic motions in terms of harmonic modes (also called normal modes). In my previous work and in other studies it has been shown that conformational changes upon ligand binding occur along a few low-energy modes of unbound proteins and can be efficiently calculated by CGNM approaches. In order to explore the validity and the applicability of CGNM approaches, a large-scale comparison of essential dynamics (ED) modes from molecular dynamics (MD) simulations and normal modes from CGNM was performed over a dataset of 335 proteins. Despite high coarse-graining, low frequency normal modes from CGNM correlate very well with ED modes in terms of directions of motions (average maximal overlap is 0.65) and relative amplitudes of motions (average maximal overlap is 0.73). In order to exploit the potential of CGNM approaches, I have developed a three-step approach for efficient exploration of intrinsic motions of proteins. The first two steps are based on recent developments in rigidity and elastic network theory. Initially, static properties of the protein are determined by decomposing the protein into rigid clusters using the graph-theoretical approach FIRST at an all-atom representation of the protein. In a second step, dynamic properties of the molecule are revealed by the rotations-translations of blocks approach (RTB) using an elastic network model representation of the coarse-grained protein. In the final step, the recently introduced idea of constrained geometric simulations of diffusive motions in proteins is extended for efficient sampling of conformational space. Here, the low-energy (frequency) normal modes provided by the RCNMA approach are used to guide the backbone motions. The NMSim approach was validated on hen egg white lysozyme by comparing it to previously mentioned simulation methods in terms of residue fluctuations, conformational space explorations, essential dynamics, sampling of side-chain rotamers, and structural quality. Residue fluctuations in NMSim generated ensemble is found to be in good agreement with MD fluctuations with a correlation coefficient of around 0.79. A comparison of different geometry-based simulation approaches shows that FRODA is restricted in sampling the backbone conformational space. CONCOORD is restricted in sampling the side-chain conformational space. NMSim sufficiently samples both the backbone and the side-chain conformations taking experimental structures and conformations from the state of the art MD simulation as reference. The NMSim approach is also applied to a dataset of proteins where conformational changes have been observed experimentally, either in domain or functionally important loop regions. The NMSim simulations starting from the unbound structures are able to reach conformations similar to ligand bound conformations (RMSD 0.7) between the RMS fluctuations derived from NMSim generated structures and two experimental structures are observed. Furthermore, intrinsic fluctuations in NMSim simulation correlate with the region of loop conformational changes observed upon ligand binding in 2 out of 3 cases. The NMSim generated pathway of conformational change from the unbound structure to the ligand bound structure of adenylate kinase is validated by a comparison to experimental structures reflecting different states of the pathway as proposed by previous studies. Interestingly, the generated pathway confirms that the LID domain closure precedes the closing of the NMPbind domain, even if no target conformation is provided in NMSim. Hence, the results in this study show that, incorporating directional information in the geometry-based approach NMSim improves the sampling of biologically relevant conformational space and provides a computationally efficient alternative to state of the art MD simulations.Konformationsänderungen von Proteinen sind häufig eine grundlegende Voraussetzung für deren biologische Funktion. Die genaue Charakterisierung und Vorhersage dieser Konformationsänderungen ist für das Verständnis ihres Einflusses auf die Funktion erforderlich. Eines der dafür am häufigsten verwendeten und genauesten computergestützten Verfahren ist die Molekulardynamik-Simulationen (MD Simulationen). Diese sind jedoch nach wie vor sehr rechenintensiv und durchmustern den Konformationsraum nur in begrenztem Maße. Daher wurden Anstrengungen unternommen, alternative geometriebasierte Methoden (wie etwa CONCOORD oder FRODA) zu entwickeln, die auf einer reduzierten Darstellung von Proteinen beruhen. Das Ziel dieser Arbeit ist es, Richtungsinformationen in einen geometriebasierten Ansatz zu integrieren, und so den biologisch relevanten Konformationsraum erschöpfend zu durchmustern. Diese Idee führte kürzlich zur Entwicklung von „coarse-grained normal mode“ (CGNM) Methoden, wie zum Beispiel dem „elastic network model“ (ENM) und der von mir in vorangegangenen Arbeiten entwickelte „rigid cluster normal mode analysis“ (RCNMA). Beide Methoden liefern die gewünschte Richtungsinformation der intrinsischen Bewegungen eines Proteins in Form von harmonischen Moden (auch Normalmoden). Um die Aussagekraft, Robustheit und breite Anwendbarkeit solcher CGNM Verfahren zu untersuchen, wurde im Rahmen dieser Dissertation ein umfangreicher Vergleich zwischen „essential dynamics“ (ED) Moden aus MD Simulationen und Normalmoden aus CGNM Berechnungen durchgeführt. Der zugrundeliegende Datensatz enthielt 335 Proteine. Obwohl die CGNM Verfahren eine stark vereinfachte Darstellung für Proteine verwenden, korrelieren die niederfrequenten Moden dieser Verfahren bezüglich ihrer Bewegungs-Richtung (durchschnittliche maximale Überschneidung: 0,65) und -Amplitude (durchschnittliche maximale Überschneidung: 0,73) sehr gut mit ED Moden. Im Durchschnitt beschreibt das erste Viertel der Normalmoden 85 % des Raumes, der durch die ersten fünf ED Moden aufgespannt wird. Um die Leistungsfähigkeit von CGNM Verfahren genauer zu bestimmen, wurde im Rahmen der vorliegenden Studie eine dreistufige Methode zur Untersuchung der intrinsischen Dynamik von Proteinen entwickelt. Die ersten beiden Stufen basieren auf neusten Entwicklungen in der Rigiditäts-Theorie und der Beschreibung von elastischen Netzwerken. Diese sind im RCNMA Ansatz verwirklich und ermöglichen die Bestimmung der Normalmoden. Im letzten Schritt werden die Bewegungen des Proteinrückgrates entlang der mittels RCNMA erzeugten niederenergetischen Normalmoden ausgerichtet. Die Seitenkettenkonformrationen werden dabei durch Diffusionsbewegungen hin zu energetisch günstigen Rotameren erzeugt. Dies ist ein iterativer Prozess, bestehend aus mehreren kleineren Schritten, in denen jeweils intermediäre Konformationen erzeugt werden. Zur Validierung des NMSim Ansatzes wurde dieser mit den anderen zuvor genannten Simulationsmethoden am Beispiel von Lysozym verglichen. Die Fluktuationen der Aminosäurereste aus dem mit NMSim erzeugten Ensemble stimmen mit berechneten Fluktuationen aus der MD Simulation gut überein (Korrelationskoeffizient R = 0,79). Ein Vergleich der unterschiedlichen geometriebasierten Simulationsansätze zeigt, dass bei FRODA die Durchmusterung des Konformationsraumes des Proteinrückrates unzureichend ist. Bei CONCOORD ist hingegen die Durchmusterung des Konformationsraumes der Seitenketten unzureichend. NMSim hingegen durchmustert sowohl den Konformationsraum des Proteinrückrates als auch den der Seitenketten angemessen, wenn man die experimentell und mittels MD Simulationen erzeugten Konformationen als Referenz verwendet. Der NMSim Ansatz wurde ebenfalls auf einen Datensatz von Proteinen angewendet, für die Konformationsänderungen in Domänen oder in funktionell wichtigen Schleifenregionen experimentell beobacht wurden. In Übereinstimmung mit dem Konformations-Selektions-Modell ist der NMSim Ansatz bei vier von fünf Proteinen, die eine Domänenbewegung aufweisen, in der Lage, ausgehend von der ungebundenen Struktur neue Konformationen zu erzeugen, die der ligandgebundenen Konformation entsprechen (RMSD 0,7) zwischen der RMS Fluktuation der durch NMSim erzeugten Konformationen und jeweils zwei experimentellen Strukturen erreicht. Hingegen korrelieren die intrinischen Fluktuationen der NMSim Simulation in zwei von drei Fällen mit dem Bereich der ligandinduzierten Konformationsänderung in den Schleifen. Der mit NMSim generierte Pfad für die Konformationsänderungen von der ungebundenen Struktur zur ligandgebundenen Struktur der Adenylat-Kinase wurde durch den Vergleich zu experimentellen Strukturen validiert, die verschiedene Zustände des Pfades widerspiegeln. Die unterschiedlichen Kristallstrukturen, die entlang der Konformationsänderungen von der ungebundenen zur ligandgebundenen Struktur liegen, werden auf dem von NMSim erzeugten Pfad durchmustert. Interessanterweise bestätigt der generierte Pfad, dass die Schließbewegung der LID Domäne derjenigen der NMPbind Domäne vorangeht, sogar wenn keine Zielkonformation für die NMSim Simulation verwendet wurde

    Open Boundary Simulations of Proteins and Their Hydration Shells by Hamiltonian Adaptive Resolution Scheme

    Full text link
    The recently proposed Hamiltonian Adaptive Resolution Scheme (H-AdResS) allows to perform molecular simulations in an open boundary framework. It allows to change on the fly the resolution of specific subset of molecules (usually the solvent), which are free to diffuse between the atomistic region and the coarse-grained reservoir. So far, the method has been successfully applied to pure liquids. Coupling the H-AdResS methodology to hybrid models of proteins, such as the Molecular Mechanics/Coarse-Grained (MM/CG) scheme, is a promising approach for rigorous calculations of ligand binding free energies in low-resolution protein models. Towards this goal, here we apply for the first time H-AdResS to two atomistic proteins in dual-resolution solvent, proving its ability to reproduce structural and dynamic properties of both the proteins and the solvent, as obtained from atomistic simulations.Comment: This document is the Accepted Manuscript version of a Published Work that appeared in final form in Journal of Chemical Theory and Computation, copyright \c{opyright} American Chemical Society after peer review and technical editing by the publishe

    Predicting biomolecular function from 3D dynamics : sequence-sensitive coarse-grained elastic network model coupled to machine learning

    Full text link
    La dynamique structurelle des biomolécules est intimement liée à leur fonction, mais très coûteuse à étudier expériementalement. Pour cette raison, de nombreuses méthodologies computationnelles ont été développées afin de simuler la dynamique structurelle biomoléculaire. Toutefois, lorsque l'on s'intéresse à la modélisation des effects de milliers de mutations, les méthodes de simulations classiques comme la dynamique moléculaire, que ce soit à l'échelle atomique ou gros-grain, sont trop coûteuses pour la majorité des applications. D'autre part, les méthodes d'analyse de modes normaux de modèles de réseaux élastiques gros-grain (ENM pour "elastic network model") sont très rapides et procurent des solutions analytiques comprenant toutes les échelles de temps. Par contre, la majorité des ENMs considèrent seulement la géométrie du squelette biomoléculaire, ce qui en fait de mauvais choix pour étudier les effets de mutations qui ne changeraient pas cette géométrie. Le "Elastic Network Contact Model" (ENCoM) est le premier ENM sensible à la séquence de la biomolécule à l'étude, ce qui rend possible son utilisation pour l'exploration efficace d'espaces conformationnels complets de variants de séquence. La présente thèse introduit le pipeline computationel ENCoM-DynaSig-ML, qui réduit les espaces conformationnels prédits par ENCoM à des Signatures Dynamiques qui sont ensuite utilisées pour entraîner des modèles d'apprentissage machine simples. ENCoM-DynaSig-ML est capable de prédire la fonction de variants de séquence avec une précision significative, est complémentaire à toutes les méthodes existantes, et peut générer de nouvelles hypothèses à propos des éléments importants de dynamique structurelle pour une fonction moléculaire donnée. Nous présentons trois exemples d'étude de relations séquence-dynamique-fonction: la maturation des microARN, le potentiel d'activation de ligands du récepteur mu-opioïde et l'efficacité enzymatique de l'enzyme VIM-2 lactamase. Cette application novatrice de l'analyse des modes normaux est rapide, demandant seulement quelques secondes de temps de calcul par variant de séquence, et est généralisable à toute biomolécule pour laquelle des données expérimentale de mutagénèse sont disponibles.The dynamics of biomolecules are intimately tied to their functions but experimentally elusive, making their computational study attractive. When modelling the effects of thousands of mutations, time-stepping methods such as classical or enhanced sampling molecular dynamics are too costly for most applications. On the other hand, normal mode analysis of coarse-grained elastic network models (ENMs) provides fast analytical dynamics spanning all timescales. However, the vast majority of ENMs consider backbone geometry alone, making them a poor choice to study point mutations which do not affect the equilibrium structure. The Elastic Network Contact Model (ENCoM) is the first sequence-sensitive ENM, enabling its use for the efficient exploration of full conformational spaces from sequence variants. The present work introduces the ENCoM-DynaSig-ML computational pipeline, in which the ENCoM conformational spaces are reduced to Dynamical Signatures and coupled to simple machine learning algorithms. ENCoM-DynaSig-ML predicts the function of sequence variants with significant accuracy, is complementary to all existing methods, and can generate new hypotheses about which dynamical features are important for the studied biomolecule's function. Examples given are the maturation efficiency of microRNA variants, the activation potential of mu-opioid receptor ligands and the effect of point mutations on VIM-2 lactamase's enzymatic efficiency. This novel application of normal mode analysis is very fast, taking a few seconds CPU time per variant, and is generalizable to any biomolecule on which experimental mutagenesis data exist

    Computational development of rubromycin-based lead compounds for HIV-1 reverse transcriptase inhibition

    Get PDF
    The binding of several rubromycin-based ligands to HIV1-reverse transcriptase was analyzed using molecular docking and molecular dynamics simulations. MM-PBSA analysis and examination of the trajectories allowed the identification of several promising compounds with predicted high affinity towards reverse transcriptase mutants which have proven resistant to current drugs. Important insights on the complex interplay of factors determining the ability of ligands to selectively target each mutant have been obtained
    corecore