176 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationOver 40 years ago, the first computer simulation of a protein was reported: the atomic motions of a 58 amino acid protein were simulated for few picoseconds. With today's supercomputers, simulations of large biomolecular systems with hundreds of thousands of atoms can reach biologically significant timescales. Through dynamics information biomolecular simulations can provide new insights into molecular structure and function to support the development of new drugs or therapies. While the recent advances in high-performance computing hardware and computational methods have enabled scientists to run longer simulations, they also created new challenges for data management. Investigators need to use local and national resources to run these simulations and store their output, which can reach terabytes of data on disk. Because of the wide variety of computational methods and software packages available to the community, no standard data representation has been established to describe the computational protocol and the output of these simulations, preventing data sharing and collaboration. Data exchange is also limited due to the lack of repositories and tools to summarize, index, and search biomolecular simulation datasets. In this dissertation a common data model for biomolecular simulations is proposed to guide the design of future databases and APIs. The data model was then extended to a controlled vocabulary that can be used in the context of the semantic web. Two different approaches to data management are also proposed. The iBIOMES repository offers a distributed environment where input and output files are indexed via common data elements. The repository includes a dynamic web interface to summarize, visualize, search, and download published data. A simpler tool, iBIOMES Lite, was developed to generate summaries of datasets hosted at remote sites where user privileges and/or IT resources might be limited. These two informatics-based approaches to data management offer new means for the community to keep track of distributed and heterogeneous biomolecular simulation data and create collaborative networks

    Investigation of Structural Dynamics of Enzymes and Protonation States of Substrates Using Computational Tools.

    Get PDF
    This review discusses the use of molecular modeling tools, together with existing experimental findings, to provide a complete atomic-level description of enzyme dynamics and function. We focus on functionally relevant conformational dynamics of enzymes and the protonation states of substrates. The conformational fluctuations of enzymes usually play a crucial role in substrate recognition and catalysis. Protein dynamics can be altered by a tiny change in a molecular system such as different protonation states of various intermediates or by a significant perturbation such as a ligand association. Here we review recent advances in applying atomistic molecular dynamics (MD) simulations to investigate allosteric and network regulation of tryptophan synthase (TRPS) and protonation states of its intermediates and catalysis. In addition, we review studies using quantum mechanics/molecular mechanics (QM/MM) methods to investigate the protonation states of catalytic residues of β-Ketoacyl ACP synthase I (KasA). We also discuss modeling of large-scale protein motions for HIV-1 protease with coarse-grained Brownian dynamics (BD) simulations

    DFFR: A New Method for High-Throughput Recalibration of Automatic Force-Fields for Drugs

    Get PDF
    We present drug force-field recalibration (DFFR), a new method for refining of automatic force-fields used to represent small drugs in docking and molecular dynamics simulations. The method is based on fine-tuning of torsional terms to obtain ensembles that reproduce observables derived from reference data. DFFR is fast and flexible and can be easily automatized for a high-throughput regime, making it useful in drug-design projects. We tested the performance of the method in a few model systems and also in a variety of druglike molecules using reference data derived from: (i) density functional theory coupled to a self-consistent reaction field (DFT/SCRF) calculations on highly populated conformers and (ii) enhanced sampling quantum mechanical/molecular mechanics (QM/MM) where the drug is reproduced at the QM level, while the solvent is represented by classical force-fields. Extension of the method to include other sources of reference data is discussed

    A simplified charge projection scheme for long-range electrostatics in ab initio QM/MM calculations

    Get PDF
    In a previous work [Pan et al., Molecules 23, 2500 (2018)], a charge projection scheme was reported, where outer molecular mechanical (MM) charges [>10 Å from the quantum mechanical (QM) region] were projected onto the electrostatic potential (ESP) grid of the QM region to accurately and efficiently capture long-range electrostatics in ab initio QM/MM calculations. Here, a further simplification to the model is proposed, where the outer MM charges are projected onto inner MM atom positions (instead of ESP grid positions). This enables a representation of the long-range MM electrostatic potential via augmentary charges (AC) on inner MM atoms. Combined with the long-range electrostatic correction function from Cisneros et al. [J. Chem. Phys. 143, 044103 (2015)] to smoothly switch between inner and outer MM regions, this new QM/MM-AC electrostatic model yields accurate and continuous ab initio QM/MM electrostatic energies with a 10 Å cutoff between inner and outer MM regions. This model enables efficient QM/MM cluster calculations with a large number of MM atoms as well as QM/MM calculations with periodic boundary conditions

    Revealing quantum mechanical effects in enzyme catalysis with large-scale electronic structure simulation

    Get PDF
    Enzymes have evolved to facilitate challenging reactions at ambient conditions with specificity seldom matched by other catalysts. Computational modeling provides valuable insight into catalytic mechanism, and the large size of enzymes mandates multi-scale, quantum mechanical-molecular mechanical (QM/MM) simulations. Although QM/MM plays an essential role in balancing simulation cost to enable sampling with the full QM treatment needed to understand electronic structure in enzyme active sites, the relative importance of these two strategies for understanding enzyme mechanism is not well known. We explore challenges in QM/MM for studying the reactivity and stability of three diverse enzymes: i) Mg[supercript 2+]-dependent catechol O-methyltransferase (COMT), ii) radical enzyme choline trimethylamine lyase (CutC), and iii) DNA methyltransferase (DNMT1), which has structural Zn[superscript 2+] binding sites. In COMT, strong non-covalent interactions lead to long range coupling of electronic structure properties across the active site, but the more isolated nature of the metallocofactor in DNMT1 leads to faster convergence of some properties. We quantify these effects in COMT by computing covariance matrices of by-residue electronic structure properties during dynamics and along the reaction coordinate. In CutC, we observe spontaneous bond cleavage following initiation events, highlighting the importance of sampling and dynamics. We use electronic structure analysis to quantify the relative importance of CHO and OHO non-covalent interactions in imparting reactivity. These three diverse cases enable us to provide some general recommendations regarding QM/MM simulation of enzymes.NEC CorporationNational Institute of Environmental Health Sciences (Grant P30-ES002109)Burroughs Wellcome Fund (Career Award at the Scientific Interface)United States. Department of Energy (Computational Science Graduate Fellowship

    2,6-diaminopurine promotes repair of DNA lesions under prebiotic conditions

    Get PDF
    High-yielding and selective prebiotic syntheses of RNA and DNA nucleotides involve UV irradiation to promote the key reaction steps and eradicate biologically irrelevant isomers. While these syntheses were likely enabled by UV-rich prebiotic environment, UV-induced formation of photodamages in polymeric nucleic acids, such as cyclobutane pyrimidine dimers (CPDs), remains the key unresolved issue for the origins of RNA and DNA on Earth. Here, we demonstrate that substitution of adenine with 2,6-diaminopurine enables repair of CPDs with yields reaching 92%. This substantial self-repairing activity originates from excellent electron donating properties of 2,6-diaminopurine in nucleic acid strands. We also show that the deoxyribonucleosides of 2,6-diaminopurine and adenine can be formed under the same prebiotic conditions. Considering that 2,6-diaminopurine was previously shown to increase the rate of nonenzymatic RNA replication, this nucleobase could have played critical roles in the formation of functional and photostable RNA/DNA oligomers in UV-rich prebiotic environments

    A new generation of user-friendly and machine learning-accelerated methods for protein pKa calculations

    Get PDF
    The ability to sense and react to external and internal pH changes is a survival requirement for any cell. pH homeostasis is tightly regulated, and even minor disruptions can severely impact cell metabolism, function, and survival. The pH dependence of proteins can be attributed to only 7 out of the 20 canonical amino acids, the titratable amino acids that can exchange protons with water in the usual 0-14 pH range. These amino acids make up for approximately 31% of all amino acids in the human proteome, meaning that, on average, roughly one-third of each protein is sensitive not only to the medium pH but also to alterations in the electrostatics of its surroundings. Unsurprisingly, protonation switches have been associated with a wide array of protein behaviors, including modulating the binding affinity in protein-protein, protein-ligand, or protein-lipid systems, modifying enzymatic activity and function, and even altering their stability and subcellular location. Despite its importance, our molecular understanding of pHdependent effects in proteins and other biomolecules is still very limited, particularly in big macromolecular complexes such as protein-protein or membrane protein systems. Over the years, several classes of methods have been developed to provide molecular insights into the protonation preference and dependence of biomolecules. Empirical methods offer cheap and competitive predictions for time- or resource-constrained situations. Albeit more computationally expensive, continuum electrostatics-based are a cost-effective solution for estimating microscopic equilibrium constants, pKhalf and macroscopic pKa. To study pH-dependent conformational transitions, constant-pH molecular dynamics (CpHMD) is the appropriate methodology. Unfortunately, given the computational cost and, in many cases, the difficulty associated with using CE-based and CpHMD, most researchers overuse empirical methods or neglect the effect of pH in their studies. Here, we address these issues by proposing multiple pKa predictor methods and tools with different levels of theory designed to be faster and accessible to more users. First, we introduced PypKa, a flexible tool to predict Poisson–Boltzmann/Monte Carlo-based (PB/MC) pKa values of titratable sites in proteins. It was validated with a large set of experimental values exhibiting a competitive performance. PypKa supports CPU parallel computing and can be used directly on proteins obtained from the Protein Data Bank (PDB) repository or molecular dynamics (MD) simulations. A simple, reusable, and extensible Python API is provided, allowing pKa calculations to be easily added to existing protocols with a few extra lines of code. This capability was exploited in the development of PypKa-MD, an easy-to-use implementation of the stochastic titration CpHMD method. PypKa-MD supports GROMOS and CHARMM force fields, as well as modern versions of GROMACS. Using PypKa’s API and consequent abstraction of PB/MC contributed to its greatly simplified modular architecture that will serve as the foundation for future developments. The new implementation was validated on alanine-based tetrapeptides with closely interacting titratable residues and four commonly used benchmark proteins, displaying highly similar and correlated pKa predictions compared to a previously validated implementation. Like most structural-based computational studies, the majority of pKa calculations are performed on experimental structures deposited in the PDB. Furthermore, there is an ever-growing imbalance between scarce experimental pKa values and the increasingly higher number of resolved structures. To save countless hours and resources that would be spent on repeated calculations, we have released pKPDB, a database of over 12M theoretical pKa values obtained by running PypKa over 120k protein structures from the PDB. The precomputed pKa estimations can be retrieved instantaneously via our web application, the PypKa Server. In case the protein of interest is not in the pKPDB, the user may easily run PypKa in the cloud either by uploading a custom structure or submitting an identifier code from the PBD or UniProtKB. It is also possible to use the server to get structures with representative pH-dependent protonation states to be used in other computational methods such as molecular dynamics. The advent of artificial intelligence in biological sciences presented an opportunity to drastically accelerate pKa predictors using our previously generated database of pKa values. With pKAI, we introduced the first deep learning-based predictor of pKa shifts in proteins trained on continuum electrostatics data. By combining a reasonable understanding of the underlying physics, an accuracy comparable to that of physics-based methods, and inference time speedups of more than 1000 ×, pKAI provided a game-changing solution for fast estimations of macroscopic pKa from ensembles of microscopic values. However, several limitations needed to be addressed before its integration within the CpHMD framework as a replacement for PypKa. Hence, we proposed a new graph neural network for protein pKa predictions suitable for CpHMD, pKAI-MD. This model estimates pH-independent energies to be used in a Monte Carlo routine to sample representative microscopic protonation states. While developing the new model, we explored different graph representations of proteins using multiple electrostatics-driven properties. While there are certainly many new features to be introduced and a multitude of development to be expanded, the selection of methods and tools presented in this work poses a significant improvement over the alternatives and effectively constitutes a new generation of user-friendly and machine learning-accelerated methods for pKa calculations

    Ultrafast Backbone Protonation in Channelrhodopsin-1 Captured by Polarization Resolved Fs Vis-pump - IR-Probe Spectroscopy and Computational Methods

    Get PDF
    Channelrhodopsins (ChR) are light-gated ion-channels heavily used in optogenetics. Upon light excitation an ultrafast all-trans to 13-cis isomerization of the retinal chromophore takes place. It is still uncertain by what means this reaction leads to further protein changes and channel conductivity. Channelrhodopsin-1 in Chlamydomonas augustae exhibits a 100 fs photoisomerization and a protonated counterion complex. By polarization resolved ultrafast spectroscopy in the mid-IR we show that the initial reaction of the retinal is accompanied by changes in the protein backbone and ultrafast protonation changes at the counterion complex comprising Asp299 and Glu169. In combination with homology modelling and quantum mechanics/molecular mechanics (QM/MM) geometry optimization we assign the protonation dynamics to ultrafast deprotonation of Glu169, and transient protonation of the Glu169 backbone, followed by a proton transfer from the backbone to the carboxylate group of Asp299 on a timescale of tens of picoseconds. The second proton transfer is not related to retinal dynamics and reflects pure protein changes in the first photoproduct. We assume these protein dynamics to be the first steps in a cascade of protein-wide changes resulting in channel conductivit

    Multiscale QM/MM modelling of catalytic systems with ChemShell

    Get PDF
    Hybrid quantum mechanical/molecular mechanical (QM/MM) methods are a powerful computational tool for the investigation of all forms of catalysis, as they allow for an accurate description of reactions occurring at catalytic sites in the context of a complicated electrostatic environment. The scriptable computational chemistry environment ChemShell is a leading software package for QM/MM calculations, providing a flexible, high performance framework for modelling both biomolecular and materials catalysis. We present an overview of recent applications of ChemShell to problems in catalysis and review new functionality introduced into the redeveloped Python-based version of ChemShell to support catalytic modelling. These include a fully guided workflow for biomolecular QM/MM modelling, starting from an experimental structure, a periodic QM/MM embedding scheme to support modelling of metallic materials, and a comprehensive set of tutorials for biomolecular and materials modelling
    corecore