90 research outputs found

    Constrained optimization applied to multiscale integrative modeling

    Get PDF
    Multiscale integrative modeling stands at the intersection between experimental and computational techniques to predict the atomistic structures of important macromolecules. In the integrative modeling process, the experimental information is often integrated with energy potential and macromolecular substructures in order to derive realistic structural models. This heterogeneous information is often combined into a global objective function that quantifies the quality of the structural models and that is minimized through optimization. In order to balance the contribution of the relative terms concurring to the global function, weight constants are assigned to each term through a computationally demanding process. In order to alleviate this common issue, we suggest to switch from the traditional paradigm of using a single unconstrained global objective function to a constrained optimization scheme. The work presented in this thesis describes the different applications and methods associated with the development of a general constrained optimization protocol for multiscale integrative modeling. The initial implementation concerned the prediction of symmetric macromolecular assemblies throught the incorporation of a recent efficient constrained optimizer nicknamed mViE (memetic Viability Evolution) to our integrative modeling protocol power (parallel optimization workbench to enhance resolution). We tested this new approach through rigorous comparisons against other state-of-the-art integrative modeling methods on a benchmark set of solved symmetric macromolecular assemblies. In this process, we validated the robustness of the constrained optimization method by obtaining native-like structural models. This constrained optimization protocol was then applied to predict the structure of the elusive human Huntingtin protein. Due to the fact that little structural information was available when the project was initiated, we integrated information from secondary structure prediction and low-resolution experiments, in the form of cryo-electron microscopy maps and crosslinking mass spectrometry data, in order to derive a structural model of Huntingtin. The structure resulting from such integrative modeling approach was used to derive dynamic information about Huntingtin protein. At a finer level of resolution, the constrained optimization protocol was then applied to dock small molecules inside the binding site of protein targets. We converted the classical molecular docking problem from an unconstrained single objective optimization to a constrained one by extracting local and global constraints from pre-computed energy grids. The new approach was tested and validated on standard ligand-receptor benchmark sets widely used by the molecular docking community, and showed comparable results to state-of-the-art molecular docking programs. Altogether, the work presented in this thesis proposed improvements in the field of multiscale integrative modeling which are reflected both in the quality of the models returned by the new constrained optimization protocol and in the simpler way of treating the uncorrelated terms concurring to the global scoring scheme to estimate the quality of the models

    Network Models for Materials and Biological Systems

    Get PDF
    abstract: The properties of materials depend heavily on the spatial distribution and connectivity of their constituent parts. This applies equally to materials such as diamond and glasses as it does to biomolecules that are the product of billions of years of evolution. In science, insight is often gained through simple models with characteristics that are the result of the few features that have purposely been retained. Common to all research within in this thesis is the use of network-based models to describe the properties of materials. This work begins with the description of a technique for decoupling boundary effects from intrinsic properties of nanomaterials that maps the atomic distribution of nanomaterials of diverse shape and size but common atomic geometry onto a universal curve. This is followed by an investigation of correlated density fluctuations in the large length scale limit in amorphous materials through the analysis of large continuous random network models. The difficulty of estimating this limit from finite models is overcome by the development of a technique that uses the variance in the number of atoms in finite subregions to perform the extrapolation to large length scales. The technique is applied to models of amorphous silicon and vitreous silica and compared with results from recent experiments. The latter part this work applies network-based models to biological systems. The first application models force-induced protein unfolding as crack propagation on a constraint network consisting of interactions such as hydrogen bonds that cross-link and stabilize a folded polypeptide chain. Unfolding pathways generated by the model are compared with molecular dynamics simulation and experiment for a diverse set of proteins, demonstrating that the model is able to capture not only native state behavior but also partially unfolded intermediates far from the native state. This study concludes with the extension of the latter model in the development of an efficient algorithm for predicting protein structure through the flexible fitting of atomic models to low-resolution cryo-electron microscopy data. By optimizing the fit to synthetic data through directed sampling and context-dependent constraint removal, predictions are made with accuracies within the expected variability of the native state.Dissertation/ThesisPh.D. Physics 201

    Open-Source Workflows for Reproducible Molecular Simulation

    Get PDF
    We apply molecular simulation to predict the equilibrium structure of organic molecular aggregates and how these structures determine material properties, with a focus on software engineering practices for ensuring correctness. Because simulations are implemented in software, there is potential for authentic scientific reproducibility in such work: An entire experimental apparatus (codebase) can be given to another investigator who should be able to use the same processes to find the same answers. Yet in practice, there are many barriers which stand in the way of reproducible molecular simulations that we address through automation, generalization, and software packaging. Collaboration on and application of the Molecular Simulation and Design Framework (MoSDeF) features prominently. We present structural investigations of organic molecule aggregates and the development of infrastructure and workflows that help manage, initialize, and analyze molecular simulation results through the following scientific applications (1) A screening study wherein we validate self-assembled poly-3-hexylthiophene (P3HT) morphologies show the same state dependency as in prior work, and (2) A multi-university collaborative reproducibility study where we examine modeling choices that give rise to differences between simulation engines. In aggregate, we reinforce the need for pipelines and practices emphasizing transferability, reproducibility, useability, and extensibility in molecular simulation

    Characterization of Coenzyme Q Biosynthesis Proteins through Integrative Modeling at the Protein-Membrane Interface

    Get PDF
    Integral and peripheral membrane proteins account for one-third of the human proteome, and they are estimated to represent the target for over 50% of modern medicinal drugs. Despite their central role in medicine, the complex, heterogeneous and dynamic nature of biological membranes complicates the investigation of their mechanism of action by both experimental and computational techniques. Among the different membrane bound compartments in eukaryotic cells, mitochondria are highly complex in form and function, and they harbor a unique proteome that remains largely unexplored. A growing number of inherited metabolic diseases are associated with mitochondrial dysfunction, which necessitates the structural and functional elucidation of mitochondrial proteins. In this thesis, we combine experimental and computational methods to explore the activity of COQ8 and COQ9, two functionally elusive proteins of the biosynthetic complex that produces coenzyme Q, a redox-active lipid component of the mitochondrial electron transport chain. (i) Conserved Lipid Modulation of Ancient Kinase-Like UbiB Family Member COQ8. We demonstrate that COQ8 has an ATPase function that is activated when it specifically associates with cardiolipin-containing membranes. We identify its interaction surface with the inner mitochondrial membrane, which gives hints about the possible interaction surfaces with other members of the coenzyme Q synthesis machinery and has implications on how it mediates functional interactions with lipids. Collectively, this work reveals how the positioning of COQ8 on the inner mitochondrial membrane is key to its activation, and therefore advances our understanding of the COQ8 function. (ii) Membrane, Lipid, and Protein Interactions of Coenzyme Q Biosynthesis Protein COQ9. We explore the lipid binding activity of COQ9, and we reveal that COQ9 repurposes an ancient bacterial fold to selectively bind aromatic isoprenes, including CoQ intermediates that reside within the bilayer. We elucidate the mechanistic details of its membrane binding process, by which COQ9 warps the membrane surface and creates a tightly sealed hydrophobic region to access its lipid cargo. Finally, we establish a potential molecular interface between COQ9 and COQ7, the enzyme that catalyzes the penultimate step in CoQ biosynthesis, suggesting a model whereby COQ9 presents intermediates to CoQ enzymes to overcome the hydrophobic barrier of the membrane. Collectively, our results provide a mechanism for how a lipid binding protein might access, select, and extract specific cargo from a membrane and present it to a peripheral membrane enzyme. In conclusion, our work is a good illustration of the interplay between experiment and modeling in protein research and specifically in understanding how proteins perform their action in direct synergy with membrane environments. We anticipate our integrative methodologies and mechanistic findings will prove relevant to other membrane proteins, whose fine functional modulation at the membrane-water interface has been historically challenging to characterize

    Edwards statistical mechanics for jammed granular matter

    Get PDF
    International audienc

    Connectable Components for Protein Design

    Get PDF
    Protein design requires reusable, trustworthy, and connectable parts in order to scale to complex challenges. The recent explosion of protein structures stored within the Protein Data Bank provides a wealth of small motifs we can harvest, but we still lack tools to combine them into larger proteins. Here I explore two approaches for connecting reusable protein components on two different length scales. On the atomic scale, I build an interactive search engine for connecting chemical fragments together. Protein fragments built using this search engine recapitulate native-like protein assemblies that can be integrated into existing protein scaffolds using backbone search engines such as MaDCaT. On the protein domain scale, I quantitatively dissect structural variations in two-component systems in order to extract general principles for engineering interfacial flexibility between modular four-helix bundles. These bundles exhibit large scissoring motions where helices move towards or away from the bundle axis and these motions propagate across domain boundaries. Together, these two approaches form the beginnings of a multiscale methodology for connecting reusable protein fragments where there is a constant interplay and feedback between design of atomic structure, secondary structure, and tertiary structure. Rapid iteration, visualization, and search glue these diverse length scales together into a cohesive whole

    Computational Approaches to Simulation and Analysis of Large Conformational Transitions in Proteins

    Get PDF
    abstract: In a typical living cell, millions to billions of proteins—nanomachines that fluctuate and cycle among many conformational states—convert available free energy into mechanochemical work. A fundamental goal of biophysics is to ascertain how 3D protein structures encode specific functions, such as catalyzing chemical reactions or transporting nutrients into a cell. Protein dynamics span femtosecond timescales (i.e., covalent bond oscillations) to large conformational transition timescales in, and beyond, the millisecond regime (e.g., glucose transport across a phospholipid bilayer). Actual transition events are fast but rare, occurring orders of magnitude faster than typical metastable equilibrium waiting times. Equilibrium molecular dynamics (EqMD) can capture atomistic detail and solute-solvent interactions, but even microseconds of sampling attainable nowadays still falls orders of magnitude short of transition timescales, especially for large systems, rendering observations of such "rare events" difficult or effectively impossible. Advanced path-sampling methods exploit reduced physical models or biasing to produce plausible transitions while balancing accuracy and efficiency, but quantifying their accuracy relative to other numerical and experimental data has been challenging. Indeed, new horizons in elucidating protein function necessitate that present methodologies be revised to more seamlessly and quantitatively integrate a spectrum of methods, both numerical and experimental. In this dissertation, experimental and computational methods are put into perspective using the enzyme adenylate kinase (AdK) as an illustrative example. We introduce Path Similarity Analysis (PSA)—an integrative computational framework developed to quantify transition path similarity. PSA not only reliably distinguished AdK transitions by the originating method, but also traced pathway differences between two methods back to charge-charge interactions (neglected by the stereochemical model, but not the all-atom force field) in several conserved salt bridges. Cryo-electron microscopy maps of the transporter Bor1p are directly incorporated into EqMD simulations using MD flexible fitting to produce viable structural models and infer a plausible transport mechanism. Conforming to the theme of integration, a short compendium of an exploratory project—developing a hybrid atomistic-continuum method—is presented, including initial results and a novel fluctuating hydrodynamics model and corresponding numerical code.Dissertation/ThesisDoctoral Dissertation Physics 201

    High resolution structural models of Ribosome nascent chain complexes restrained by experimental NMR data

    Get PDF
    As understanding of the ways in which the complex cellular environment affects the in vivo folding of proteins improves, improved methods for their study are required. It is possible to produce limited quantities of ribosome-nascent chain complexes (RNCs) and techniques for gathering data about them are improving, but no single technique provides all the information required to understand folding of nascent proteins on the ribosome and there are still significant data that cannot be obtained experimentally. In particular, while NMR chemical shift and residual dipolar couplings may be recorded, the samples are of too low concentration and stability to conduct the most informative NOESY experiments that are traditionally used for revealing atomic-resolution structure. Recently, the ability to use chemical shifts to reveal structural details and dynamic properties of small proteins has been developed. By simulating multiple molecules and predicting the average chemical shift of the ensemble, the simulation may be restrained to conform to the experimentally measured data, making testable predictions about the atomic-resolution dynamic properties of the molecule. By adapting these methods to the macromolecular RNC structures it is theorized that the limited chemical shift data available may be used to provide structural details of the protein as it emerges from a ribosome. This, however, is faced by many challenges, including the ability to simulate such large number of atoms in a suitable timescale and applying the restraints to the nascent chain alone. The thesis presented describes the development of computational techniques to characterize RNCs, including the concepts and challenges faced, the chemical-shift restrained simulation of nascent chains alone, the development of techniques to perform chemical-shift restrained molecular dynamics simulations of the RNCs and the application of these techniques to a model system
    • …
    corecore