77,100 research outputs found
Formulation of Hybrid Knowledge-Based/Molecular Mechanics Potentials for Protein Structure Refinement and a Novel Graph Theoretical Protein Structure Comparison and Analysis Technique
Proteins are the fundamental machinery that enables the functions of life. It is critical to understand them not just for basic biology, but also to enable medical advances. The field of protein structure prediction is concerned with developing computational techniques to predict protein structure and function from a protein’s amino acid sequence, encoded for directly in DNA, alone. Despite much progress since the first computational models in the late 1960’s, techniques for the prediction of protein structure still cannot reliably produce structures of high enough accuracy to enable desired applications such as rational drug design. Protein structure refinement is the process of modifying a predicted model of a protein to bring it closer to its native state. In this dissertation a protein structure refinement technique, that of potential energy minimization using hybrid molecular mechanics/knowledge based potential energy functions is examined in detail. The generation of the knowledge-based component is critically analyzed, and in the end, a potential that is a modest improvement over the original is presented.
This dissertation also examines the task of protein structure comparison. In evaluating various protein structure prediction techniques, it is crucial to be able to compare produced models against known structures to understand how well the technique performs. A novel technique is proposed that allows an in-depth yet intuitive evaluation of the local similarities between protein structures. Based on a graph analysis of pairwise atomic distance similarities, multiple regions of structural similarity can be identified between structures independently of relative orientation. Multidomain structures can be evaluated and this technique can be combined with global measures of similarity such as the global distance test. This method of comparison is expected to have broad applications in rational drug design, the evolutionary study of protein structures, and in the analysis of the protein structure prediction effort
Experimental library screening demonstrates the successful application of computational protein design to large structural ensembles
The stability, activity, and solubility of a protein sequence are determined by a delicate balance of molecular interactions in a variety of conformational states. Even so, most computational protein design methods model sequences in the context of a single native conformation. Simulations that model the native state as an ensemble have been mostly neglected due to the lack of sufficiently powerful optimization algorithms for multistate design. Here, we have applied our multistate design algorithm to study the potential utility of various forms of input structural data for design. To facilitate a more thorough analysis, we developed new methods for the design and high-throughput stability determination of combinatorial mutation libraries based on protein design calculations. The application of these methods to the core design of a small model system produced many variants with improved thermodynamic stability and showed that multistate design methods can be readily applied to large structural ensembles. We found that exhaustive screening of our designed libraries helped to clarify several sources of simulation error that would have otherwise been difficult to ascertain. Interestingly, the lack of correlation between our simulated and experimentally measured stability values shows clearly that a design procedure need not reproduce experimental data exactly to achieve success. This surprising result suggests potentially fruitful directions for the improvement of computational protein design technology
Protein Docking by the Underestimation of Free Energy Funnels in the Space of Encounter Complexes
Similarly to protein folding, the association of two proteins is driven
by a free energy funnel, determined by favorable interactions in some neighborhood of the
native state. We describe a docking method based on stochastic global minimization of
funnel-shaped energy functions in the space of rigid body motions (SE(3)) while accounting
for flexibility of the interface side chains. The method, called semi-definite
programming-based underestimation (SDU), employs a general quadratic function to
underestimate a set of local energy minima and uses the resulting underestimator to bias
further sampling. While SDU effectively minimizes functions with funnel-shaped basins, its
application to docking in the rotational and translational space SE(3) is not
straightforward due to the geometry of that space. We introduce a strategy that uses
separate independent variables for side-chain optimization, center-to-center distance of the
two proteins, and five angular descriptors of the relative orientations of the molecules.
The removal of the center-to-center distance turns out to vastly improve the efficiency of
the search, because the five-dimensional space now exhibits a well-behaved energy surface
suitable for underestimation. This algorithm explores the free energy surface spanned by
encounter complexes that correspond to local free energy minima and shows similarity to the
model of macromolecular association that proceeds through a series of collisions. Results
for standard protein docking benchmarks establish that in this space the free energy
landscape is a funnel in a reasonably broad neighborhood of the native state and that the
SDU strategy can generate docking predictions with less than 5 ďż˝ ligand interface Ca
root-mean-square deviation while achieving an approximately 20-fold efficiency gain compared
to Monte Carlo methods
LightDock: a new multi-scale approach to protein–protein docking
Computational prediction of protein–protein complex structure by docking can provide structural and mechanistic insights for protein interactions of biomedical interest. However, current methods struggle with difficult cases, such as those involving flexible proteins, low-affinity complexes or transient interactions. A major challenge is how to efficiently sample the structural and energetic landscape of the association at different resolution levels, given that each scoring function is often highly coupled to a specific type of search method. Thus, new methodologies capable of accommodating multi-scale conformational flexibility and scoring are strongly needed.
We describe here a new multi-scale protein–protein docking methodology, LightDock, capable of accommodating conformational flexibility and a variety of scoring functions at different resolution levels. Implicit use of normal modes during the search and atomic/coarse-grained combined scoring functions yielded improved predictive results with respect to state-of-the-art rigid-body docking, especially in flexible cases.B.J-G was supported by a FPI fellowship from the Spanish Ministry of Economy and
Competitiveness. This work was supported by I+D+I Research Project grants BIO2013-48213-R and BIO2016-79930-R from the Spanish Ministry of Economy
and Competitiveness. This work is partially supported by the European Union H2020
program through HiPEAC (GA 687698), by the Spanish Government through Programa
Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and
Technology (TIN2015-65316-P) and the Departament d’Innovació, Universitats i
Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programaciói Entorns d’Execució Paral·lels (2014-SGR-1051).Peer ReviewedPostprint (author's final draft
ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space
Studying the function of proteins is important for understanding the
molecular mechanisms of life. The number of publicly available protein
structures has increasingly become extremely large. Still, the determination of
the function of a protein structure remains a difficult, costly, and time
consuming task. The difficulties are often due to the essential role of spatial
and topological structures in the determination of protein functions in living
cells. In this paper, we propose ProtNN, a novel approach for protein function
prediction. Given an unannotated protein structure and a set of annotated
proteins, ProtNN finds the nearest neighbor annotated structures based on
protein-graph pairwise similarities. Given a query protein, ProtNN finds the
nearest neighbor reference proteins based on a graph representation model and a
pairwise similarity between vector embedding of both query and reference
protein-graphs in structural and topological spaces. ProtNN assigns to the
query protein the function with the highest number of votes across the set of k
nearest neighbor reference proteins, where k is a user-defined parameter.
Experimental evaluation demonstrates that ProtNN is able to accurately classify
several datasets in an extremely fast runtime compared to state-of-the-art
approaches. We further show that ProtNN is able to scale up to a whole PDB
dataset in a single-process mode with no parallelization, with a gain of
thousands order of magnitude of runtime compared to state-of-the-art
approaches
- …