20 research outputs found
BioPhysConnectoR: Connecting Sequence Information and Biophysical Models
<p>Abstract</p> <p>Background</p> <p>One of the most challenging aspects of biomolecular systems is the understanding of the coevolution in and among the molecule(s).</p> <p>A complete, theoretical picture of the selective advantage, and thus a functional annotation, of (co-)mutations is still lacking. Using sequence-based and information theoretical inspired methods we can identify coevolving residues in proteins without understanding the underlying biophysical properties giving rise to such coevolutionary dynamics. Detailed (atomistic) simulations are prohibitively expensive. At the same time reduced molecular models are an efficient way to determine the reduced dynamics around the native state. The combination of sequence based approaches with such reduced models is therefore a promising approach to annotate evolutionary sequence changes.</p> <p>Results</p> <p>With the <monospace>R</monospace> package <monospace>BioPhysConnectoR</monospace> we provide a framework to connect the information theoretical domain of biomolecular sequences to biophysical properties of the encoded molecules - derived from reduced molecular models. To this end we have integrated several fragmented ideas into one single package ready to be used in connection with additional statistical routines in <monospace>R</monospace>. Additionally, the package leverages the power of modern multi-core architectures to reduce turn-around times in evolutionary and biomolecular design studies. Our package is a first step to achieve the above mentioned annotation of coevolution by reduced dynamics around the native state of proteins.</p> <p>Conclusions</p> <p><monospace>BioPhysConnectoR</monospace> is implemented as an <monospace>R</monospace> package and distributed under GPL 2 license. It allows for efficient and perfectly parallelized functional annotation of coevolution found at the sequence level.</p
Biomolecular Correlation in Physical and Sequence Space
Investigating correlations is the key to understanding the nature of biological systems. In general, correlations describe the relationship between data sets or specific characteristics of data. To investigate correlations among and within biomolecules we discussed two complementary approaches to advance the understanding of evolution. Mutational dynamics can mainly be seen in the space of sequences whereas the altered phenotype is selected in the biophysical realm. By mutual information, an information-theoretical measure, we can identify potentially coevolving nucleotide or amino acid positions from a set of sequences combined into a multiple sequence alignment. In the biophysical realm, the mechanics of a biomolecule, which is important for its structure and function, is examined by various methods. Since molecular dynamics simulations and normal mode analysis are computationally expensive approaches, coarse-grained protein representations such as elastic network models have been developed. We used such protein models, particularly the Gaussian and the anisotropic network model, to jugde the importance of single residues or amino acid contacts on the dynamics of the biomolecule or distinct portions.
In this thesis, we applied this analysis to distinct sets of hammerhead ribozyme sequences of type I and III to reveal coevolutionary hot spots shared among the different sequences. We observed a weaker coevolution of ribozymes originating from prokaryotes and eukaryotes compared to viroid sequences. Additionally, we obtained signals between helical stems I and II which is well-known from experiments. However, we noticed a coevolutionary connection between stems I and III throughout all sets of sequences that have not been reported yet.
We applied an established protocol to a structural model of the small viral potassium channel Kcv, where we deleted single contacts and measured the resulting change in dynamics using the Frobenius norm. Here, we observed a mechanical connection of N- and C-terminal residues, whereas the selectivity filter seems almost mechanically uncoupled to the rest of the channel. A similar study was performed for the acetylcholinesterase as well where we additionally correlated mechanical changes with coevolutionary information. By means of coarse-grained protein models, we proposed a protocol for the Kcv to identify the transition from a functional to a non-functional channel upon N-terminal deletions.
Furthermore, we utilized reduced molecular models to derive amino acid specific interaction constants directly from a set of protein structures obtained from e.g. from molecular dynamics simulations. To this end, we examined the performance of three approaches to retrieve the input parameters from an artificially constructed system. As it turned out, semidefinite programming is an efficient method for this task and was employed for a realistic application as well
Distance-dependent classification of amino acids by information theory
Reduced amino acid alphabets are useful to understand molecular evolution as they reveal basal, shared properties of amino acids, which the structures and functions of proteins rely on. Several previous studies derived such reduced alphabets and linked them to the origin of life and biotechnological applications. However, all this previous work presupposes that only direct contacts of amino acids in native protein structures are relevant. We show in this work, using information-theoretical measures, that an appropriate alphabet reduction scheme is in fact a function of the maximum distance amino acids interact at. Although for small distances our results agree with previous ones, we show how long-range interactions change the overall picture and prompt for a revised understanding of the protein design process
Structure-based, biophysical annotation of molecular coevolution of acetylcholinesterase.
Acetylcholinesterase (AChE) is an important enzyme in the nervous system. It terminates signal transmission at chemical synapses by degrading the neurotransmitter acetylcholine and was found to play a role in plaque formation in Alzheimer's disease. Several functional parts of its structure have been identified in the past. Here, we use a coarse-grained anisotropic network model approach based on structure data to analyze protein mechanics of AChE. Single contacts in the protein are "switched off" and the change in the intrinsic dynamics is measured. We correlate the gained insight with information about coevolution within the molecule derived from multiple sequence alignments. More than 300 AChE sequences were aligned and the mutual information of the positions was calculated. From these structural, biophysical, and evolutionary data we could reveal sites of coevolutionary signatures in AChE, annotate them by the selective pressure induced for biophysical reasons, and further pave the way for a more detailed understanding of evolutionary boundary conditions for AChE. Proteins 2011; © 2011 Wiley-Liss, Inc
Distance dependency and minimum amino acid alphabets for decoy scoring potentials.
The validity and accuracy of a proposed tertiary structure of a protein can be assessed in several ways. Scoring such a structure by a knowledge-based potential is a well-known approach in molecular biophysics, an important task in structure prediction and refinement, and a key step in several experiments on protein structures. Although several parameterizations for such models have been derived over the course of time, improvements in accuracy by explicitly using continuous distance information have not been suggested yet. We close this methodological gap by formulating the parameterization of a protein structure model as a linear program. Optimization of the parameters was performed using amino acid distances calculated for the residues in topology rich 2830 protein structures. We show the capability of our derived model to discriminate between native structures and decoys for a diverse set of proteins. In addition, we discuss the effect of reduced amino acid alphabets on the model. In contrast to studies focusing on binary contact schemes (without considering distance dependencies and proposing five symbols as optimal alphabet size), we find an accurate protein alphabet size to contain at least five symbols, preferably more, to assure a satisfactory fold recognition capability. © 2012 Wiley Periodicals, Inc
Distance dependency and minimum amino acid alphabets for decoy scoring potentials
The validity and accuracy of a proposed tertiary structure of a protein can be assessed in several ways. Scoring such a structure by a knowledge-based potential is a well-known approach in molecular biophysics, an important task in structure prediction and refinement, and a key step in several experiments on protein structures. Although several parameterizations for such models have been derived over the course of time, improvements in accuracy by explicitly using continuous distance information have not been suggested yet. We close this methodological gap by formulating the parameterization of a protein structure model as a linear program. Optimization of the parameters was performed using amino acid distances calculated for the residues in topology rich 2830 protein structures. We show the capability of our derived model to discriminate between native structures and decoys for a diverse set of proteins. In addition, we discuss the effect of reduced amino acid alphabets on the model. In contrast to studies focusing on binary contact schemes (without considering distance dependencies and proposing five symbols as optimal alphabet size), we find an accurate protein alphabet size to contain at least five symbols, preferably more, to assure a satisfactory fold recognition capability. © 2012 Wiley Periodicals, Inc
Structural Model of the Gas Vesicle Protein GvpA and Analysis of GvpA Mutants in vivo.
Gas vesicles are gas-filled protein structures increasing the buoyancy of cells. The gas vesicle envelope is mainly constituted by the 8-kDa protein GvpA forming a wall with a water excluding inner surface. A structure of GvpA is not available; recent solid-state NMR results suggest a coil-α-β-β-α-coil fold. We obtained a first structural model of GvpA by high-performance de novo modeling. ATR-FTIR spectroscopy supported this structure. A dimer of GvpA was derived that could explain the formation of the protein monolayer in the gas vesicle wall. The hydrophobic inner surface is mainly constituted by anti-parallel β-strands. The proposed structure allows the pinpointing of contact sites that were mutated and tested for the ability to form gas vesicles in haloarchaea. Mutations in α-helix I and α-helix II, but also in the β-turn affected the gas vesicle formation, whereas other alterations had no effect. All mutants supported the structural features deduced from the model. The proposed GvpA dimers allow the formation of a monolayer protein wall, also consistent with protease treatments of isolated gas vesicles