30 research outputs found
Protein beta-turn assignments
A classical way to analyze protein 3D structures or models is to investigate their secondary structures. Their predictions are also widely used as
a help to build new 3D models. Thus, hundreds of prediction methods have been proposed. Nonetheless before predicting, secondary structure assignment
is required even if not trivial. Therefore numerous but diverging assignment methods have been developed. β-turns constitute the third most important
secondary structures. However, no analysis to compare the β-turn distributions according to different secondary structure assignment methods has ever
been done. We propose in this paper to analyze and evaluate the results of such a comparison. We highlight some important divergence that could have
important consequence for the analysis and prediction of β-turns
: peel it
International audienceThree-dimensional structures of proteins are the support of their biological functions. Their folds are maintained by inter-residue interactions which are one of the main focuses to understand the mechanisms of protein folding and stability. Furthermore, protein structures can be composed of single or multiple functional domains that can fold and function independently. Hence, dividing a protein into domains is useful for obtaining an accurate structure and function determination. In previous studies, we enlightened protein contact properties according to different definitions and developed a novel methodology named Protein Peeling. Within protein structures, Protein Peeling characterizes small successive compact units along the sequence called protein units (PUs). The cutting done by Protein Peeling maximizes the number of contacts within the PUs and minimizes the number of contacts between them. This method is so a relevant tool in the context of the protein folding research and particularly regarding the hierarchical model proposed by George Rose. Here, we accurately analyze the PUs at different levels of cutting, using a non-redundant protein databank. Distribution of PU sizes, number of PUs or their accessibility are screened to determine their common and different features. Moreover, we highlight the preferential amino acid interactions inside and between PUs. Our results show that PUs are clearly an intermediate level between secondary structures and protein structural domains
A short survey on protein blocks.
International audienceProtein structures are classically described in terms of secondary structures. Even if the regular secondary structures have relevant physical meaning, their recognition from atomic coordinates has some important limitations such as uncertainties in the assignment of boundaries of helical and β-strand regions. Further, on an average about 50% of all residues are assigned to an irregular state, i.e., the coil. Thus different research teams have focused on abstracting conformation of protein backbone in the localized short stretches. Using different geometric measures, local stretches in protein structures are clustered in a chosen number of states. A prototype representative of the local structures in each cluster is generally defined. These libraries of local structures prototypes are named as "structural alphabets". We have developed a structural alphabet, named Protein Blocks, not only to approximate the protein structure, but also to predict them from sequence. Since its development, we and other teams have explored numerous new research fields using this structural alphabet. We review here some of the most interesting applications
Analyse et prédiction de la relation séquence - structure locale et flexibilité au sein des protéines globulaires
In silico prediction of protein structure from sequence is a major scientific challenge. It is now admitted that native 3D protein structures can be described by a limited set of recurring local structures. This observation led to the development of protein structure prediction techniques based on fragment assembly methods. Nowadays, these techniques are among the most effective. Protein local structure prediction is the first step toward the generation of global protein models. This thesis work mainly focuses on two major questions: (i) protein local structure prediction from sequence and (ii) the analysis of local structure predictability according to protein structure flexibility features. These analyses were based on a library - previously developed in the laboratory - of 120 3D structural prototypes encompassing all known local protein structures. An associated local structure prediction method from sequence had also been created and yielded a correct prediction rate of 51 %. Here, we achieved a balanced improvement of the prediction rate by coupling evolutionary information with Support Vector Machines. A very satisfying correct prediction rate of 63 % was obtained. Moreover, for directly estimating the quality of the prediction, we developed a confidence index which enables to identify regions that are more difficult to predict. In the same way, protein structures are not rigid macromolecules. Hence, we also extended our analysis and addressed the question of the structural predictability of a sequence with regards to its structural flexibility properties inside protein structures. We analyzed local structure flexibility features in proteins by relying on: (i) B-factors from X-ray experiments and (ii) backbone fluctuations observed in molecular dynamics simulations. Finally, an original flexibility prediction method from sequence was developed. Our different analyses are the first step toward the prediction of 3D global protein models.La prédiction in silico de la structure tridimensionnelle d'une protéine à partir de sa séquence en acides aminés constitue un défi scientifique d'intérêt majeur. Il est à présent admis que les structures protéiques peuvent être décrites à partir d'un répertoire limité de structures locales récurrentes. Cette observation a conduit au développement de techniques de prédiction de la structure 3D par assemblage de fragments. Ces techniques sont aujourd'hui parmi les plus performantes. Dans ce contexte, la prédiction des structures locales constitue une première étape vers la prédiction de la structure 3D globale d'une protéine. Mon travail de thèse porte principalement sur l'étude des structures protéiques locales à travers deux thèmes : (i) la prédiction des structures locales à partir de la séquence et (ii) l'analyse de la prédictibilité des structures locales en fonction de la flexibilité des structures protéiques. Ces études reposent sur une bibliothèque de 120 fragments chevauchants de 11 résidus de long précédemment développée au sein du laboratoire. Une méthode de prédiction des structures locales à partir de la séquence avait également été mise en place et permettait d'obtenir un taux de prédiction correct de 51 %. La prise en compte de données évolutionnaires couplée à l'utilisation de Machines à Vecteurs de Support a permis d'améliorer la prédiction des structures locales jusqu'à 63 % de prédiction correctes. De plus, un indice de confiance permettant d'évaluer directement la qualité de la prédiction et ainsi d'identifier les régions plus ardues à prédire a été mis au point. Par ailleurs, la structure des protéines n'est pas rigide. Ainsi, j'ai étendu notre analyse à l'étude la prédictibilité structurale des séquences d'acides aminés en fonction de leur flexibilité structurale au sein des protéines. Une analyse des propriétés dynamiques des structures locales a été menée en s'appuyant sur (i) les B-facteurs issus des expériences de cristallographie et (ii) les fluctuations du squelette polypeptidique observées lors de simulations de dynamique moléculaire. Ces analyses de la relation flexibilité-structure locale ont conduit au développement d'une stratégie de prédiction originale de la flexibilité à partir de la séquence. Nos différentes approches constituent une première étape vers la prédiction de la structure tridimensionnelle globale d'une protéine
Analyse et prédiction de la relation séquence - structure locale et flexibilité au sein des protéines globulaires
PARIS7-Bibliothèque centrale (751132105) / SudocSudocFranceF
Protein contacts, inter-residue interactions and side-chain modelling.: protein contacts
International audienceThree-dimensional structures of proteins are the support of their biological functions. Their folds are stabilized by contacts between residues. Inner protein contacts are generally described through direct atomic contacts, i.e. interactions between side-chain atoms, while contact prediction methods mainly used inter-Calpha distances. In this paper, we have analyzed the protein contacts on a recent high quality non-redundant databank using different criteria. First, we have studied the average number of contacts depending on the distance threshold to define a contact. Preferential contacts between types of amino acids have been highlighted. Detailed analyses have been done concerning the proximity of contacts in the sequence, the size of the proteins and fold classes. The strongest differences have been extracted, highlighting important residues. Then, we studied the influence of five different side-chain conformation prediction methods (SCWRL, IRECS, SCAP, SCATD and SCCOMP) on the distribution of contacts. The prediction rates of these different methods are quite similar. However, using a distance criterion between side chains, the results are quite different, e.g. SCAP predicts 50% more contacts than observed, unlike other methods that predict fewer contacts than observed. Contacts deduced are quite distinct from one method to another with at most 75% contacts in common. Moreover, distributions of amino acid preferential contacts present unexpected behaviours distinct from previously observed in the X-ray structures, especially at the surface of proteins. For instance, the interactions involving Tryptophan greatly decrease
Predicting protein flexibility through the prediction of local structures.
International audienceProtein structures are valuable tools for understanding protein function. However, protein dynamics is also considered a key element in protein function. Therefore, in addition to structural analysis, fully understanding protein function at the molecular level now requires accounting for flexibility. However, experimental techniques that produce both types of information simultaneously are still limited. Prediction approaches are useful alternative tools for obtaining otherwise unavailable data. It has been shown that protein structure can be described by a limited set of recurring local structures. In this context, we previously established a library composed of 120 overlapping long structural prototypes (LSPs) representing fragments of 11 residues in length and covering all known local protein structures. On the basis of the close sequence-structure relationship observed in LSPs, we developed a novel prediction method that proposes structural candidates in terms of LSPs along a given sequence. The prediction accuracy rate was high given the number of structural classes. In this study, we use this methodology to predict protein flexibility. We first examine flexibility according to two different descriptors, the B-factor and root mean square fluctuations from molecular dynamics simulations. We then show the relevance of using both descriptors together. We define three flexibility classes and propose a method based on the LSP prediction method for predicting flexibility along the sequence. The prediction rate reaches 49.6%. This method competes rather efficiently with the most recent, cutting-edge methods based on true flexibility data learning with sophisticated algorithms. Accordingly, flexibility information should be taken into account in structural prediction assessments. Proteins 2011. © 2010 Wiley-Liss, Inc
: Protein flexibility
Protein structures and protein structural models are great tools to reach protein function and provide very relevant information for drug design. Nevertheless, protein structures are not rigid entities. Cutting-edge bioinformatics methods tend to take into account the flexibility of these macromolecules. We present new approaches used to define protein structure flexibility
Local Structure Alphabets
Protein structures are classically described in terms of secondary structures, i.e., two regular states, the alpha-helices and the beta-strands and one default state, the coil. Even if the regular secondary structures have relevant physical meaning, the definition of secondary structures has some important (and often forgotten) limitations: the rules for secondary structure assignments are (i) not simple, (ii) not unique and (iii) 50% of all residues, which occur in the coil, are not described. Hence, different research groups have described local protein structures with the aim of analyzing them and to approximate every part of the protein backbone. These libraries of local structures consist of sets of small prototypes named "structural alphabets". They have also been used to predict the protein backbone conformation. In this chapter, we first present the secondary structures, i.e., the most classical approach to describe protein structures, followed by the different structural alphabets designed till date. We focus on the different prediction schemes developed with these structural alphabets