Search CORE

45 research outputs found

EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments

Author: Chiche Laurent
Gelly Jean-Christophe
Gracy Jérôme
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of amino acid substitution probabilities from a set of sequence-structure alignments. The main advantage over other approaches is an unbiased automatic selection of the most informative structural descriptors and associated values or thresholds. This feature allows automatic derivation of structure-dependent substitution scores for any specific set of structures, without the need to empirically determine best descriptors and parameters. RESULTS: Decision trees for residue substitutions were constructed for each residue type from sequence-structure alignments extracted from the HOMSTRAD database. For each tree cluster, environment-dependent substitution profiles were derived. The resulting structure-dependent substitution scores were assessed using a criterion based on the mean ranking of observed substitution among all possible substitutions and in sequence-structure alignments. The automatically built EvDTree substitution scores provide significantly better results than conventional matrices and similar or slightly better results than other structure-dependent matrices. EvDTree has been applied to small disulfide-rich proteins as a test case to automatically derive specific substitutions scores providing better results than non-specific substitution scores. Analyses of the decision tree classifications provide useful information on the relative importance of different structural descriptors. CONCLUSIONS: We propose a fully automatic method for the classification of structural environments and inference of structure-dependent substitution profiles. We show that this approach is more accurate than existing methods for various applications. The easy adaptation of EvDTree to any specific data set opens the way for class-specific structure-dependent substitution scores which can be used in threading-based remote homology searches

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Detection and Architecture of Small Heat Shock Protein Monomers

Author: Flatters Delphine
Gelly Jean-Christophe
Poulain Pierre
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

International audienceBACKGROUND: Small Heat Shock Proteins (sHSPs) are chaperone-like proteins involved in the prevention of the irreversible aggregation of misfolded proteins. Although many studies have already been conducted on sHSPs, the molecular mechanisms and structural properties of these proteins remain unclear. Here, we propose a better understanding of the architecture, organization and properties of the sHSP family through structural and functional annotations. We focused on the Alpha Crystallin Domain (ACD), a sandwich fold that is the hallmark of the sHSP family. METHODOLOGY/PRINCIPAL FINDINGS: We developed a new approach for detecting sHSPs and delineating ACDs based on an iterative Hidden Markov Model algorithm using a multiple alignment profile generated from structural data on ACD. Using this procedure on the UniProt databank, we found 4478 sequences identified as sHSPs, showing a very good coverage with the corresponding PROSITE and Pfam profiles. ACD was then delimited and structurally annotated. We showed that taxonomic-based groups of sHSPs (animals, plants, bacteria) have unique features regarding the length of their ACD and, more specifically, the length of a large loop within ACD. We detailed highly conserved residues and patterns specific to the whole family or to some groups of sHSPs. For 96% of studied sHSPs, we identified in the C-terminal region a conserved I/V/L-X-I/V/L motif that acts as an anchor in the oligomerization process. The fragment defined from the end of ACD to the end of this motif has a mean length of 14 residues and was named the C-terminal Anchoring Module (CAM). CONCLUSIONS/SIGNIFICANCE: This work annotates structural components of ACD and quantifies properties of several thousand sHSPs. It gives a more accurate overview of the architecture of sHSP monomers

Public Library of Science (PLOS)

CiteSeerX

HAL-Inserm

Directory of Open Access Journals

PubMed Central

Hal-Diderot

Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo

Author: Chaslot Guillaume
Chatriot Louis
Fiter Christophe
Gelly Sylvain
Hoock Jean-Baptiste
Perez J.
Rimmel Arpad
Teytaud Olivier
Publication venue: 'Lavoisier'
Publication date: 01/01/2008
Field of study

National audienceNous combinons pour de l'exploration Monte-Carlo d'arbres de l'apprentissage arti- RÉSUMÉ. ﬁciel à 4 échelles de temps : – regret en ligne, via l'utilisation d'algorithmes de bandit et d'estimateurs Monte-Carlo ; – de l'apprentissage transient, via l'utilisation d'estimateur rapide de Q-fonction (RAVE, pour Rapid Action Value Estimate) qui sont appris en ligne et utilisés pour accélérer l'explora- tion mais sont ensuite peu à peu laissés de côté à mesure que des informations plus ﬁnes sont disponibles ; – apprentissage hors-ligne, par fouille de données de jeux ; – utilisation de connaissances expertes comme information a priori. L'algorithme obtenu est plus fort que chaque élément séparément. Nous mettons en évidence par ailleurs un dilemne exploration-exploitation dans l'exploration Monte-Carlo d'arbres et obtenons une très forte amélioration par calage des paramètres correspondant. We combine for Monte-Carlo exploration machine learning at four different time ABSTRACT. scales: – online regret, through the use of bandit algorithms and Monte-Carlo estimates; – transient learning, through the use of rapid action value estimates (RAVE) which are learnt online and used for accelerating the exploration and are thereafter neglected; – ofﬂine learning, by data mining of datasets of games; – use of expert knowledge coming from the old ages as prior information

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Combiner connaissances expertes, hors-ligne, transientes et en ligne pour l'exploration Monte-Carlo

Author: Chaslot Guillaume
Chatriot Louis
Fiter Christophe
Gelly Sylvain
Hoock Jean-Baptiste
Perez J.
Rimmel Arpad
Teytaud Olivier
Publication venue: 'Lavoisier'
Publication date: 01/01/2008
Field of study

INRIA a CCSD electronic archive server

A short survey on protein blocks.

Author: Agarwal Garima
Bornot Aurélie
Cadet Frédéric
de Brevern Alexandre
Etchebest Catherine
Gelly Jean-Christophe
Joseph Agnel
Mahajan Swapnil
Offmann Bernard
Schneider Bohdan
Srinivasan Narayanaswamy
Swapna Lakshmipuram,
Tyagi Manoj
Valadié Hélène
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2010
Field of study

International audienceProtein structures are classically described in terms of secondary structures. Even if the regular secondary structures have relevant physical meaning, their recognition from atomic coordinates has some important limitations such as uncertainties in the assignment of boundaries of helical and β-strand regions. Further, on an average about 50% of all residues are assigned to an irregular state, i.e., the coil. Thus different research teams have focused on abstracting conformation of protein backbone in the localized short stretches. Using different geometric measures, local stretches in protein structures are clustered in a chosen number of states. A prototype representative of the local structures in each cluster is generally defined. These libraries of local structures prototypes are named as "structural alphabets". We have developed a structural alphabet, named Protein Blocks, not only to approximate the protein structure, but also to predict them from sequence. Since its development, we and other teams have explored numerous new research fields using this structural alphabet. We review here some of the most interesting applications

Hal - Université Grenoble Alpes

Assignment of PolyProline II Conformation and Analysis of Sequence – Structure Relationship

Author: A Bornot
A Kentsis
A Rath
AA Adzhubei
AA Adzhubei
AG de Brevern
AG de Brevern
AG de Brevern
AG de Brevern
AG de Brevern
AG de Brevern
Agnel Praveen Joseph
AK Jha
Alexandre G. de Brevern
AP Joseph
AP Joseph
AW Chan
B Hess
B Offmann
B Zagrovic
BJ Stapley
BK Kay
BW Chellgren
BW Chellgren
C Etchebest
CM Venkatachalam
CY Wu
D Eisenberg
D Frishman
D van der Spoel
DA Beck
E Lindahl
E Polverini
EJ Thompson
EW Blanch
F Avbelj
F Eker
FC Bernstein
FC Peterson
FM Richards
G Darnell
G Faure
G Faure
G Labesse
G Wang
G Wang
GB Banks
GD Rose
HJC Berendsen
HM Berman
J Esque
J Makowska
J Martin
J Martin
J Martin
JC Horng
JC Kendrew
Jean-Christophe Gelly
JM Hicks
JS Richardson
JS Richardson
K Chen
L Fourrier
L Pauling
L Pauling
L Pauling
L Pauling
LL Perskie
LL Porter
LR Rabiner
M Bansal
M Dudev
M Kuemin
M Mezei
M Tyagi
M Tyagi
M Tyagi
M Tyagi
M Tyagi
MA Kelly
Markus Buehler
MB Swindells
ML Tiffany
MV Cubellis
MV Cubellis
N Colloc'h
N Sreerama
NC Fitzkee
PK Vlasov
PL Obuchowski
PM Cowan
R Berisio
R Srinivasan
RV Pappu
S Arnott
S Jun
S Kutter
SA Hollingsworth
SJ Whittington
SM King
T Kameda
T Kohonen
TP Creamer
TP Creamer
V Sasisekharan
W Kabsch
WL Jorgensen
Y Watanabe
Yohann Mansiaux
Z Liu
Z Shi
Z Shi
Z Shi
Z Shi
Z Shi
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

International audienceBACKGROUND: Secondary structures are elements of great importance in structural biology, biochemistry and bioinformatics. They are broadly composed of two repetitive structures namely α-helices and β-sheets, apart from turns, and the rest is associated to coil. These repetitive secondary structures have specific and conserved biophysical and geometric properties. PolyProline II (PPII) helix is yet another interesting repetitive structure which is less frequent and not usually associated with stabilizing interactions. Recent studies have shown that PPII frequency is higher than expected, and they could have an important role in protein - protein interactions. METHODOLOGY/PRINCIPAL FINDINGS: A major factor that limits the study of PPII is that its assignment cannot be carried out with the most commonly used secondary structure assignment methods (SSAMs). The purpose of this work is to propose a PPII assignment methodology that can be defined in the frame of DSSP secondary structure assignment. Considering the ambiguity in PPII assignments by different methods, a consensus assignment strategy was utilized. To define the most consensual rule of PPII assignment, three SSAMs that can assign PPII, were compared and analyzed. The assignment rule was defined to have a maximum coverage of all assignments made by these SSAMs. Not many constraints were added to the assignment and only PPII helices of at least 2 residues length are defined. CONCLUSIONS/SIGNIFICANCE: The simple rules designed in this study for characterizing PPII conformation, lead to the assignment of 5% of all amino as PPII. Sequence - structure relationships associated with PPII, defined by the different SSAMs, underline few striking differences. A specific study of amino acid preferences in their N and C-cap regions was carried out as their solvent accessibility and contact patterns. Thus the assignment of PPII can be coupled with DSSP and thus opens a simple way for further analysis in this field

Public Library of Science (PLOS)

Crossref

HAL-Inserm

Directory of Open Access Journals

PubMed Central

HAL Descartes

Hal-Diderot

EvDTree : structure-dependent substitution matrices based on decision tree classification of 3D environments

Author: Jean-christophe Gelly
Jean-Christophe Gelly Jrme
Jérôme Gracy
Laurent Chiche
Publication venue
Publication date
Field of study

Introduction Substitution matrices are commonly used in sequence alignment or homology searches. They are the essential component in the detection of structure, function and evolutionary relationships between protein sequences. Substitution matrices derived from structural superposition of homologous pairs of proteins provide the best performance, and it has been shown that amino acid substitutions are indeed constrained by the structural environment, each environment displaying a distinct substitution pattern. One of the most reliable and popular tool for sequence-structure homology recognition, FUGUE, is based on environment-dependent matrices [1]. The FUGUE substitution matrices are deduced from a classification into 64 empirically selected 3D environments. Here we use hierarchical clustering and decision tree algorithms to determine optimal classifications of 3D environments leading to improved environment-dependent substitution matrices. Decision tree classifications appear rob

CiteSeerX

Système d'information et outils de prédiction structurale spécifiques de classes de protéines (Base de données KNOTTIN et matrices de substitution EvDTree dépendantes de la structure)

Author: CHICHE Laurent
GELLY Jean-Christophe
Publication venue
Publication date: 01/01/2004
Field of study

MONTPELLIER-BU Sciences (341722106) / SudocSudocFranceF

OpenGrey Repository

Protein Peeling 3D: new tools for analyzing protein structures.

Author: de Brevern Alexandre
Gelly Jean-Christophe
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2011
Field of study

International audienceWe present an improved version of our Protein Peeling web server dedicated to the analysis of protein structure architecture through the identification of protein units produced by an iterative splitting algorithm. New features include identification of structural domains, detection of unstructured terminal elements and evaluation of the stability of protein unit structures. AVAILABILITY: The website is free and open to all users with no login requirements at http://www.dsimb.inserm.fr/dsimb-tools/peeling3

HAL-Inserm

HAL Descartes

Hal-Diderot

A bioinformatic web server to cut protein structures in terms of Protein Units.

Author: de Brevern Alexandre
Gelly Jean-Christophe
Publication venue: Nova Book Press
Publication date: 01/11/2011
Field of study

Analysis of the architecture and organization of protein structures is a major challenge to better understand protein flexibility, folding, functions and interactions with their partners and to design new drugs. Protein structures are often described as series of alpha-helices and beta-sheets, or at a higher level as an arrangement of protein domains. Due to the lack of an intermediate vision which could give a good understanding and description of protein structure architecture, we have proposed a novel intermediate view, the Protein Units (PUs). They are novel level of protein structure description between secondary structures and domains. A PU is defined as a compact sub-region of the 3D structure corresponding to one sequence fragment, defined by a high number of intra-PU contacts and a low number of inter-PU contacts. The methodology to obtain PUs from the protein structures is named Protein Peeling (PP). For the algorithm, the protein structures are described as a succession of Ca. The distances between Ca are translated into contact probabilities using a logistic function. Protein Peeling only uses this contact probability matrix. An optimization procedure, based on the Matthews' coefficient correlation (MCC) between contacts probability sub matrices, defines optimal cutting points that separate the region examined into two or three PUs. The process is iterated until the compactness of the resulting PUs reaches a given limit. An index assesses the compactness quality and relative independence of each PU. Protein Peeling is a tool to better understand and analyze the organization of protein structures. We have developed a dedicated bioinformatic web server: Protein Peeling 2 (PP2). Given the 3D coordinates of a protein, it proposes an automatic identification of protein units (PUs). The interface component consists of a web page (HTML) and common gateway interface (CGI). The user can set many parameters and upload a given structure in PDB file format to a perl core instance. This last component is a module that embeds all the information necessary for two others softwares (mainly coded in C to perform most of the computation tasks and R for the analysis). Results are given both textually and graphically using JMol applet and PyMol software. The server can be accessed from http://www.dsimb.inserm.fr/dsimb_tools/peeling/. Only one equivalent on line methodology is available

HAL-Inserm

Hal-Diderot