Search CORE

908 research outputs found

Automatic prediction of catalytic residues by modeling residue structural neighborhood

Author: A Ceroni
A Humm
A Yamaguchi
AC Wallace
AE Todd
Andrea Passerini
CT Porter
E Chea
E Webb
E Youn
EF Pettersen
Elisa Cilia
G Amitai
G Bartlett
J Bernardes
J Davis
J Ebert
J Mistry
JA Capra
JC Nebel
JD Fischer
KM Borgwardt
L Xie
M Babor
M Lippi
M Ondrechen
MM Benning
N Cristianini
N Nagano
N Shu
NV Petrova
P Gherardini
RD Finn
S Kawashima
SF Altschul
T Joachims
T Zhang
W Tong
WS Valdar
Y Tang
Y Wei
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Prediction of catalytic residues is a major step in characterizing the function of enzymes. In its simpler formulation, the problem can be cast into a binary classification task at the residue level, by predicting whether the residue is directly involved in the catalytic process. The task is quite hard also when structural information is available, due to the rather wide range of roles a functional residue can play and to the large imbalance between the number of catalytic and non-catalytic residues.Results: We developed an effective representation of structural information by modeling spherical regions around candidate residues, and extracting statistics on the properties of their content such as physico-chemical properties, atomic density, flexibility, presence of water molecules. We trained an SVM classifier combining our features with sequence-based information and previously developed 3D features, and compared its performance with the most recent state-of-the-art approaches on different benchmark datasets. We further analyzed the discriminant power of the information provided by the presence of heterogens in the residue neighborhood.Conclusions: Our structure-based method achieves consistent improvements on all tested datasets over both sequence-based and structure-based state-of-the-art approaches. Structural neighborhood information is shown to be responsible for such results, and predicting the presence of nearby heterogens seems to be a promising direction for further improvements.Journal ArticleResearch Support, N.I.H. Extramuralinfo:eu-repo/semantics/publishe

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DI-fusion

Active site prediction using evolutionary and structural information

Author: Aloy
Alterovitz
Altschul
Apweiler
Bagley
Baker
Bartlett
Bate
Berna
Brady
Capra
Casari
Chandonia
Davis
Edgar
Elcock
Fei Sha
Felsenstein
Fetrow
Fischer
Frey
George
Greenshtein
Gutteridge
Hastie
Hedstrom
Hedstrom
Henikoff
Hoggart
Hosmer
Huang
Hubbard
Innis
Jack F. Kirsch
Kabsch
Kimmen Sjölander
Koh
Kraut
Krem
Landau
Landgraf
Laurie
Lichtarge
Lin
Mayrose
McGrath
Michael I. Jordan
Mihalek
Mooney
Murzin
Ondrechen
Ota
Panchenko
Pazos
Peters
Petrova
Polgar
Porter
Richardson
Sankararaman
Segal
Shevade
Sriram Sankararaman
Tibshirani
Tong
van de Geer
Vàrallyay
Youn
Zhao
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5% at a corresponding recall (the fraction of catalytic residues identified) of 57% on a standard benchmark. Here we present a new method, Discern, which provides a significant improvement over the state-of-the-art through the use of statistical techniques to derive a model with a small set of features that are jointly predictive of enzyme active sites

CiteSeerX

Crossref

PubMed Central

eScholarship - University of California

Identification of a new family of putative PD-(D/E)XK nucleases with unusual phylogenomic distribution and a new type of the active site

Author: Bujnicki Janusz M
Feder Marcin
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Prediction of structure and function for uncharacterized protein families by identification of evolutionary links to characterized families and known structures is one of the cornerstones of genomics. Theoretical assignment of three-dimensional folds and prediction of protein function even at a very general level can facilitate the experimental determination of the molecular mechanism of action and the role that members of a given protein family fulfill in the cell. Here, we predict the three-dimensional fold and study the phylogenomic distribution of members of a large family of uncharacterized proteins classified in the Clusters of Orthologous Groups database as COG4636. RESULTS: Using protein fold-recognition we found that members of COG4636 are remotely related to Holliday junction resolvases and other nucleases from the PD-(D/E)XK superfamily. Structure modeling and sequence analyses suggest that most members of COG4636 exhibit a new, unusual variant of the putative active site, in which the catalytic Lys residue migrated in the sequence, but retained similar spatial position with respect to other functionally important residues. Sequence analyses revealed that members of COG4636 and their homologs are found mainly in Cyanobacteria, but also in other bacterial phyla. They undergo horizontal transfer and extensive proliferation in the colonized genomes; for instance in Gloeobacter violaceus PCC 7421 they comprise over 2% of all protein-encoding genes. Thus, members of COG4636 appear to be a new type of selfish genetic elements, which may fulfill an important role in the genome dynamics of Cyanobacteria and other species they invaded. Our analyses provide a platform for experimental determination of the molecular and cellular function of members of this large protein family. CONCLUSION: After submission of this manuscript, a crystal structure of one of the COG4636 members was released in the Protein Data Bank (code 1wdj; Idaka, M., Wada, T., Murayama, K., Terada, T., Kuramitsu, S., Shirouzu, M., Yokoyama, S.: Crystal structure of Tt1808 from Thermus thermophilus Hb8, to be published). Our analysis of the Tt1808 structure reveals that we correctly predicted all functionally important features of the COG4636 family, including the membership in the PD-(D/E)xK superfamily of nucleases, the three-dimensional fold, the putative catalytic residues, and the unusual configuration of the active site

Springer - Publisher Connector

PubMed Central

Type II restriction endonuclease R.Hpy188I belongs to the GIY-YIG nuclease superfamily, but exhibits an unusual active site

Author: Boniecki Michal
Bujnicki Janusz M
Kaminska Katarzyna H
Kawai Mikihiko
Kobayashi Ichizo
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

The interplay of descriptor-based computational analysis with pharmacophore modeling builds the basis for a novel classification scheme for feruloyl esterases

Author: Akin
Altschul
Andersen
Andreasen
Aurilia
Barnum
Bartolomé
Bendtsen
Benner
Benoit
Benoit
Bhasin
Bhasin
Blum
Cai
Cai
Castanares
Chang
Choi
Crepin
D.B.R.K. Gupta Udatha
Dodd
Donaghy
Donaghy
Dudoit
Dysvik
Ewing
Faulds
Ferguson
Fillingham
Finn
Garcia-Conesa
García-Conesa
Garrigues
Gasteiger
Gasteiger
Gianni Panagiotou
Giuliani
Goldstone
Hall
Han
Hatzakis
Henikoff
Hermoso
Hsu
Humberstone
Huson
Irene Kouskoumvekaki
Kaiser
Karchin
Keerthi
Kheder
Kikuzaki
Kim
Kohavi
Kohonen
Koseki
Koseki
Kroon
Kroon
Kumar
Lao
Larkin
Laszlo
Latha
Lee
Lesage-Meessen
Levasseur
Levasseur
Li
Lima
Lisbeth Olsson
MacKay
Marcotte
McAuley
Meinicke
Morris
Mukherjee
Nielsen
Noble
Nsereko
Oili
Ong
Platt
Prates
Pérez-Bercoff
Rashamuse
Record
Rost
Sancho
Sankararaman
Sankararaman
Schrödinger Suite 2009
Schubot
Slavin
Tarbouriech
Teodoro
Thompson
Tomoko
Topakas
Topakas
Topakas
Topakas
Topakas
Tsuchiyama
Tsuchiyama
Uestuen
Vafiadi
Vafiadi
Vafiadi
Vafiadi
Vafiadi
Vafiadi
Wang
Wang
Wang
Wilkinson
Publication venue
Publication date: 11/08/2010
Field of study

One of the most intriguing groups of enzymes, the feruloyl esterases (FAEs), is ubiquitous in both simple and complex organisms. FAEs have gained importance in biofuel, medicine and food industries due to their capability of acting on a large range of substrates for cleaving ester bonds and synthesizing high-added value molecules through esterification and transesterification reactions. During the past two decades extensive studies have been carried out on the production and partial characterization of FAEs from fungi, while much less is known about FAEs of bacterial or plant origin. Initial classification studies on FAEs were restricted on sequence similarity and substrate specificity on just four model substrates and considered only a handful of FAEs belonging to the fungal kingdom. This study centers on the descriptor-based classification and structural analysis of experimentally verified and putative FAEs; nevertheless, the framework presented here is applicable to every poorly characterized enzyme family. 365 FAE-related sequences of fungal, bacterial and plantae origin were collected and they were clustered using Self Organizing Maps followed by k-means clustering into distinct groups based on amino acid composition and physico-chemical composition descriptors derived from the respective amino acid sequence. A Support Vector Machine model was subsequently constructed for the classification of new FAEs into the pre-assigned clusters. The model successfully recognized 98.2% of the training sequences and all the sequences of the blind test. The underlying functionality of the 12 proposed FAE families was validated against a combination of prediction tools and published experimental data. Another important aspect of the present work involves the development of pharmacophore models for the new FAE families, for which sufficient information on known substrates existed. Knowing the pharmacophoric features of a small molecule that are essential for binding to the members of a certain family opens a window of opportunities for tailored applications of FAEs

Crossref

Chalmers Research

Nature Precedings

Online Research Database In Technology

Chalmers Publication Library

HKU Scholars Hub

L1pred: A Sequence-Based Prediction Tool for Catalytic Residues in Enzymes with the L1-logreg Classifier

Author: A Armon
A del Sol Mesa
A Gutteridge
AR Panchenko
B Sterner
C Berezin
C Marino Buslje
C Porter
CA Innis
Chi Zhang
D La
DR Caffrey
E Chea
E Cilia
E Greenshtein
E Youn
F Glaser
G Lopez
GJ Bartlett
HM Berman
I Mayrose
I Mihalek
IA Vergara
Iddo Friedberg
J Capra
J Pei
JD Fischer
Jialiang Yang
Jun Wang
K Koh
K Wang
K Ye
KC Bahadur Dukka
L Mirny
LJ McGuffin
M Brylinski
M Landau
N Petrova
P Zhao
R Alterovitz
RM Sweet
RM Williamson
S Ahmad
S Gong
S Pande
S Sankararaman
S Sankararaman
SA van de Geer
SF Altschul
SW Zhang
T Kato
T Zhang
W Taylor
W Tong
W Valdar
XS Liu
YC Dou
YC Dou
YC Dou
Yongchao Dou
YR Tang
ZP Liu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

To understand enzyme functions, identifying the catalytic residues is a usual first step. Moreover, knowledge about catalytic residues is also useful for protein engineering and drug-design. However, to experimentally identify catalytic residues remains challenging for reasons of time and cost. Therefore, computational methods have been explored to predict catalytic residues. Here, we developed a new algorithm, L1pred, for catalytic residue prediction, by using the L1-logreg classifier to integrate eight sequence-based scoring functions. We tested L1pred and compared it against several existing sequence-based methods on carefully designed datasets Data604 and Data63. With ten-fold cross-validation, L1pred showed the area under precision-recall curve (AUPR) and the area under ROC curve (AUC) of 0.2198 and 0.9494 on the training dataset, Data604, respectively. In addition, on the independent test dataset, Data63, it showed the AUPR and AUC values of 0.2636 and 0.9375, respectively. Compared with other sequence-based methods, L1pred showed the best performance on both datasets. We also analyzed the importance of each attribute in the algorithm, and found that all the scores contributed more or less equally to the L1pred performance

CiteSeerX

Public Library of Science (PLOS)

Crossref

DigitalCommons@University of Nebraska

Directory of Open Access Journals

PubMed Central

Prediction of functionally important residues in globular proteins from unusual central distances of amino acids

Author: Kochańczyk Marek
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Well-performing automated protein function recognition approaches usually comprise several complementary techniques. Beside constructing better consensus, their predictive power can be improved by either adding or refining independent modules that explore orthogonal features of proteins. In this work, we demonstrated how the exploration of global atomic distributions can be used to indicate functionally important residues. Results Using a set of carefully selected globular proteins, we parametrized continuous probability density functions describing preferred central distances of individual protein atoms. Relative preferred burials were estimated using mixture models of radial density functions dependent on the amino acid composition of a protein under consideration. The unexpectedness of extraordinary locations of atoms was evaluated in the information-theoretic manner and used directly for the identification of key amino acids. In the validation study, we tested capabilities of a tool built upon our approach, called SurpResi, by searching for binding sites interacting with ligands. The tool indicated multiple candidate sites achieving success rates comparable to several geometric methods. We also showed that the unexpectedness is a property of regions involved in protein-protein interactions, and thus can be used for the ranking of protein docking predictions. The computational approach implemented in this work is freely available via a Web interface at <url>http://www.bioinformatics.org/surpresi</url>. Conclusions Probabilistic analysis of atomic central distances in globular proteins is capable of capturing distinct orientational preferences of amino acids as resulting from different sizes, charges and hydrophobic characters of their side chains. When idealized spatial preferences can be inferred from the sole amino acid composition of a protein, residues located in hydrophobically unfavorable environments can be easily detected. Such residues turn out to be often directly involved in binding ligands or interfacing with other proteins.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

On the Structural Context and Identification of Enzyme Catalytic Residues

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Crossref

Computational approaches to predict protein functional families and functional sites.

Author: Abbasian M
Orengo CA
Rauer C
Sen N
Waman VP
Publication venue
Publication date: 01/10/2021
Field of study

Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the gap. Here, we present the latest developments in building functional subclassifications of protein superfamilies and using evolutionary conservation to detect functional determinants, for example, catalytic-, binding- and specificity-determining residues important for delineating the functional families. We also briefly review other features exploited for functional site detection and new machine learning strategies for combining multiple features

UCL Discovery

Insights into the structure and dynamics of lysyl oxidase propeptide, a flexible protein with numerous partners

Author: Duclos Bertrand
Liwo Adam
Miele Adriana E.
Ricard-Blum Sylvie
Samsonov Sergey A.
Uciechowska-Kaczmarzyk Urszula
Vallet Sylvain D.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Lysyl oxidase (LOX) catalyzes the oxidative deamination of lysine and hydroxylysine residues in collagens and elastin, which is the first step of the cross-linking of these extracellular matrix proteins. It is secreted as a proenzyme activated by bone morphogenetic protein-1, which releases the LOX catalytic domain and its bioactive N-terminal propeptide. We characterized the recombinant human propeptide by circular dichroism, dynamic light scattering, and small-angle X-ray scattering (SAXS), and showed that it is elongated, monomeric, disordered and flexible (Dmax: 11.7 nm, Rg: 3.7 nm). We generated 3D models of the propeptide by coarse-grained molecular dynamics simulations restrained by SAXS data, which were used for docking experiments. Furthermore, we have identified 17 new binding partners of the propeptide by label-free assays. They include four glycosaminoglycans (hyaluronan, chondroitin, dermatan and heparan sulfate), collagen I, cross-linking and proteolytic enzymes (lysyl oxidase-like 2, transglutaminase-2, matrix metalloproteinase-2), a proteoglycan (fibromodulin), one growth factor (Epidermal Growth Factor, EGF), and one membrane protein (tumor endothelial marker-8). This suggests new roles for the propeptide in EGF signaling pathway

Archivio della ricerca- Università di Roma La Sapienza

Hal-Diderot