Search CORE

9 research outputs found

Predicting zinc binding at the proteome level

Author: A Murzin
A Passerini
A Passerini
A Sali
Andrea Passerini
Antonio Rosato
B Rost
B Schölkopf
B Taskar
BL Vallee
BN Chaudhuri
C Andreini
C Andreini
C Cortes
CH Tsai
Claudia Andreini
D Jensen
D Jensen
HM Berman
I Bertini
J Platt
J Shawe-Taylor
JD Thompson
M Hogbom
M Hu
M Renatus
MA Holmes
Paolo Frasconi
R Tupler
RA Scott
S Altschul
S Mika
Sauro Menchetti
W Shi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Metalloproteins are proteins capable of binding one or more metal ions, which may be required for their biological function, for regulation of their activities or for structural purposes. Metal-binding properties remain difficult to predict as well as to investigate experimentally at the whole-proteome level. Consequently, the current knowledge about metalloproteins is only partial. RESULTS: The present work reports on the development of a machine learning method for the prediction of the zinc-binding state of pairs of nearby amino-acids, using predictors based on support vector machines. The predictor was trained using chains containing zinc-binding sites and non-metalloproteins in order to provide positive and negative examples. Results based on strong non-redundancy tests prove that (1) zinc-binding residues can be predicted and (2) modelling the correlation between the binding state of nearby residues significantly improves performance. The trained predictor was then applied to the human proteome. The present results were in good agreement with the outcomes of previous, highly manually curated, efforts for the identification of human zinc-binding proteins. Some unprecedented zinc-binding sites could be identified, and were further validated through structural modelling. The software implementing the predictor is freely available at: CONCLUSION: The proposed approach constitutes a highly automated tool for the identification of metalloproteins, which provides results of comparable quality with respect to highly manually refined predictions. The ability to model correlations between pairwise residues allows it to obtain a significant improvement over standard 1D based approaches. In addition, the method permits the identification of unprecedented metal sites, providing important hints for the work of experimentalists

CiteSeerX

Crossref

Springer - Publisher Connector

Florence Research

Directory of Open Access Journals

PubMed Central

Weighted decomposition kernels

Author: Fabrizio Costa
Paolo Frasconi
Sauro Menchetti
Publication venue
Publication date: 01/01/2005
Field of study

We introduce a family of kernels on discrete data structures within the general class of decomposition kernels. A weighted decomposition kernel (WDK) is computed by dividing objects into substructures indexed by a selector. Two substructures are then matched if their selectors satisfy an equality predicate, while the importance of the match is determined by a probability kernel on local distributions fitted on the substructures. Under reasonable assumptions, a WDK can be computed efficiently and can avoid combinatorial explosion of the feature space. We report experimental evidence that the proposed kernel is highly competitive with respect to more complex state-of-the-art methods on a set of problems in bioinformatics

CiteSeerX

Wide coverage natural language processing using kernel methods and neural networks for structured data

Author: Fabrizio Costa
Massimiliano Pontil
Paolo Frasconi
Sauro Menchetti
Publication venue
Publication date: 01/01/2005
Field of study

Convolution kernels and recursive neural networks are both suitable approaches for supervised learning when the input is a discrete structure like a labeled tree or graph. We compare these techniques in two natural language problems. In both problems, the learning task consists in choosing the best alternative tree in a set of candidates. We report about an empirical evaluation between the two methods on a large corpus of parsed sentences and speculate on the role played by the representation and the loss function

CiteSeerX

Comparing Convolution Kernels and Recursive Neural Networks for Learning Preferences on Structured Data

Author: Fabrizio Costa
Massimiliano Pontil
Paolo Frasconi
Sauro Menchetti
Publication venue
Publication date
Field of study

Convolution kernels and recursive neural networks (RNN) are both suitable approaches for supervised learning when the input portion of an instance is a discrete structure like a tree or a graph. We report about an empirical comparison between the two architectures in a large scale preference learning problem related to natural language processing, where instances are candidate incremental parse trees. We found that kernels never outperform RNNs, even when a limited number of examples is employed for learning. We argue that convolution kernels may lead to feature space representations that are too sparse and too general because not focused on the specific learning task. The adaptive encoding mechanism in RNNs in this case allows us to obtain better prediction accuracy at smaller computational cost

CiteSeerX

Improving prediction of zinc binding sites by modeling the linkage between residues close in sequence

Author: Andrea Passerini
Antonio Rosato
Claudia Andreini
Paolo Frasconi
Sauro Menchetti
Publication venue: Springer
Publication date: 01/01/2006
Field of study

Abstract. We describe and empirically evaluate machine learning methods for the prediction of zinc binding sites from protein sequences. We start by observing that a data set consisting of single residues as examples is affected by autocorrelation and we propose an ad-hoc remedy in which sequentially close pairs of candidate residues are classified as being jointly involved in the coordination of a zinc ion. We develop a kernel for this particular type of data that can handle variable length gaps between candidate coordinating residues. Our empirical evaluation on a data set of non redundant protein chains shows that explicit modeling the correlation between residues close in sequence allows us to gain a significant improvement in the prediction performance.

CiteSeerX

Decomposition Kernels for Natural Language Processing

Author: Alessio Ceroni
Andrea Passerini
Fabrizio Costa
Paolo Frasconi
Sauro Menchetti
Publication venue
Publication date
Field of study

We propose a simple solution to the sequence labeling problem based on an extension of weighted decomposition kernels. We additionally introduce a multiinstance kernel approach for representing lexical word sense information. These new ideas have been preliminarily tested on named entity recognition and PP attachment disambiguation. We finally suggest how these techniques could be potentially merged using a declarative formalism that may provide a basis for the integration of multiple sources of information when using kernel-based learning in NLP.

CiteSeerX