13,300 research outputs found
Identification of functionally related enzymes by learning-to-rank methods
Enzyme sequences and structures are routinely used in the biological sciences
as queries to search for functionally related enzymes in online databases. To
this end, one usually departs from some notion of similarity, comparing two
enzymes by looking for correspondences in their sequences, structures or
surfaces. For a given query, the search operation results in a ranking of the
enzymes in the database, from very similar to dissimilar enzymes, while
information about the biological function of annotated database enzymes is
ignored.
In this work we show that rankings of that kind can be substantially improved
by applying kernel-based learning algorithms. This approach enables the
detection of statistical dependencies between similarities of the active cleft
and the biological function of annotated enzymes. This is in contrast to
search-based approaches, which do not take annotated training data into
account. Similarity measures based on the active cleft are known to outperform
sequence-based or structure-based measures under certain conditions. We
consider the Enzyme Commission (EC) classification hierarchy for obtaining
annotated enzymes during the training phase. The results of a set of sizeable
experiments indicate a consistent and significant improvement for a set of
similarity measures that exploit information about small cavities in the
surface of enzymes
Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data
In domains like bioinformatics, information retrieval and social network
analysis, one can find learning tasks where the goal consists of inferring a
ranking of objects, conditioned on a particular target object. We present a
general kernel framework for learning conditional rankings from various types
of relational data, where rankings can be conditioned on unseen data objects.
We propose efficient algorithms for conditional ranking by optimizing squared
regression and ranking loss functions. We show theoretically, that learning
with the ranking loss is likely to generalize better than with the regression
loss. Further, we prove that symmetry or reciprocity properties of relations
can be efficiently enforced in the learned models. Experiments on synthetic and
real-world data illustrate that the proposed methods deliver state-of-the-art
performance in terms of predictive power and computational efficiency.
Moreover, we also show empirically that incorporating symmetry or reciprocity
properties can improve the generalization performance
Advances in Hyperspectral Image Classification: Earth monitoring with statistical learning methods
Hyperspectral images show similar statistical properties to natural grayscale
or color photographic images. However, the classification of hyperspectral
images is more challenging because of the very high dimensionality of the
pixels and the small number of labeled examples typically available for
learning. These peculiarities lead to particular signal processing problems,
mainly characterized by indetermination and complex manifolds. The framework of
statistical learning has gained popularity in the last decade. New methods have
been presented to account for the spatial homogeneity of images, to include
user's interaction via active learning, to take advantage of the manifold
structure with semisupervised learning, to extract and encode invariances, or
to adapt classifiers and image representations to unseen yet similar scenes.
This tutuorial reviews the main advances for hyperspectral remote sensing image
classification through illustrative examples.Comment: IEEE Signal Processing Magazine, 201
- …