14,519 research outputs found
Graph Kernels
We present a unified framework to study graph kernels, special cases of which include the random
walk (Gärtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004;
Mahé et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time
complexity of kernel computation between unlabeled graphs with n vertices from O(n^6) to O(n^3).
We find a spectral decomposition approach even more efficient when computing entire kernel matrices.
For labeled graphs we develop conjugate gradient and fixed-point methods that take O(dn^3)
time per iteration, where d is the size of the label set. By extending the necessary linear algebra to
Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for d-dimensional edge kernels,
and O(n^4) in the infinite-dimensional case; on sparse graphs these algorithms only take O(n^2)
time per iteration in all cases. Experiments on graphs from bioinformatics and other application
domains show that these techniques can speed up computation of the kernel by an order of magnitude
or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when
specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to
R-convolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment
kernel of Fröhlich et al. (2006) yet provably positive semi-definite
On multi-view learning with additive models
In many scientific settings data can be naturally partitioned into variable
groupings called views. Common examples include environmental (1st view) and
genetic information (2nd view) in ecological applications, chemical (1st view)
and biological (2nd view) data in drug discovery. Multi-view data also occur in
text analysis and proteomics applications where one view consists of a graph
with observations as the vertices and a weighted measure of pairwise similarity
between observations as the edges. Further, in several of these applications
the observations can be partitioned into two sets, one where the response is
observed (labeled) and the other where the response is not (unlabeled). The
problem for simultaneously addressing viewed data and incorporating unlabeled
observations in training is referred to as multi-view transductive learning. In
this work we introduce and study a comprehensive generalized fixed point
additive modeling framework for multi-view transductive learning, where any
view is represented by a linear smoother. The problem of view selection is
discussed using a generalized Akaike Information Criterion, which provides an
approach for testing the contribution of each view. An efficient implementation
is provided for fitting these models with both backfitting and local-scoring
type algorithms adjusted to semi-supervised graph-based learning. The proposed
technique is assessed on both synthetic and real data sets and is shown to be
competitive to state-of-the-art co-training and graph-based techniques.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS202 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Inductive queries for a drug designing robot scientist
It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments
Sparse Learning over Infinite Subgraph Features
We present a supervised-learning algorithm from graph data (a set of graphs)
for arbitrary twice-differentiable loss functions and sparse linear models over
all possible subgraph features. To date, it has been shown that under all
possible subgraph features, several types of sparse learning, such as Adaboost,
LPBoost, LARS/LASSO, and sparse PLS regression, can be performed. Particularly
emphasis is placed on simultaneous learning of relevant features from an
infinite set of candidates. We first generalize techniques used in all these
preceding studies to derive an unifying bounding technique for arbitrary
separable functions. We then carefully use this bounding to make block
coordinate gradient descent feasible over infinite subgraph features, resulting
in a fast converging algorithm that can solve a wider class of sparse learning
problems over graph data. We also empirically study the differences from the
existing approaches in convergence property, selected subgraph features, and
search-space sizes. We further discuss several unnoticed issues in sparse
learning over all possible subgraph features.Comment: 42 pages, 24 figures, 4 table
Context-aware visual exploration of molecular databases
Facilitating the visual exploration of scientific data has
received increasing attention in the past decade or so. Especially
in life science related application areas the amount
of available data has grown at a breath taking pace. In this
paper we describe an approach that allows for visual inspection
of large collections of molecular compounds. In
contrast to classical visualizations of such spaces we incorporate
a specific focus of analysis, for example the outcome
of a biological experiment such as high throughout
screening results. The presented method uses this experimental
data to select molecular fragments of the underlying
molecules that have interesting properties and uses the
resulting space to generate a two dimensional map based
on a singular value decomposition algorithm and a self organizing
map. Experiments on real datasets show that
the resulting visual landscape groups molecules of similar
chemical properties in densely connected regions
kLog: A Language for Logical and Relational Learning with Kernels
We introduce kLog, a novel approach to statistical relational learning.
Unlike standard approaches, kLog does not represent a probability distribution
directly. It is rather a language to perform kernel-based learning on
expressive logical and relational representations. kLog allows users to specify
learning problems declaratively. It builds on simple but powerful concepts:
learning from interpretations, entity/relationship data modeling, logic
programming, and deductive databases. Access by the kernel to the rich
representation is mediated by a technique we call graphicalization: the
relational representation is first transformed into a graph --- in particular,
a grounded entity/relationship diagram. Subsequently, a choice of graph kernel
defines the feature space. kLog supports mixed numerical and symbolic data, as
well as background knowledge in the form of Prolog or Datalog programs as in
inductive logic programming systems. The kLog framework can be applied to
tackle the same range of tasks that has made statistical relational learning so
popular, including classification, regression, multitask learning, and
collective classification. We also report about empirical comparisons, showing
that kLog can be either more accurate, or much faster at the same level of
accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at
http://klog.dinfo.unifi.it along with tutorials
A Survey on Graph Kernels
Graph kernels have become an established and widely-used technique for
solving classification tasks on graphs. This survey gives a comprehensive
overview of techniques for kernel-based graph classification developed in the
past 15 years. We describe and categorize graph kernels based on properties
inherent to their design, such as the nature of their extracted graph features,
their method of computation and their applicability to problems in practice. In
an extensive experimental evaluation, we study the classification accuracy of a
large suite of graph kernels on established benchmarks as well as new datasets.
We compare the performance of popular kernels with several baseline methods and
study the effect of applying a Gaussian RBF kernel to the metric induced by a
graph kernel. In doing so, we find that simple baselines become competitive
after this transformation on some datasets. Moreover, we study the extent to
which existing graph kernels agree in their predictions (and prediction errors)
and obtain a data-driven categorization of kernels as result. Finally, based on
our experimental results, we derive a practitioner's guide to kernel-based
graph classification
- …