15,307 research outputs found
A tree-based kernel for graphs with continuous attributes
The availability of graph data with node attributes that can be either
discrete or real-valued is constantly increasing. While existing kernel methods
are effective techniques for dealing with graphs having discrete node labels,
their adaptation to non-discrete or continuous node attributes has been
limited, mainly for computational issues. Recently, a few kernels especially
tailored for this domain, and that trade predictive performance for
computational efficiency, have been proposed. In this paper, we propose a graph
kernel for complex and continuous nodes' attributes, whose features are tree
structures extracted from specific graph visits. The kernel manages to keep the
same complexity of state-of-the-art kernels while implicitly using a larger
feature space. We further present an approximated variant of the kernel which
reduces its complexity significantly. Experimental results obtained on six
real-world datasets show that the kernel is the best performing one on most of
them. Moreover, in most cases the approximated version reaches comparable
performances to current state-of-the-art kernels in terms of classification
accuracy while greatly shortening the running times.Comment: This work has been submitted to the IEEE Transactions on Neural
Networks and Learning Systems for possible publication. Copyright may be
transferred without notice, after which this version may no longer be
accessibl
A Survey on Graph Kernels
Graph kernels have become an established and widely-used technique for
solving classification tasks on graphs. This survey gives a comprehensive
overview of techniques for kernel-based graph classification developed in the
past 15 years. We describe and categorize graph kernels based on properties
inherent to their design, such as the nature of their extracted graph features,
their method of computation and their applicability to problems in practice. In
an extensive experimental evaluation, we study the classification accuracy of a
large suite of graph kernels on established benchmarks as well as new datasets.
We compare the performance of popular kernels with several baseline methods and
study the effect of applying a Gaussian RBF kernel to the metric induced by a
graph kernel. In doing so, we find that simple baselines become competitive
after this transformation on some datasets. Moreover, we study the extent to
which existing graph kernels agree in their predictions (and prediction errors)
and obtain a data-driven categorization of kernels as result. Finally, based on
our experimental results, we derive a practitioner's guide to kernel-based
graph classification
Subgraph Matching Kernels for Attributed Graphs
We propose graph kernels based on subgraph matchings, i.e.
structure-preserving bijections between subgraphs. While recently proposed
kernels based on common subgraphs (Wale et al., 2008; Shervashidze et al.,
2009) in general can not be applied to attributed graphs, our approach allows
to rate mappings of subgraphs by a flexible scoring scheme comparing vertex and
edge attributes by kernels. We show that subgraph matching kernels generalize
several known kernels. To compute the kernel we propose a graph-theoretical
algorithm inspired by a classical relation between common subgraphs of two
graphs and cliques in their product graph observed by Levi (1973). Encouraging
experimental results on a classification task of real-world graphs are
presented.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Propagation Kernels
We introduce propagation kernels, a general graph-kernel framework for
efficiently measuring the similarity of structured data. Propagation kernels
are based on monitoring how information spreads through a set of given graphs.
They leverage early-stage distributions from propagation schemes such as random
walks to capture structural information encoded in node labels, attributes, and
edge information. This has two benefits. First, off-the-shelf propagation
schemes can be used to naturally construct kernels for many graph types,
including labeled, partially labeled, unlabeled, directed, and attributed
graphs. Second, by leveraging existing efficient and informative propagation
schemes, propagation kernels can be considerably faster than state-of-the-art
approaches without sacrificing predictive performance. We will also show that
if the graphs at hand have a regular structure, for instance when modeling
image or video data, one can exploit this regularity to scale the kernel
computation to large databases of graphs with thousands of nodes. We support
our contributions by exhaustive experiments on a number of real-world graphs
from a variety of application domains
Graph Classification with 2D Convolutional Neural Networks
Graph learning is currently dominated by graph kernels, which, while
powerful, suffer some significant limitations. Convolutional Neural Networks
(CNNs) offer a very appealing alternative, but processing graphs with CNNs is
not trivial. To address this challenge, many sophisticated extensions of CNNs
have recently been introduced. In this paper, we reverse the problem: rather
than proposing yet another graph CNN model, we introduce a novel way to
represent graphs as multi-channel image-like structures that allows them to be
handled by vanilla 2D CNNs. Experiments reveal that our method is more accurate
than state-of-the-art graph kernels and graph CNNs on 4 out of 6 real-world
datasets (with and without continuous node attributes), and close elsewhere.
Our approach is also preferable to graph kernels in terms of time complexity.
Code and data are publicly available.Comment: Published at ICANN 201
Inductive queries for a drug designing robot scientist
It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments
- …