5,016 research outputs found
A simple yet effective baseline for non-attributed graph classification
Graphs are complex objects that do not lend themselves easily to typical
learning tasks. Recently, a range of approaches based on graph kernels or graph
neural networks have been developed for graph classification and for
representation learning on graphs in general. As the developed methodologies
become more sophisticated, it is important to understand which components of
the increasingly complex methods are necessary or most effective.
As a first step, we develop a simple yet meaningful graph representation, and
explore its effectiveness in graph classification. We test our baseline
representation for the graph classification task on a range of graph datasets.
Interestingly, this simple representation achieves similar performance as the
state-of-the-art graph kernels and graph neural networks for non-attributed
graph classification. Its performance on classifying attributed graphs is
slightly weaker as it does not incorporate attributes. However, given its
simplicity and efficiency, we believe that it still serves as an effective
baseline for attributed graph classification. Our graph representation is
efficient (linear-time) to compute. We also provide a simple connection with
the graph neural networks.
Note that these observations are only for the task of graph classification
while existing methods are often designed for a broader scope including node
embedding and link prediction. The results are also likely biased due to the
limited amount of benchmark datasets available. Nevertheless, the good
performance of our simple baseline calls for the development of new, more
comprehensive benchmark datasets so as to better evaluate and analyze different
graph learning methods. Furthermore, given the computational efficiency of our
graph summary, we believe that it is a good candidate as a baseline method for
future graph classification (or even other graph learning) studies.Comment: 13 pages. Shorter version appears at 2019 ICLR Workshop:
Representation Learning on Graphs and Manifolds. arXiv admin note: text
overlap with arXiv:1810.00826 by other author
A Survey on Graph Kernels
Graph kernels have become an established and widely-used technique for
solving classification tasks on graphs. This survey gives a comprehensive
overview of techniques for kernel-based graph classification developed in the
past 15 years. We describe and categorize graph kernels based on properties
inherent to their design, such as the nature of their extracted graph features,
their method of computation and their applicability to problems in practice. In
an extensive experimental evaluation, we study the classification accuracy of a
large suite of graph kernels on established benchmarks as well as new datasets.
We compare the performance of popular kernels with several baseline methods and
study the effect of applying a Gaussian RBF kernel to the metric induced by a
graph kernel. In doing so, we find that simple baselines become competitive
after this transformation on some datasets. Moreover, we study the extent to
which existing graph kernels agree in their predictions (and prediction errors)
and obtain a data-driven categorization of kernels as result. Finally, based on
our experimental results, we derive a practitioner's guide to kernel-based
graph classification
Reconstructing Kernel-based Machine Learning Force Fields with Super-linear Convergence
Kernel machines have sustained continuous progress in the field of quantum
chemistry. In particular, they have proven to be successful in the low-data
regime of force field reconstruction. This is because many physical invariances
and symmetries can be incorporated into the kernel function to compensate for
much larger datasets. So far, the scalability of this approach has however been
hindered by its cubical runtime in the number of training points. While it is
known, that iterative Krylov subspace solvers can overcome these burdens, they
crucially rely on effective preconditioners, which are elusive in practice.
Practical preconditioners need to be computationally efficient and numerically
robust at the same time. Here, we consider the broad class of Nystr\"om-type
methods to construct preconditioners based on successively more sophisticated
low-rank approximations of the original kernel matrix, each of which provides a
different set of computational trade-offs. All considered methods estimate the
relevant subspace spanned by the kernel matrix columns using different
strategies to identify a representative set of inducing points. Our
comprehensive study covers the full spectrum of approaches, starting from naive
random sampling to leverage score estimates and incomplete Cholesky
factorizations, up to exact SVD decompositions.Comment: 18 pages, 12 figures, preprin
The prospects of quantum computing in computational molecular biology
Quantum computers can in principle solve certain problems exponentially more
quickly than their classical counterparts. We have not yet reached the advent
of useful quantum computation, but when we do, it will affect nearly all
scientific disciplines. In this review, we examine how current quantum
algorithms could revolutionize computational biology and bioinformatics. There
are potential benefits across the entire field, from the ability to process
vast amounts of information and run machine learning algorithms far more
efficiently, to algorithms for quantum simulation that are poised to improve
computational calculations in drug discovery, to quantum algorithms for
optimization that may advance fields from protein structure prediction to
network analysis. However, these exciting prospects are susceptible to "hype",
and it is also important to recognize the caveats and challenges in this new
technology. Our aim is to introduce the promise and limitations of emerging
quantum computing technologies in the areas of computational molecular biology
and bioinformatics.Comment: 23 pages, 3 figure
Accelerating Science: A Computing Research Agenda
The emergence of "big data" offers unprecedented opportunities for not only
accelerating scientific advances but also enabling new modes of discovery.
Scientific progress in many disciplines is increasingly enabled by our ability
to examine natural phenomena through the computational lens, i.e., using
algorithmic or information processing abstractions of the underlying processes;
and our ability to acquire, share, integrate and analyze disparate types of
data. However, there is a huge gap between our ability to acquire, store, and
process data and our ability to make effective use of the data to advance
discovery. Despite successful automation of routine aspects of data management
and analytics, most elements of the scientific process currently require
considerable human expertise and effort. Accelerating science to keep pace with
the rate of data acquisition and data processing calls for the development of
algorithmic or information processing abstractions, coupled with formal methods
and tools for modeling and simulation of natural processes as well as major
innovations in cognitive tools for scientists, i.e., computational tools that
leverage and extend the reach of human intellect, and partner with humans on a
broad range of tasks in scientific discovery (e.g., identifying, prioritizing
formulating questions, designing, prioritizing and executing experiments
designed to answer a chosen question, drawing inferences and evaluating the
results, and formulating new questions, in a closed-loop fashion). This calls
for concerted research agenda aimed at: Development, analysis, integration,
sharing, and simulation of algorithmic or information processing abstractions
of natural processes, coupled with formal methods and tools for their analyses
and simulation; Innovations in cognitive tools that augment and extend human
intellect and partner with humans in all aspects of science.Comment: Computing Community Consortium (CCC) white paper, 17 page
- …