5,466 research outputs found
A Survey on Graph Kernels
Graph kernels have become an established and widely-used technique for
solving classification tasks on graphs. This survey gives a comprehensive
overview of techniques for kernel-based graph classification developed in the
past 15 years. We describe and categorize graph kernels based on properties
inherent to their design, such as the nature of their extracted graph features,
their method of computation and their applicability to problems in practice. In
an extensive experimental evaluation, we study the classification accuracy of a
large suite of graph kernels on established benchmarks as well as new datasets.
We compare the performance of popular kernels with several baseline methods and
study the effect of applying a Gaussian RBF kernel to the metric induced by a
graph kernel. In doing so, we find that simple baselines become competitive
after this transformation on some datasets. Moreover, we study the extent to
which existing graph kernels agree in their predictions (and prediction errors)
and obtain a data-driven categorization of kernels as result. Finally, based on
our experimental results, we derive a practitioner's guide to kernel-based
graph classification
Kernel methods in genomics and computational biology
Support vector machines and kernel methods are increasingly popular in
genomics and computational biology, due to their good performance in real-world
applications and strong modularity that makes them suitable to a wide range of
problems, from the classification of tumors to the automatic annotation of
proteins. Their ability to work in high dimension, to process non-vectorial
data, and the natural framework they provide to integrate heterogeneous data
are particularly relevant to various problems arising in computational biology.
In this chapter we survey some of the most prominent applications published so
far, highlighting the particular developments in kernel methods triggered by
problems in biology, and mention a few promising research directions likely to
expand in the future
Applying Deep Learning to Fast Radio Burst Classification
Upcoming Fast Radio Burst (FRB) surveys will search 10\, beams on
sky with very high duty cycle, generating large numbers of single-pulse
candidates. The abundance of false positives presents an intractable problem if
candidates are to be inspected by eye, making it a good application for
artificial intelligence (AI). We apply deep learning to single pulse
classification and develop a hierarchical framework for ranking events by their
probability of being true astrophysical transients. We construct a tree-like
deep neural network (DNN) that takes multiple or individual data products as
input (e.g. dynamic spectra and multi-beam detection information) and trains on
them simultaneously. We have built training and test sets using false-positive
triggers from real telescopes, along with simulated FRBs, and single pulses
from pulsars. Training of the DNN was independently done for two radio
telescopes: the CHIME Pathfinder, and Apertif on Westerbork. High accuracy and
recall can be achieved with a labelled training set of a few thousand events.
Even with high triggering rates, classification can be done very quickly on
Graphical Processing Units (GPUs). That speed is essential for selective
voltage dumps or issuing real-time VOEvents. Next, we investigate whether
dedispersion back-ends could be completely replaced by a real-time DNN
classifier. It is shown that a single forward propagation through a moderate
convolutional network could be faster than brute-force dedispersion; but the
low signal-to-noise per pixel makes such a classifier sub-optimal for this
problem. Real-time automated classification may prove useful for bright,
unexpected signals, both now and in the era of radio astronomy when data
volumes and the searchable parameter spaces further outgrow our ability to
manually inspect the data, such as for SKA and ngVLA
Kernel methods in machine learning
We review machine learning methods employing positive definite kernels. These
methods formulate learning and estimation problems in a reproducing kernel
Hilbert space (RKHS) of functions defined on the data domain, expanded in terms
of a kernel. Working in linear spaces of function has the benefit of
facilitating the construction and analysis of learning algorithms while at the
same time allowing large classes of functions. The latter include nonlinear
functions as well as functions defined on nonvectorial data. We cover a wide
range of methods, ranging from binary classifiers to sophisticated methods for
estimation with structured data.Comment: Published in at http://dx.doi.org/10.1214/009053607000000677 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …