1,076 research outputs found
Graph Kernels
We present a unified framework to study graph kernels, special cases of which include the random
walk (GƤrtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004;
MahƩ et al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time
complexity of kernel computation between unlabeled graphs with n vertices from O(n^6) to O(n^3).
We find a spectral decomposition approach even more efficient when computing entire kernel matrices.
For labeled graphs we develop conjugate gradient and fixed-point methods that take O(dn^3)
time per iteration, where d is the size of the label set. By extending the necessary linear algebra to
Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for d-dimensional edge kernels,
and O(n^4) in the infinite-dimensional case; on sparse graphs these algorithms only take O(n^2)
time per iteration in all cases. Experiments on graphs from bioinformatics and other application
domains show that these techniques can speed up computation of the kernel by an order of magnitude
or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when
specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to
R-convolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment
kernel of Frƶhlich et al. (2006) yet provably positive semi-definite
Mining Images in Biomedical Publications: Detection and Analysis of Gel Diagrams
Authors of biomedical publications use gel images to report experimental
results such as protein-protein interactions or protein expressions under
different conditions. Gel images offer a concise way to communicate such
findings, not all of which need to be explicitly discussed in the article text.
This fact together with the abundance of gel images and their shared common
patterns makes them prime candidates for automated image mining and parsing. We
introduce an approach for the detection of gel images, and present a workflow
to analyze them. We are able to detect gel segments and panels at high
accuracy, and present preliminary results for the identification of gene names
in these images. While we cannot provide a complete solution at this point, we
present evidence that this kind of image mining is feasible.Comment: arXiv admin note: substantial text overlap with arXiv:1209.148
Attributed Graph Classification via Deep Graph Convolutional Neural Networks
From social networks to biological networks, graphs are a natural way to represent a diverse set of real-world data. This research presents attributed graph convolutional neural network with a pooling layer (AGCP for short), a novel end-to-end deep neural network model which captures the higher-order latent attributes of weighted, labeled, undirected, attributed graphs of arbitrary size. The architecture of AGCP is an efficient variant of convolutional neural network (CNN) and has a linear filter function that convolves over the fixed topological structure of a graph to learn local and global attributes of the graph. Convolution is followed by a pooling layer that coarsens the graph while preserving the global structure of the original input graph using information gain. On the other hand, advances in high throughput technologies for next-generation sequencing have enabled machine learning research to acquire and extract knowledge from biological networks. We apply AGCP on three bioinformatics networks, ENZYMES, D&D, and GINA a graph dataset of gene interaction networks with genomic mutation attributes as the attributes of the vertices. In several experiments on these datasets, we demonstrate that AGCP yields better results in terms of classification accuracy relative to the previously proposed models by a considerable margin
Graph Representation Learning in Biomedicine
Biomedical networks are universal descriptors of systems of interacting
elements, from protein interactions to disease networks, all the way to
healthcare systems and scientific knowledge. With the remarkable success of
representation learning in providing powerful predictions and insights, we have
witnessed a rapid expansion of representation learning techniques into
modeling, analyzing, and learning with such networks. In this review, we put
forward an observation that long-standing principles of networks in biology and
medicine -- while often unspoken in machine learning research -- can provide
the conceptual grounding for representation learning, explain its current
successes and limitations, and inform future advances. We synthesize a spectrum
of algorithmic approaches that, at their core, leverage graph topology to embed
networks into compact vector spaces, and capture the breadth of ways in which
representation learning is proving useful. Areas of profound impact include
identifying variants underlying complex traits, disentangling behaviors of
single cells and their effects on health, assisting in diagnosis and treatment
of patients, and developing safe and effective medicines
- ā¦