33,393 research outputs found
Self-Supervised Graph Representation Learning via Global Context Prediction
To take full advantage of fast-growing unlabeled networked data, this paper
introduces a novel self-supervised strategy for graph representation learning
by exploiting natural supervision provided by the data itself. Inspired by
human social behavior, we assume that the global context of each node is
composed of all nodes in the graph since two arbitrary entities in a connected
network could interact with each other via paths of varying length. Based on
this, we investigate whether the global context can be a source of free and
effective supervisory signals for learning useful node representations.
Specifically, we randomly select pairs of nodes in a graph and train a
well-designed neural net to predict the contextual position of one node
relative to the other. Our underlying hypothesis is that the representations
learned from such within-graph context would capture the global topology of the
graph and finely characterize the similarity and differentiation between nodes,
which is conducive to various downstream learning tasks. Extensive benchmark
experiments including node classification, clustering, and link prediction
demonstrate that our approach outperforms many state-of-the-art unsupervised
methods and sometimes even exceeds the performance of supervised counterparts
Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks
Representation learning methods for heterogeneous networks produce a
low-dimensional vector embedding for each node that is typically fixed for all
tasks involving the node. Many of the existing methods focus on obtaining a
static vector representation for a node in a way that is agnostic to the
downstream application where it is being used. In practice, however, downstream
tasks such as link prediction require specific contextual information that can
be extracted from the subgraphs related to the nodes provided as input to the
task. To tackle this challenge, we develop SLiCE, a framework bridging static
representation learning methods using global information from the entire graph
with localized attention driven mechanisms to learn contextual node
representations. We first pre-train our model in a self-supervised manner by
introducing higher-order semantic associations and masking nodes, and then
fine-tune our model for a specific link prediction task. Instead of training
node representations by aggregating information from all semantic neighbors
connected via metapaths, we automatically learn the composition of different
metapaths that characterize the context for a specific task without the need
for any pre-defined metapaths. SLiCE significantly outperforms both static and
contextual embedding learning methods on several publicly available benchmark
network datasets. We also interpret the semantic association matrix and provide
its utility and relevance in making successful link predictions between
heterogeneous nodes in the network
Machine Learning on Graphs: A Model and Comprehensive Taxonomy
There has been a surge of recent interest in learning representations for
graph-structured data. Graph representation learning methods have generally
fallen into three main categories, based on the availability of labeled data.
The first, network embedding (such as shallow graph embedding or graph
auto-encoders), focuses on learning unsupervised representations of relational
structure. The second, graph regularized neural networks, leverages graphs to
augment neural network losses with a regularization objective for
semi-supervised learning. The third, graph neural networks, aims to learn
differentiable functions over discrete topologies with arbitrary structure.
However, despite the popularity of these areas there has been surprisingly
little work on unifying the three paradigms. Here, we aim to bridge the gap
between graph neural networks, network embedding and graph regularization
models. We propose a comprehensive taxonomy of representation learning methods
for graph-structured data, aiming to unify several disparate bodies of work.
Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which
generalizes popular algorithms for semi-supervised learning on graphs (e.g.
GraphSage, Graph Convolutional Networks, Graph Attention Networks), and
unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc)
into a single consistent approach. To illustrate the generality of this
approach, we fit over thirty existing methods into this framework. We believe
that this unifying view both provides a solid foundation for understanding the
intuition behind these methods, and enables future research in the area
Strategies for Pre-training Graph Neural Networks
Many applications of machine learning require a model to make accurate
pre-dictions on test examples that are distributionally different from training
ones, while task-specific labels are scarce during training. An effective
approach to this challenge is to pre-train a model on related tasks where data
is abundant, and then fine-tune it on a downstream task of interest. While
pre-training has been effective in many language and vision domains, it remains
an open question how to effectively use pre-training on graph datasets. In this
paper, we develop a new strategy and self-supervised methods for pre-training
Graph Neural Networks (GNNs). The key to the success of our strategy is to
pre-train an expressive GNN at the level of individual nodes as well as entire
graphs so that the GNN can learn useful local and global representations
simultaneously. We systematically study pre-training on multiple graph
classification datasets. We find that naive strategies, which pre-train GNNs at
the level of either entire graphs or individual nodes, give limited improvement
and can even lead to negative transfer on many downstream tasks. In contrast,
our strategy avoids negative transfer and improves generalization significantly
across downstream tasks, leading up to 9.4% absolute improvements in ROC-AUC
over non-pre-trained models and achieving state-of-the-art performance for
molecular property prediction and protein function prediction.Comment: Accepted as a spotlight to ICLR 202
Mix-and-Match Tuning for Self-Supervised Semantic Segmentation
Deep convolutional networks for semantic image segmentation typically require
large-scale labeled data, e.g. ImageNet and MS COCO, for network pre-training.
To reduce annotation efforts, self-supervised semantic segmentation is recently
proposed to pre-train a network without any human-provided labels. The key of
this new form of learning is to design a proxy task (e.g. image colorization),
from which a discriminative loss can be formulated on unlabeled data. Many
proxy tasks, however, lack the critical supervision signals that could induce
discriminative representation for the target image segmentation task. Thus
self-supervision's performance is still far from that of supervised
pre-training. In this study, we overcome this limitation by incorporating a
"mix-and-match" (M&M) tuning stage in the self-supervision pipeline. The
proposed approach is readily pluggable to many self-supervision methods and
does not use more annotated samples than the original process. Yet, it is
capable of boosting the performance of target image segmentation task to
surpass fully-supervised pre-trained counterpart. The improvement is made
possible by better harnessing the limited pixel-wise annotations in the target
dataset. Specifically, we first introduce the "mix" stage, which sparsely
samples and mixes patches from the target set to reflect rich and diverse local
patch statistics of target images. A "match" stage then forms a class-wise
connected graph, which can be used to derive a strong triplet-based
discriminative loss for fine-tuning the network. Our paradigm follows the
standard practice in existing self-supervised studies and no extra data or
label is required. With the proposed M&M approach, for the first time, a
self-supervision method can achieve comparable or even better performance
compared to its ImageNet pre-trained counterpart on both PASCAL VOC2012 dataset
and CityScapes dataset.Comment: To appear in AAAI 2018 as a spotlight paper. More details at the
project page: http://mmlab.ie.cuhk.edu.hk/projects/M%26M
Semi-Supervised Learning on Graphs Based on Local Label Distributions
Most approaches that tackle the problem of node classification consider nodes
to be similar, if they have shared neighbors or are close to each other in the
graph. Recent methods for attributed graphs additionally take attributes of
neighboring nodes into account. We argue that the class labels of the neighbors
bear important information and considering them helps to improve classification
quality. Two nodes which are similar based on class labels in their
neighborhood do not need to be close-by in the graph and may even belong to
different connected components. In this work, we propose a novel approach for
the semi-supervised node classification. Precisely, we propose a new node
embedding which is based on the class labels in the local neighborhood of a
node. We show that this is a different setting from attribute-based embeddings
and thus, we propose a new method to learn label-based node embeddings which
can mirror a variety of relations between the class labels of neighboring
nodes. Our experimental evaluation demonstrates that our new methods can
significantly improve the prediction quality on real world data sets
Semi-Supervised Graph Classification: A Hierarchical Graph Perspective
Node classification and graph classification are two graph learning problems
that predict the class label of a node and the class label of a graph
respectively. A node of a graph usually represents a real-world entity, e.g., a
user in a social network, or a protein in a protein-protein interaction
network. In this work, we consider a more challenging but practically useful
setting, in which a node itself is a graph instance. This leads to a
hierarchical graph perspective which arises in many domains such as social
network, biological network and document collection. For example, in a social
network, a group of people with shared interests forms a user group, whereas a
number of user groups are interconnected via interactions or common members. We
study the node classification problem in the hierarchical graph where a `node'
is a graph instance, e.g., a user group in the above example. As labels are
usually limited in real-world data, we design two novel semi-supervised
solutions named \underline{SE}mi-supervised gr\underline{A}ph
c\underline{L}assification via \underline{C}autious/\underline{A}ctive
\underline{I}teration (or SEAL-C/AI in short). SEAL-C/AI adopt an iterative
framework that takes turns to build or update two classifiers, one working at
the graph instance level and the other at the hierarchical graph level. To
simplify the representation of the hierarchical graph, we propose a novel
supervised, self-attentive graph embedding method called SAGE, which embeds
graph instances of arbitrary size into fixed-length vectors. Through
experiments on synthetic data and Tencent QQ group data, we demonstrate that
SEAL-C/AI not only outperform competing methods by a significant margin in
terms of accuracy/Macro-F1, but also generate meaningful interpretations of the
learned representations.Comment: 12 pages, WWW-201
Semi-supervised Learning with Contrastive Predicative Coding
Semi-supervised learning (SSL) provides a powerful framework for leveraging
unlabeled data when labels are limited or expensive to obtain. SSL algorithms
based on deep neural networks have recently proven successful on standard
benchmark tasks. However, many of them have thus far been either inflexible,
inefficient or non-scalable. This paper explores recently developed contrastive
predictive coding technique to improve discriminative power of deep learning
models when a large portion of labels are absent. Two models, cpc-SSL and a
class conditional variant~(ccpc-SSL) are presented. They effectively exploit
the unlabeled data by extracting shared information between different parts of
the (high-dimensional) data. The proposed approaches are inductive, and scale
well to very large datasets like ImageNet, making them good candidates in
real-world large scale applications.Comment: 6 pages, 4 figures, conferenc
Machine Learning with World Knowledge: The Position and Survey
Machine learning has become pervasive in multiple domains, impacting a wide
variety of applications, such as knowledge discovery and data mining, natural
language processing, information retrieval, computer vision, social and health
informatics, ubiquitous computing, etc. Two essential problems of machine
learning are how to generate features and how to acquire labels for machines to
learn. Particularly, labeling large amount of data for each domain-specific
problem can be very time consuming and costly. It has become a key obstacle in
making learning protocols realistic in applications. In this paper, we will
discuss how to use the existing general-purpose world knowledge to enhance
machine learning processes, by enriching the features or reducing the labeling
work. We start from the comparison of world knowledge with domain-specific
knowledge, and then introduce three key problems in using world knowledge in
learning processes, i.e., explicit and implicit feature representation,
inference for knowledge linking and disambiguation, and learning with direct or
indirect supervision. Finally we discuss the future directions of this research
topic
Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
Machine learning (ML) algorithms have made a tremendous impact in the field
of medical imaging. While medical imaging datasets have been growing in size, a
challenge for supervised ML algorithms that is frequently mentioned is the lack
of annotated data. As a result, various methods which can learn with less/other
types of supervision, have been proposed. We review semi-supervised, multiple
instance, and transfer learning in medical imaging, both in diagnosis/detection
or segmentation tasks. We also discuss connections between these learning
scenarios, and opportunities for future research.Comment: Submitted to Medical Image Analysi
- …