1,787 research outputs found
A Flexible Generative Framework for Graph-based Semi-supervised Learning
We consider a family of problems that are concerned about making predictions
for the majority of unlabeled, graph-structured data samples based on a small
proportion of labeled samples. Relational information among the data samples,
often encoded in the graph/network structure, is shown to be helpful for these
semi-supervised learning tasks. However, conventional graph-based
regularization methods and recent graph neural networks do not fully leverage
the interrelations between the features, the graph, and the labels. In this
work, we propose a flexible generative framework for graph-based
semi-supervised learning, which approaches the joint distribution of the node
features, labels, and the graph structure. Borrowing insights from random graph
models in network science literature, this joint distribution can be
instantiated using various distribution families. For the inference of missing
labels, we exploit recent advances of scalable variational inference techniques
to approximate the Bayesian posterior. We conduct thorough experiments on
benchmark datasets for graph-based semi-supervised learning. Results show that
the proposed methods outperform the state-of-the-art models in most settings.Comment: NeurIPS 201
Learning Discrete Structures for Graph Neural Networks
Graph neural networks (GNNs) are a popular class of machine learning models
whose major advantage is their ability to incorporate a sparse and discrete
dependency structure between data points. Unfortunately, GNNs can only be used
when such a graph-structure is available. In practice, however, real-world
graphs are often noisy and incomplete or might not be available at all. With
this work, we propose to jointly learn the graph structure and the parameters
of graph convolutional networks (GCNs) by approximately solving a bilevel
program that learns a discrete probability distribution on the edges of the
graph. This allows one to apply GCNs not only in scenarios where the given
graph is incomplete or corrupted but also in those where a graph is not
available. We conduct a series of experiments that analyze the behavior of the
proposed method and demonstrate that it outperforms related methods by a
significant margin.Comment: ICML 2019, code at https://github.com/lucfra/LDS - Revision of Sec.
Graph Adversarial Training: Dynamically Regularizing Based on Graph Structure
Recent efforts show that neural networks are vulnerable to small but
intentional perturbations on input features in visual classification tasks. Due
to the additional consideration of connections between examples (\eg articles
with citation link tend to be in the same class), graph neural networks could
be more sensitive to the perturbations, since the perturbations from connected
examples exacerbate the impact on a target example. Adversarial Training (AT),
a dynamic regularization technique, can resist the worst-case perturbations on
input features and is a promising choice to improve model robustness and
generalization. However, existing AT methods focus on standard classification,
being less effective when training models on graph since it does not model the
impact from connected examples.
In this work, we explore adversarial training on graph, aiming to improve the
robustness and generalization of models learned on graph. We propose Graph
Adversarial Training (GraphAT), which takes the impact from connected examples
into account when learning to construct and resist perturbations. We give a
general formulation of GraphAT, which can be seen as a dynamic regularization
scheme based on the graph structure. To demonstrate the utility of GraphAT, we
employ it on a state-of-the-art graph neural network model --- Graph
Convolutional Network (GCN). We conduct experiments on two citation graphs
(Citeseer and Cora) and a knowledge graph (NELL), verifying the effectiveness
of GraphAT which outperforms normal training on GCN by 4.51% in node
classification accuracy. Codes are available via:
https://github.com/fulifeng/GraphAT.Comment: Accepted by TKD
Learning Graph Embedding with Adversarial Training Methods
Graph embedding aims to transfer a graph into vectors to facilitate
subsequent graph analytics tasks like link prediction and graph clustering.
Most approaches on graph embedding focus on preserving the graph structure or
minimizing the reconstruction errors for graph data. They have mostly
overlooked the embedding distribution of the latent codes, which unfortunately
may lead to inferior representation in many cases. In this paper, we present a
novel adversarially regularized framework for graph embedding. By employing the
graph convolutional network as an encoder, our framework embeds the topological
information and node content into a vector representation, from which a graph
decoder is further built to reconstruct the input graph. The adversarial
training principle is applied to enforce our latent codes to match a prior
Gaussian or Uniform distribution. Based on this framework, we derive two
variants of adversarial models, the adversarially regularized graph autoencoder
(ARGA) and its variational version, adversarially regularized variational graph
autoencoder (ARVGA), to learn the graph embedding effectively. We also exploit
other potential variations of ARGA and ARVGA to get a deeper understanding on
our designs. Experimental results compared among twelve algorithms for link
prediction and twenty algorithms for graph clustering validate our solutions.Comment: To appear in IEEE Transactions on Cybernetics. arXiv admin note:
substantial text overlap with arXiv:1802.0440
Manifold regularization in structured output space for semi-supervised structured output prediction
Structured output prediction aims to learn a predictor to predict a
structured output from a input data vector. The structured outputs include
vector, tree, sequence, etc. We usually assume that we have a training set of
input-output pairs to train the predictor. However, in many real-world appli-
cations, it is difficult to obtain the output for a input, thus for many
training input data points, the structured outputs are missing. In this paper,
we dis- cuss how to learn from a training set composed of some input-output
pairs, and some input data points without outputs. This problem is called semi-
supervised structured output prediction. We propose a novel method for this
problem by constructing a nearest neighbor graph from the input space to
present the manifold structure, and using it to regularize the structured out-
put space directly. We define a slack structured output for each training data
point, and proposed to predict it by learning a structured output predictor.
The learning of both slack structured outputs and the predictor are unified
within one single minimization problem. In this problem, we propose to mini-
mize the structured loss between the slack structured outputs of neighboring
data points, and the prediction error measured by the structured loss. The
problem is optimized by an iterative algorithm. Experiment results over three
benchmark data sets show its advantage
Collective Semi-Supervised Learning for User Profiling in Social Media
The abundance of user-generated data in social media has incentivized the
development of methods to infer the latent attributes of users, which are
crucially useful for personalization, advertising and recommendation. However,
the current user profiling approaches have limited success, due to the lack of
a principled way to integrate different types of social relationships of a
user, and the reliance on scarcely-available labeled data in building a
prediction model. In this paper, we present a novel solution termed Collective
Semi-Supervised Learning (CSL), which provides a principled means to integrate
different types of social relationship and unlabeled data under a unified
computational framework. The joint learning from multiple relationships and
unlabeled data yields a computationally sound and accurate approach to model
user attributes in social media. Extensive experiments using Twitter data have
demonstrated the efficacy of our CSL approach in inferring user attributes such
as account type and marital status. We also show how CSL can be used to
determine important user features, and to make inference on a larger user
population
Generalizable Adversarial Attacks with Latent Variable Perturbation Modelling
Adversarial attacks on deep neural networks traditionally rely on a
constrained optimization paradigm, where an optimization procedure is used to
obtain a single adversarial perturbation for a given input example. In this
work we frame the problem as learning a distribution of adversarial
perturbations, enabling us to generate diverse adversarial distributions given
an unperturbed input. We show that this framework is domain-agnostic in that
the same framework can be employed to attack different input domains with
minimal modification. Across three diverse domains---images, text, and
graphs---our approach generates whitebox attacks with success rates that are
competitive with or superior to existing approaches, with a new
state-of-the-art achieved in the graph domain. Finally, we demonstrate that our
framework can efficiently generate a diverse set of attacks for a single given
input, and is even capable of attacking \textit{unseen} test instances in a
zero-shot manner, exhibiting attack generalization
Label Prediction Framework for Semi-Supervised Cross-Modal Retrieval
Cross-modal data matching refers to retrieval of data from one modality, when
given a query from another modality. In general, supervised algorithms achieve
better retrieval performance compared to their unsupervised counterpart, as
they can learn better representative features by leveraging the available label
information. However, this comes at the cost of requiring huge amount of
labeled examples, which may not always be available. In this work, we propose a
novel framework in a semi-supervised setting, which can predict the labels of
the unlabeled data using complementary information from different modalities.
The proposed framework can be used as an add-on with any baseline crossmodal
algorithm to give significant performance improvement, even in case of limited
labeled data. Finally, we analyze the challenging scenario where the unlabeled
examples can even come from classes not in the training data and evaluate the
performance of our algorithm under such setting. Extensive evaluation using
several baseline algorithms across three different datasets shows the
effectiveness of our label prediction framework.Comment: 12 pages, 3 tables, 2 figures, 1 algorithm flowchar
Neural Graph Machines: Learning Neural Networks Using Graphs
Label propagation is a powerful and flexible semi-supervised learning
technique on graphs. Neural networks, on the other hand, have proven track
records in many supervised learning tasks. In this work, we propose a training
framework with a graph-regularised objective, namely "Neural Graph Machines",
that can combine the power of neural networks and label propagation. This work
generalises previous literature on graph-augmented training of neural networks,
enabling it to be applied to multiple neural architectures (Feed-forward NNs,
CNNs and LSTM RNNs) and a wide range of graphs. The new objective allows the
neural networks to harness both labeled and unlabeled data by: (a) allowing the
network to train using labeled data as in the supervised setting, (b) biasing
the network to learn similar hidden representations for neighboring nodes on a
graph, in the same vein as label propagation. Such architectures with the
proposed objective can be trained efficiently using stochastic gradient descent
and scaled to large graphs, with a runtime that is linear in the number of
edges. The proposed joint training approach convincingly outperforms many
existing methods on a wide range of tasks (multi-label classification on social
graphs, news categorization, document classification and semantic intent
classification), with multiple forms of graph inputs (including graphs with and
without node-level features) and using different types of neural networks.Comment: 9 page
Machine Learning with World Knowledge: The Position and Survey
Machine learning has become pervasive in multiple domains, impacting a wide
variety of applications, such as knowledge discovery and data mining, natural
language processing, information retrieval, computer vision, social and health
informatics, ubiquitous computing, etc. Two essential problems of machine
learning are how to generate features and how to acquire labels for machines to
learn. Particularly, labeling large amount of data for each domain-specific
problem can be very time consuming and costly. It has become a key obstacle in
making learning protocols realistic in applications. In this paper, we will
discuss how to use the existing general-purpose world knowledge to enhance
machine learning processes, by enriching the features or reducing the labeling
work. We start from the comparison of world knowledge with domain-specific
knowledge, and then introduce three key problems in using world knowledge in
learning processes, i.e., explicit and implicit feature representation,
inference for knowledge linking and disambiguation, and learning with direct or
indirect supervision. Finally we discuss the future directions of this research
topic
- …