26,594 research outputs found
Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels
Graph Convolutional Networks(GCNs) play a crucial role in graph learning
tasks, however, learning graph embedding with few supervised signals is still a
difficult problem. In this paper, we propose a novel training algorithm for
Graph Convolutional Network, called Multi-Stage Self-Supervised(M3S) Training
Algorithm, combined with self-supervised learning approach, focusing on
improving the generalization performance of GCNs on graphs with few labeled
nodes. Firstly, a Multi-Stage Training Framework is provided as the basis of
M3S training method. Then we leverage DeepCluster technique, a popular form of
self-supervised learning, and design corresponding aligning mechanism on the
embedding space to refine the Multi-Stage Training Framework, resulting in M3S
Training Algorithm. Finally, extensive experimental results verify the superior
performance of our algorithm on graphs with few labeled nodes under different
label rates compared with other state-of-the-art approaches.Comment: AAAI Conference on Artificial Intelligence (AAAI 2020
Attention-based Graph Neural Network for Semi-supervised Learning
Recently popularized graph neural networks achieve the state-of-the-art
accuracy on a number of standard benchmark datasets for graph-based
semi-supervised learning, improving significantly over existing approaches.
These architectures alternate between a propagation layer that aggregates the
hidden states of the local neighborhood and a fully-connected layer. Perhaps
surprisingly, we show that a linear model, that removes all the intermediate
fully-connected layers, is still able to achieve a performance comparable to
the state-of-the-art models. This significantly reduces the number of
parameters, which is critical for semi-supervised learning where number of
labeled examples are small. This in turn allows a room for designing more
innovative propagation layers. Based on this insight, we propose a novel graph
neural network that removes all the intermediate fully-connected layers, and
replaces the propagation layers with attention mechanisms that respect the
structure of the graph. The attention mechanism allows us to learn a dynamic
and adaptive local summary of the neighborhood to achieve more accurate
predictions. In a number of experiments on benchmark citation networks
datasets, we demonstrate that our approach outperforms competing methods. By
examining the attention weights among neighbors, we show that our model
provides some interesting insights on how neighbors influence each other
GESF: A Universal Discriminative Mapping Mechanism for Graph Representation Learning
Graph embedding is a central problem in social network analysis and many
other applications, aiming to learn the vector representation for each node.
While most existing approaches need to specify the neighborhood and the
dependence form to the neighborhood, which may significantly degrades the
flexibility of representation, we propose a novel graph node embedding method
(namely GESF) via the set function technique. Our method can 1) learn an
arbitrary form of representation function from neighborhood, 2) automatically
decide the significance of neighbors at different distances, and 3) be applied
to heterogeneous graph embedding, which may contain multiple types of nodes.
Theoretical guarantee for the representation capability of our method has been
proved for general homogeneous and heterogeneous graphs and evaluation results
on benchmark data sets show that the proposed GESF outperforms the
state-of-the-art approaches on producing node vectors for classification tasks.Comment: 18 page
A Survey on Data Collection for Machine Learning: a Big Data -- AI Integration Perspective
Data collection is a major bottleneck in machine learning and an active
research topic in multiple communities. There are largely two reasons data
collection has recently become a critical issue. First, as machine learning is
becoming more widely-used, we are seeing new applications that do not
necessarily have enough labeled data. Second, unlike traditional machine
learning, deep learning techniques automatically generate features, which saves
feature engineering costs, but in return may require larger amounts of labeled
data. Interestingly, recent research in data collection comes not only from the
machine learning, natural language, and computer vision communities, but also
from the data management community due to the importance of handling large
amounts of data. In this survey, we perform a comprehensive study of data
collection from a data management point of view. Data collection largely
consists of data acquisition, data labeling, and improvement of existing data
or models. We provide a research landscape of these operations, provide
guidelines on which technique to use when, and identify interesting research
challenges. The integration of machine learning and data management for data
collection is part of a larger trend of Big data and Artificial Intelligence
(AI) integration and opens many opportunities for new research.Comment: 20 page
Semi-Supervised Learning with Competitive Infection Models
The goal in semi-supervised learning is to effectively combine labeled and
unlabeled data. One way to do this is by encouraging smoothness across edges in
a graph whose nodes correspond to input examples. In many graph-based methods,
labels can be thought of as propagating over the graph, where the underlying
propagation mechanism is based on random walks or on averaging dynamics. While
theoretically elegant, these dynamics suffer from several drawbacks which can
hurt predictive performance.
Our goal in this work is to explore alternative mechanisms for propagating
labels. In particular, we propose a method based on dynamic infection
processes, where unlabeled nodes can be "infected" with the label of their
already infected neighbors. Our algorithm is efficient and scalable, and an
analysis of the underlying optimization objective reveals a surprising relation
to other Laplacian approaches. We conclude with a thorough set of experiments
across multiple benchmarks and various learning settings
Robust Graph Data Learning via Latent Graph Convolutional Representation
Graph Convolutional Representation (GCR) has achieved impressive performance
for graph data representation. However, existing GCR is generally defined on
the input fixed graph which may restrict the representation capacity and also
be vulnerable to the structural attacks and noises. To address this issue, we
propose a novel Latent Graph Convolutional Representation (LatGCR) for robust
graph data representation and learning. Our LatGCR is derived based on
reformulating graph convolutional representation from the aspect of graph
neighborhood reconstruction. Given an input graph , LatGCR aims to
generate a flexible latent graph for graph
convolutional representation which obviously enhances the representation
capacity and also performs robustly w.r.t graph structural attacks and noises.
Moreover, LatGCR is implemented in a self-supervised manner and thus provides a
basic block for both supervised and unsupervised graph learning tasks.
Experiments on several datasets demonstrate the effectiveness and robustness of
LatGCR
GrAMME: Semi-Supervised Learning using Multi-layered Graph Attention Models
Modern data analysis pipelines are becoming increasingly complex due to the
presence of multi-view information sources. While graphs are effective in
modeling complex relationships, in many scenarios a single graph is rarely
sufficient to succinctly represent all interactions, and hence multi-layered
graphs have become popular. Though this leads to richer representations,
extending solutions from the single-graph case is not straightforward.
Consequently, there is a strong need for novel solutions to solve classical
problems, such as node classification, in the multi-layered case. In this
paper, we consider the problem of semi-supervised learning with multi-layered
graphs. Though deep network embeddings, e.g. DeepWalk, are widely adopted for
community discovery, we argue that feature learning with random node
attributes, using graph neural networks, can be more effective. To this end, we
propose to use attention models for effective feature learning, and develop two
novel architectures, GrAMME-SG and GrAMME-Fusion, that exploit the inter-layer
dependencies for building multi-layered graph embeddings. Using empirical
studies on several benchmark datasets, we evaluate the proposed approaches and
demonstrate significant performance improvements in comparison to
state-of-the-art network embedding strategies. The results also show that using
simple random features is an effective choice, even in cases where explicit
node attributes are not available
Relation Extraction : A Survey
With the advent of the Internet, large amount of digital text is generated
everyday in the form of news articles, research publications, blogs, question
answering forums and social media. It is important to develop techniques for
extracting information automatically from these documents, as lot of important
information is hidden within them. This extracted information can be used to
improve access and management of knowledge hidden in large text corpora.
Several applications such as Question Answering, Information Retrieval would
benefit from this information. Entities like persons and organizations, form
the most basic unit of the information. Occurrences of entities in a sentence
are often linked through well-defined relations; e.g., occurrences of person
and organization in a sentence may be linked through relations such as employed
at. The task of Relation Extraction (RE) is to identify such relations
automatically. In this paper, we survey several important supervised,
semi-supervised and unsupervised RE techniques. We also cover the paradigms of
Open Information Extraction (OIE) and Distant Supervision. Finally, we describe
some of the recent trends in the RE techniques and possible future research
directions. This survey would be useful for three kinds of readers - i)
Newcomers in the field who want to quickly learn about RE; ii) Researchers who
want to know how the various RE techniques evolved over time and what are
possible future research directions and iii) Practitioners who just need to
know which RE technique works best in various settings
Regression-based Hypergraph Learning for Image Clustering and Classification
Inspired by the recently remarkable successes of Sparse Representation (SR),
Collaborative Representation (CR) and sparse graph, we present a novel
hypergraph model named Regression-based Hypergraph (RH) which utilizes the
regression models to construct the high quality hypergraphs. Moreover, we plug
RH into two conventional hypergraph learning frameworks, namely hypergraph
spectral clustering and hypergraph transduction, to present Regression-based
Hypergraph Spectral Clustering (RHSC) and Regression-based Hypergraph
Transduction (RHT) models for addressing the image clustering and
classification issues. Sparse Representation and Collaborative Representation
are employed to instantiate two RH instances and their RHSC and RHT algorithms.
The experimental results on six popular image databases demonstrate that the
proposed RH learning algorithms achieve promising image clustering and
classification performances, and also validate that RH can inherit the
desirable properties from both hypergraph models and regression models.Comment: 11page
An Optimization Framework for Semi-Supervised and Transfer Learning using Multiple Classifiers and Clusterers
Unsupervised models can provide supplementary soft constraints to help
classify new, "target" data since similar instances in the target set are more
likely to share the same class label. Such models can also help detect possible
differences between training and target distributions, which is useful in
applications where concept drift may take place, as in transfer learning
settings. This paper describes a general optimization framework that takes as
input class membership estimates from existing classifiers learnt on previously
encountered "source" data, as well as a similarity matrix from a cluster
ensemble operating solely on the target data to be classified, and yields a
consensus labeling of the target data. This framework admits a wide range of
loss functions and classification/clustering methods. It exploits properties of
Bregman divergences in conjunction with Legendre duality to yield a principled
and scalable approach. A variety of experiments show that the proposed
framework can yield results substantially superior to those provided by popular
transductive learning techniques or by naively applying classifiers learnt on
the original task to the target data
- …