32,911 research outputs found
SSFG: Stochastically Scaling Features and Gradients for Regularizing Graph Convolutional Networks
Graph convolutional networks have been successfully applied in various
graph-based tasks. In a typical graph convolutional layer, node features are
updated by aggregating neighborhood information. Repeatedly applying graph
convolutions can cause the oversmoothing issue, i.e., node features at deep
layers converge to similar values. Previous studies have suggested that
oversmoothing is one of the major issues that restrict the performance of graph
convolutional networks. In this paper, we propose a stochastic regularization
method to tackle the oversmoothing problem. In the proposed method, we
stochastically scale features and gradients (SSFG) by a factor sampled from a
probability distribution in the training procedure. By explicitly applying a
scaling factor to break feature convergence, the oversmoothing issue is
alleviated. We show that applying stochastic scaling at the gradient level is
complementary to that applied at the feature level to improve the overall
performance. Our method does not increase the number of trainable parameters.
When used together with ReLU, our SSFG can be seen as a stochastic ReLU
activation function. We experimentally validate our SSFG regularization method
on three commonly used types of graph networks. Extensive experimental results
on seven benchmark datasets for four graph-based tasks demonstrate that our
SSFG regularization is effective in improving the overall performance of the
baseline graph networks
Efficient Deep Feature Learning and Extraction via StochasticNets
Deep neural networks are a powerful tool for feature learning and extraction
given their ability to model high-level abstractions in highly complex data.
One area worth exploring in feature learning and extraction using deep neural
networks is efficient neural connectivity formation for faster feature learning
and extraction. Motivated by findings of stochastic synaptic connectivity
formation in the brain as well as the brain's uncanny ability to efficiently
represent information, we propose the efficient learning and extraction of
features via StochasticNets, where sparsely-connected deep neural networks can
be formed via stochastic connectivity between neurons. To evaluate the
feasibility of such a deep neural network architecture for feature learning and
extraction, we train deep convolutional StochasticNets to learn abstract
features using the CIFAR-10 dataset, and extract the learned features from
images to perform classification on the SVHN and STL-10 datasets. Experimental
results show that features learned using deep convolutional StochasticNets,
with fewer neural connections than conventional deep convolutional neural
networks, can allow for better or comparable classification accuracy than
conventional deep neural networks: relative test error decrease of ~4.5% for
classification on the STL-10 dataset and ~1% for classification on the SVHN
dataset. Furthermore, it was shown that the deep features extracted using deep
convolutional StochasticNets can provide comparable classification accuracy
even when only 10% of the training data is used for feature learning. Finally,
it was also shown that significant gains in feature extraction speed can be
achieved in embedded applications using StochasticNets. As such, StochasticNets
allow for faster feature learning and extraction performance while facilitate
for better or comparable accuracy performances.Comment: 10 pages. arXiv admin note: substantial text overlap with
arXiv:1508.0546
Bayesian graph convolutional neural networks for semi-supervised classification
Recently, techniques for applying convolutional neural networks to
graph-structured data have emerged. Graph convolutional neural networks (GCNNs)
have been used to address node and graph classification and matrix completion.
Although the performance has been impressive, the current implementations have
limited capability to incorporate uncertainty in the graph structure. Almost
all GCNNs process a graph as though it is a ground-truth depiction of the
relationship between nodes, but often the graphs employed in applications are
themselves derived from noisy data or modelling assumptions. Spurious edges may
be included; other edges may be missing between nodes that have very strong
relationships. In this paper we adopt a Bayesian approach, viewing the observed
graph as a realization from a parametric family of random graphs. We then
target inference of the joint posterior of the random graph parameters and the
node (or graph) labels. We present the Bayesian GCNN framework and develop an
iterative learning procedure for the case of assortative mixed-membership
stochastic block models. We present the results of experiments that demonstrate
that the Bayesian formulation can provide better performance when there are
very few labels available during the training process
Interpretable Structure-Evolving LSTM
This paper develops a general framework for learning interpretable data
representation via Long Short-Term Memory (LSTM) recurrent neural networks over
hierarchal graph structures. Instead of learning LSTM models over the pre-fixed
structures, we propose to further learn the intermediate interpretable
multi-level graph structures in a progressive and stochastic way from data
during the LSTM network optimization. We thus call this model the
structure-evolving LSTM. In particular, starting with an initial element-level
graph representation where each node is a small data element, the
structure-evolving LSTM gradually evolves the multi-level graph representations
by stochastically merging the graph nodes with high compatibilities along the
stacked LSTM layers. In each LSTM layer, we estimate the compatibility of two
connected nodes from their corresponding LSTM gate outputs, which is used to
generate a merging probability. The candidate graph structures are accordingly
generated where the nodes are grouped into cliques with their merging
probabilities. We then produce the new graph structure with a
Metropolis-Hasting algorithm, which alleviates the risk of getting stuck in
local optimums by stochastic sampling with an acceptance probability. Once a
graph structure is accepted, a higher-level graph is then constructed by taking
the partitioned cliques as its nodes. During the evolving process,
representation becomes more abstracted in higher-levels where redundant
information is filtered out, allowing more efficient propagation of long-range
data dependencies. We evaluate the effectiveness of structure-evolving LSTM in
the application of semantic object parsing and demonstrate its advantage over
state-of-the-art LSTM models on standard benchmarks.Comment: To appear in CVPR 2017 as a spotlight pape
- …