5,241 research outputs found
Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning
Many interesting problems in machine learning are being revisited with new
deep learning tools. For graph-based semisupervised learning, a recent
important development is graph convolutional networks (GCNs), which nicely
integrate local vertex features and graph topology in the convolutional layers.
Although the GCN model compares favorably with other state-of-the-art methods,
its mechanisms are not clear and it still requires a considerable amount of
labeled data for validation and model selection. In this paper, we develop
deeper insights into the GCN model and address its fundamental limits. First,
we show that the graph convolution of the GCN model is actually a special form
of Laplacian smoothing, which is the key reason why GCNs work, but it also
brings potential concerns of over-smoothing with many convolutional layers.
Second, to overcome the limits of the GCN model with shallow architectures, we
propose both co-training and self-training approaches to train GCNs. Our
approaches significantly improve GCNs in learning with very few labels, and
exempt them from requiring additional labels for validation. Extensive
experiments on benchmarks have verified our theory and proposals.Comment: AAAI-2018 Oral Presentatio
Efficient Decomposed Learning for Structured Prediction
Structured prediction is the cornerstone of several machine learning
applications. Unfortunately, in structured prediction settings with expressive
inter-variable interactions, exact inference-based learning algorithms, e.g.
Structural SVM, are often intractable. We present a new way, Decomposed
Learning (DecL), which performs efficient learning by restricting the inference
step to a limited part of the structured spaces. We provide characterizations
based on the structure, target parameters, and gold labels, under which DecL is
equivalent to exact learning. We then show that in real world settings, where
our theoretical assumptions may not completely hold, DecL-based algorithms are
significantly more efficient and as accurate as exact learning.Comment: ICML201
A Two-stage Classification Method for High-dimensional Data and Point Clouds
High-dimensional data classification is a fundamental task in machine
learning and imaging science. In this paper, we propose a two-stage multiphase
semi-supervised classification method for classifying high-dimensional data and
unstructured point clouds. To begin with, a fuzzy classification method such as
the standard support vector machine is used to generate a warm initialization.
We then apply a two-stage approach named SaT (smoothing and thresholding) to
improve the classification. In the first stage, an unconstraint convex
variational model is implemented to purify and smooth the initialization,
followed by the second stage which is to project the smoothed partition
obtained at stage one to a binary partition. These two stages can be repeated,
with the latest result as a new initialization, to keep improving the
classification quality. We show that the convex model of the smoothing stage
has a unique solution and can be solved by a specifically designed primal-dual
algorithm whose convergence is guaranteed. We test our method and compare it
with the state-of-the-art methods on several benchmark data sets. The
experimental results demonstrate clearly that our method is superior in both
the classification accuracy and computation speed for high-dimensional data and
point clouds.Comment: 21 pages, 4 figure
Multiclass Semi-Supervised Learning on Graphs using Ginzburg-Landau Functional Minimization
We present a graph-based variational algorithm for classification of
high-dimensional data, generalizing the binary diffuse interface model to the
case of multiple classes. Motivated by total variation techniques, the method
involves minimizing an energy functional made up of three terms. The first two
terms promote a stepwise continuous classification function with sharp
transitions between classes, while preserving symmetry among the class labels.
The third term is a data fidelity term, allowing us to incorporate prior
information into the model in a semi-supervised framework. The performance of
the algorithm on synthetic data, as well as on the COIL and MNIST benchmark
datasets, is competitive with state-of-the-art graph-based multiclass
segmentation methods.Comment: 16 pages, to appear in Springer's Lecture Notes in Computer Science
volume "Pattern Recognition Applications and Methods 2013", part of series on
Advances in Intelligent and Soft Computin
- …