785 research outputs found
Structure fusion based on graph convolutional networks for semi-supervised classification
Suffering from the multi-view data diversity and complexity for
semi-supervised classification, most of existing graph convolutional networks
focus on the networks architecture construction or the salient graph structure
preservation, and ignore the the complete graph structure for semi-supervised
classification contribution. To mine the more complete distribution structure
from multi-view data with the consideration of the specificity and the
commonality, we propose structure fusion based on graph convolutional networks
(SF-GCN) for improving the performance of semi-supervised classification.
SF-GCN can not only retain the special characteristic of each view data by
spectral embedding, but also capture the common style of multi-view data by
distance metric between multi-graph structures. Suppose the linear relationship
between multi-graph structures, we can construct the optimization function of
structure fusion model by balancing the specificity loss and the commonality
loss. By solving this function, we can simultaneously obtain the fusion
spectral embedding from the multi-view data and the fusion structure as
adjacent matrix to input graph convolutional networks for semi-supervised
classification. Experiments demonstrate that the performance of SF-GCN
outperforms that of the state of the arts on three challenging datasets, which
are Cora,Citeseer and Pubmed in citation networks
Graph-based Semi-supervised Learning: Algorithms and Applications.
114 p.Graph-based semi-supervised learning have attracted large numbers of researchers and it is an important part of semi-supervised learning. Graph construction and semi-supervised embedding are two main steps in graph-based semi-supervised learning algorithms. In this thesis, we proposed two graph construction algorithms and two semi-supervised embedding algorithms. The main work of this thesis is summarized as follows:1. A new graph construction algorithm named Graph construction based on self-representativeness and Laplacian smoothness (SRLS) and several variants are proposed. Researches show that the coefficients obtained by data representation algorithms reflect the similarity between data samples and can be considered as a measurement of the similarity. This kind of measurement can be used for the weights of the edges between data samples in graph construction. Each column of the coefficient matrix obtained by data self-representation algorithms can be regarded as a new representation of original data. The new representations should have common features as the original data samples. Thus, if two data samples are close to each other in the original space, the corresponding representations should be highly similar. This constraint is called Laplacian smoothness.SRLS graph is based on l2-norm minimized data self-representation and Laplacian smoothness. Since the representation matrix obtained by l2 minimization is dense, a two phrase SRLS method (TPSRLS) is proposed to increase the sparsity of graph matrix. By extending the linear space to Hilbert space, two kernelized versions of SRLS are proposed. Besides, a direct solution to kernelized SRLS algorithm is also introduced.2. A new sparse graph construction algorithm named Sparse graph with Laplacian smoothness (SGLS) and several variants are proposed. SGLS graph algorithm is based on sparse representation and use Laplacian smoothness as a constraint (SGLS). A kernelized version of the SGLS algorithm and a direct solution to kernelized SGLS algorithm are also proposed. 3. SPP is a successful unsupervised learning method. To extend SPP to a semi-supervised embedding method, we introduce the idea of in-class constraints in CGE into SPP and propose a new semi-supervised method for data embedding named Constrained Sparsity Preserving Embedding (CSPE).4. The weakness of CSPE is that it cannot handle the new coming samples which means a cascade regression should be performed after the non-linear mapping is obtained by CSPE over the whole training samples. Inspired by FME, we add a regression term in the objective function to obtain an approximate linear projection simultaneously when non-linear embedding is estimated and proposed Flexible Constrained Sparsity Preserving Embedding (FCSPE).Extensive experiments on several datasets (including facial images, handwriting digits images and objects images) prove that the proposed algorithms can improve the state-of-the-art results
Graph-based Semi-supervised Learning: Algorithms and Applications.
114 p.Graph-based semi-supervised learning have attracted large numbers of researchers and it is an important part of semi-supervised learning. Graph construction and semi-supervised embedding are two main steps in graph-based semi-supervised learning algorithms. In this thesis, we proposed two graph construction algorithms and two semi-supervised embedding algorithms. The main work of this thesis is summarized as follows:1. A new graph construction algorithm named Graph construction based on self-representativeness and Laplacian smoothness (SRLS) and several variants are proposed. Researches show that the coefficients obtained by data representation algorithms reflect the similarity between data samples and can be considered as a measurement of the similarity. This kind of measurement can be used for the weights of the edges between data samples in graph construction. Each column of the coefficient matrix obtained by data self-representation algorithms can be regarded as a new representation of original data. The new representations should have common features as the original data samples. Thus, if two data samples are close to each other in the original space, the corresponding representations should be highly similar. This constraint is called Laplacian smoothness.SRLS graph is based on l2-norm minimized data self-representation and Laplacian smoothness. Since the representation matrix obtained by l2 minimization is dense, a two phrase SRLS method (TPSRLS) is proposed to increase the sparsity of graph matrix. By extending the linear space to Hilbert space, two kernelized versions of SRLS are proposed. Besides, a direct solution to kernelized SRLS algorithm is also introduced.2. A new sparse graph construction algorithm named Sparse graph with Laplacian smoothness (SGLS) and several variants are proposed. SGLS graph algorithm is based on sparse representation and use Laplacian smoothness as a constraint (SGLS). A kernelized version of the SGLS algorithm and a direct solution to kernelized SGLS algorithm are also proposed. 3. SPP is a successful unsupervised learning method. To extend SPP to a semi-supervised embedding method, we introduce the idea of in-class constraints in CGE into SPP and propose a new semi-supervised method for data embedding named Constrained Sparsity Preserving Embedding (CSPE).4. The weakness of CSPE is that it cannot handle the new coming samples which means a cascade regression should be performed after the non-linear mapping is obtained by CSPE over the whole training samples. Inspired by FME, we add a regression term in the objective function to obtain an approximate linear projection simultaneously when non-linear embedding is estimated and proposed Flexible Constrained Sparsity Preserving Embedding (FCSPE).Extensive experiments on several datasets (including facial images, handwriting digits images and objects images) prove that the proposed algorithms can improve the state-of-the-art results
Hypergraph Neural Networks
In this paper, we present a hypergraph neural networks (HGNN) framework for
data representation learning, which can encode high-order data correlation in a
hypergraph structure. Confronting the challenges of learning representation for
complex data in real practice, we propose to incorporate such data structure in
a hypergraph, which is more flexible on data modeling, especially when dealing
with complex data. In this method, a hyperedge convolution operation is
designed to handle the data correlation during representation learning. In this
way, traditional hypergraph learning procedure can be conducted using hyperedge
convolution operations efficiently. HGNN is able to learn the hidden layer
representation considering the high-order data structure, which is a general
framework considering the complex data correlations. We have conducted
experiments on citation network classification and visual object recognition
tasks and compared HGNN with graph convolutional networks and other traditional
methods. Experimental results demonstrate that the proposed HGNN method
outperforms recent state-of-the-art methods. We can also reveal from the
results that the proposed HGNN is superior when dealing with multi-modal data
compared with existing methods.Comment: Accepted in AAAI'201
Analysis of label noise in graph-based semi-supervised learning
In machine learning, one must acquire labels to help supervise a model that
will be able to generalize to unseen data. However, the labeling process can be
tedious, long, costly, and error-prone. It is often the case that most of our
data is unlabeled. Semi-supervised learning (SSL) alleviates that by making
strong assumptions about the relation between the labels and the input data
distribution. This paradigm has been successful in practice, but most SSL
algorithms end up fully trusting the few available labels. In real life, both
humans and automated systems are prone to mistakes; it is essential that our
algorithms are able to work with labels that are both few and also unreliable.
Our work aims to perform an extensive empirical evaluation of existing
graph-based semi-supervised algorithms, like Gaussian Fields and Harmonic
Functions, Local and Global Consistency, Laplacian Eigenmaps, Graph
Transduction Through Alternating Minimization. To do that, we compare the
accuracy of classifiers while varying the amount of labeled data and label
noise for many different samples. Our results show that, if the dataset is
consistent with SSL assumptions, we are able to detect the noisiest instances,
although this gets harder when the number of available labels decreases. Also,
the Laplacian Eigenmaps algorithm performed better than label propagation when
the data came from high-dimensional clusters
Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification
Multiple kernel learning (MKL) method is generally believed to perform better
than single kernel method. However, some empirical studies show that this is
not always true: the combination of multiple kernels may even yield an even
worse performance than using a single kernel. There are two possible reasons
for the failure: (i) most existing MKL methods assume that the optimal kernel
is a linear combination of base kernels, which may not hold true; and (ii) some
kernel weights are inappropriately assigned due to noises and carelessly
designed algorithms. In this paper, we propose a novel MKL framework by
following two intuitive assumptions: (i) each kernel is a perturbation of the
consensus kernel; and (ii) the kernel that is close to the consensus kernel
should be assigned a large weight. Impressively, the proposed method can
automatically assign an appropriate weight to each kernel without introducing
additional parameters, as existing methods do. The proposed framework is
integrated into a unified framework for graph-based clustering and
semi-supervised classification. We have conducted experiments on multiple
benchmark datasets and our empirical results verify the superiority of the
proposed framework.Comment: Accepted by IJCAI 2018, Code is availabl
Extracting Information from Multimodal Remote Sensing Data for Sea Ice Characterization
Remote sensing is the discipline that studies acquisition, preparation and analysis of spectral, spatial and temporal properties of objects without direct touch or contact. It is a field of great importance to understanding the climate system and its changes, as well as for conducting operations in the Arctic. A current challenge however is that most sensory equipment can only capture one or fewer of the characteristics needed to accurately describe ground objects through their temporal, spatial, spectral and radiometric resolution characteristics. This in turn motivates the fusing of complimentary modalities for potentially improved accuracy and stability in analysis but it also leads to problems when trying to merge heterogeneous data with different statistical, geometric and physical qualities.
Another concern in the remote sensing of arctic regions is the scarcity of high quality labeled data but simultaneous abundance of unlabeled data as the gathering of labeled data can be both costly and time consuming. It could therefore be of great value to explore routes that can automate this process in ways that target both the situation regarding available data and the difficulties from fusing of heterogeneous multimodal data. To this end Semi-Supervised methods were considered for their ability to leverage smaller amounts of carefully labeled data in combination with more widely available unlabeled data in achieving greater classification performance.
Strengths and limitations of three algorithms for real life applications are assessed through experiments on datasets from arctic and urban areas. The first two algorithms, Deep Semi-Supervised Label Propagation (LP) and MixMatch Holistic SSL (MixMatch), consider simultaneous processing of multimodal remote sensing data with additional extracted Gray Level Co-occurrence Matrix texture features for image classification. LP trains in alternating steps of supervised learning on potentially pseudolabeled data and steps of deciding new labels through node propagation while MixMatch mixes loss terms from several leading algorithms to gain their respective benefits. Another method, Graph Fusion Merriman Bence Osher (GMBO), explores processing of modalities in parallel by constructing a fused graph from complimentary input modalities and Ginzburg-Landau minimization on an approximated Graph Laplacian. Results imply that inclusion of extracted GLCM features could be beneficial for classification of multimodal remote sensing data, and that GMBO has merits for operational use in the Arctic given that certain data prerequisites are met
- …