785 research outputs found

    Structure fusion based on graph convolutional networks for semi-supervised classification

    Full text link
    Suffering from the multi-view data diversity and complexity for semi-supervised classification, most of existing graph convolutional networks focus on the networks architecture construction or the salient graph structure preservation, and ignore the the complete graph structure for semi-supervised classification contribution. To mine the more complete distribution structure from multi-view data with the consideration of the specificity and the commonality, we propose structure fusion based on graph convolutional networks (SF-GCN) for improving the performance of semi-supervised classification. SF-GCN can not only retain the special characteristic of each view data by spectral embedding, but also capture the common style of multi-view data by distance metric between multi-graph structures. Suppose the linear relationship between multi-graph structures, we can construct the optimization function of structure fusion model by balancing the specificity loss and the commonality loss. By solving this function, we can simultaneously obtain the fusion spectral embedding from the multi-view data and the fusion structure as adjacent matrix to input graph convolutional networks for semi-supervised classification. Experiments demonstrate that the performance of SF-GCN outperforms that of the state of the arts on three challenging datasets, which are Cora,Citeseer and Pubmed in citation networks

    Graph-based Semi-supervised Learning: Algorithms and Applications.

    Get PDF
    114 p.Graph-based semi-supervised learning have attracted large numbers of researchers and it is an important part of semi-supervised learning. Graph construction and semi-supervised embedding are two main steps in graph-based semi-supervised learning algorithms. In this thesis, we proposed two graph construction algorithms and two semi-supervised embedding algorithms. The main work of this thesis is summarized as follows:1. A new graph construction algorithm named Graph construction based on self-representativeness and Laplacian smoothness (SRLS) and several variants are proposed. Researches show that the coefficients obtained by data representation algorithms reflect the similarity between data samples and can be considered as a measurement of the similarity. This kind of measurement can be used for the weights of the edges between data samples in graph construction. Each column of the coefficient matrix obtained by data self-representation algorithms can be regarded as a new representation of original data. The new representations should have common features as the original data samples. Thus, if two data samples are close to each other in the original space, the corresponding representations should be highly similar. This constraint is called Laplacian smoothness.SRLS graph is based on l2-norm minimized data self-representation and Laplacian smoothness. Since the representation matrix obtained by l2 minimization is dense, a two phrase SRLS method (TPSRLS) is proposed to increase the sparsity of graph matrix. By extending the linear space to Hilbert space, two kernelized versions of SRLS are proposed. Besides, a direct solution to kernelized SRLS algorithm is also introduced.2. A new sparse graph construction algorithm named Sparse graph with Laplacian smoothness (SGLS) and several variants are proposed. SGLS graph algorithm is based on sparse representation and use Laplacian smoothness as a constraint (SGLS). A kernelized version of the SGLS algorithm and a direct solution to kernelized SGLS algorithm are also proposed. 3. SPP is a successful unsupervised learning method. To extend SPP to a semi-supervised embedding method, we introduce the idea of in-class constraints in CGE into SPP and propose a new semi-supervised method for data embedding named Constrained Sparsity Preserving Embedding (CSPE).4. The weakness of CSPE is that it cannot handle the new coming samples which means a cascade regression should be performed after the non-linear mapping is obtained by CSPE over the whole training samples. Inspired by FME, we add a regression term in the objective function to obtain an approximate linear projection simultaneously when non-linear embedding is estimated and proposed Flexible Constrained Sparsity Preserving Embedding (FCSPE).Extensive experiments on several datasets (including facial images, handwriting digits images and objects images) prove that the proposed algorithms can improve the state-of-the-art results

    Graph-based Semi-supervised Learning: Algorithms and Applications.

    Get PDF
    114 p.Graph-based semi-supervised learning have attracted large numbers of researchers and it is an important part of semi-supervised learning. Graph construction and semi-supervised embedding are two main steps in graph-based semi-supervised learning algorithms. In this thesis, we proposed two graph construction algorithms and two semi-supervised embedding algorithms. The main work of this thesis is summarized as follows:1. A new graph construction algorithm named Graph construction based on self-representativeness and Laplacian smoothness (SRLS) and several variants are proposed. Researches show that the coefficients obtained by data representation algorithms reflect the similarity between data samples and can be considered as a measurement of the similarity. This kind of measurement can be used for the weights of the edges between data samples in graph construction. Each column of the coefficient matrix obtained by data self-representation algorithms can be regarded as a new representation of original data. The new representations should have common features as the original data samples. Thus, if two data samples are close to each other in the original space, the corresponding representations should be highly similar. This constraint is called Laplacian smoothness.SRLS graph is based on l2-norm minimized data self-representation and Laplacian smoothness. Since the representation matrix obtained by l2 minimization is dense, a two phrase SRLS method (TPSRLS) is proposed to increase the sparsity of graph matrix. By extending the linear space to Hilbert space, two kernelized versions of SRLS are proposed. Besides, a direct solution to kernelized SRLS algorithm is also introduced.2. A new sparse graph construction algorithm named Sparse graph with Laplacian smoothness (SGLS) and several variants are proposed. SGLS graph algorithm is based on sparse representation and use Laplacian smoothness as a constraint (SGLS). A kernelized version of the SGLS algorithm and a direct solution to kernelized SGLS algorithm are also proposed. 3. SPP is a successful unsupervised learning method. To extend SPP to a semi-supervised embedding method, we introduce the idea of in-class constraints in CGE into SPP and propose a new semi-supervised method for data embedding named Constrained Sparsity Preserving Embedding (CSPE).4. The weakness of CSPE is that it cannot handle the new coming samples which means a cascade regression should be performed after the non-linear mapping is obtained by CSPE over the whole training samples. Inspired by FME, we add a regression term in the objective function to obtain an approximate linear projection simultaneously when non-linear embedding is estimated and proposed Flexible Constrained Sparsity Preserving Embedding (FCSPE).Extensive experiments on several datasets (including facial images, handwriting digits images and objects images) prove that the proposed algorithms can improve the state-of-the-art results

    Hypergraph Neural Networks

    Full text link
    In this paper, we present a hypergraph neural networks (HGNN) framework for data representation learning, which can encode high-order data correlation in a hypergraph structure. Confronting the challenges of learning representation for complex data in real practice, we propose to incorporate such data structure in a hypergraph, which is more flexible on data modeling, especially when dealing with complex data. In this method, a hyperedge convolution operation is designed to handle the data correlation during representation learning. In this way, traditional hypergraph learning procedure can be conducted using hyperedge convolution operations efficiently. HGNN is able to learn the hidden layer representation considering the high-order data structure, which is a general framework considering the complex data correlations. We have conducted experiments on citation network classification and visual object recognition tasks and compared HGNN with graph convolutional networks and other traditional methods. Experimental results demonstrate that the proposed HGNN method outperforms recent state-of-the-art methods. We can also reveal from the results that the proposed HGNN is superior when dealing with multi-modal data compared with existing methods.Comment: Accepted in AAAI'201

    Analysis of label noise in graph-based semi-supervised learning

    Full text link
    In machine learning, one must acquire labels to help supervise a model that will be able to generalize to unseen data. However, the labeling process can be tedious, long, costly, and error-prone. It is often the case that most of our data is unlabeled. Semi-supervised learning (SSL) alleviates that by making strong assumptions about the relation between the labels and the input data distribution. This paradigm has been successful in practice, but most SSL algorithms end up fully trusting the few available labels. In real life, both humans and automated systems are prone to mistakes; it is essential that our algorithms are able to work with labels that are both few and also unreliable. Our work aims to perform an extensive empirical evaluation of existing graph-based semi-supervised algorithms, like Gaussian Fields and Harmonic Functions, Local and Global Consistency, Laplacian Eigenmaps, Graph Transduction Through Alternating Minimization. To do that, we compare the accuracy of classifiers while varying the amount of labeled data and label noise for many different samples. Our results show that, if the dataset is consistent with SSL assumptions, we are able to detect the noisiest instances, although this gets harder when the number of available labels decreases. Also, the Laplacian Eigenmaps algorithm performed better than label propagation when the data came from high-dimensional clusters

    Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification

    Full text link
    Multiple kernel learning (MKL) method is generally believed to perform better than single kernel method. However, some empirical studies show that this is not always true: the combination of multiple kernels may even yield an even worse performance than using a single kernel. There are two possible reasons for the failure: (i) most existing MKL methods assume that the optimal kernel is a linear combination of base kernels, which may not hold true; and (ii) some kernel weights are inappropriately assigned due to noises and carelessly designed algorithms. In this paper, we propose a novel MKL framework by following two intuitive assumptions: (i) each kernel is a perturbation of the consensus kernel; and (ii) the kernel that is close to the consensus kernel should be assigned a large weight. Impressively, the proposed method can automatically assign an appropriate weight to each kernel without introducing additional parameters, as existing methods do. The proposed framework is integrated into a unified framework for graph-based clustering and semi-supervised classification. We have conducted experiments on multiple benchmark datasets and our empirical results verify the superiority of the proposed framework.Comment: Accepted by IJCAI 2018, Code is availabl

    Extracting Information from Multimodal Remote Sensing Data for Sea Ice Characterization

    Get PDF
    Remote sensing is the discipline that studies acquisition, preparation and analysis of spectral, spatial and temporal properties of objects without direct touch or contact. It is a field of great importance to understanding the climate system and its changes, as well as for conducting operations in the Arctic. A current challenge however is that most sensory equipment can only capture one or fewer of the characteristics needed to accurately describe ground objects through their temporal, spatial, spectral and radiometric resolution characteristics. This in turn motivates the fusing of complimentary modalities for potentially improved accuracy and stability in analysis but it also leads to problems when trying to merge heterogeneous data with different statistical, geometric and physical qualities. Another concern in the remote sensing of arctic regions is the scarcity of high quality labeled data but simultaneous abundance of unlabeled data as the gathering of labeled data can be both costly and time consuming. It could therefore be of great value to explore routes that can automate this process in ways that target both the situation regarding available data and the difficulties from fusing of heterogeneous multimodal data. To this end Semi-Supervised methods were considered for their ability to leverage smaller amounts of carefully labeled data in combination with more widely available unlabeled data in achieving greater classification performance. Strengths and limitations of three algorithms for real life applications are assessed through experiments on datasets from arctic and urban areas. The first two algorithms, Deep Semi-Supervised Label Propagation (LP) and MixMatch Holistic SSL (MixMatch), consider simultaneous processing of multimodal remote sensing data with additional extracted Gray Level Co-occurrence Matrix texture features for image classification. LP trains in alternating steps of supervised learning on potentially pseudolabeled data and steps of deciding new labels through node propagation while MixMatch mixes loss terms from several leading algorithms to gain their respective benefits. Another method, Graph Fusion Merriman Bence Osher (GMBO), explores processing of modalities in parallel by constructing a fused graph from complimentary input modalities and Ginzburg-Landau minimization on an approximated Graph Laplacian. Results imply that inclusion of extracted GLCM features could be beneficial for classification of multimodal remote sensing data, and that GMBO has merits for operational use in the Arctic given that certain data prerequisites are met
    corecore