26,564 research outputs found
Graph Augmentation using Spectral moments
Graph representational learning focuses on learning real value vectors that for nodes,edges or the graph, such that these vectors capture adequate information about these entities. Graph data augmentation, focuses on changing the structure or features in a graph to help improve classification performance and become more generalizable. This can be broadly categorized into feature based augmentation and structure based augmentation. Feature augmentation focuses on changing the feature matrix, without changing the structure of the graph to help improve the performance of the graph neural network. Graph structure augmentation refers to the manipulation of the adjacency matrix of a given graph to achieve better classification performance.
Our approach focuses on the problem of graph augmentation but from a spectral standpoint. More specifically, we attempt to augment a graph using spectral moments. Recent results have indicated that the second, third and fourth spectral moments of a graph, have strong connections to the graph\u27s properties, such as degree distribution, clustering coefficient, and connectivity[1]. Our contribution is two fold: First, we explain a formal method to find a spectral moment that helps maximize node classification performance. Second, we also provide an algorithm to augment the graph using it\u27s spectral moments, and therefore augment the graph to the spectral point that helps maximize classification performance while making the graph sparse. For the purpose of node classification, we use the GraphSAGE model with no node sampling and the mean aggregator. We notice that the node classification performance after augmentation goes up in a majority of our datasets, and furthermore, the graph also gets sparser across all our datasets
Spectral clustering in the Gaussian mixture block model
Gaussian mixture block models are distributions over graphs that strive to
model modern networks: to generate a graph from such a model, we associate each
vertex with a latent feature vector sampled from a
mixture of Gaussians, and we add edge if and only if the feature
vectors are sufficiently similar, in that
for a pre-specified threshold . The different components of the Gaussian
mixture represent the fact that there may be different types of nodes with
different distributions over features -- for example, in a social network each
component represents the different attributes of a distinct community. Natural
algorithmic tasks associated with these networks are embedding (recovering the
latent feature vectors) and clustering (grouping nodes by their mixture
component).
In this paper we initiate the study of clustering and embedding graphs
sampled from high-dimensional Gaussian mixture block models, where the
dimension of the latent feature vectors as the size of the
network . This high-dimensional setting is most appropriate in
the context of modern networks, in which we think of the latent feature space
as being high-dimensional. We analyze the performance of canonical spectral
clustering and embedding algorithms for such graphs in the case of 2-component
spherical Gaussian mixtures, and begin to sketch out the
information-computation landscape for clustering and embedding in these models.Comment: 41 page
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Compressive Spectral Clustering
Spectral clustering has become a popular technique due to its high
performance in many contexts. It comprises three main steps: create a
similarity graph between N objects to cluster, compute the first k eigenvectors
of its Laplacian matrix to define a feature vector for each object, and run
k-means on these features to separate objects into k classes. Each of these
three steps becomes computationally intensive for large N and/or k. We propose
to speed up the last two steps based on recent results in the emerging field of
graph signal processing: graph filtering of random signals, and random sampling
of bandlimited graph signals. We prove that our method, with a gain in
computation time that can reach several orders of magnitude, is in fact an
approximation of spectral clustering, for which we are able to control the
error. We test the performance of our method on artificial and real-world
network data.Comment: 12 pages, 2 figure
Pattern vectors from algebraic graph theory
Graphstructures have proven computationally cumbersome for pattern analysis. The reason for this is that, before graphs can be converted to pattern vectors, correspondences must be established between the nodes of structures which are potentially of different size. To overcome this problem, in this paper, we turn to the spectral decomposition of the Laplacian matrix. We show how the elements of the spectral matrix for the Laplacian can be used to construct symmetric polynomials that are permutation invariants. The coefficients of these polynomials can be used as graph features which can be encoded in a vectorial manner. We extend this representation to graphs in which there are unary attributes on the nodes and binary attributes on the edges by using the spectral decomposition of a Hermitian property matrix that can be viewed as a complex analogue of the Laplacian. To embed the graphs in a pattern space, we explore whether the vectors of invariants can be embedded in a low- dimensional space using a number of alternative strategies, including principal components analysis ( PCA), multidimensional scaling ( MDS), and locality preserving projection ( LPP). Experimentally, we demonstrate that the embeddings result in well- defined graph clusters. Our experiments with the spectral representation involve both synthetic and real- world data. The experiments with synthetic data demonstrate that the distances between spectral feature vectors can be used to discriminate between graphs on the basis of their structure. The real- world experiments show that the method can be used to locate clusters of graphs
A survey of kernel and spectral methods for clustering
Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., K-means, SOM and neural gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel K-means clustering algorithm. (C) 2007 Pattem Recognition Society. Published by Elsevier Ltd. All rights reserved
Image classification by visual bag-of-words refinement and reduction
This paper presents a new framework for visual bag-of-words (BOW) refinement
and reduction to overcome the drawbacks associated with the visual BOW model
which has been widely used for image classification. Although very influential
in the literature, the traditional visual BOW model has two distinct drawbacks.
Firstly, for efficiency purposes, the visual vocabulary is commonly constructed
by directly clustering the low-level visual feature vectors extracted from
local keypoints, without considering the high-level semantics of images. That
is, the visual BOW model still suffers from the semantic gap, and thus may lead
to significant performance degradation in more challenging tasks (e.g. social
image classification). Secondly, typically thousands of visual words are
generated to obtain better performance on a relatively large image dataset. Due
to such large vocabulary size, the subsequent image classification may take
sheer amount of time. To overcome the first drawback, we develop a graph-based
method for visual BOW refinement by exploiting the tags (easy to access
although noisy) of social images. More notably, for efficient image
classification, we further reduce the refined visual BOW model to a much
smaller size through semantic spectral clustering. Extensive experimental
results show the promising performance of the proposed framework for visual BOW
refinement and reduction
- âŠ