12,330 research outputs found
Similarity Search Over Graphs Using Localized Spectral Analysis
This paper provides a new similarity detection algorithm. Given an input set
of multi-dimensional data points, where each data point is assumed to be
multi-dimensional, and an additional reference data point for similarity
finding, the algorithm uses kernel method that embeds the data points into a
low dimensional manifold. Unlike other kernel methods, which consider the
entire data for the embedding, our method selects a specific set of kernel
eigenvectors. The eigenvectors are chosen to separate between the data points
and the reference data point so that similar data points can be easily
identified as being distinct from most of the members in the dataset.Comment: Published in SampTA 201
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Flow-based Influence Graph Visual Summarization
Visually mining a large influence graph is appealing yet challenging. People
are amazed by pictures of newscasting graph on Twitter, engaged by hidden
citation networks in academics, nevertheless often troubled by the unpleasant
readability of the underlying visualization. Existing summarization methods
enhance the graph visualization with blocked views, but have adverse effect on
the latent influence structure. How can we visually summarize a large graph to
maximize influence flows? In particular, how can we illustrate the impact of an
individual node through the summarization? Can we maintain the appealing graph
metaphor while preserving both the overall influence pattern and fine
readability?
To answer these questions, we first formally define the influence graph
summarization problem. Second, we propose an end-to-end framework to solve the
new problem. Our method can not only highlight the flow-based influence
patterns in the visual summarization, but also inherently support rich graph
attributes. Last, we present a theoretic analysis and report our experiment
results. Both evidences demonstrate that our framework can effectively
approximate the proposed influence graph summarization objective while
outperforming previous methods in a typical scenario of visually mining
academic citation networks.Comment: to appear in IEEE International Conference on Data Mining (ICDM),
Shen Zhen, China, December 201
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
A short-graph Fourier transform via personalized PageRank vectors
The short-time Fourier transform (STFT) is widely used to analyze the spectra
of temporal signals that vary through time. Signals defined over graphs, due to
their intrinsic complexity, exhibit large variations in their patterns. In this
work we propose a new formulation for an STFT for signals defined over graphs.
This formulation draws on recent ideas from spectral graph theory, using
personalized PageRank vectors as its fundamental building block. Furthermore,
this work establishes and explores the connection between local spectral graph
theory and localized spectral analysis of graph signals. We accompany the
presentation with synthetic and real-world examples, showing the suitability of
the proposed approach
Compressive Embedding and Visualization using Graphs
Visualizing high-dimensional data has been a focus in data analysis
communities for decades, which has led to the design of many algorithms, some
of which are now considered references (such as t-SNE for example). In our era
of overwhelming data volumes, the scalability of such methods have become more
and more important. In this work, we present a method which allows to apply any
visualization or embedding algorithm on very large datasets by considering only
a fraction of the data as input and then extending the information to all data
points using a graph encoding its global similarity. We show that in most
cases, using only samples is sufficient to diffuse the
information to all data points. In addition, we propose quantitative
methods to measure the quality of embeddings and demonstrate the validity of
our technique on both synthetic and real-world datasets
NetLSD: Hearing the Shape of a Graph
Comparison among graphs is ubiquitous in graph analytics. However, it is a
hard task in terms of the expressiveness of the employed similarity measure and
the efficiency of its computation. Ideally, graph comparison should be
invariant to the order of nodes and the sizes of compared graphs, adaptive to
the scale of graph patterns, and scalable. Unfortunately, these properties have
not been addressed together. Graph comparisons still rely on direct approaches,
graph kernels, or representation-based methods, which are all inefficient and
impractical for large graph collections.
In this paper, we propose the Network Laplacian Spectral Descriptor (NetLSD):
the first, to our knowledge, permutation- and size-invariant, scale-adaptive,
and efficiently computable graph representation method that allows for
straightforward comparisons of large graphs. NetLSD extracts a compact
signature that inherits the formal properties of the Laplacian spectrum,
specifically its heat or wave kernel; thus, it hears the shape of a graph. Our
evaluation on a variety of real-world graphs demonstrates that it outperforms
previous works in both expressiveness and efficiency.Comment: KDD '18: The 24th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining, August 19--23, 2018, London, United Kingdo
- …