263 research outputs found
Consistency of Spectral Hypergraph Partitioning under Planted Partition Model
Hypergraph partitioning lies at the heart of a number of problems in machine
learning and network sciences. Many algorithms for hypergraph partitioning have
been proposed that extend standard approaches for graph partitioning to the
case of hypergraphs. However, theoretical aspects of such methods have seldom
received attention in the literature as compared to the extensive studies on
the guarantees of graph partitioning. For instance, consistency results of
spectral graph partitioning under the stochastic block model are well known. In
this paper, we present a planted partition model for sparse random non-uniform
hypergraphs that generalizes the stochastic block model. We derive an error
bound for a spectral hypergraph partitioning algorithm under this model using
matrix concentration inequalities. To the best of our knowledge, this is the
first consistency result related to partitioning non-uniform hypergraphs.Comment: 35 pages, 2 figures, 1 tabl
Hypergraph -Laplacian: A Differential Geometry View
The graph Laplacian plays key roles in information processing of relational
data, and has analogies with the Laplacian in differential geometry. In this
paper, we generalize the analogy between graph Laplacian and differential
geometry to the hypergraph setting, and propose a novel hypergraph
-Laplacian. Unlike the existing two-node graph Laplacians, this
generalization makes it possible to analyze hypergraphs, where the edges are
allowed to connect any number of nodes. Moreover, we propose a semi-supervised
learning method based on the proposed hypergraph -Laplacian, and formalize
them as the analogue to the Dirichlet problem, which often appears in physics.
We further explore theoretical connections to normalized hypergraph cut on a
hypergraph, and propose normalized cut corresponding to hypergraph
-Laplacian. The proposed -Laplacian is shown to outperform standard
hypergraph Laplacians in the experiment on a hypergraph semi-supervised
learning and normalized cut setting.Comment: Extended version of our AAAI-18 pape
Multilayer Networks
In most natural and engineered systems, a set of entities interact with each
other in complicated patterns that can encompass multiple types of
relationships, change in time, and include other types of complications. Such
systems include multiple subsystems and layers of connectivity, and it is
important to take such "multilayer" features into account to try to improve our
understanding of complex systems. Consequently, it is necessary to generalize
"traditional" network theory by developing (and validating) a framework and
associated tools to study multilayer systems in a comprehensive fashion. The
origins of such efforts date back several decades and arose in multiple
disciplines, and now the study of multilayer networks has become one of the
most important directions in network science. In this paper, we discuss the
history of multilayer networks (and related concepts) and review the exploding
body of work on such networks. To unify the disparate terminology in the large
body of recent work, we discuss a general framework for multilayer networks,
construct a dictionary of terminology to relate the numerous existing concepts
to each other, and provide a thorough discussion that compares, contrasts, and
translates between related notions such as multilayer networks, multiplex
networks, interdependent networks, networks of networks, and many others. We
also survey and discuss existing data sets that can be represented as
multilayer networks. We review attempts to generalize single-layer-network
diagnostics to multilayer networks. We also discuss the rapidly expanding
research on multilayer-network models and notions like community structure,
connected components, tensor decompositions, and various types of dynamical
processes on multilayer networks. We conclude with a summary and an outlook.Comment: Working paper; 59 pages, 8 figure
Hypergraph Learning with Line Expansion
Previous hypergraph expansions are solely carried out on either vertex level
or hyperedge level, thereby missing the symmetric nature of data co-occurrence,
and resulting in information loss. To address the problem, this paper treats
vertices and hyperedges equally and proposes a new hypergraph formulation named
the \emph{line expansion (LE)} for hypergraphs learning. The new expansion
bijectively induces a homogeneous structure from the hypergraph by treating
vertex-hyperedge pairs as "line nodes". By reducing the hypergraph to a simple
graph, the proposed \emph{line expansion} makes existing graph learning
algorithms compatible with the higher-order structure and has been proven as a
unifying framework for various hypergraph expansions. We evaluate the proposed
line expansion on five hypergraph datasets, the results show that our method
beats SOTA baselines by a significant margin
Overlapping and Robust Edge-Colored Clustering in Hypergraphs
A recent trend in data mining has explored (hyper)graph clustering algorithms
for data with categorical relationship types. Such algorithms have applications
in the analysis of social, co-authorship, and protein interaction networks, to
name a few. Many such applications naturally have some overlap between
clusters, a nuance which is missing from current combinatorial models.
Additionally, existing models lack a mechanism for handling noise in datasets.
We address these concerns by generalizing Edge-Colored Clustering, a recent
framework for categorical clustering of hypergraphs. Our generalizations allow
for a budgeted number of either (a) overlapping cluster assignments or (b) node
deletions. For each new model we present a greedy algorithm which approximately
minimizes an edge mistake objective, as well as bicriteria approximations where
the second approximation factor is on the budget. Additionally, we address the
parameterized complexity of each problem, providing FPT algorithms and hardness
results
Wasserstein Soft Label Propagation on Hypergraphs: Algorithm and Generalization Error Bounds
Inspired by recent interests of developing machine learning and data mining
algorithms on hypergraphs, we investigate in this paper the semi-supervised
learning algorithm of propagating "soft labels" (e.g. probability
distributions, class membership scores) over hypergraphs, by means of optimal
transportation. Borrowing insights from Wasserstein propagation on graphs
[Solomon et al. 2014], we re-formulate the label propagation procedure as a
message-passing algorithm, which renders itself naturally to a generalization
applicable to hypergraphs through Wasserstein barycenters. Furthermore, in a
PAC learning framework, we provide generalization error bounds for propagating
one-dimensional distributions on graphs and hypergraphs using 2-Wasserstein
distance, by establishing the \textit{algorithmic stability} of the proposed
semi-supervised learning algorithm. These theoretical results also shed new
lights upon deeper understandings of the Wasserstein propagation on graphs.Comment: To appear in Proc. AAAI'1
- …