324 research outputs found
Robust Densest Subgraph Discovery
Dense subgraph discovery is an important primitive in graph mining, which has
a wide variety of applications in diverse domains. In the densest subgraph
problem, given an undirected graph with an edge-weight vector
, we aim to find that maximizes the density,
i.e., , where is the sum of the weights of the edges in the
subgraph induced by . Although the densest subgraph problem is one of the
most well-studied optimization problems for dense subgraph discovery, there is
an implicit strong assumption; it is assumed that the weights of all the edges
are known exactly as input. In real-world applications, there are often cases
where we have only uncertain information of the edge weights. In this study, we
provide a framework for dense subgraph discovery under the uncertainty of edge
weights. Specifically, we address such an uncertainty issue using the theory of
robust optimization. First, we formulate our fundamental problem, the robust
densest subgraph problem, and present a simple algorithm. We then formulate the
robust densest subgraph problem with sampling oracle that models dense subgraph
discovery using an edge-weight sampling oracle, and present an algorithm with a
strong theoretical performance guarantee. Computational experiments using both
synthetic graphs and popular real-world graphs demonstrate the effectiveness of
our proposed algorithms.Comment: 10 pages; Accepted to ICDM 201
Linear optimization over homogeneous matrix cones
A convex cone is homogeneous if its automorphism group acts transitively on
the interior of the cone, i.e., for every pair of points in the interior of the
cone, there exists a cone automorphism that maps one point to the other. Cones
that are homogeneous and self-dual are called symmetric. The symmetric cones
include the positive semidefinite matrix cone and the second order cone as
important practical examples. In this paper, we consider the less well-studied
conic optimization problems over cones that are homogeneous but not necessarily
self-dual. We start with cones of positive semidefinite symmetric matrices with
a given sparsity pattern. Homogeneous cones in this class are characterized by
nested block-arrow sparsity patterns, a subset of the chordal sparsity
patterns. We describe transitive subsets of the automorphism groups of the
cones and their duals, and important properties of the composition of log-det
barrier functions with the automorphisms in this set. Next, we consider
extensions to linear slices of the positive semidefinite cone, i.e.,
intersection of the positive semidefinite cone with a linear subspace, and
review conditions that make the cone homogeneous. In the third part of the
paper we give a high-level overview of the classical algebraic theory of
homogeneous cones due to Vinberg and Rothaus. A fundamental consequence of this
theory is that every homogeneous cone admits a spectrahedral (linear matrix
inequality) representation. We conclude by discussing the role of homogeneous
cone structure in primal-dual symmetric interior-point methods.Comment: 59 pages, 10 figures, to appear in Acta Numeric
Correlation Clustering with Low-Rank Matrices
Correlation clustering is a technique for aggregating data based on
qualitative information about which pairs of objects are labeled 'similar' or
'dissimilar.' Because the optimization problem is NP-hard, much of the previous
literature focuses on finding approximation algorithms. In this paper we
explore how to solve the correlation clustering objective exactly when the data
to be clustered can be represented by a low-rank matrix. We prove in particular
that correlation clustering can be solved in polynomial time when the
underlying matrix is positive semidefinite with small constant rank, but that
the task remains NP-hard in the presence of even one negative eigenvalue. Based
on our theoretical results, we develop an algorithm for efficiently "solving"
low-rank positive semidefinite correlation clustering by employing a procedure
for zonotope vertex enumeration. We demonstrate the effectiveness and speed of
our algorithm by using it to solve several clustering problems on both
synthetic and real-world data
On prisms, M\"obius ladders and the cycle space of dense graphs
For a graph X, let f_0(X) denote its number of vertices, d(X) its minimum
degree and Z_1(X;Z/2) its cycle space in the standard graph-theoretical sense
(i.e. 1-dimensional cycle group in the sense of simplicial homology theory with
Z/2-coefficients). Call a graph Hamilton-generated if and only if the set of
all Hamilton circuits is a Z/2-generating system for Z_1(X;Z/2). The main
purpose of this paper is to prove the following: for every s > 0 there exists
n_0 such that for every graph X with f_0(X) >= n_0 vertices, (1) if d(X) >=
(1/2 + s) f_0(X) and f_0(X) is odd, then X is Hamilton-generated, (2) if d(X)
>= (1/2 + s) f_0(X) and f_0(X) is even, then the set of all Hamilton circuits
of X generates a codimension-one subspace of Z_1(X;Z/2), and the set of all
circuits of X having length either f_0(X)-1 or f_0(X) generates all of
Z_1(X;Z/2), (3) if d(X) >= (1/4 + s) f_0(X) and X is square bipartite, then X
is Hamilton-generated. All these degree-conditions are essentially
best-possible. The implications in (1) and (2) give an asymptotic affirmative
answer to a special case of an open conjecture which according to [European J.
Combin. 4 (1983), no. 3, p. 246] originates with A. Bondy.Comment: 33 pages; 5 figure
Knowledge Graph Embedding: An Overview
Many mathematical models have been leveraged to design embeddings for
representing Knowledge Graph (KG) entities and relations for link prediction
and many downstream tasks. These mathematically-inspired models are not only
highly scalable for inference in large KGs, but also have many explainable
advantages in modeling different relation patterns that can be validated
through both formal proofs and empirical results. In this paper, we make a
comprehensive overview of the current state of research in KG completion. In
particular, we focus on two main branches of KG embedding (KGE) design: 1)
distance-based methods and 2) semantic matching-based methods. We discover the
connections between recently proposed models and present an underlying trend
that might help researchers invent novel and more effective models. Next, we
delve into CompoundE and CompoundE3D, which draw inspiration from 2D and 3D
affine operations, respectively. They encompass a broad spectrum of techniques
including distance-based and semantic-based methods. We will also discuss an
emerging approach for KG completion which leverages pre-trained language models
(PLMs) and textual descriptions of entities and relations and offer insights
into the integration of KGE embedding methods with PLMs for KG completion
Approximating Nash Equilibria and Dense Bipartite Subgraphs via an Approximate Version of Carathéodory's Theorem
We present algorithmic applications of an approximate version of Caratheodory's theorem. The theorem states that given a set of vectors X in R^d, for every vector in the convex hull of X there exists an ε-close (under the p-norm distance, for 2 ≤ p < ∞) vector that can be expressed as a convex combination of at most b vectors of X, where the bound b depends on ε and the norm p and is independent of the dimension d. This theorem can be derived by instantiating Maurey's lemma, early references to which can be found in the work of Pisier (1981) and Carl (1985). However, in this paper we present a self-contained proof of this result.
Using this theorem we establish that in a bimatrix game with n x n payoff matrices A, B, if the number of non-zero entries in any column of A+B is at most s then an ε-Nash equilibrium of the game can be computed in time n^O(log s/ε^2}). This, in particular, gives us a polynomial-time approximation scheme for Nash equilibrium in games with fixed column sparsity s. Moreover, for arbitrary bimatrix games---since s can be at most n---the running time of our algorithm matches the best-known upper bound, which was obtained by Lipton, Markakis, and Mehta (2003).
The approximate Carathéodory's theorem also leads to an additive approximation algorithm for the densest k-bipartite subgraph problem. Given a graph with n vertices and maximum degree d, the developed algorithm determines a k x k bipartite subgraph with density within ε (in the additive sense) of the optimal density in time n^O(log d/ε^2)
- …