609 research outputs found
Phase Transitions in Semidefinite Relaxations
Statistical inference problems arising within signal processing, data mining,
and machine learning naturally give rise to hard combinatorial optimization
problems. These problems become intractable when the dimensionality of the data
is large, as is often the case for modern datasets. A popular idea is to
construct convex relaxations of these combinatorial problems, which can be
solved efficiently for large scale datasets.
Semidefinite programming (SDP) relaxations are among the most powerful
methods in this family, and are surprisingly well-suited for a broad range of
problems where data take the form of matrices or graphs. It has been observed
several times that, when the `statistical noise' is small enough, SDP
relaxations correctly detect the underlying combinatorial structures.
In this paper we develop asymptotic predictions for several `detection
thresholds,' as well as for the estimation error above these thresholds. We
study some classical SDP relaxations for statistical problems motivated by
graph synchronization and community detection in networks. We map these
optimization problems to statistical mechanics models with vector spins, and
use non-rigorous techniques from statistical mechanics to characterize the
corresponding phase transitions. Our results clarify the effectiveness of SDP
relaxations in solving high-dimensional statistical problems.Comment: 71 pages, 24 pdf figure
Tightness of the maximum likelihood semidefinite relaxation for angular synchronization
Maximum likelihood estimation problems are, in general, intractable
optimization problems. As a result, it is common to approximate the maximum
likelihood estimator (MLE) using convex relaxations. In some cases, the
relaxation is tight: it recovers the true MLE. Most tightness proofs only apply
to situations where the MLE exactly recovers a planted solution (known to the
analyst). It is then sufficient to establish that the optimality conditions
hold at the planted signal. In this paper, we study an estimation problem
(angular synchronization) for which the MLE is not a simple function of the
planted solution, yet for which the convex relaxation is tight. To establish
tightness in this context, the proof is less direct because the point at which
to verify optimality conditions is not known explicitly.
Angular synchronization consists in estimating a collection of phases,
given noisy measurements of the pairwise relative phases. The MLE for angular
synchronization is the solution of a (hard) non-bipartite Grothendieck problem
over the complex numbers. We consider a stochastic model for the data: a
planted signal (that is, a ground truth set of phases) is corrupted with
non-adversarial random noise. Even though the MLE does not coincide with the
planted signal, we show that the classical semidefinite relaxation for it is
tight, with high probability. This holds even for high levels of noise.Comment: 2 figure
Exact Clustering of Weighted Graphs via Semidefinite Programming
As a model problem for clustering, we consider the densest k-disjoint-clique
problem of partitioning a weighted complete graph into k disjoint subgraphs
such that the sum of the densities of these subgraphs is maximized. We
establish that such subgraphs can be recovered from the solution of a
particular semidefinite relaxation with high probability if the input graph is
sampled from a distribution of clusterable graphs. Specifically, the
semidefinite relaxation is exact if the graph consists of k large disjoint
subgraphs, corresponding to clusters, with weight concentrated within these
subgraphs, plus a moderate number of outliers. Further, we establish that if
noise is weakly obscuring these clusters, i.e, the between-cluster edges are
assigned very small weights, then we can recover significantly smaller
clusters. For example, we show that in approximately sparse graphs, where the
between-cluster weights tend to zero as the size n of the graph tends to
infinity, we can recover clusters of size polylogarithmic in n. Empirical
evidence from numerical simulations is also provided to support these
theoretical phase transitions to perfect recovery of the cluster structure
Multireference Alignment using Semidefinite Programming
The multireference alignment problem consists of estimating a signal from
multiple noisy shifted observations. Inspired by existing Unique-Games
approximation algorithms, we provide a semidefinite program (SDP) based
relaxation which approximates the maximum likelihood estimator (MLE) for the
multireference alignment problem. Although we show that the MLE problem is
Unique-Games hard to approximate within any constant, we observe that our
poly-time approximation algorithm for the MLE appears to perform quite well in
typical instances, outperforming existing methods. In an attempt to explain
this behavior we provide stability guarantees for our SDP under a random noise
model on the observations. This case is more challenging to analyze than
traditional semi-random instances of Unique-Games: the noise model is on
vertices of a graph and translates into dependent noise on the edges.
Interestingly, we show that if certain positivity constraints in the SDP are
dropped, its solution becomes equivalent to performing phase correlation, a
popular method used for pairwise alignment in imaging applications. Finally, we
show how symmetry reduction techniques from matrix representation theory can
simplify the analysis and computation of the SDP, greatly decreasing its
computational cost
A Riemannian low-rank method for optimization over semidefinite matrices with block-diagonal constraints
We propose a new algorithm to solve optimization problems of the form for a smooth function under the constraints that is positive
semidefinite and the diagonal blocks of are small identity matrices. Such
problems often arise as the result of relaxing a rank constraint (lifting). In
particular, many estimation tasks involving phases, rotations, orthonormal
bases or permutations fit in this framework, and so do certain relaxations of
combinatorial problems such as Max-Cut. The proposed algorithm exploits the
facts that (1) such formulations admit low-rank solutions, and (2) their
rank-restricted versions are smooth optimization problems on a Riemannian
manifold. Combining insights from both the Riemannian and the convex geometries
of the problem, we characterize when second-order critical points of the smooth
problem reveal KKT points of the semidefinite problem. We compare against state
of the art, mature software and find that, on certain interesting problem
instances, what we call the staircase method is orders of magnitude faster, is
more accurate and scales better. Code is available.Comment: 37 pages, 3 figure
- …