Search CORE

17 research outputs found

Projected Power Iteration for Network Alignment

Author: Abbe
Aflalo
Babai
Berthet
Bruna
Chandrasekaran
Chen
Cullina
Deshpande
Deshpande
Donoho
Donoho
Feizi
Garfinkel
Journee
Kelley
Makarychev
Massoulie
Moitra
Mossel
Nowak
Onaran
Perry
Villar
Publication venue
Publication date: 16/07/2017
Field of study

The network alignment problem asks for the best correspondence between two given graphs, so that the largest possible number of edges are matched. This problem appears in many scientific problems (like the study of protein-protein interactions) and it is very closely related to the quadratic assignment problem which has graph isomorphism, traveling salesman and minimum bisection problems as particular cases. The graph matching problem is NP-hard in general. However, under some restrictive models for the graphs, algorithms can approximate the alignment efficiently. In that spirit the recent work by Feizi and collaborators introduce EigenAlign, a fast spectral method with convergence guarantees for Erd\H{o}s-Reny\'i graphs. In this work we propose the algorithm Projected Power Alignment, which is a projected power iteration version of EigenAlign. We numerically show it improves the recovery rates of EigenAlign and we describe the theory that may be used to provide performance guarantees for Projected Power Alignment.Comment: 8 page

arXiv.org e-Print Archive

Crossref

Exact Clustering of Weighted Graphs via Semidefinite Programming

Author: Ames Brendan
Pirinen Aleksis
Publication venue
Publication date: 01/01/2019
Field of study

As a model problem for clustering, we consider the densest k-disjoint-clique problem of partitioning a weighted complete graph into k disjoint subgraphs such that the sum of the densities of these subgraphs is maximized. We establish that such subgraphs can be recovered from the solution of a particular semidefinite relaxation with high probability if the input graph is sampled from a distribution of clusterable graphs. Specifically, the semidefinite relaxation is exact if the graph consists of k large disjoint subgraphs, corresponding to clusters, with weight concentrated within these subgraphs, plus a moderate number of outliers. Further, we establish that if noise is weakly obscuring these clusters, i.e, the between-cluster edges are assigned very small weights, then we can recover significantly smaller clusters. For example, we show that in approximately sparse graphs, where the between-cluster weights tend to zero as the size n of the graph tends to infinity, we can recover clusters of size polylogarithmic in n. Empirical evidence from numerical simulations is also provided to support these theoretical phase transitions to perfect recovery of the cluster structure

arXiv.org e-Print Archive

Lund University Publications

Relax, no need to round: integrality of clustering formulations

Author: Awasthi Pranjal
Bandeira Afonso S.
Charikar Moses
Krishnaswamy Ravishankar
Villar Soledad
Ward Rachel
Publication venue
Publication date: 01/01/2015
Field of study

We study exact recovery conditions for convex relaxations of point cloud clustering problems, focusing on two of the most common optimization problems for unsupervised clustering:

k

-means and

k

-median clustering. Motivations for focusing on convex relaxations are: (a) they come with a certificate of optimality, and (b) they are generic tools which are relatively parameter-free, not tailored to specific assumptions over the input. More precisely, we consider the distributional setting where there are

k

clusters in

\mathbb{R}^m

and data from each cluster consists of

n

points sampled from a symmetric distribution within a ball of unit radius. We ask: what is the minimal separation distance between cluster centers needed for convex relaxations to exactly recover these

k

clusters as the optimal integral solution? For the

k

-median linear programming relaxation we show a tight bound: exact recovery is obtained given arbitrarily small pairwise separation

\epsilon > 0

between the balls. In other words, the pairwise center separation is

\Delta > 2+\epsilon

. Under the same distributional model, the

k

-means LP relaxation fails to recover such clusters at separation as large as

\Delta = 4

. Yet, if we enforce PSD constraints on the

k

-means LP, we get exact cluster recovery at center separation

\Delta > 2\sqrt2(1+\sqrt{1/m})

. In contrast, common heuristics such as Lloyd's algorithm (a.k.a. the

k

-means algorithm) can fail to recover clusters in this setting; even with arbitrarily large cluster separation, k-means++ with overseeding by any constant factor fails with high probability at exact cluster recovery. To complement the theoretical analysis, we provide an experimental study of the recovery guarantees for these various methods, and discuss several open problems which these experiments suggest.Comment: 30 pages, ITCS 201

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Crossref

Recommended from our members

Applied Harmonic Analysis and Sparse Approximation

Author
Publication venue: Zürich : EMS Publ. House
Publication date: 01/01/2015
Field of study

Efficiently analyzing functions, in particular multivariate functions, is a key problem in applied mathematics. The area of applied harmonic analysis has a significant impact on this problem by providing methodologies both for theoretical questions and for a wide range of applications in technology and science, such as image processing. Approximation theory, in particular the branch of the theory of sparse approximations, is closely intertwined with this area with a lot of recent exciting developments in the intersection of both. Research topics typically also involve related areas such as convex optimization, probability theory, and Banach space geometry. The workshop was the continuation of a first event in 2012 and intended to bring together world leading experts in these areas, to report on recent developments, and to foster new developments and collaborations

Repositorium für Naturwissenschaften und Technik

Size matters: cardinality-constrained clustering and outlier detection via conic optimization

Author: Kuhn D
Rujeerapaiboon N
Schindler K
Wiesemann W
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 10/01/2019
Field of study

Plain vanilla K-means clustering has proven to be successful in practice, yet it suffers from outlier sensitivity and may produce highly unbalanced clusters. To mitigate both shortcomings, we formulate a joint outlier detection and clustering problem, which assigns a prescribed number of datapoints to an auxiliary outlier cluster and performs cardinality-constrainedK-means clustering on the residual dataset, treating the cluster cardinalities as a given input. We cast this problem as a mixed-integer linear program (MILP) that admits tractable semidefinite and linear programming relaxations. We propose deterministic rounding schemes thattransform the relaxed solutions to feasible solutions for the MILP. We also prove that these solutions areoptimal in the MILP if a cluster separation condition holds

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

ScholarBank@NUS