1,029 research outputs found

    Correlation Clustering with Low-Rank Matrices

    Full text link
    Correlation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled 'similar' or 'dissimilar.' Because the optimization problem is NP-hard, much of the previous literature focuses on finding approximation algorithms. In this paper we explore how to solve the correlation clustering objective exactly when the data to be clustered can be represented by a low-rank matrix. We prove in particular that correlation clustering can be solved in polynomial time when the underlying matrix is positive semidefinite with small constant rank, but that the task remains NP-hard in the presence of even one negative eigenvalue. Based on our theoretical results, we develop an algorithm for efficiently "solving" low-rank positive semidefinite correlation clustering by employing a procedure for zonotope vertex enumeration. We demonstrate the effectiveness and speed of our algorithm by using it to solve several clustering problems on both synthetic and real-world data

    Chordal Decomposition in Rank Minimized Semidefinite Programs with Applications to Subspace Clustering

    Full text link
    Semidefinite programs (SDPs) often arise in relaxations of some NP-hard problems, and if the solution of the SDP obeys certain rank constraints, the relaxation will be tight. Decomposition methods based on chordal sparsity have already been applied to speed up the solution of sparse SDPs, but methods for dealing with rank constraints are underdeveloped. This paper leverages a minimum rank completion result to decompose the rank constraint on a single large matrix into multiple rank constraints on a set of smaller matrices. The re-weighted heuristic is used as a proxy for rank, and the specific form of the heuristic preserves the sparsity pattern between iterations. Implementations of rank-minimized SDPs through interior-point and first-order algorithms are discussed. The problem of subspace clustering is used to demonstrate the computational improvement of the proposed method.Comment: 6 pages, 6 figure

    Fast Graph Laplacian regularized kernel learning via semidefinite-quadratic-linear programming.

    Get PDF
    Wu, Xiaoming.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 30-34).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 2 --- Preliminaries --- p.4Chapter 2.1 --- Kernel Learning Theory --- p.4Chapter 2.1.1 --- Positive Semidefinite Kernel --- p.4Chapter 2.1.2 --- The Reproducing Kernel Map --- p.6Chapter 2.1.3 --- Kernel Tricks --- p.7Chapter 2.2 --- Spectral Graph Theory --- p.8Chapter 2.2.1 --- Graph Laplacian --- p.8Chapter 2.2.2 --- Eigenvectors of Graph Laplacian --- p.9Chapter 2.3 --- Convex Optimization --- p.10Chapter 2.3.1 --- From Linear to Conic Programming --- p.11Chapter 2.3.2 --- Second-Order Cone Programming --- p.12Chapter 2.3.3 --- Semidefinite Programming --- p.12Chapter 3 --- Fast Graph Laplacian Regularized Kernel Learning --- p.14Chapter 3.1 --- The Problems --- p.14Chapter 3.1.1 --- MVU --- p.16Chapter 3.1.2 --- PCP --- p.17Chapter 3.1.3 --- Low-Rank Approximation: from SDP to QSDP --- p.18Chapter 3.2 --- Previous Approach: from QSDP to SDP --- p.20Chapter 3.3 --- Our Formulation: from QSDP to SQLP --- p.21Chapter 3.4 --- Experimental Results --- p.23Chapter 3.4.1 --- The Results --- p.25Chapter 4 --- Conclusion --- p.28Bibliography --- p.3

    Eigenvector Synchronization, Graph Rigidity and the Molecule Problem

    Full text link
    The graph realization problem has received a great deal of attention in recent years, due to its importance in applications such as wireless sensor networks and structural biology. In this paper, we extend on previous work and propose the 3D-ASAP algorithm, for the graph realization problem in R3\mathbb{R}^3, given a sparse and noisy set of distance measurements. 3D-ASAP is a divide and conquer, non-incremental and non-iterative algorithm, which integrates local distance information into a global structure determination. Our approach starts with identifying, for every node, a subgraph of its 1-hop neighborhood graph, which can be accurately embedded in its own coordinate system. In the noise-free case, the computed coordinates of the sensors in each patch must agree with their global positioning up to some unknown rigid motion, that is, up to translation, rotation and possibly reflection. In other words, to every patch there corresponds an element of the Euclidean group Euc(3) of rigid transformations in R3\mathbb{R}^3, and the goal is to estimate the group elements that will properly align all the patches in a globally consistent way. Furthermore, 3D-ASAP successfully incorporates information specific to the molecule problem in structural biology, in particular information on known substructures and their orientation. In addition, we also propose 3D-SP-ASAP, a faster version of 3D-ASAP, which uses a spectral partitioning algorithm as a preprocessing step for dividing the initial graph into smaller subgraphs. Our extensive numerical simulations show that 3D-ASAP and 3D-SP-ASAP are very robust to high levels of noise in the measured distances and to sparse connectivity in the measurement graph, and compare favorably to similar state-of-the art localization algorithms.Comment: 49 pages, 8 figure

    An Efficient Dual Approach to Distance Metric Learning

    Full text link
    Distance metric learning is of fundamental interest in machine learning because the distance metric employed can significantly affect the performance of many learning methods. Quadratic Mahalanobis metric learning is a popular approach to the problem, but typically requires solving a semidefinite programming (SDP) problem, which is computationally expensive. Standard interior-point SDP solvers typically have a complexity of O(D6.5)O(D^{6.5}) (with DD the dimension of input data), and can thus only practically solve problems exhibiting less than a few thousand variables. Since the number of variables is D(D+1)/2D (D+1) / 2 , this implies a limit upon the size of problem that can practically be solved of around a few hundred dimensions. The complexity of the popular quadratic Mahalanobis metric learning approach thus limits the size of problem to which metric learning can be applied. Here we propose a significantly more efficient approach to the metric learning problem based on the Lagrange dual formulation of the problem. The proposed formulation is much simpler to implement, and therefore allows much larger Mahalanobis metric learning problems to be solved. The time complexity of the proposed method is O(D3)O (D ^ 3) , which is significantly lower than that of the SDP approach. Experiments on a variety of datasets demonstrate that the proposed method achieves an accuracy comparable to the state-of-the-art, but is applicable to significantly larger problems. We also show that the proposed method can be applied to solve more general Frobenius-norm regularized SDP problems approximately
    • …
    corecore