25,782 research outputs found

    Approximation Algorithms for Semi-random Graph Partitioning Problems

    Full text link
    In this paper, we propose and study a new semi-random model for graph partitioning problems. We believe that it captures many properties of real--world instances. The model is more flexible than the semi-random model of Feige and Kilian and planted random model of Bui, Chaudhuri, Leighton and Sipser. We develop a general framework for solving semi-random instances and apply it to several problems of interest. We present constant factor bi-criteria approximation algorithms for semi-random instances of the Balanced Cut, Multicut, Min Uncut, Sparsest Cut and Small Set Expansion problems. We also show how to almost recover the optimal solution if the instance satisfies an additional expanding condition. Our algorithms work in a wider range of parameters than most algorithms for previously studied random and semi-random models. Additionally, we study a new planted algebraic expander model and develop constant factor bi-criteria approximation algorithms for graph partitioning problems in this model.Comment: To appear at the 44th ACM Symposium on Theory of Computing (STOC 2012

    Planted Models for the Densest k-Subgraph Problem

    Get PDF
    Given an undirected graph G, the Densest k-subgraph problem (DkS) asks to compute a set S ? V of cardinality |S| ? k such that the weight of edges inside S is maximized. This is a fundamental NP-hard problem whose approximability, inspite of many decades of research, is yet to be settled. The current best known approximation algorithm due to Bhaskara et al. (2010) computes a ?(n^{1/4 + ?}) approximation in time n^{?(1/?)}, for any ? > 0. We ask what are some "easier" instances of this problem? We propose some natural semi-random models of instances with a planted dense subgraph, and study approximation algorithms for computing the densest subgraph in them. These models are inspired by the semi-random models of instances studied for various other graph problems such as the independent set problem, graph partitioning problems etc. For a large range of parameters of these models, we get significantly better approximation factors for the Densest k-subgraph problem. Moreover, our algorithm recovers a large part of the planted solution

    Improved Cheeger's Inequality: Analysis of Spectral Partitioning Algorithms through Higher Order Spectral Gap

    Get PDF
    Let \phi(G) be the minimum conductance of an undirected graph G, and let 0=\lambda_1 <= \lambda_2 <=... <= \lambda_n <= 2 be the eigenvalues of the normalized Laplacian matrix of G. We prove that for any graph G and any k >= 2, \phi(G) = O(k) \lambda_2 / \sqrt{\lambda_k}, and this performance guarantee is achieved by the spectral partitioning algorithm. This improves Cheeger's inequality, and the bound is optimal up to a constant factor for any k. Our result shows that the spectral partitioning algorithm is a constant factor approximation algorithm for finding a sparse cut if \lambda_k$ is a constant for some constant k. This provides some theoretical justification to its empirical performance in image segmentation and clustering problems. We extend the analysis to other graph partitioning problems, including multi-way partition, balanced separator, and maximum cut

    Approximate Computation and Implicit Regularization for Very Large-scale Data Analysis

    Full text link
    Database theory and database practice are typically the domain of computer scientists who adopt what may be termed an algorithmic perspective on their data. This perspective is very different than the more statistical perspective adopted by statisticians, scientific computers, machine learners, and other who work on what may be broadly termed statistical data analysis. In this article, I will address fundamental aspects of this algorithmic-statistical disconnect, with an eye to bridging the gap between these two very different approaches. A concept that lies at the heart of this disconnect is that of statistical regularization, a notion that has to do with how robust is the output of an algorithm to the noise properties of the input data. Although it is nearly completely absent from computer science, which historically has taken the input data as given and modeled algorithms discretely, regularization in one form or another is central to nearly every application domain that applies algorithms to noisy data. By using several case studies, I will illustrate, both theoretically and empirically, the nonobvious fact that approximate computation, in and of itself, can implicitly lead to statistical regularization. This and other recent work suggests that, by exploiting in a more principled way the statistical properties implicit in worst-case algorithms, one can in many cases satisfy the bicriteria of having algorithms that are scalable to very large-scale databases and that also have good inferential or predictive properties.Comment: To appear in the Proceedings of the 2012 ACM Symposium on Principles of Database Systems (PODS 2012

    Bilu-Linial Stable Instances of Max Cut and Minimum Multiway Cut

    Full text link
    We investigate the notion of stability proposed by Bilu and Linial. We obtain an exact polynomial-time algorithm for γ\gamma-stable Max Cut instances with γclognloglogn\gamma \geq c\sqrt{\log n}\log\log n for some absolute constant c>0c > 0. Our algorithm is robust: it never returns an incorrect answer; if the instance is γ\gamma-stable, it finds the maximum cut, otherwise, it either finds the maximum cut or certifies that the instance is not γ\gamma-stable. We prove that there is no robust polynomial-time algorithm for γ\gamma-stable instances of Max Cut when γ<αSC(n/2)\gamma < \alpha_{SC}(n/2), where αSC\alpha_{SC} is the best approximation factor for Sparsest Cut with non-uniform demands. Our algorithm is based on semidefinite programming. We show that the standard SDP relaxation for Max Cut (with 22\ell_2^2 triangle inequalities) is integral if γD221(n)\gamma \geq D_{\ell_2^2\to \ell_1}(n), where D221(n)D_{\ell_2^2\to \ell_1}(n) is the least distortion with which every nn point metric space of negative type embeds into 1\ell_1. On the negative side, we show that the SDP relaxation is not integral when γ<D221(n/2)\gamma < D_{\ell_2^2\to \ell_1}(n/2). Moreover, there is no tractable convex relaxation for γ\gamma-stable instances of Max Cut when γ<αSC(n/2)\gamma < \alpha_{SC}(n/2). That suggests that solving γ\gamma-stable instances with γ=o(logn)\gamma =o(\sqrt{\log n}) might be difficult or impossible. Our results significantly improve previously known results. The best previously known algorithm for γ\gamma-stable instances of Max Cut required that γcn\gamma \geq c\sqrt{n} (for some c>0c > 0) [Bilu, Daniely, Linial, and Saks]. No hardness results were known for the problem. Additionally, we present an algorithm for 4-stable instances of Minimum Multiway Cut. We also study a relaxed notion of weak stability.Comment: 24 page
    corecore