44,081 research outputs found

    The Hardness of Approximation of Euclidean k-means

    Get PDF
    The Euclidean kk-means problem is a classical problem that has been extensively studied in the theoretical computer science, machine learning and the computational geometry communities. In this problem, we are given a set of nn points in Euclidean space RdR^d, and the goal is to choose kk centers in RdR^d so that the sum of squared distances of each point to its nearest center is minimized. The best approximation algorithms for this problem include a polynomial time constant factor approximation for general kk and a (1+ϵ)(1+\epsilon)-approximation which runs in time poly(n)2O(k/ϵ)poly(n) 2^{O(k/\epsilon)}. At the other extreme, the only known computational complexity result for this problem is NP-hardness [ADHP'09]. The main difficulty in obtaining hardness results stems from the Euclidean nature of the problem, and the fact that any point in RdR^d can be a potential center. This gap in understanding left open the intriguing possibility that the problem might admit a PTAS for all k,dk,d. In this paper we provide the first hardness of approximation for the Euclidean kk-means problem. Concretely, we show that there exists a constant ϵ>0\epsilon > 0 such that it is NP-hard to approximate the kk-means objective to within a factor of (1+ϵ)(1+\epsilon). We show this via an efficient reduction from the vertex cover problem on triangle-free graphs: given a triangle-free graph, the goal is to choose the fewest number of vertices which are incident on all the edges. Additionally, we give a proof that the current best hardness results for vertex cover can be carried over to triangle-free graphs. To show this we transform GG, a known hard vertex cover instance, by taking a graph product with a suitably chosen graph HH, and showing that the size of the (normalized) maximum independent set is almost exactly preserved in the product graph using a spectral analysis, which might be of independent interest

    A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem

    Full text link
    Many graph mining applications rely on detecting subgraphs which are near-cliques. There exists a dichotomy between the results in the existing work related to this problem: on the one hand the densest subgraph problem (DSP) which maximizes the average degree over all subgraphs is solvable in polynomial time but for many networks fails to find subgraphs which are near-cliques. On the other hand, formulations that are geared towards finding near-cliques are NP-hard and frequently inapproximable due to connections with the Maximum Clique problem. In this work, we propose a formulation which combines the best of both worlds: it is solvable in polynomial time and finds near-cliques when the DSP fails. Surprisingly, our formulation is a simple variation of the DSP. Specifically, we define the triangle densest subgraph problem (TDSP): given G(V,E)G(V,E), find a subset of vertices SS^* such that τ(S)=maxSVt(S)S\tau(S^*)=\max_{S \subseteq V} \frac{t(S)}{|S|}, where t(S)t(S) is the number of triangles induced by the set SS. We provide various exact and approximation algorithms which the solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to the more general problem of maximizing the kk-clique average density. Finally, we provide empirical evidence that the TDSP should be used whenever the output of the DSP fails to output a near-clique.Comment: 42 page

    Approximation Algorithms for Multi-Criteria Traveling Salesman Problems

    Full text link
    In multi-criteria optimization problems, several objective functions have to be optimized. Since the different objective functions are usually in conflict with each other, one cannot consider only one particular solution as the optimal solution. Instead, the aim is to compute a so-called Pareto curve of solutions. Since Pareto curves cannot be computed efficiently in general, we have to be content with approximations to them. We design a deterministic polynomial-time algorithm for multi-criteria g-metric STSP that computes (min{1 +g, 2g^2/(2g^2 -2g +1)} + eps)-approximate Pareto curves for all 1/2<=g<=1. In particular, we obtain a (2+eps)-approximation for multi-criteria metric STSP. We also present two randomized approximation algorithms for multi-criteria g-metric STSP that achieve approximation ratios of (2g^3 +2g^2)/(3g^2 -2g +1) + eps and (1 +g)/(1 +3g -4g^2) + eps, respectively. Moreover, we present randomized approximation algorithms for multi-criteria g-metric ATSP (ratio 1/2 + g^3/(1 -3g^2) + eps) for g < 1/sqrt(3)), STSP with weights 1 and 2 (ratio 4/3) and ATSP with weights 1 and 2 (ratio 3/2). To do this, we design randomized approximation schemes for multi-criteria cycle cover and graph factor problems.Comment: To appear in Algorithmica. A preliminary version has been presented at the 4th Workshop on Approximation and Online Algorithms (WAOA 2006

    Bilu-Linial Stable Instances of Max Cut and Minimum Multiway Cut

    Full text link
    We investigate the notion of stability proposed by Bilu and Linial. We obtain an exact polynomial-time algorithm for γ\gamma-stable Max Cut instances with γclognloglogn\gamma \geq c\sqrt{\log n}\log\log n for some absolute constant c>0c > 0. Our algorithm is robust: it never returns an incorrect answer; if the instance is γ\gamma-stable, it finds the maximum cut, otherwise, it either finds the maximum cut or certifies that the instance is not γ\gamma-stable. We prove that there is no robust polynomial-time algorithm for γ\gamma-stable instances of Max Cut when γ<αSC(n/2)\gamma < \alpha_{SC}(n/2), where αSC\alpha_{SC} is the best approximation factor for Sparsest Cut with non-uniform demands. Our algorithm is based on semidefinite programming. We show that the standard SDP relaxation for Max Cut (with 22\ell_2^2 triangle inequalities) is integral if γD221(n)\gamma \geq D_{\ell_2^2\to \ell_1}(n), where D221(n)D_{\ell_2^2\to \ell_1}(n) is the least distortion with which every nn point metric space of negative type embeds into 1\ell_1. On the negative side, we show that the SDP relaxation is not integral when γ<D221(n/2)\gamma < D_{\ell_2^2\to \ell_1}(n/2). Moreover, there is no tractable convex relaxation for γ\gamma-stable instances of Max Cut when γ<αSC(n/2)\gamma < \alpha_{SC}(n/2). That suggests that solving γ\gamma-stable instances with γ=o(logn)\gamma =o(\sqrt{\log n}) might be difficult or impossible. Our results significantly improve previously known results. The best previously known algorithm for γ\gamma-stable instances of Max Cut required that γcn\gamma \geq c\sqrt{n} (for some c>0c > 0) [Bilu, Daniely, Linial, and Saks]. No hardness results were known for the problem. Additionally, we present an algorithm for 4-stable instances of Minimum Multiway Cut. We also study a relaxed notion of weak stability.Comment: 24 page

    A QPTAS for Maximum Weight Independent Set of Polygons with Polylogarithmically Many Vertices

    Full text link
    The Maximum Weight Independent Set of Polygons problem is a fundamental problem in computational geometry. Given a set of weighted polygons in the 2-dimensional plane, the goal is to find a set of pairwise non-overlapping polygons with maximum total weight. Due to its wide range of applications, the MWISP problem and its special cases have been extensively studied both in the approximation algorithms and the computational geometry community. Despite a lot of research, its general case is not well-understood. Currently the best known polynomial time algorithm achieves an approximation ratio of n^(epsilon) [Fox and Pach, SODA 2011], and it is not even clear whether the problem is APX-hard. We present a (1+epsilon)-approximation algorithm, assuming that each polygon in the input has at most a polylogarithmic number of vertices. Our algorithm has quasi-polynomial running time. We use a recently introduced framework for approximating maximum weight independent set in geometric intersection graphs. The framework has been used to construct a QPTAS in the much simpler case of axis-parallel rectangles. We extend it in two ways, to adapt it to our much more general setting. First, we show that its technical core can be reduced to the case when all input polygons are triangles. Secondly, we replace its key technical ingredient which is a method to partition the plane using only few edges such that the objects stemming from the optimal solution are evenly distributed among the resulting faces and each object is intersected only a few times. Our new procedure for this task is not more complex than the original one, and it can handle the arising difficulties due to the arbitrary angles of the polygons. Note that already this obstacle makes the known analysis for the above framework fail. Also, in general it is not well understood how to handle this difficulty by efficient approximation algorithms

    Exact asymptotics of the optimal Lp-error of asymmetric linear spline approximation

    Full text link
    In this paper we study the best asymmetric (sometimes also called penalized or sign-sensitive) approximation in the metrics of the space LpL_p, 1p1\leqslant p\leqslant\infty, of functions fC2([0,1]2)f\in C^2\left([0,1]^2\right) with nonnegative Hessian by piecewise linear splines sS(N)s\in S(\triangle_N), generated by given triangulations N\triangle_N with NN elements. We find the exact asymptotic behavior of optimal (over triangulations N\triangle_N and splines sS(N)s\in S(\triangle_N) error of such approximation as NN\to \infty
    corecore