43 research outputs found

    Detecting High Log-Densities -- an O(n^1/4) Approximation for Densest k-Subgraph

    Full text link
    In the Densest k-Subgraph problem, given a graph G and a parameter k, one needs to find a subgraph of G induced on k vertices that contains the largest number of edges. There is a significant gap between the best known upper and lower bounds for this problem. It is NP-hard, and does not have a PTAS unless NP has subexponential time algorithms. On the other hand, the current best known algorithm of Feige, Kortsarz and Peleg, gives an approximation ratio of n^(1/3-epsilon) for some specific epsilon > 0 (estimated at around 1/60). We present an algorithm that for every epsilon > 0 approximates the Densest k-Subgraph problem within a ratio of n^(1/4+epsilon) in time n^O(1/epsilon). In particular, our algorithm achieves an approximation ratio of O(n^1/4) in time n^O(log n). Our algorithm is inspired by studying an average-case version of the problem where the goal is to distinguish random graphs from graphs with planted dense subgraphs. The approximation ratio we achieve for the general case matches the distinguishing ratio we obtain for this planted problem. At a high level, our algorithms involve cleverly counting appropriately defined trees of constant size in G, and using these counts to identify the vertices of the dense subgraph. Our algorithm is based on the following principle. We say that a graph G(V,E) has log-density alpha if its average degree is Theta(|V|^alpha). The algorithmic core of our result is a family of algorithms that output k-subgraphs of nontrivial density whenever the log-density of the densest k-subgraph is larger than the log-density of the host graph.Comment: 23 page

    Diversity Maximization in Doubling Metrics

    Get PDF
    Diversity maximization is an important geometric optimization problem with many applications in recommender systems, machine learning or search engines among others. A typical diversification problem is as follows: Given a finite metric space (X,d) and a parameter k in N, find a subset of k elements of X that has maximum diversity. There are many functions that measure diversity. One of the most popular measures, called remote-clique, is the sum of the pairwise distances of the chosen elements. In this paper, we present novel results on three widely used diversity measures: Remote-clique, remote-star and remote-bipartition. Our main result are polynomial time approximation schemes for these three diversification problems under the assumption that the metric space is doubling. This setting has been discussed in the recent literature. The existence of such a PTAS however was left open. Our results also hold in the setting where the distances are raised to a fixed power q >= 1, giving rise to more variants of diversity functions, similar in spirit to the variations of clustering problems depending on the power applied to the pairwise distances. Finally, we provide a proof of NP-hardness for remote-clique with squared distances in doubling metric spaces

    Approximating k-Forest with Resource Augmentation: A Primal-Dual Approach

    Full text link
    In this paper, we study the kk-forest problem in the model of resource augmentation. In the kk-forest problem, given an edge-weighted graph G(V,E)G(V,E), a parameter kk, and a set of mm demand pairs V×V\subseteq V \times V, the objective is to construct a minimum-cost subgraph that connects at least kk demands. The problem is hard to approximate---the best-known approximation ratio is O(min{n,k})O(\min\{\sqrt{n}, \sqrt{k}\}). Furthermore, kk-forest is as hard to approximate as the notoriously-hard densest kk-subgraph problem. While the kk-forest problem is hard to approximate in the worst-case, we show that with the use of resource augmentation, we can efficiently approximate it up to a constant factor. First, we restate the problem in terms of the number of demands that are {\em not} connected. In particular, the objective of the kk-forest problem can be viewed as to remove at most mkm-k demands and find a minimum-cost subgraph that connects the remaining demands. We use this perspective of the problem to explain the performance of our algorithm (in terms of the augmentation) in a more intuitive way. Specifically, we present a polynomial-time algorithm for the kk-forest problem that, for every ϵ>0\epsilon>0, removes at most mkm-k demands and has cost no more than O(1/ϵ2)O(1/\epsilon^{2}) times the cost of an optimal algorithm that removes at most (1ϵ)(mk)(1-\epsilon)(m-k) demands

    Weakly Submodular Functions

    Full text link
    Submodular functions are well-studied in combinatorial optimization, game theory and economics. The natural diminishing returns property makes them suitable for many applications. We study an extension of monotone submodular functions, which we call {\em weakly submodular functions}. Our extension includes some (mildly) supermodular functions. We show that several natural functions belong to this class and relate our class to some other recent submodular function extensions. We consider the optimization problem of maximizing a weakly submodular function subject to uniform and general matroid constraints. For a uniform matroid constraint, the "standard greedy algorithm" achieves a constant approximation ratio where the constant (experimentally) converges to 5.95 as the cardinality constraint increases. For a general matroid constraint, a simple local search algorithm achieves a constant approximation ratio where the constant (analytically) converges to 10.22 as the rank of the matroid increases

    Max-sum diversity via convex programming

    Get PDF
    Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f()f(\cdot) that measures diversity of a subset, the task is to select a feasible subset SS such that f(S)f(S) is maximized. The \emph{sum-dispersion} function f(S)=x,ySd(x,y)f(S) = \sum_{x,y \in S} d(x,y), which is the sum of the pairwise distances in SS, is in this context a prominent diversification measure. The corresponding diversity maximization is the \emph{max-sum} or \emph{sum-sum diversification}. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversification problem under a matroid constraint for distances d(,)d(\cdot,\cdot) of \emph{negative type}. Distances of negative type are, for example, metric distances stemming from the 2\ell_2 and 1\ell_1 norm, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search

    Efficient Approximations for the Online Dispersion Problem

    Get PDF
    The dispersion problem has been widely studied in computational geometry and facility location, and is closely related to the packing problem. The goal is to locate n points (e.g., facilities or persons) in a k-dimensional polytope, so that they are far away from each other and from the boundary of the polytope. In many real-world scenarios however, the points arrive and depart at different times, and decisions must be made without knowing future events. Therefore we study, for the first time in the literature, the online dispersion problem in Euclidean space. There are two natural objectives when time is involved: the all-time worst-case (ATWC) problem tries to maximize the minimum distance that ever appears at any time; and the cumulative distance (CD) problem tries to maximize the integral of the minimum distance throughout the whole time interval. Interestingly, the online problems are highly non-trivial even on a segment. For cumulative distance, this remains the case even when the problem is time-dependent but offline, with all the arriving and departure times given in advance. For the online ATWC problem on a segment, we construct a deterministic polynomial-time algorithm which is (2ln2+epsilon)-competitive, where epsilon>0 can be arbitrarily small and the algorithm\u27s running time is polynomial in 1/epsilon. We show this algorithm is actually optimal. For the same problem in a square, we provide a 1.591-competitive algorithm and a 1.183 lower-bound. Furthermore, for arbitrary k-dimensional polytopes with k>=2, we provide a 2/(1-epsilon)-competitive algorithm and a 7/6 lower-bound. All our lower-bounds come from the structure of the online problems and hold even when computational complexity is not a concern. Interestingly, for the offline CD problem in arbitrary k-dimensional polytopes, we provide a polynomial-time black-box reduction to the online ATWC problem, and the resulting competitive ratio increases by a factor of at most 2. Our techniques also apply to online dispersion problems with different boundary conditions
    corecore