706 research outputs found

    Approximating k-Forest with Resource Augmentation: A Primal-Dual Approach

    Full text link
    In this paper, we study the kk-forest problem in the model of resource augmentation. In the kk-forest problem, given an edge-weighted graph G(V,E)G(V,E), a parameter kk, and a set of mm demand pairs V×V\subseteq V \times V, the objective is to construct a minimum-cost subgraph that connects at least kk demands. The problem is hard to approximate---the best-known approximation ratio is O(min{n,k})O(\min\{\sqrt{n}, \sqrt{k}\}). Furthermore, kk-forest is as hard to approximate as the notoriously-hard densest kk-subgraph problem. While the kk-forest problem is hard to approximate in the worst-case, we show that with the use of resource augmentation, we can efficiently approximate it up to a constant factor. First, we restate the problem in terms of the number of demands that are {\em not} connected. In particular, the objective of the kk-forest problem can be viewed as to remove at most mkm-k demands and find a minimum-cost subgraph that connects the remaining demands. We use this perspective of the problem to explain the performance of our algorithm (in terms of the augmentation) in a more intuitive way. Specifically, we present a polynomial-time algorithm for the kk-forest problem that, for every ϵ>0\epsilon>0, removes at most mkm-k demands and has cost no more than O(1/ϵ2)O(1/\epsilon^{2}) times the cost of an optimal algorithm that removes at most (1ϵ)(mk)(1-\epsilon)(m-k) demands

    The Strongish Planted Clique Hypothesis and Its Consequences

    Get PDF
    We formulate a new hardness assumption, the Strongish Planted Clique Hypothesis (SPCH), which postulates that any algorithm for planted clique must run in time n^?(log n) (so that the state-of-the-art running time of n^O(log n) is optimal up to a constant in the exponent). We provide two sets of applications of the new hypothesis. First, we show that SPCH implies (nearly) tight inapproximability results for the following well-studied problems in terms of the parameter k: Densest k-Subgraph, Smallest k-Edge Subgraph, Densest k-Subhypergraph, Steiner k-Forest, and Directed Steiner Network with k terminal pairs. For example, we show, under SPCH, that no polynomial time algorithm achieves o(k)-approximation for Densest k-Subgraph. This inapproximability ratio improves upon the previous best k^o(1) factor from (Chalermsook et al., FOCS 2017). Furthermore, our lower bounds hold even against fixed-parameter tractable algorithms with parameter k. Our second application focuses on the complexity of graph pattern detection. For both induced and non-induced graph pattern detection, we prove hardness results under SPCH, improving the running time lower bounds obtained by (Dalirrooyfard et al., STOC 2019) under the Exponential Time Hypothesis

    The Maximum Clique Problem: Algorithms, Applications, and Implementations

    Get PDF
    Computationally hard problems are routinely encountered during the course of solving practical problems. This is commonly dealt with by settling for less than optimal solutions, through the use of heuristics or approximation algorithms. This dissertation examines the alternate possibility of solving such problems exactly, through a detailed study of one particular problem, the maximum clique problem. It discusses algorithms, implementations, and the application of maximum clique results to real-world problems. First, the theoretical roots of the algorithmic method employed are discussed. Then a practical approach is described, which separates out important algorithmic decisions so that the algorithm can be easily tuned for different types of input data. This general and modifiable approach is also meant as a tool for research so that different strategies can easily be tried for different situations. Next, a specific implementation is described. The program is tuned, by use of experiments, to work best for two different graph types, real-world biological data and a suite of synthetic graphs. A parallel implementation is then briefly discussed and tested. After considering implementation, an example of applying these clique-finding tools to a specific case of real-world biological data is presented. Results are analyzed using both statistical and biological metrics. Then the development of practical algorithms based on clique-finding tools is explored in greater detail. New algorithms are introduced and preliminary experiments are performed. Next, some relaxations of clique are discussed along with the possibility of developing new practical algorithms from these variations. Finally, conclusions and future research directions are given

    Approximating Bipartite Minimum Vertex Cover in the CONGEST Model

    Get PDF

    Multipartite Graph Algorithms for the Analysis of Heterogeneous Data

    Get PDF
    The explosive growth in the rate of data generation in recent years threatens to outpace the growth in computer power, motivating the need for new, scalable algorithms and big data analytic techniques. No field may be more emblematic of this data deluge than the life sciences, where technologies such as high-throughput mRNA arrays and next generation genome sequencing are routinely used to generate datasets of extreme scale. Data from experiments in genomics, transcriptomics, metabolomics and proteomics are continuously being added to existing repositories. A goal of exploratory analysis of such omics data is to illuminate the functions and relationships of biomolecules within an organism. This dissertation describes the design, implementation and application of graph algorithms, with the goal of seeking dense structure in data derived from omics experiments in order to detect latent associations between often heterogeneous entities, such as genes, diseases and phenotypes. Exact combinatorial solutions are developed and implemented, rather than relying on approximations or heuristics, even when problems are exceedingly large and/or difficult. Datasets on which the algorithms are applied include time series transcriptomic data from an experiment on the developing mouse cerebellum, gene expression data measuring acute ethanol response in the prefrontal cortex, and the analysis of a predicted protein-protein interaction network. A bipartite graph model is used to integrate heterogeneous data types, such as genes with phenotypes and microbes with mouse strains. The techniques are then extended to a multipartite algorithm to enumerate dense substructure in multipartite graphs, constructed using data from three or more heterogeneous sources, with applications to functional genomics. Several new theoretical results are given regarding multipartite graphs and the multipartite enumeration algorithm. In all cases, practical implementations are demonstrated to expand the frontier of computational feasibility

    A Novel Approach to Finding Near-Cliques: The Triangle-Densest Subgraph Problem

    Full text link
    Many graph mining applications rely on detecting subgraphs which are near-cliques. There exists a dichotomy between the results in the existing work related to this problem: on the one hand the densest subgraph problem (DSP) which maximizes the average degree over all subgraphs is solvable in polynomial time but for many networks fails to find subgraphs which are near-cliques. On the other hand, formulations that are geared towards finding near-cliques are NP-hard and frequently inapproximable due to connections with the Maximum Clique problem. In this work, we propose a formulation which combines the best of both worlds: it is solvable in polynomial time and finds near-cliques when the DSP fails. Surprisingly, our formulation is a simple variation of the DSP. Specifically, we define the triangle densest subgraph problem (TDSP): given G(V,E)G(V,E), find a subset of vertices SS^* such that τ(S)=maxSVt(S)S\tau(S^*)=\max_{S \subseteq V} \frac{t(S)}{|S|}, where t(S)t(S) is the number of triangles induced by the set SS. We provide various exact and approximation algorithms which the solve the TDSP efficiently. Furthermore, we show how our algorithms adapt to the more general problem of maximizing the kk-clique average density. Finally, we provide empirical evidence that the TDSP should be used whenever the output of the DSP fails to output a near-clique.Comment: 42 page

    Fully Dynamic Algorithms for Minimum Weight Cycle and Related Problems

    Get PDF
    We consider the directed minimum weight cycle problem in the fully dynamic setting. To the best of our knowledge, so far no fully dynamic algorithms have been designed specifically for the minimum weight cycle problem in general digraphs. One can achieve O~(n2)\tilde{O}(n^2) amortized update time by simply invoking the fully dynamic APSP algorithm of Demetrescu and Italiano [J. ACM'04]. This bound, however, yields no improvement over the trivial recompute-from-scratch algorithm for sparse graphs. Our first contribution is a very simple deterministic (1+ϵ)(1+\epsilon)-approximate algorithm supporting vertex updates (i.e., changing all edges incident to a specified vertex) in conditionally near-optimal O~(mlog(W)/ϵ)\tilde{O}(m\log{(W)}/\epsilon) amortized time for digraphs with real edge weights in [1,W][1,W]. Using known techniques, the algorithm can be implemented on planar graphs and also gives some new sublinear fully dynamic algorithms maintaining approximate cuts and flows in planar digraphs. Additionally, we show a Monte Carlo randomized exact fully dynamic minimum weight cycle algorithm with O~(mn2/3)\tilde{O}(mn^{2/3}) worst-case update that works for real edge weights. To this end, we generalize the exact fully dynamic APSP data structure of Abraham et al. [SODA'17] to solve the ``multiple-pairs shortest paths problem'', where one is interested in computing distances for some kk (instead of all n2n^2) fixed source-target pairs after each update. We show that in such a scenario, O~((m+k)n2/3)\tilde{O}((m+k)n^{2/3}) worst-case update time is possible.Comment: Full version of an ICALP 2021 pape
    corecore