110,120 research outputs found

    Precedence-Constrained Min Sum Set Cover

    Get PDF
    We introduce a version of the Min Sum Set Cover (MSSC) problem in which there are "AND" precedence constraints on the m sets. In the Precedence-Constrained Min Sum Set Cover (PCMSSC) problem, when interpreted as directed edges, the constraints induce an acyclic directed graph. PCMSSC models the aim of scheduling software tests to prioritize the rate of fault detection subject to dependencies between tests. Our greedy scheme for PCMSSC is similar to the approaches of Feige, Lovasz, and, Tetali for MSSC, and Chekuri and Motwani for precedence-constrained scheduling to minimize weighted completion time. With a factor-4 increase in approximation ratio, we reduce PCMSSC to the problem of finding a maximum-density precedence-closed sub-family of sets, where density is the ratio of sub-family union size to cardinality. We provide a greedy factor-sqrt m algorithm for maximizing density; on forests of in-trees, we show this algorithm finds an optimal solution. Harnessing an alternative greedy argument of Chekuri and Kumar for Maximum Coverage with Group Budget Constraints, on forests of out-trees, we design an algorithm with approximation ratio equal to maximum tree height. Finally, with a reduction from the Planted Dense Subgraph detection problem, we show that its conjectured hardness implies there is no polynomial-time algorithm for PCMSSC with approximation factor in O(m^{1/12-epsilon})

    Improved Approximation Algorithms for (Budgeted) Node-weighted Steiner Problems

    Full text link
    Moss and Rabani[12] study constrained node-weighted Steiner tree problems with two independent weight values associated with each node, namely, cost and prize (or penalty). They give an O(log n)-approximation algorithm for the prize-collecting node-weighted Steiner tree problem (PCST). They use the algorithm for PCST to obtain a bicriteria (2, O(log n))-approximation algorithm for the Budgeted node-weighted Steiner tree problem. Their solution may cost up to twice the budget, but collects a factor Omega(1/log n) of the optimal prize. We improve these results from at least two aspects. Our first main result is a primal-dual O(log h)-approximation algorithm for a more general problem, prize-collecting node-weighted Steiner forest, where we have (h) demands each requesting the connectivity of a pair of vertices. Our algorithm can be seen as a greedy algorithm which reduces the number of demands by choosing a structure with minimum cost-to-reduction ratio. This natural style of argument (also used by Klein and Ravi[10] and Guha et al.[8]) leads to a much simpler algorithm than that of Moss and Rabani[12] for PCST. Our second main contribution is for the Budgeted node-weighted Steiner tree problem, which is also an improvement to [12] and [8]. In the unrooted case, we improve upon an O(log^2(n))-approximation of [8], and present an O(log n)-approximation algorithm without any budget violation. For the rooted case, where a specified vertex has to appear in the solution tree, we improve the bicriteria result of [12] to a bicriteria approximation ratio of (1+eps, O(log n)/(eps^2)) for any positive (possibly subconstant) (eps). That is, for any permissible budget violation (1+eps), we present an algorithm achieving a tradeoff in the guarantee for prize. Indeed, we show that this is almost tight for the natural linear-programming relaxation used by us as well as in [12].Comment: To appear in ICALP 201

    On Kernelization and Approximation for the Vector Connectivity Problem

    Get PDF
    In the Vector Connectivity problem we are given an undirected graph G=(V,E)G=(V,E), a demand function ϕ ⁣:V{0,,d}\phi\colon V\to\{0,\ldots,d\}, and an integer kk. The question is whether there exists a set SS of at most kk vertices such that every vertex vVSv\in V\setminus S has at least ϕ(v)\phi(v) vertex-disjoint paths to SS; this abstractly captures questions about placing servers or warehouses relative to demands. The problem is \NP-hard already for instances with d=4d=4 (Cicalese et al., arXiv '14), admits a log-factor approximation (Boros et al., Networks '14), and is fixed-parameter tractable in terms of~kk (Lokshtanov, unpublished '14). We prove several results regarding kernelization and approximation for Vector Connectivity and the variant Vector dd-Connectivity where the upper bound dd on demands is a fixed constant. For Vector dd-Connectivity we give a factor dd-approximation algorithm and construct a vertex-linear kernelization, i.e., an efficient reduction to an equivalent instance with f(d)k=O(k)f(d)k=O(k) vertices. For Vector Connectivity we have a factor opt\text{opt}-approximation and we can show that it has no kernelization to size polynomial in kk or even k+dk+d unless NPcoNP/poly\mathsf{NP\subseteq coNP/poly}, making f(d)poly(k)f(d)\operatorname{poly}(k) optimal for Vector dd-Connectivity. Finally, we provide a write-up for fixed-parameter tractability of Vector Connectivity(kk) by giving an alternative FPT algorithm based on matroid intersection.Comment: Non-constructive Kernelization argument, improved technical details of signature

    On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product

    Get PDF
    In this paper we study the (Bichromatic) Maximum Inner Product Problem (Max-IP), in which we are given sets A and B of vectors, and the goal is to find a in A and b in B maximizing inner product a * b. Max-IP is very basic and serves as the base problem in the recent breakthrough of [Abboud et al., FOCS 2017] on hardness of approximation for polynomial-time problems. It is also used (implicitly) in the argument for hardness of exact l_2-Furthest Pair (and other important problems in computational geometry) in poly-log-log dimensions in [Williams, SODA 2018]. We have three main results regarding this problem. - Characterization of Multiplicative Approximation. First, we study the best multiplicative approximation ratio for Boolean Max-IP in sub-quadratic time. We show that, for Max-IP with two sets of n vectors from {0,1}^{d}, there is an n^{2 - Omega(1)} time (d/log n)^{Omega(1)}-multiplicative-approximating algorithm, and we show this is conditionally optimal, as such a (d/log n)^{o(1)}-approximating algorithm would refute SETH. Similar characterization is also achieved for additive approximation for Max-IP. - 2^{O(log^* n)}-dimensional Hardness for Exact Max-IP Over The Integers. Second, we revisit the hardness of solving Max-IP exactly for vectors with integer entries. We show that, under SETH, for Max-IP with sets of n vectors from Z^{d} for some d = 2^{O(log^* n)}, every exact algorithm requires n^{2 - o(1)} time. With the reduction from [Williams, SODA 2018], it follows that l_2-Furthest Pair and Bichromatic l_2-Closest Pair in 2^{O(log^* n)} dimensions require n^{2 - o(1)} time. - Connection with NP * UPP Communication Protocols. Last, We establish a connection between conditional lower bounds for exact Max-IP with integer entries and NP * UPP communication protocols for Set-Disjointness, parallel to the connection between conditional lower bounds for approximating Max-IP and MA communication protocols for Set-Disjointness. The lower bound in our first result is a direct corollary of the new MA protocol for Set-Disjointness introduced in [Rubinstein, STOC 2018], and our algorithms utilize the polynomial method and simple random sampling. Our second result follows from a new dimensionality self reduction from the Orthogonal Vectors problem for n vectors from {0,1}^{d} to n vectors from Z^{l} where l = 2^{O(log^* d)}, dramatically improving the previous reduction in [Williams, SODA 2018]. The key technical ingredient is a recursive application of Chinese Remainder Theorem. As a side product, we obtain an MA communication protocol for Set-Disjointness with complexity O (sqrt{n log n log log n}), slightly improving the O (sqrt{n} log n) bound [Aaronson and Wigderson, TOCT 2009], and approaching the Omega(sqrt{n}) lower bound [Klauck, CCC 2003]. Moreover, we show that (under SETH) one can apply the O(sqrt{n}) BQP communication protocol for Set-Disjointness to prove near-optimal hardness for approximation to Max-IP with vectors in {-1,1}^d. This answers a question from [Abboud et al., FOCS 2017] in the affirmative

    Approximating solution structure of the Weighted Sentence Alignment problem

    Full text link
    We study the complexity of approximating solution structure of the bijective weighted sentence alignment problem of DeNero and Klein (2008). In particular, we consider the complexity of finding an alignment that has a significant overlap with an optimal alignment. We discuss ways of representing the solution for the general weighted sentence alignment as well as phrases-to-words alignment problem, and show that computing a string which agrees with the optimal sentence partition on more than half (plus an arbitrarily small polynomial fraction) positions for the phrases-to-words alignment is NP-hard. For the general weighted sentence alignment we obtain such bound from the agreement on a little over 2/3 of the bits. Additionally, we generalize the Hamming distance approximation of a solution structure to approximating it with respect to the edit distance metric, obtaining similar lower bounds

    Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

    Get PDF
    Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data

    Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

    Get PDF
    Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed---either explicitly or implicitly---to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis
    corecore