110,120 research outputs found
Precedence-Constrained Min Sum Set Cover
We introduce a version of the Min Sum Set Cover (MSSC) problem in which there are "AND" precedence constraints on the m sets. In the Precedence-Constrained Min Sum Set Cover (PCMSSC) problem, when interpreted as directed edges, the constraints induce an acyclic directed graph. PCMSSC models the aim of scheduling software tests to prioritize the rate of fault detection subject to dependencies between tests.
Our greedy scheme for PCMSSC is similar to the approaches of Feige, Lovasz, and, Tetali for MSSC, and Chekuri and Motwani for precedence-constrained scheduling to minimize weighted completion time. With a factor-4 increase in approximation ratio, we reduce PCMSSC to the problem of
finding a maximum-density precedence-closed sub-family of sets, where density is the ratio of sub-family union size to cardinality. We provide a greedy factor-sqrt m algorithm for maximizing density; on forests of in-trees, we show this algorithm finds an optimal solution. Harnessing an alternative greedy argument of Chekuri and Kumar for Maximum Coverage with Group Budget Constraints, on forests of out-trees, we design an algorithm with approximation ratio equal to maximum tree height.
Finally, with a reduction from the Planted Dense Subgraph detection problem, we show that its conjectured hardness implies there is no polynomial-time algorithm for PCMSSC with approximation factor in O(m^{1/12-epsilon})
Improved Approximation Algorithms for (Budgeted) Node-weighted Steiner Problems
Moss and Rabani[12] study constrained node-weighted Steiner tree problems
with two independent weight values associated with each node, namely, cost and
prize (or penalty). They give an O(log n)-approximation algorithm for the
prize-collecting node-weighted Steiner tree problem (PCST). They use the
algorithm for PCST to obtain a bicriteria (2, O(log n))-approximation algorithm
for the Budgeted node-weighted Steiner tree problem. Their solution may cost up
to twice the budget, but collects a factor Omega(1/log n) of the optimal prize.
We improve these results from at least two aspects.
Our first main result is a primal-dual O(log h)-approximation algorithm for a
more general problem, prize-collecting node-weighted Steiner forest, where we
have (h) demands each requesting the connectivity of a pair of vertices. Our
algorithm can be seen as a greedy algorithm which reduces the number of demands
by choosing a structure with minimum cost-to-reduction ratio. This natural
style of argument (also used by Klein and Ravi[10] and Guha et al.[8]) leads to
a much simpler algorithm than that of Moss and Rabani[12] for PCST.
Our second main contribution is for the Budgeted node-weighted Steiner tree
problem, which is also an improvement to [12] and [8]. In the unrooted case, we
improve upon an O(log^2(n))-approximation of [8], and present an O(log
n)-approximation algorithm without any budget violation. For the rooted case,
where a specified vertex has to appear in the solution tree, we improve the
bicriteria result of [12] to a bicriteria approximation ratio of (1+eps, O(log
n)/(eps^2)) for any positive (possibly subconstant) (eps). That is, for any
permissible budget violation (1+eps), we present an algorithm achieving a
tradeoff in the guarantee for prize. Indeed, we show that this is almost tight
for the natural linear-programming relaxation used by us as well as in [12].Comment: To appear in ICALP 201
On Kernelization and Approximation for the Vector Connectivity Problem
In the Vector Connectivity problem we are given an undirected graph
, a demand function , and an integer
. The question is whether there exists a set of at most vertices
such that every vertex has at least
vertex-disjoint paths to ; this abstractly captures questions about placing
servers or warehouses relative to demands. The problem is \NP-hard already for
instances with (Cicalese et al., arXiv '14), admits a log-factor
approximation (Boros et al., Networks '14), and is fixed-parameter tractable in
terms of~ (Lokshtanov, unpublished '14). We prove several results regarding
kernelization and approximation for Vector Connectivity and the variant Vector
-Connectivity where the upper bound on demands is a fixed constant. For
Vector -Connectivity we give a factor -approximation algorithm and
construct a vertex-linear kernelization, i.e., an efficient reduction to an
equivalent instance with vertices. For Vector Connectivity we have
a factor -approximation and we can show that it has no
kernelization to size polynomial in or even unless
, making optimal
for Vector -Connectivity. Finally, we provide a write-up for fixed-parameter
tractability of Vector Connectivity() by giving an alternative FPT algorithm
based on matroid intersection.Comment: Non-constructive Kernelization argument, improved technical details
of signature
On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product
In this paper we study the (Bichromatic) Maximum Inner Product Problem (Max-IP), in which we are given sets A and B of vectors, and the goal is to find a in A and b in B maximizing inner product a * b. Max-IP is very basic and serves as the base problem in the recent breakthrough of [Abboud et al., FOCS 2017] on hardness of approximation for polynomial-time problems. It is also used (implicitly) in the argument for hardness of exact l_2-Furthest Pair (and other important problems in computational geometry) in poly-log-log dimensions in [Williams, SODA 2018]. We have three main results regarding this problem.
- Characterization of Multiplicative Approximation. First, we study the best multiplicative approximation ratio for Boolean Max-IP in sub-quadratic time. We show that, for Max-IP with two sets of n vectors from {0,1}^{d}, there is an n^{2 - Omega(1)} time (d/log n)^{Omega(1)}-multiplicative-approximating algorithm, and we show this is conditionally optimal, as such a (d/log n)^{o(1)}-approximating algorithm would refute SETH. Similar characterization is also achieved for additive approximation for Max-IP.
- 2^{O(log^* n)}-dimensional Hardness for Exact Max-IP Over The Integers. Second, we revisit the hardness of solving Max-IP exactly for vectors with integer entries. We show that, under SETH, for Max-IP with sets of n vectors from Z^{d} for some d = 2^{O(log^* n)}, every exact algorithm requires n^{2 - o(1)} time. With the reduction from [Williams, SODA 2018], it follows that l_2-Furthest Pair and Bichromatic l_2-Closest Pair in 2^{O(log^* n)} dimensions require n^{2 - o(1)} time.
- Connection with NP * UPP Communication Protocols. Last, We establish a connection between conditional lower bounds for exact Max-IP with integer entries and NP * UPP communication protocols for Set-Disjointness, parallel to the connection between conditional lower bounds for approximating Max-IP and MA communication protocols for Set-Disjointness.
The lower bound in our first result is a direct corollary of the new MA protocol for Set-Disjointness introduced in [Rubinstein, STOC 2018], and our algorithms utilize the polynomial method and simple random sampling. Our second result follows from a new dimensionality self reduction from the Orthogonal Vectors problem for n vectors from {0,1}^{d} to n vectors from Z^{l} where l = 2^{O(log^* d)}, dramatically improving the previous reduction in [Williams, SODA 2018]. The key technical ingredient is a recursive application of Chinese Remainder Theorem.
As a side product, we obtain an MA communication protocol for Set-Disjointness with complexity O (sqrt{n log n log log n}), slightly improving the O (sqrt{n} log n) bound [Aaronson and Wigderson, TOCT 2009], and approaching the Omega(sqrt{n}) lower bound [Klauck, CCC 2003].
Moreover, we show that (under SETH) one can apply the O(sqrt{n}) BQP communication protocol for Set-Disjointness to prove near-optimal hardness for approximation to Max-IP with vectors in {-1,1}^d. This answers a question from [Abboud et al., FOCS 2017] in the affirmative
Approximating solution structure of the Weighted Sentence Alignment problem
We study the complexity of approximating solution structure of the bijective
weighted sentence alignment problem of DeNero and Klein (2008). In particular,
we consider the complexity of finding an alignment that has a significant
overlap with an optimal alignment. We discuss ways of representing the solution
for the general weighted sentence alignment as well as phrases-to-words
alignment problem, and show that computing a string which agrees with the
optimal sentence partition on more than half (plus an arbitrarily small
polynomial fraction) positions for the phrases-to-words alignment is NP-hard.
For the general weighted sentence alignment we obtain such bound from the
agreement on a little over 2/3 of the bits. Additionally, we generalize the
Hamming distance approximation of a solution structure to approximating it with
respect to the edit distance metric, obtaining similar lower bounds
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or
implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k))
floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data
Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
Low-rank matrix approximations, such as the truncated singular value
decomposition and the rank-revealing QR decomposition, play a central role in
data analysis and scientific computing. This work surveys and extends recent
research which demonstrates that randomization offers a powerful tool for
performing low-rank matrix approximation. These techniques exploit modern
computational architectures more fully than classical methods and open the
possibility of dealing with truly massive data sets.
This paper presents a modular framework for constructing randomized
algorithms that compute partial matrix decompositions. These methods use random
sampling to identify a subspace that captures most of the action of a matrix.
The input matrix is then compressed---either explicitly or implicitly---to this
subspace, and the reduced matrix is manipulated deterministically to obtain the
desired low-rank factorization. In many cases, this approach beats its
classical competitors in terms of accuracy, speed, and robustness. These claims
are supported by extensive numerical experiments and a detailed error analysis
- …