202 research outputs found

    Distributed PCP Theorems for Hardness of Approximation in P

    Get PDF
    We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment x{0,1}nx \in \{0,1\}^n to a CNF formula φ\varphi is shared between two parties, where Alice knows x1,,xn/2x_1, \dots, x_{n/2}, Bob knows xn/2+1,,xnx_{n/2+1},\dots,x_n, and both parties know φ\varphi. The goal is to have Alice and Bob jointly write a PCP that xx satisfies φ\varphi, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of xx. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of 2(logn)1o(1)2^{(\log n)^{1-o(1)}}; only (1+o(1))(1+o(1))-factor lower bounds (under SETH) were known before

    Four Soviets Walk the Dog-Improved Bounds for Computing the Fr\'echet Distance

    Get PDF
    Given two polygonal curves in the plane, there are many ways to define a notion of similarity between them. One popular measure is the Fr\'echet distance. Since it was proposed by Alt and Godau in 1992, many variants and extensions have been studied. Nonetheless, even more than 20 years later, the original O(n2logn)O(n^2 \log n) algorithm by Alt and Godau for computing the Fr\'echet distance remains the state of the art (here, nn denotes the number of edges on each curve). This has led Helmut Alt to conjecture that the associated decision problem is 3SUM-hard. In recent work, Agarwal et al. show how to break the quadratic barrier for the discrete version of the Fr\'echet distance, where one considers sequences of points instead of polygonal curves. Building on their work, we give a randomized algorithm to compute the Fr\'echet distance between two polygonal curves in time O(n2logn(loglogn)3/2)O(n^2 \sqrt{\log n}(\log\log n)^{3/2}) on a pointer machine and in time O(n2(loglogn)2)O(n^2(\log\log n)^2) on a word RAM. Furthermore, we show that there exists an algebraic decision tree for the decision problem of depth O(n2ε)O(n^{2-\varepsilon}), for some ε>0\varepsilon > 0. We believe that this reveals an intriguing new aspect of this well-studied problem. Finally, we show how to obtain the first subquadratic algorithm for computing the weak Fr\'echet distance on a word RAM.Comment: 34 pages, 15 figures. A preliminary version appeared in SODA 201

    Optimization with Sparsity-Inducing Penalties

    Get PDF
    Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted 2\ell_2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

    Improved Approximation for Longest Common Subsequence over Small Alphabets

    Get PDF
    This paper investigates the approximability of the Longest Common Subsequence (LCS) problem. The fastest algorithm for solving the LCS problem exactly runs in essentially quadratic time in the length of the input, and it is known that under the Strong Exponential Time Hypothesis the quadratic running time cannot be beaten. There are no such limitations for the approximate computation of the LCS however, except in some limited scenarios. There is also a scarcity of approximation algorithms. When the two given strings are over an alphabet of size k, returning the subsequence formed by the most frequent symbol occurring in both strings achieves a 1/k approximation for the LCS. It is an open problem whether a better than 1/k approximation can be achieved in truly subquadratic time (O(n^{2-?}) time for constant ? > 0). A recent result [Rubinstein and Song SODA\u272020] showed that a 1/2+? approximation for the LCS over a binary alphabet is possible in truly subquadratic time, provided the input strings have the same length. In this paper we show that if a 1/2+? approximation (for ? > 0) is achievable for binary LCS in truly subquadratic time when the input strings can be unequal, then for every constant k, there is a truly subquadratic time algorithm that achieves a 1/k+? approximation for k-ary alphabet LCS for some ? > 0. Thus the binary case is the hardest. We also show that for every constant k, if one is given two strings of equal length over a k-ary alphabet, one can obtain a 1/k+? approximation for some constant ? > 0 in truly subquadratic time, thus extending the Rubinstein and Song result to all alphabets of constant size

    Approximating Approximate Pattern Matching

    Full text link
    Given a text TT of length nn and a pattern PP of length mm, the approximate pattern matching problem asks for computation of a particular \emph{distance} function between PP and every mm-substring of TT. We consider a (1±ε)(1\pm\varepsilon) multiplicative approximation variant of this problem, for p\ell_p distance function. In this paper, we describe two (1+ε)(1+\varepsilon)-approximate algorithms with a runtime of O~(nε)\widetilde{O}(\frac{n}{\varepsilon}) for all (constant) non-negative values of pp. For constant p1p \ge 1 we show a deterministic (1+ε)(1+\varepsilon)-approximation algorithm. Previously, such run time was known only for the case of 1\ell_1 distance, by Gawrychowski and Uzna\'nski [ICALP 2018] and only with a randomized algorithm. For constant 0p10 \le p \le 1 we show a randomized algorithm for the p\ell_p, thereby providing a smooth tradeoff between algorithms of Kopelowitz and Porat [FOCS~2015, SOSA~2018] for Hamming distance (case of p=0p=0) and of Gawrychowski and Uzna\'nski for 1\ell_1 distance

    On the Complexity of Exact Pattern Matching in Graphs: Binary Strings and Bounded Degree

    Get PDF
    Exact pattern matching in labeled graphs is the problem of searching paths of a graph G=(V,E)G=(V,E) that spell the same string as the pattern P[1..m]P[1..m]. This basic problem can be found at the heart of more complex operations on variation graphs in computational biology, of query operations in graph databases, and of analysis operations in heterogeneous networks, where the nodes of some paths must match a sequence of labels or types. We describe a simple conditional lower bound that, for any constant ϵ>0\epsilon>0, an O(E1ϵm)O(|E|^{1 - \epsilon} \, m)-time or an O(Em1ϵ)O(|E| \, m^{1 - \epsilon})-time algorithm for exact pattern matching on graphs, with node labels and patterns drawn from a binary alphabet, cannot be achieved unless the Strong Exponential Time Hypothesis (SETH) is false. The result holds even if restricted to undirected graphs of maximum degree three or directed acyclic graphs of maximum sum of indegree and outdegree three. Although a conditional lower bound of this kind can be somehow derived from previous results (Backurs and Indyk, FOCS'16), we give a direct reduction from SETH for dissemination purposes, as the result might interest researchers from several areas, such as computational biology, graph database, and graph mining, as mentioned before. Indeed, as approximate pattern matching on graphs can be solved in O(Em)O(|E|\,m) time, exact and approximate matching are thus equally hard (quadratic time) on graphs under the SETH assumption. In comparison, the same problems restricted to strings have linear time vs quadratic time solutions, respectively, where the latter ones have a matching SETH lower bound on computing the edit distance of two strings (Backurs and Indyk, STOC'15).Comment: Using Lemma 12 and Lemma 13 might to be enough to prove Lemma 14. However, the proof of Lemma 14 is correct if you assume that the graph used in the reduction is a DAG. Hence, since the problem is already quadratic for a DAG and a binary alphabet, it has to be quadratic also for a general graph and a binary alphabe
    corecore