1,159 research outputs found

    Lower bounds for approximation schemes for Closest String

    Get PDF
    In the Closest String problem one is given a family S\mathcal S of equal-length strings over some fixed alphabet, and the task is to find a string yy that minimizes the maximum Hamming distance between yy and a string from S\mathcal S. While polynomial-time approximation schemes (PTASes) for this problem are known for a long time [Li et al., J. ACM'02], no efficient polynomial-time approximation scheme (EPTAS) has been proposed so far. In this paper, we prove that the existence of an EPTAS for Closest String is in fact unlikely, as it would imply that FPT=W[1]\mathrm{FPT}=\mathrm{W}[1], a highly unexpected collapse in the hierarchy of parameterized complexity classes. Our proof also shows that the existence of a PTAS for Closest String with running time f(ε)no(1/ε)f(\varepsilon)\cdot n^{o(1/\varepsilon)}, for any computable function ff, would contradict the Exponential Time Hypothesis

    Approximate Hamming distance in a stream

    Get PDF
    We consider the problem of computing a (1+ϵ)(1+\epsilon)-approximation of the Hamming distance between a pattern of length nn and successive substrings of a stream. We first look at the one-way randomised communication complexity of this problem, giving Alice the first half of the stream and Bob the second half. We show the following: (1) If Alice and Bob both share the pattern then there is an O(ϵ4log2n)O(\epsilon^{-4} \log^2 n) bit randomised one-way communication protocol. (2) If only Alice has the pattern then there is an O(ϵ2nlogn)O(\epsilon^{-2}\sqrt{n}\log n) bit randomised one-way communication protocol. We then go on to develop small space streaming algorithms for (1+ϵ)(1+\epsilon)-approximate Hamming distance which give worst case running time guarantees per arriving symbol. (1) For binary input alphabets there is an O(ϵ3nlog2n)O(\epsilon^{-3} \sqrt{n} \log^{2} n) space and O(ϵ2logn)O(\epsilon^{-2} \log{n}) time streaming (1+ϵ)(1+\epsilon)-approximate Hamming distance algorithm. (2) For general input alphabets there is an O(ϵ5nlog4n)O(\epsilon^{-5} \sqrt{n} \log^{4} n) space and O(ϵ4log3n)O(\epsilon^{-4} \log^3 {n}) time streaming (1+ϵ)(1+\epsilon)-approximate Hamming distance algorithm.Comment: Submitted to ICALP' 201

    Online Pattern Matching for String Edit Distance with Moves

    Full text link
    Edit distance with moves (EDM) is a string-to-string distance measure that includes substring moves in addition to ordinal editing operations to turn one string to the other. Although optimizing EDM is intractable, it has many applications especially in error detections. Edit sensitive parsing (ESP) is an efficient parsing algorithm that guarantees an upper bound of parsing discrepancies between different appearances of the same substrings in a string. ESP can be used for computing an approximate EDM as the L1 distance between characteristic vectors built by node labels in parsing trees. However, ESP is not applicable to a streaming text data where a whole text is unknown in advance. We present an online ESP (OESP) that enables an online pattern matching for EDM. OESP builds a parse tree for a streaming text and computes the L1 distance between characteristic vectors in an online manner. For the space-efficient computation of EDM, OESP directly encodes the parse tree into a succinct representation by leveraging the idea behind recent results of a dynamic succinct tree. We experimentally test OESP on the ability to compute EDM in an online manner on benchmark datasets, and we show OESP's efficiency.Comment: This paper has been accepted to the 21st edition of the International Symposium on String Processing and Information Retrieval (SPIRE2014

    On Computing Centroids According to the p-Norms of Hamming Distance Vectors

    Get PDF
    In this paper we consider the p-Norm Hamming Centroid problem which asks to determine whether some given strings have a centroid with a bound on the p-norm of its Hamming distances to the strings. Specifically, given a set S of strings and a real k, we consider the problem of determining whether there exists a string s^* with (sum_{s in S} d^{p}(s^*,s))^(1/p) <=k, where d(,) denotes the Hamming distance metric. This problem has important applications in data clustering and multi-winner committee elections, and is a generalization of the well-known polynomial-time solvable Consensus String (p=1) problem, as well as the NP-hard Closest String (p=infty) problem. Our main result shows that the problem is NP-hard for all fixed rational p > 1, closing the gap for all rational values of p between 1 and infty. Under standard complexity assumptions the reduction also implies that the problem has no 2^o(n+m)-time or 2^o(k^(p/(p+1)))-time algorithm, where m denotes the number of input strings and n denotes the length of each string, for any fixed p > 1. The first bound matches a straightforward brute-force algorithm. The second bound is tight in the sense that for each fixed epsilon > 0, we provide a 2^(k^(p/((p+1))+epsilon))-time algorithm. In the last part of the paper, we complement our hardness result by presenting a fixed-parameter algorithm and a factor-2 approximation algorithm for the problem

    Approximation and Parameterized Complexity of Minimax Approval Voting

    Full text link
    We present three results on the complexity of Minimax Approval Voting. First, we study Minimax Approval Voting parameterized by the Hamming distance dd from the solution to the votes. We show Minimax Approval Voting admits no algorithm running in time O(2o(dlogd))\mathcal{O}^\star(2^{o(d\log d)}), unless the Exponential Time Hypothesis (ETH) fails. This means that the O(d2d)\mathcal{O}^\star(d^{2d}) algorithm of Misra et al. [AAMAS 2015] is essentially optimal. Motivated by this, we then show a parameterized approximation scheme, running in time O((3/ϵ)2d)\mathcal{O}^\star(\left({3}/{\epsilon}\right)^{2d}), which is essentially tight assuming ETH. Finally, we get a new polynomial-time randomized approximation scheme for Minimax Approval Voting, which runs in time nO(1/ϵ2log(1/ϵ))poly(m)n^{\mathcal{O}(1/\epsilon^2 \cdot \log(1/\epsilon))} \cdot \mathrm{poly}(m), almost matching the running time of the fastest known PTAS for Closest String due to Ma and Sun [SIAM J. Comp. 2009].Comment: 14 pages, 3 figures, 2 pseudocode

    Streaming and Small Space Approximation Algorithms for Edit Distance and Longest Common Subsequence

    Get PDF
    corecore