23,453 research outputs found

    Consensus Strings with Small Maximum Distance and Small Distance Sum

    Get PDF
    The parameterised complexity of consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds

    Lower bounds for approximation schemes for Closest String

    Get PDF
    In the Closest String problem one is given a family S\mathcal S of equal-length strings over some fixed alphabet, and the task is to find a string yy that minimizes the maximum Hamming distance between yy and a string from S\mathcal S. While polynomial-time approximation schemes (PTASes) for this problem are known for a long time [Li et al., J. ACM'02], no efficient polynomial-time approximation scheme (EPTAS) has been proposed so far. In this paper, we prove that the existence of an EPTAS for Closest String is in fact unlikely, as it would imply that FPT=W[1]\mathrm{FPT}=\mathrm{W}[1], a highly unexpected collapse in the hierarchy of parameterized complexity classes. Our proof also shows that the existence of a PTAS for Closest String with running time f(Δ)⋅no(1/Δ)f(\varepsilon)\cdot n^{o(1/\varepsilon)}, for any computable function ff, would contradict the Exponential Time Hypothesis

    On Computing Centroids According to the p-Norms of Hamming Distance Vectors

    Get PDF
    In this paper we consider the p-Norm Hamming Centroid problem which asks to determine whether some given strings have a centroid with a bound on the p-norm of its Hamming distances to the strings. Specifically, given a set S of strings and a real k, we consider the problem of determining whether there exists a string s^* with (sum_{s in S} d^{p}(s^*,s))^(1/p) <=k, where d(,) denotes the Hamming distance metric. This problem has important applications in data clustering and multi-winner committee elections, and is a generalization of the well-known polynomial-time solvable Consensus String (p=1) problem, as well as the NP-hard Closest String (p=infty) problem. Our main result shows that the problem is NP-hard for all fixed rational p > 1, closing the gap for all rational values of p between 1 and infty. Under standard complexity assumptions the reduction also implies that the problem has no 2^o(n+m)-time or 2^o(k^(p/(p+1)))-time algorithm, where m denotes the number of input strings and n denotes the length of each string, for any fixed p > 1. The first bound matches a straightforward brute-force algorithm. The second bound is tight in the sense that for each fixed epsilon > 0, we provide a 2^(k^(p/((p+1))+epsilon))-time algorithm. In the last part of the paper, we complement our hardness result by presenting a fixed-parameter algorithm and a factor-2 approximation algorithm for the problem

    Interacting Agents and Continuous Opinions Dynamics

    Full text link
    We present a model of opinion dynamics in which agents adjust continuous opinions as a result of random binary encounters whenever their difference in opinion is below a given threshold. High thresholds yield convergence of opinions towards an average opinion, whereas low thresholds result in several opinion clusters. The model is further generalised to network interactions, threshold heterogeneity, adaptive thresholds and binary strings of opinions.Comment: 21 pages, 13 figures. http://www.lps.ens.fr/~weisbuch/contopidyn/contopidyn.htm

    Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies

    Get PDF
    Existing sequence alignment algorithms use heuristic scoring schemes which cannot be used as objective distance metrics. Therefore one relies on measures like the p- or log-det distances, or makes explicit, and often simplistic, assumptions about sequence evolution. Information theory provides an alternative, in the form of mutual information (MI) which is, in principle, an objective and model independent similarity measure. MI can be estimated by concatenating and zipping sequences, yielding thereby the "normalized compression distance". So far this has produced promising results, but with uncontrolled errors. We describe a simple approach to get robust estimates of MI from global pairwise alignments. Using standard alignment algorithms, this gives for animal mitochondrial DNA estimates that are strikingly close to estimates obtained from the alignment free methods mentioned above. Our main result uses algorithmic (Kolmogorov) information theory, but we show that similar results can also be obtained from Shannon theory. Due to the fact that it is not additive, normalized compression distance is not an optimal metric for phylogenetics, but we propose a simple modification that overcomes the issue of additivity. We test several versions of our MI based distance measures on a large number of randomly chosen quartets and demonstrate that they all perform better than traditional measures like the Kimura or log-det (resp. paralinear) distances. Even a simplified version based on single letter Shannon entropies, which can be easily incorporated in existing software packages, gave superior results throughout the entire animal kingdom. But we see the main virtue of our approach in a more general way. For example, it can also help to judge the relative merits of different alignment algorithms, by estimating the significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia

    Approximation and Parameterized Complexity of Minimax Approval Voting

    Full text link
    We present three results on the complexity of Minimax Approval Voting. First, we study Minimax Approval Voting parameterized by the Hamming distance dd from the solution to the votes. We show Minimax Approval Voting admits no algorithm running in time O⋆(2o(dlog⁥d))\mathcal{O}^\star(2^{o(d\log d)}), unless the Exponential Time Hypothesis (ETH) fails. This means that the O⋆(d2d)\mathcal{O}^\star(d^{2d}) algorithm of Misra et al. [AAMAS 2015] is essentially optimal. Motivated by this, we then show a parameterized approximation scheme, running in time O⋆((3/Ï”)2d)\mathcal{O}^\star(\left({3}/{\epsilon}\right)^{2d}), which is essentially tight assuming ETH. Finally, we get a new polynomial-time randomized approximation scheme for Minimax Approval Voting, which runs in time nO(1/Ï”2⋅log⁥(1/Ï”))⋅poly(m)n^{\mathcal{O}(1/\epsilon^2 \cdot \log(1/\epsilon))} \cdot \mathrm{poly}(m), almost matching the running time of the fastest known PTAS for Closest String due to Ma and Sun [SIAM J. Comp. 2009].Comment: 14 pages, 3 figures, 2 pseudocode
    • 

    corecore