23,453 research outputs found
Consensus Strings with Small Maximum Distance and Small Distance Sum
The parameterised complexity of consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds
Lower bounds for approximation schemes for Closest String
In the Closest String problem one is given a family of
equal-length strings over some fixed alphabet, and the task is to find a string
that minimizes the maximum Hamming distance between and a string from
. While polynomial-time approximation schemes (PTASes) for this
problem are known for a long time [Li et al., J. ACM'02], no efficient
polynomial-time approximation scheme (EPTAS) has been proposed so far. In this
paper, we prove that the existence of an EPTAS for Closest String is in fact
unlikely, as it would imply that , a highly
unexpected collapse in the hierarchy of parameterized complexity classes. Our
proof also shows that the existence of a PTAS for Closest String with running
time , for any computable function
, would contradict the Exponential Time Hypothesis
On Computing Centroids According to the p-Norms of Hamming Distance Vectors
In this paper we consider the p-Norm Hamming Centroid problem which asks to determine whether some given strings have a centroid with a bound on the p-norm of its Hamming distances to the strings. Specifically, given a set S of strings and a real k, we consider the problem of determining whether there exists a string s^* with (sum_{s in S} d^{p}(s^*,s))^(1/p) <=k, where d(,) denotes the Hamming distance metric. This problem has important applications in data clustering and multi-winner committee elections, and is a generalization of the well-known polynomial-time solvable Consensus String (p=1) problem, as well as the NP-hard Closest String (p=infty) problem.
Our main result shows that the problem is NP-hard for all fixed rational p > 1, closing the gap for all rational values of p between 1 and infty. Under standard complexity assumptions the reduction also implies that the problem has no 2^o(n+m)-time or 2^o(k^(p/(p+1)))-time algorithm, where m denotes the number of input strings and n denotes the length of each string, for any fixed p > 1. The first bound matches a straightforward brute-force algorithm. The second bound is tight in the sense that for each fixed epsilon > 0, we provide a 2^(k^(p/((p+1))+epsilon))-time algorithm. In the last part of the paper, we complement our hardness result by presenting a fixed-parameter algorithm and a factor-2 approximation algorithm for the problem
Interacting Agents and Continuous Opinions Dynamics
We present a model of opinion dynamics in which agents adjust continuous
opinions as a result of random binary encounters whenever their difference in
opinion is below a given threshold. High thresholds yield convergence of
opinions towards an average opinion, whereas low thresholds result in several
opinion clusters. The model is further generalised to network interactions,
threshold heterogeneity, adaptive thresholds and binary strings of opinions.Comment: 21 pages, 13 figures.
http://www.lps.ens.fr/~weisbuch/contopidyn/contopidyn.htm
Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies
Existing sequence alignment algorithms use heuristic scoring schemes which
cannot be used as objective distance metrics. Therefore one relies on measures
like the p- or log-det distances, or makes explicit, and often simplistic,
assumptions about sequence evolution. Information theory provides an
alternative, in the form of mutual information (MI) which is, in principle, an
objective and model independent similarity measure. MI can be estimated by
concatenating and zipping sequences, yielding thereby the "normalized
compression distance". So far this has produced promising results, but with
uncontrolled errors. We describe a simple approach to get robust estimates of
MI from global pairwise alignments. Using standard alignment algorithms, this
gives for animal mitochondrial DNA estimates that are strikingly close to
estimates obtained from the alignment free methods mentioned above. Our main
result uses algorithmic (Kolmogorov) information theory, but we show that
similar results can also be obtained from Shannon theory. Due to the fact that
it is not additive, normalized compression distance is not an optimal metric
for phylogenetics, but we propose a simple modification that overcomes the
issue of additivity. We test several versions of our MI based distance measures
on a large number of randomly chosen quartets and demonstrate that they all
perform better than traditional measures like the Kimura or log-det (resp.
paralinear) distances. Even a simplified version based on single letter Shannon
entropies, which can be easily incorporated in existing software packages, gave
superior results throughout the entire animal kingdom. But we see the main
virtue of our approach in a more general way. For example, it can also help to
judge the relative merits of different alignment algorithms, by estimating the
significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia
Approximation and Parameterized Complexity of Minimax Approval Voting
We present three results on the complexity of Minimax Approval Voting. First,
we study Minimax Approval Voting parameterized by the Hamming distance from
the solution to the votes. We show Minimax Approval Voting admits no algorithm
running in time , unless the Exponential
Time Hypothesis (ETH) fails. This means that the
algorithm of Misra et al. [AAMAS 2015] is essentially optimal. Motivated by
this, we then show a parameterized approximation scheme, running in time
, which is essentially
tight assuming ETH. Finally, we get a new polynomial-time randomized
approximation scheme for Minimax Approval Voting, which runs in time
,
almost matching the running time of the fastest known PTAS for Closest String
due to Ma and Sun [SIAM J. Comp. 2009].Comment: 14 pages, 3 figures, 2 pseudocode
- âŠ