221 research outputs found
Consensus Strings with Small Maximum Distance and Small Distance Sum
The parameterised complexity of consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds
A Computational Method for the Rate Estimation of Evolutionary Transpositions
Genome rearrangements are evolutionary events that shuffle genomic
architectures. Most frequent genome rearrangements are reversals,
translocations, fusions, and fissions. While there are some more complex genome
rearrangements such as transpositions, they are rarely observed and believed to
constitute only a small fraction of genome rearrangements happening in the
course of evolution. The analysis of transpositions is further obfuscated by
intractability of the underlying computational problems.
We propose a computational method for estimating the rate of transpositions
in evolutionary scenarios between genomes. We applied our method to a set of
mammalian genomes and estimated the transpositions rate in mammalian evolution
to be around 0.26.Comment: Proceedings of the 3rd International Work-Conference on
Bioinformatics and Biomedical Engineering (IWBBIO), 2015. (to appear
Inapproximability of maximal strip recovery
In comparative genomic, the first step of sequence analysis is usually to
decompose two or more genomes into syntenic blocks that are segments of
homologous chromosomes. For the reliable recovery of syntenic blocks, noise and
ambiguities in the genomic maps need to be removed first. Maximal Strip
Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff
for reliably recovering syntenic blocks from genomic maps in the midst of noise
and ambiguities. Given genomic maps as sequences of gene markers, the
objective of \msr{d} is to find subsequences, one subsequence of each
genomic map, such that the total length of syntenic blocks in these
subsequences is maximized. For any constant , a polynomial-time
2d-approximation for \msr{d} was previously known. In this paper, we show that
for any , \msr{d} is APX-hard, even for the most basic version of the
problem in which all gene markers are distinct and appear in positive
orientation in each genomic map. Moreover, we provide the first explicit lower
bounds on approximating \msr{d} for all . In particular, we show that
\msr{d} is NP-hard to approximate within . From the other
direction, we show that the previous 2d-approximation for \msr{d} can be
optimized into a polynomial-time algorithm even if is not a constant but is
part of the input. We then extend our inapproximability results to several
related problems including \cmsr{d}, \gapmsr{\delta}{d}, and
\gapcmsr{\delta}{d}.Comment: A preliminary version of this paper appeared in two parts in the
Proceedings of the 20th International Symposium on Algorithms and Computation
(ISAAC 2009) and the Proceedings of the 4th International Frontiers of
Algorithmics Workshop (FAW 2010
Approximating Weighted Duo-Preservation in Comparative Genomics
Motivated by comparative genomics, Chen et al. [9] introduced the Maximum
Duo-preservation String Mapping (MDSM) problem in which we are given two
strings and from the same alphabet and the goal is to find a
mapping between them so as to maximize the number of duos preserved. A
duo is any two consecutive characters in a string and it is preserved in the
mapping if its two consecutive characters in are mapped to same two
consecutive characters in . The MDSM problem is known to be NP-hard and
there are approximation algorithms for this problem [3, 5, 13], but all of them
consider only the "unweighted" version of the problem in the sense that a duo
from is preserved by mapping to any same duo in regardless of their
positions in the respective strings. However, it is well-desired in comparative
genomics to find mappings that consider preserving duos that are "closer" to
each other under some distance measure [19]. In this paper, we introduce a
generalized version of the problem, called the Maximum-Weight Duo-preservation
String Mapping (MWDSM) problem that captures both duos-preservation and
duos-distance measures in the sense that mapping a duo from to each
preserved duo in has a weight, indicating the "closeness" of the two
duos. The objective of the MWDSM problem is to find a mapping so as to maximize
the total weight of preserved duos. In this paper, we give a polynomial-time
6-approximation algorithm for this problem.Comment: Appeared in proceedings of the 23rd International Computing and
Combinatorics Conference (COCOON 2017
Reversal Distances for Strings with Few Blocks or Small Alphabets
International audienceWe study the String Reversal Distance problem, an extension of the well-known Sorting by Reversals problem. String Reversal Distance takes two strings S and T as input, and asks for a minimum number of reversals to obtain T from S. We consider four variants: String Reversal Distance, String Prefix Reversal Distance (in which any reversal must include the first letter of the string), and the signed variants of these problems, namely Signed String Reversal Distance and Signed String Prefix Reversal Distance. We study algorithmic properties of these four problems, in connection with two parameters of the input strings: the number of blocks they contain (a block being maximal substring such that all letters in the substring are equal), and the alphabet size Σ. For instance, we show that Signed String Reversal Distance and Signed String Prefix Reversal Distance are NP-hard even if the input strings have only one letter
A combinatorial algorithm for microbial consortia synthetic design
International audienceSynthetic biology has boomed since the early 2000s when it started being shown that it was possible to efficiently synthetize compounds of interest in a much more rapid and effective way by using other organisms than those naturally producing them. However, to thus engineer a single organism, often a microbe, to optimise one or a collection of metabolic tasks may lead to difficulties when attempting to obtain a production system that is efficient, or to avoid toxic effects for the recruited microorganism. The idea of using instead a microbial consortium has thus started being developed in the last decade. This was motivated by the fact that such consortia may perform more complicated functions than could single populations and be more robust to environmental fluctuations. Success is however not always guaranteed. In particular, establishing which consortium is best for the production of a given compound or set thereof remains a great challenge. This is the problem we address in this paper. We thus introduce an initial model and a method that enable to propose a consortium to synthetically produce compounds that are either exogenous to it, or are endogenous but where interaction among the species in the consortium could improve the production line. Synthetic biology has been defined by the European Commission as " the application of science, technology, and engineering to facilitate and accelerate the design, manufacture, and/or modification of genetic materials in living organisms to alter living or nonliving materials ". It is a field that has boomed since the early 2000s when in particular Jay Keasling showed that it was possible to efficiently synthetise a compound–artemisinic acid–which after a few more tricks then leads to an effective anti-malaria drug, artemisini
More Natural Models of Electoral Control by Partition
"Control" studies attempts to set the outcome of elections through the
addition, deletion, or partition of voters or candidates. The set of benchmark
control types was largely set in the seminal 1992 paper by Bartholdi, Tovey,
and Trick that introduced control, and there now is a large literature studying
how many of the benchmark types various election systems are vulnerable to,
i.e., have polynomial-time attack algorithms for.
However, although the longstanding benchmark models of addition and deletion
model relatively well the real-world settings that inspire them, the
longstanding benchmark models of partition model settings that are arguably
quite distant from those they seek to capture.
In this paper, we introduce--and for some important cases analyze the
complexity of--new partition models that seek to better capture many real-world
partition settings. In particular, in many partition settings one wants the two
parts of the partition to be of (almost) equal size, or is partitioning into
more than two parts, or has groups of actors who must be placed in the same
part of the partition. Our hope is that having these new partition types will
allow studies of control attacks to include such models that more realistically
capture many settings
Treatment of electrical status epilepticus in sleep : A pooled analysis of 575 cases
OBJECTIVE: Epileptic encephalopathy with electrical status epilepticus in sleep (ESES) is a pediatric epilepsy syndrome with sleep-induced epileptic discharges and acquired impairment of cognition or behavior. Treatment of ESES is assumed to improve cognitive outcome. The aim of this study is to create an overview of the current evidence for different treatment regimens in children with ESES syndrome. METHODS: A literature search using PubMed and Embase was performed. Articles were selected that contain original treatment data of patients with ESES syndrome. Authors were contacted for additional information. Individual patient data were collected, coded, and analyzed using logistic regression analysis. The three predefined main outcome measures were improvement in cognitive function, electroencephalography (EEG) pattern, and any improvement (cognition or EEG). RESULTS: The literature search yielded 1,766 articles. After applying inclusion and exclusion criteria, 112 articles and 950 treatments in 575 patients could be analyzed. Antiepileptic drugs (AEDs, n = 495) were associated with improvement (i.e., cognition or EEG) in 49% of patients, benzodiazepines (n = 171) in 68%, and steroids (n = 166) in 81%. Surgery (n = 62) resulted in improvement in 90% of patients. In a subgroup analysis of patients who were consecutively reported (585 treatments in 282 patients), we found improvement in a smaller proportion treated with AEDs (34%), benzodiazepines (59%), and steroids (75%), whereas the improvement percentage after surgery was preserved (93%). Possible predictors of improved outcome were treatment category, normal development before ESES onset, and the absence of structural abnormalities. SIGNIFICANCE: Although most included studies were small and retrospective and their heterogeneity allowed analysis of only qualitative outcome data, this pooled analysis suggests superior efficacy of steroids and surgery in encephalopathy with ESES
Triangle Counting in Dynamic Graph Streams
Estimating the number of triangles in graph streams using a limited amount of
memory has become a popular topic in the last decade. Different variations of
the problem have been studied, depending on whether the graph edges are
provided in an arbitrary order or as incidence lists. However, with a few
exceptions, the algorithms have considered {\em insert-only} streams. We
present a new algorithm estimating the number of triangles in {\em dynamic}
graph streams where edges can be both inserted and deleted. We show that our
algorithm achieves better time and space complexity than previous solutions for
various graph classes, for example sparse graphs with a relatively small number
of triangles. Also, for graphs with constant transitivity coefficient, a common
situation in real graphs, this is the first algorithm achieving constant
processing time per edge. The result is achieved by a novel approach combining
sampling of vertex triples and sparsification of the input graph. In the course
of the analysis of the algorithm we present a lower bound on the number of
pairwise independent 2-paths in general graphs which might be of independent
interest. At the end of the paper we discuss lower bounds on the space
complexity of triangle counting algorithms that make no assumptions on the
structure of the graph.Comment: New version of a SWAT 2014 paper with improved result
- …