221 research outputs found

    Consensus Strings with Small Maximum Distance and Small Distance Sum

    Get PDF
    The parameterised complexity of consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds

    A Computational Method for the Rate Estimation of Evolutionary Transpositions

    Full text link
    Genome rearrangements are evolutionary events that shuffle genomic architectures. Most frequent genome rearrangements are reversals, translocations, fusions, and fissions. While there are some more complex genome rearrangements such as transpositions, they are rarely observed and believed to constitute only a small fraction of genome rearrangements happening in the course of evolution. The analysis of transpositions is further obfuscated by intractability of the underlying computational problems. We propose a computational method for estimating the rate of transpositions in evolutionary scenarios between genomes. We applied our method to a set of mammalian genomes and estimated the transpositions rate in mammalian evolution to be around 0.26.Comment: Proceedings of the 3rd International Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO), 2015. (to appear

    Inapproximability of maximal strip recovery

    Get PDF
    In comparative genomic, the first step of sequence analysis is usually to decompose two or more genomes into syntenic blocks that are segments of homologous chromosomes. For the reliable recovery of syntenic blocks, noise and ambiguities in the genomic maps need to be removed first. Maximal Strip Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff for reliably recovering syntenic blocks from genomic maps in the midst of noise and ambiguities. Given dd genomic maps as sequences of gene markers, the objective of \msr{d} is to find dd subsequences, one subsequence of each genomic map, such that the total length of syntenic blocks in these subsequences is maximized. For any constant d2d \ge 2, a polynomial-time 2d-approximation for \msr{d} was previously known. In this paper, we show that for any d2d \ge 2, \msr{d} is APX-hard, even for the most basic version of the problem in which all gene markers are distinct and appear in positive orientation in each genomic map. Moreover, we provide the first explicit lower bounds on approximating \msr{d} for all d2d \ge 2. In particular, we show that \msr{d} is NP-hard to approximate within Ω(d/logd)\Omega(d/\log d). From the other direction, we show that the previous 2d-approximation for \msr{d} can be optimized into a polynomial-time algorithm even if dd is not a constant but is part of the input. We then extend our inapproximability results to several related problems including \cmsr{d}, \gapmsr{\delta}{d}, and \gapcmsr{\delta}{d}.Comment: A preliminary version of this paper appeared in two parts in the Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC 2009) and the Proceedings of the 4th International Frontiers of Algorithmics Workshop (FAW 2010

    Approximating Weighted Duo-Preservation in Comparative Genomics

    Full text link
    Motivated by comparative genomics, Chen et al. [9] introduced the Maximum Duo-preservation String Mapping (MDSM) problem in which we are given two strings s1s_1 and s2s_2 from the same alphabet and the goal is to find a mapping π\pi between them so as to maximize the number of duos preserved. A duo is any two consecutive characters in a string and it is preserved in the mapping if its two consecutive characters in s1s_1 are mapped to same two consecutive characters in s2s_2. The MDSM problem is known to be NP-hard and there are approximation algorithms for this problem [3, 5, 13], but all of them consider only the "unweighted" version of the problem in the sense that a duo from s1s_1 is preserved by mapping to any same duo in s2s_2 regardless of their positions in the respective strings. However, it is well-desired in comparative genomics to find mappings that consider preserving duos that are "closer" to each other under some distance measure [19]. In this paper, we introduce a generalized version of the problem, called the Maximum-Weight Duo-preservation String Mapping (MWDSM) problem that captures both duos-preservation and duos-distance measures in the sense that mapping a duo from s1s_1 to each preserved duo in s2s_2 has a weight, indicating the "closeness" of the two duos. The objective of the MWDSM problem is to find a mapping so as to maximize the total weight of preserved duos. In this paper, we give a polynomial-time 6-approximation algorithm for this problem.Comment: Appeared in proceedings of the 23rd International Computing and Combinatorics Conference (COCOON 2017

    Reversal Distances for Strings with Few Blocks or Small Alphabets

    Get PDF
    International audienceWe study the String Reversal Distance problem, an extension of the well-known Sorting by Reversals problem. String Reversal Distance takes two strings S and T as input, and asks for a minimum number of reversals to obtain T from S. We consider four variants: String Reversal Distance, String Prefix Reversal Distance (in which any reversal must include the first letter of the string), and the signed variants of these problems, namely Signed String Reversal Distance and Signed String Prefix Reversal Distance. We study algorithmic properties of these four problems, in connection with two parameters of the input strings: the number of blocks they contain (a block being maximal substring such that all letters in the substring are equal), and the alphabet size Σ. For instance, we show that Signed String Reversal Distance and Signed String Prefix Reversal Distance are NP-hard even if the input strings have only one letter

    A combinatorial algorithm for microbial consortia synthetic design

    Get PDF
    International audienceSynthetic biology has boomed since the early 2000s when it started being shown that it was possible to efficiently synthetize compounds of interest in a much more rapid and effective way by using other organisms than those naturally producing them. However, to thus engineer a single organism, often a microbe, to optimise one or a collection of metabolic tasks may lead to difficulties when attempting to obtain a production system that is efficient, or to avoid toxic effects for the recruited microorganism. The idea of using instead a microbial consortium has thus started being developed in the last decade. This was motivated by the fact that such consortia may perform more complicated functions than could single populations and be more robust to environmental fluctuations. Success is however not always guaranteed. In particular, establishing which consortium is best for the production of a given compound or set thereof remains a great challenge. This is the problem we address in this paper. We thus introduce an initial model and a method that enable to propose a consortium to synthetically produce compounds that are either exogenous to it, or are endogenous but where interaction among the species in the consortium could improve the production line. Synthetic biology has been defined by the European Commission as " the application of science, technology, and engineering to facilitate and accelerate the design, manufacture, and/or modification of genetic materials in living organisms to alter living or nonliving materials ". It is a field that has boomed since the early 2000s when in particular Jay Keasling showed that it was possible to efficiently synthetise a compound–artemisinic acid–which after a few more tricks then leads to an effective anti-malaria drug, artemisini

    More Natural Models of Electoral Control by Partition

    Full text link
    "Control" studies attempts to set the outcome of elections through the addition, deletion, or partition of voters or candidates. The set of benchmark control types was largely set in the seminal 1992 paper by Bartholdi, Tovey, and Trick that introduced control, and there now is a large literature studying how many of the benchmark types various election systems are vulnerable to, i.e., have polynomial-time attack algorithms for. However, although the longstanding benchmark models of addition and deletion model relatively well the real-world settings that inspire them, the longstanding benchmark models of partition model settings that are arguably quite distant from those they seek to capture. In this paper, we introduce--and for some important cases analyze the complexity of--new partition models that seek to better capture many real-world partition settings. In particular, in many partition settings one wants the two parts of the partition to be of (almost) equal size, or is partitioning into more than two parts, or has groups of actors who must be placed in the same part of the partition. Our hope is that having these new partition types will allow studies of control attacks to include such models that more realistically capture many settings

    Treatment of electrical status epilepticus in sleep : A pooled analysis of 575 cases

    Get PDF
    OBJECTIVE: Epileptic encephalopathy with electrical status epilepticus in sleep (ESES) is a pediatric epilepsy syndrome with sleep-induced epileptic discharges and acquired impairment of cognition or behavior. Treatment of ESES is assumed to improve cognitive outcome. The aim of this study is to create an overview of the current evidence for different treatment regimens in children with ESES syndrome. METHODS: A literature search using PubMed and Embase was performed. Articles were selected that contain original treatment data of patients with ESES syndrome. Authors were contacted for additional information. Individual patient data were collected, coded, and analyzed using logistic regression analysis. The three predefined main outcome measures were improvement in cognitive function, electroencephalography (EEG) pattern, and any improvement (cognition or EEG). RESULTS: The literature search yielded 1,766 articles. After applying inclusion and exclusion criteria, 112 articles and 950 treatments in 575 patients could be analyzed. Antiepileptic drugs (AEDs, n = 495) were associated with improvement (i.e., cognition or EEG) in 49% of patients, benzodiazepines (n = 171) in 68%, and steroids (n = 166) in 81%. Surgery (n = 62) resulted in improvement in 90% of patients. In a subgroup analysis of patients who were consecutively reported (585 treatments in 282 patients), we found improvement in a smaller proportion treated with AEDs (34%), benzodiazepines (59%), and steroids (75%), whereas the improvement percentage after surgery was preserved (93%). Possible predictors of improved outcome were treatment category, normal development before ESES onset, and the absence of structural abnormalities. SIGNIFICANCE: Although most included studies were small and retrospective and their heterogeneity allowed analysis of only qualitative outcome data, this pooled analysis suggests superior efficacy of steroids and surgery in encephalopathy with ESES

    Triangle Counting in Dynamic Graph Streams

    Get PDF
    Estimating the number of triangles in graph streams using a limited amount of memory has become a popular topic in the last decade. Different variations of the problem have been studied, depending on whether the graph edges are provided in an arbitrary order or as incidence lists. However, with a few exceptions, the algorithms have considered {\em insert-only} streams. We present a new algorithm estimating the number of triangles in {\em dynamic} graph streams where edges can be both inserted and deleted. We show that our algorithm achieves better time and space complexity than previous solutions for various graph classes, for example sparse graphs with a relatively small number of triangles. Also, for graphs with constant transitivity coefficient, a common situation in real graphs, this is the first algorithm achieving constant processing time per edge. The result is achieved by a novel approach combining sampling of vertex triples and sparsification of the input graph. In the course of the analysis of the algorithm we present a lower bound on the number of pairwise independent 2-paths in general graphs which might be of independent interest. At the end of the paper we discuss lower bounds on the space complexity of triangle counting algorithms that make no assumptions on the structure of the graph.Comment: New version of a SWAT 2014 paper with improved result
    corecore