28 research outputs found

    Inapproximability of maximal strip recovery

    Get PDF
    In comparative genomic, the first step of sequence analysis is usually to decompose two or more genomes into syntenic blocks that are segments of homologous chromosomes. For the reliable recovery of syntenic blocks, noise and ambiguities in the genomic maps need to be removed first. Maximal Strip Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff for reliably recovering syntenic blocks from genomic maps in the midst of noise and ambiguities. Given dd genomic maps as sequences of gene markers, the objective of \msr{d} is to find dd subsequences, one subsequence of each genomic map, such that the total length of syntenic blocks in these subsequences is maximized. For any constant d2d \ge 2, a polynomial-time 2d-approximation for \msr{d} was previously known. In this paper, we show that for any d2d \ge 2, \msr{d} is APX-hard, even for the most basic version of the problem in which all gene markers are distinct and appear in positive orientation in each genomic map. Moreover, we provide the first explicit lower bounds on approximating \msr{d} for all d2d \ge 2. In particular, we show that \msr{d} is NP-hard to approximate within Ω(d/logd)\Omega(d/\log d). From the other direction, we show that the previous 2d-approximation for \msr{d} can be optimized into a polynomial-time algorithm even if dd is not a constant but is part of the input. We then extend our inapproximability results to several related problems including \cmsr{d}, \gapmsr{\delta}{d}, and \gapcmsr{\delta}{d}.Comment: A preliminary version of this paper appeared in two parts in the Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC 2009) and the Proceedings of the 4th International Frontiers of Algorithmics Workshop (FAW 2010

    A 2-Approximation Algorithm for the Complementary Maximal Strip Recovery Problem

    Get PDF
    The Maximal Strip Recovery problem (MSR) and its complementary (CMSR) are well-studied NP-hard problems in computational genomics. The input of these dual problems are two signed permutations. The goal is to delete some gene markers from both permutations, such that, in the remaining permutations, each gene marker has at least one common neighbor. Equivalently, the resulting permutations could be partitioned into common strips of length at least two. Then MSR is to maximize the number of remaining genes, while the objective of CMSR is to delete the minimum number of gene markers. In this paper, we present a new approximation algorithm for the Complementary Maximal Strip Recovery (CMSR) problem. Our approximation factor is 2, improving the currently best 7/3-approximation algorithm. Although the improvement on the factor is not huge, the analysis is greatly simplified by a compensating method, commonly referred to as the non-oblivious local search technique. In such a method a substitution may not always increase the value of the current solution (it sometimes may even decrease the solution value), though it always improves the value of another function seemingly unrelated to the objective function

    LIPIcs, Volume 248, ISAAC 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 248, ISAAC 2022, Complete Volum

    27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

    Get PDF

    Algorithms approaching the threshold for semi-random planted clique

    Full text link
    We design new polynomial-time algorithms for recovering planted cliques in the semi-random graph model introduced by Feige and Kilian~\cite{FK01}. The previous best algorithms for this model succeed if the planted clique has size at least n2/3n^{2/3} in a graph with nn vertices (Mehta, Mckenzie, Trevisan, 2019 and Charikar, Steinhardt, Valiant 2017). Our algorithms work for planted-clique sizes approaching n1/2n^{1/2} -- the information-theoretic threshold in the semi-random model~\cite{steinhardt2017does} and a conjectured computational threshold even in the easier fully-random model. This result comes close to resolving open questions by Feige and Steinhardt. Our algorithms are based on higher constant degree sum-of-squares relaxation and rely on a new conceptual connection that translates certificates of upper bounds on biclique numbers in \emph{unbalanced} bipartite Erd\H{o}s--R\'enyi random graphs into algorithms for semi-random planted clique. The use of a higher-constant degree sum-of-squares is essential in our setting: we prove a lower bound on the basic SDP for certifying bicliques that shows that the basic SDP cannot succeed for planted cliques of size k=o(n2/3)k =o(n^{2/3}). We also provide some evidence that the information-computation trade-off of our current algorithms may be inherent by proving an average-case lower bound for unbalanced bicliques in the low-degree-polynomials model.Comment: 51 pages, the arxiv landing page contains a shortened abstrac

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum

    Self-Evaluation Applied Mathematics 2003-2008 University of Twente

    Get PDF
    This report contains the self-study for the research assessment of the Department of Applied Mathematics (AM) of the Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS) at the University of Twente (UT). The report provides the information for the Research Assessment Committee for Applied Mathematics, dealing with mathematical sciences at the three universities of technology in the Netherlands. It describes the state of affairs pertaining to the period 1 January 2003 to 31 December 2008

    29th International Symposium on Algorithms and Computation: ISAAC 2018, December 16-19, 2018, Jiaoxi, Yilan, Taiwan

    Get PDF
    corecore