77 research outputs found

    Finding Optimal Tree Decompositions

    Get PDF
    The task of organizing a given graph into a structure called a tree decomposition is relevant in multiple areas of computer science. In particular, many NP-hard problems can be solved in polynomial time if a suitable tree decomposition of a graph describing the problem instance is given as a part of the input. This motivates the task of finding as good tree decompositions as possible, or ideally, optimal tree decompositions. This thesis is about finding optimal tree decompositions of graphs with respect to several notions of optimality. Each of the considered notions measures the quality of a tree decomposition in the context of an application. In particular, we consider a total of seven problems that are formulated as finding optimal tree decompositions: treewidth, minimum fill-in, generalized and fractional hypertreewidth, total table size, phylogenetic character compatibility, and treelength. For each of these problems we consider the BT algorithm of BouchittĂ© and Todinca as the method of finding optimal tree decompositions. The BT algorithm is well-known on the theoretical side, but to our knowledge the first time it was implemented was only recently for the 2nd Parameterized Algorithms and Computational Experiments Challenge (PACE 2017). The author’s implementation of the BT algorithm took the second place in the minimum fill-in track of PACE 2017. In this thesis we review and extend the BT algorithm and our implementation. In particular, we improve the eciency of the algorithm in terms of both theory and practice. We also implement the algorithm for each of the seven problems considered, introducing a novel adaptation of the algorithm for the maximum compatibility problem of phylogenetic characters. Our implementation outperforms alternative state-of-the-art approaches in terms of numbers of test instances solved on well-known benchmarks on minimum fill-in, generalized hypertreewidth, fractional hypertreewidth, total table size, and the maximum compatibility problem of phylogenetic characters. Furthermore, to our understanding the implementation is the first exact approach for the treelength problem

    Finding Optimal Triangulations Parameterized by Edge Clique Cover

    Get PDF
    Publisher Copyright: © 2022, The Author(s).We consider problems that can be formulated as a task of finding an optimal triangulation of a graph w.r.t. some notion of optimality. We present algorithms parameterized by the size of a minimum edge clique cover (cc) to such problems. This parameterization occurs naturally in many problems in this setting, e.g., in the perfect phylogeny problem cc is at most the number of taxa, in fractional hypertreewidth cc is at most the number of hyperedges, and in treewidth of Bayesian networks cc is at most the number of non-root nodes. We show that the number of minimal separators of graphs is at most 2 cc, the number of potential maximal cliques is at most 3 cc, and these objects can be listed in times O∗(2 cc) and O∗(3 cc) , respectively, even when no edge clique cover is given as input; the O∗(·) notation omits factors polynomial in the input size. These enumeration algorithms imply O∗(3 cc) time algorithms for problems such as treewidth, weighted minimum fill-in, and feedback vertex set. For generalized and fractional hypertreewidth we give O∗(4 m) time and O∗(3 m) time algorithms, respectively, where m is the number of hyperedges. When an edge clique cover of size ccâ€Č is given as a part of the input we give O∗(2ccâ€Č) time algorithms for treewidth, minimum fill-in, and chordal sandwich. This implies an O∗(2 n) time algorithm for perfect phylogeny, where n is the number of taxa. We also give polynomial space algorithms with time complexities O∗(9ccâ€Č) and O∗(9cc+O(log2cc)) for problems in this framework.Peer reviewe

    Finding Optimal Triangulations Parameterized by Edge Clique Cover

    Get PDF
    Peer reviewe

    The PACE 2017 Parameterized Algorithms and Computational Experiments Challenge: The Second Iteration

    Get PDF
    In this article, the Program Committee of the Second Parameterized Algorithms and Computational Experiments challenge (PACE 2017) reports on the second iteration of the PACE challenge. Track A featured the Treewidth problem and Track B the Minimum Fill-In problem. Over 44 participants on 17 teams from 11 countries submitted their implementations to the competition

    Dynamic representation of consecutive-ones matrices and interval graphs

    Get PDF
    2015 Spring.Includes bibliographical references.We give an algorithm for updating a consecutive-ones ordering of a consecutive-ones matrix when a row or column is added or deleted. When the addition of the row or column would result in a matrix that does not have the consecutive-ones property, we return a well-known minimal forbidden submatrix for the consecutive-ones property, known as a Tucker submatrix, which serves as a certificate of correctness of the output in this case, in O(n log n) time. The ability to return such a certificate within this time bound is one of the new contributions of this work. Using this result, we obtain an O(n) algorithm for updating an interval model of an interval graph when an edge or vertex is added or deleted. This matches the bounds obtained by a previous dynamic interval-graph recognition algorithm due to Crespelle. We improve on Crespelle's result by producing an easy-to-check certificate, known as a Lekkerkerker-Boland subgraph, when a proposed change to the graph results in a graph that is not an interval graph. Our algorithm takes O(n log n) time to produce this certificate. The ability to return such a certificate within this time bound is the second main contribution of this work

    Fast and accurate supertrees: towards large scale phylogenies

    Get PDF
    Phylogenetics is the study of evolutionary relationships between biological entities; phylogenetic trees (phylogenies) are a visualization of these evolutionary relationships. Accurate approaches to reconstruct hylogenies from sequence data usually result in NPhard optimization problems, hence local search heuristics have to be applied in practice. These methods are highly accurate and fast enough as long as the input data is not too large. Divide-and-conquer techniques are a promising approach to boost scalability and accuracy of those local search heuristics on very large datasets. A divide-and-conquer method breaks down a large phylogenetic problem into smaller sub-problems that are computationally easier to solve. The sub-problems (overlapping trees) are then combined using a supertree method. Supertree methods merge a set of overlapping phylogenetic trees into a supertree containing all taxa of the input trees. The challenge in supertree reconstruction is the way of dealing with conflicting information in the input trees. Many different algorithms for different objective functions have been suggested to resolve these conflicts. In particular, there are methods that encode the source trees in a matrix and the supertree is constructed applying a local search heuristic to optimize the respective objective function. The most widely used supertree methods use such local search heuristics. However, to really improve the scalability of accurate tree reconstruction by divide-and-conquer approaches, accurate polynomial time methods are needed for the supertree reconstruction step. In this work, we present approaches for accurate polynomial time supertree reconstruction in particular Bad Clade Deletion (BCD), a novel heuristic supertree algorithm with polynomial running time. BCD uses minimum cuts to greedily delete a locally minimal number of columns from a matrix representation to make it compatible. Different from local search heuristics, it guarantees to return the directed perfect phylogeny for the input matrix, corresponding to the parent tree of the input trees if one exists. BCD can take support values of the source trees into account without an increase in complexity. We show how reliable clades can be used to restrict the search space for BCD and how those clades can be collected from the input data using the Greedy Strict Consensus Merger. Finally, we introduce a beam search extension for the BCD algorithm that keeps alive a constant number of partial solutions in each top-down iteration phase. The guaranteed worst-case running time of BCD with beam search extension is still polynomial. We present an exact and a randomized subroutine to generate suboptimal partial solutions. In our thorough evaluation on several simulated and biological datasets against a representative set of supertree methods we found that BCD is more accurate than the most accurate supertree methods when using support values and search space restriction on simulated data. Simultaneously BCD is faster than any other evaluated method. The beam search approach improved the accuracy of BCD on all evaluated datasets at the cost of speed. We found that BCD supertrees can boost maximum likelihood tree reconstruction when used as starting tree. Further, BCD could handle large scale datasets where local search heuristics did not converge in reasonable time. Due to its combination of speed, accuracy, and the ability to reconstruct the parent tree if one exists, BCD is a promising approach to enable outstanding scalability of divide-and-conquer approaches.Die Phylogenetik studiert die evolutionĂ€ren Beziehungen zwischen biologischen EntitĂ€ten. Phylogenetische BĂ€ume sind eine Visualisierung dieser Beziehungen. Akkurate AnsĂ€tze zur Rekonstruktion von Phylogenien aus Sequenzdaten fĂŒhren in der Regel zu NP-schweren Optimierungsproblemen, sodass in der Praxis lokale Suchheuristiken angewendet werden mĂŒssen. Diese Methoden liefern akkurate BĂ€ume und sind schnell genug, solange die Eingabedaten nicht zu groß werden. Teile-und-herrsche-Verfahren sind ein vielversprechender Ansatz, um Skalierbarkeit und Genauigkeit dieser lokalen Suchheuristiken auf sehr großen DatensĂ€tzen zu verbessern. Beim Teile-und-herrsche-Ansatz zerlegt man ein großes phylogenetisches Problem in kleinere Teilprobleme, die einfacher und schneller zu lösen sind. Die Teilprobleme, in diesem Fall ĂŒberlappende TeilbĂ€ume, mĂŒssen dann zu einem gesamtheitlichen Baum kombiniert werden. Superbaummethoden verschmelzen solche ĂŒberlappenden phylogenetischen BĂ€ume zu einem Superbaum, der alle Taxa der EingangsbĂ€ume enthĂ€lt. Die Herausforderung bei der Superbaumrekonstruktion besteht darin, mit widersprĂŒchlichen EingabebĂ€umen umzugehen. Es wurden viele verschiedene Algorithmen mit unterschiedlichen Zielfunktionen entwickelt, um solche WidersprĂŒche möglichst sinnvoll aufzulösen. Verfahren, die auf der Kodierung der EingabebĂ€ume als MatrixreprĂ€sentation basieren, sind am weitesten verbreitet. Die zum Auflösen der Konflikte verwendeten Zielfunktionen fĂŒhren in der Regel zu NP-schweren Optimierungsproblemen, sodass in der Praxis auch hier lokale Suchheuristiken zum Einsatz kommen. Da diese AnsĂ€tze nicht wesentlich besser mit der GrĂ¶ĂŸe der Eingabedaten skalieren als die direkte Rekonstruktion aus Sequenzdaten, werden fĂŒr die Superbaumrekonstruktion in Teile-undherrsche-AnsĂ€tzen akkurate Polynomialzeitmethoden benötigt. Diese Arbeit beschĂ€ftigt sich mit der akkuraten Rekonstruktion von SuperbĂ€umen in Polynomialzeit. Wir prĂ€sentieren Bad Clade Deletion (BCD), eine neue Polynomialzeitheuristik zur Superbaumrekonstruktion. BCD verwendet minimale Schnitte in Graphen, um eine minimale Anzahl von Spalten aus der MatrixreprĂ€sentation zu löschen, sodass diese konfliktfrei wird. Im Gegensatz zu lokalen Suchheuristiken garantiert BCD die Rekonstruktion einer perfekten Phylogenie, sofern eine solche fĂŒr die Eingabematrix existiert. BCD ermöglicht es, GĂŒtekriterien der EingabebĂ€ume zu berĂŒcksichtigen, ohne dass sich dadurch die KomplexitĂ€t erhöht. Weiterhin zeigen wir, wie zuverlĂ€ssige Kladen verwendet werden können, um den Suchraum fĂŒr BCD einzuschrĂ€nken und wie man diese mit Hilfe des Greedy Strict Consensus Mergers aus den Eingabedaten gewinnen kann. Schließlich stellen wir eine Strahlensuche fĂŒr BCD vor. Diese erlaubt es eine bestimmte Anzahl suboptimaler Teillösungen (anstatt nur der optimalen) zu berĂŒcksichtigen, um so das Gesamtergebnis zu verbessern. Die Worst-Case-Laufzeit der Strahlensuche ist immer noch polynomiell. Zur Berechnung suboptimaler Teillösungen stellen wir einen exakten und einen randomisierten Algorithmus vor. In einer ausfĂŒhrlichen Evaluation auf mehreren simulierten und biologischen DatensĂ€tzen vergleichen wir BCD mit einer reprĂ€sentativen Auswahl an Superbaummethoden. Wir haben herausgefunden, dass BCD bei Verwendung von GĂŒtekriterien und SuchraumbeschrĂ€nkung auf simulierten Daten genauer ist als die akkuratesten evaluierten Superbaummethoden. Gleichzeitig ist BCD deutlich schneller als alle evaluierten Methoden. Die Strahlensuche verbessert die QualitĂ€t der BCD-BĂ€ume auf allen DatensĂ€tzen, allerdings auf Kosten der Laufzeit. Weiterhin fanden wir heraus, dass ein BCD-Superbaum, der als Startbaum verwendet wird, die QualitĂ€t einer Maximum-Likelihood-Baumrekonstruktion verbessern kann. Außerdem kann BCD DatensĂ€tze verarbeiten, die so groß sind, dass lokale Suchheuristiken auf diesen nicht mehr in angemessener Zeit konvergieren. Aufgrund der Kombination aus Geschwindigkeit, Genauigkeit und der FĂ€higkeit, den Elternbaum zu rekonstruieren, sofern ein solcher existiert, ist BCD ein vielversprechender Ansatz um die Skalierbarkeit von Teile-und-herrsche-Methoden entscheidend zu verbessern
