Search CORE

77 research outputs found

Finding Optimal Tree Decompositions

Author: Korhonen Tuukka
Publication venue: Helsingfors universitet
Publication date: 01/01/2020
Field of study

The task of organizing a given graph into a structure called a tree decomposition is relevant in multiple areas of computer science. In particular, many NP-hard problems can be solved in polynomial time if a suitable tree decomposition of a graph describing the problem instance is given as a part of the input. This motivates the task of finding as good tree decompositions as possible, or ideally, optimal tree decompositions. This thesis is about finding optimal tree decompositions of graphs with respect to several notions of optimality. Each of the considered notions measures the quality of a tree decomposition in the context of an application. In particular, we consider a total of seven problems that are formulated as finding optimal tree decompositions: treewidth, minimum fill-in, generalized and fractional hypertreewidth, total table size, phylogenetic character compatibility, and treelength. For each of these problems we consider the BT algorithm of Bouchitté and Todinca as the method of finding optimal tree decompositions. The BT algorithm is well-known on the theoretical side, but to our knowledge the first time it was implemented was only recently for the 2nd Parameterized Algorithms and Computational Experiments Challenge (PACE 2017). The author’s implementation of the BT algorithm took the second place in the minimum fill-in track of PACE 2017. In this thesis we review and extend the BT algorithm and our implementation. In particular, we improve the eciency of the algorithm in terms of both theory and practice. We also implement the algorithm for each of the seven problems considered, introducing a novel adaptation of the algorithm for the maximum compatibility problem of phylogenetic characters. Our implementation outperforms alternative state-of-the-art approaches in terms of numbers of test instances solved on well-known benchmarks on minimum fill-in, generalized hypertreewidth, fractional hypertreewidth, total table size, and the maximum compatibility problem of phylogenetic characters. Furthermore, to our understanding the implementation is the first exact approach for the treelength problem

Helsingin yliopiston digitaalinen arkisto

Finding Optimal Triangulations Parameterized by Edge Clique Cover

Author: Korhonen Tuukka
Publication venue
Publication date: 02/02/2022
Field of study

Publisher Copyright: © 2022, The Author(s).We consider problems that can be formulated as a task of finding an optimal triangulation of a graph w.r.t. some notion of optimality. We present algorithms parameterized by the size of a minimum edge clique cover (cc) to such problems. This parameterization occurs naturally in many problems in this setting, e.g., in the perfect phylogeny problem cc is at most the number of taxa, in fractional hypertreewidth cc is at most the number of hyperedges, and in treewidth of Bayesian networks cc is at most the number of non-root nodes. We show that the number of minimal separators of graphs is at most 2 cc, the number of potential maximal cliques is at most 3 cc, and these objects can be listed in times O∗(2 cc) and O∗(3 cc) , respectively, even when no edge clique cover is given as input; the O∗(·) notation omits factors polynomial in the input size. These enumeration algorithms imply O∗(3 cc) time algorithms for problems such as treewidth, weighted minimum fill-in, and feedback vertex set. For generalized and fractional hypertreewidth we give O∗(4 m) time and O∗(3 m) time algorithms, respectively, where m is the number of hyperedges. When an edge clique cover of size cc′ is given as a part of the input we give O∗(2cc′) time algorithms for treewidth, minimum fill-in, and chordal sandwich. This implies an O∗(2 n) time algorithm for perfect phylogeny, where n is the number of taxa. We also give polynomial space algorithms with time complexities O∗(9cc′) and O∗(9cc+O(log2cc)) for problems in this framework.Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

Finding Optimal Triangulations Parameterized by Edge Clique Cover

Author: Korhonen Tuukka
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/01/2020
Field of study

Peer reviewe

Dagstuhl Research Online Publication Server

Helsingin yliopiston digitaalinen arkisto

The PACE 2017 Parameterized Algorithms and Computational Experiments Challenge: The Second Iteration

Author: Dell Holger
Komusiewicz Christian
Talmon Nimrod
Weller Mathias
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 12th International Symposium on Parameterized and Exact Computation (IPEC 2017)
Publication date: 01/01/2018
Field of study

In this article, the Program Committee of the Second Parameterized Algorithms and Computational Experiments challenge (PACE 2017) reports on the second iteration of the PACE challenge. Track A featured the Treewidth problem and Track B the Minimum Fill-In problem. Over 44 participants on 17 teams from 11 countries submitted their implementations to the competition

Dagstuhl Research Online Publication Server

Solving Graph Problems via Potential Maximal Cliques : An Experimental Evaluation of the Bouchitté-Todinca Algorithm

Author: Berg Jeremias
Järvisalo Matti
Korhonen Tuukka
Publication venue
Publication date: 01/06/2019
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Dynamic representation of consecutive-ones matrices and interval graphs

Author: Springer William M., II
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2015
Field of study

2015 Spring.Includes bibliographical references.We give an algorithm for updating a consecutive-ones ordering of a consecutive-ones matrix when a row or column is added or deleted. When the addition of the row or column would result in a matrix that does not have the consecutive-ones property, we return a well-known minimal forbidden submatrix for the consecutive-ones property, known as a Tucker submatrix, which serves as a certificate of correctness of the output in this case, in O(n log n) time. The ability to return such a certificate within this time bound is one of the new contributions of this work. Using this result, we obtain an O(n) algorithm for updating an interval model of an interval graph when an edge or vertex is added or deleted. This matches the bounds obtained by a previous dynamic interval-graph recognition algorithm due to Crespelle. We improve on Crespelle's result by producing an easy-to-check certificate, known as a Lekkerkerker-Boland subgraph, when a proposed change to the graph results in a graph that is not an interval graph. Our algorithm takes O(n log n) time to produce this certificate. The ability to return such a certificate within this time bound is the second main contribution of this work

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Fast and accurate supertrees: towards large scale phylogenies

Author: Fleischauer Markus
Publication venue
Publication date: 01/01/2018
Field of study

Phylogenetics is the study of evolutionary relationships between biological entities; phylogenetic trees (phylogenies) are a visualization of these evolutionary relationships. Accurate approaches to reconstruct hylogenies from sequence data usually result in NPhard optimization problems, hence local search heuristics have to be applied in practice. These methods are highly accurate and fast enough as long as the input data is not too large. Divide-and-conquer techniques are a promising approach to boost scalability and accuracy of those local search heuristics on very large datasets. A divide-and-conquer method breaks down a large phylogenetic problem into smaller sub-problems that are computationally easier to solve. The sub-problems (overlapping trees) are then combined using a supertree method. Supertree methods merge a set of overlapping phylogenetic trees into a supertree containing all taxa of the input trees. The challenge in supertree reconstruction is the way of dealing with conflicting information in the input trees. Many different algorithms for different objective functions have been suggested to resolve these conflicts. In particular, there are methods that encode the source trees in a matrix and the supertree is constructed applying a local search heuristic to optimize the respective objective function. The most widely used supertree methods use such local search heuristics. However, to really improve the scalability of accurate tree reconstruction by divide-and-conquer approaches, accurate polynomial time methods are needed for the supertree reconstruction step. In this work, we present approaches for accurate polynomial time supertree reconstruction in particular Bad Clade Deletion (BCD), a novel heuristic supertree algorithm with polynomial running time. BCD uses minimum cuts to greedily delete a locally minimal number of columns from a matrix representation to make it compatible. Different from local search heuristics, it guarantees to return the directed perfect phylogeny for the input matrix, corresponding to the parent tree of the input trees if one exists. BCD can take support values of the source trees into account without an increase in complexity. We show how reliable clades can be used to restrict the search space for BCD and how those clades can be collected from the input data using the Greedy Strict Consensus Merger. Finally, we introduce a beam search extension for the BCD algorithm that keeps alive a constant number of partial solutions in each top-down iteration phase. The guaranteed worst-case running time of BCD with beam search extension is still polynomial. We present an exact and a randomized subroutine to generate suboptimal partial solutions. In our thorough evaluation on several simulated and biological datasets against a representative set of supertree methods we found that BCD is more accurate than the most accurate supertree methods when using support values and search space restriction on simulated data. Simultaneously BCD is faster than any other evaluated method. The beam search approach improved the accuracy of BCD on all evaluated datasets at the cost of speed. We found that BCD supertrees can boost maximum likelihood tree reconstruction when used as starting tree. Further, BCD could handle large scale datasets where local search heuristics did not converge in reasonable time. Due to its combination of speed, accuracy, and the ability to reconstruct the parent tree if one exists, BCD is a promising approach to enable outstanding scalability of divide-and-conquer approaches.Die Phylogenetik studiert die evolutionären Beziehungen zwischen biologischen Entitäten. Phylogenetische Bäume sind eine Visualisierung dieser Beziehungen. Akkurate Ansätze zur Rekonstruktion von Phylogenien aus Sequenzdaten führen in der Regel zu NP-schweren Optimierungsproblemen, sodass in der Praxis lokale Suchheuristiken angewendet werden müssen. Diese Methoden liefern akkurate Bäume und sind schnell genug, solange die Eingabedaten nicht zu groß werden. Teile-und-herrsche-Verfahren sind ein vielversprechender Ansatz, um Skalierbarkeit und Genauigkeit dieser lokalen Suchheuristiken auf sehr großen Datensätzen zu verbessern. Beim Teile-und-herrsche-Ansatz zerlegt man ein großes phylogenetisches Problem in kleinere Teilprobleme, die einfacher und schneller zu lösen sind. Die Teilprobleme, in diesem Fall überlappende Teilbäume, müssen dann zu einem gesamtheitlichen Baum kombiniert werden. Superbaummethoden verschmelzen solche überlappenden phylogenetischen Bäume zu einem Superbaum, der alle Taxa der Eingangsbäume enthält. Die Herausforderung bei der Superbaumrekonstruktion besteht darin, mit widersprüchlichen Eingabebäumen umzugehen. Es wurden viele verschiedene Algorithmen mit unterschiedlichen Zielfunktionen entwickelt, um solche Widersprüche möglichst sinnvoll aufzulösen. Verfahren, die auf der Kodierung der Eingabebäume als Matrixrepräsentation basieren, sind am weitesten verbreitet. Die zum Auflösen der Konflikte verwendeten Zielfunktionen führen in der Regel zu NP-schweren Optimierungsproblemen, sodass in der Praxis auch hier lokale Suchheuristiken zum Einsatz kommen. Da diese Ansätze nicht wesentlich besser mit der Größe der Eingabedaten skalieren als die direkte Rekonstruktion aus Sequenzdaten, werden für die Superbaumrekonstruktion in Teile-undherrsche-Ansätzen akkurate Polynomialzeitmethoden benötigt. Diese Arbeit beschäftigt sich mit der akkuraten Rekonstruktion von Superbäumen in Polynomialzeit. Wir präsentieren Bad Clade Deletion (BCD), eine neue Polynomialzeitheuristik zur Superbaumrekonstruktion. BCD verwendet minimale Schnitte in Graphen, um eine minimale Anzahl von Spalten aus der Matrixrepräsentation zu löschen, sodass diese konfliktfrei wird. Im Gegensatz zu lokalen Suchheuristiken garantiert BCD die Rekonstruktion einer perfekten Phylogenie, sofern eine solche für die Eingabematrix existiert. BCD ermöglicht es, Gütekriterien der Eingabebäume zu berücksichtigen, ohne dass sich dadurch die Komplexität erhöht. Weiterhin zeigen wir, wie zuverlässige Kladen verwendet werden können, um den Suchraum für BCD einzuschränken und wie man diese mit Hilfe des Greedy Strict Consensus Mergers aus den Eingabedaten gewinnen kann. Schließlich stellen wir eine Strahlensuche für BCD vor. Diese erlaubt es eine bestimmte Anzahl suboptimaler Teillösungen (anstatt nur der optimalen) zu berücksichtigen, um so das Gesamtergebnis zu verbessern. Die Worst-Case-Laufzeit der Strahlensuche ist immer noch polynomiell. Zur Berechnung suboptimaler Teillösungen stellen wir einen exakten und einen randomisierten Algorithmus vor. In einer ausführlichen Evaluation auf mehreren simulierten und biologischen Datensätzen vergleichen wir BCD mit einer repräsentativen Auswahl an Superbaummethoden. Wir haben herausgefunden, dass BCD bei Verwendung von Gütekriterien und Suchraumbeschränkung auf simulierten Daten genauer ist als die akkuratesten evaluierten Superbaummethoden. Gleichzeitig ist BCD deutlich schneller als alle evaluierten Methoden. Die Strahlensuche verbessert die Qualität der BCD-Bäume auf allen Datensätzen, allerdings auf Kosten der Laufzeit. Weiterhin fanden wir heraus, dass ein BCD-Superbaum, der als Startbaum verwendet wird, die Qualität einer Maximum-Likelihood-Baumrekonstruktion verbessern kann. Außerdem kann BCD Datensätze verarbeiten, die so groß sind, dass lokale Suchheuristiken auf diesen nicht mehr in angemessener Zeit konvergieren. Aufgrund der Kombination aus Geschwindigkeit, Genauigkeit und der Fähigkeit, den Elternbaum zu rekonstruieren, sofern ein solcher existiert, ist BCD ein vielversprechender Ansatz um die Skalierbarkeit von Teile-und-herrsche-Methoden entscheidend zu verbessern

Digitale Bibliothek Thüringen