Search CORE

1,170 research outputs found

Cavity Matchings, Label Compressions, and Unrooted Evolutionary Trees

Author: Kao Ming-Yang
Lam Tak-Wah
Sung Wing-Kin
Ting Hing-Fung
Publication venue
Publication date: 01/01/2000
Field of study

We present an algorithm for computing a maximum agreement subtree of two unrooted evolutionary trees. It takes O(n^{1.5} log n) time for trees with unbounded degrees, matching the best known time complexity for the rooted case. Our algorithm allows the input trees to be mixed trees, i.e., trees that may contain directed and undirected edges at the same time. Our algorithm adopts a recursive strategy exploiting a technique called label compression. The backbone of this technique is an algorithm that computes the maximum weight matchings over many subgraphs of a bipartite graph as fast as it takes to compute a single matching

arXiv.org e-Print Archive

CiteSeerX

HKU Scholars Hub

An Even Faster and More Unifying Algorithm for Comparing Trees via Unbalanced Bipartite Matchings

Author: Kao Ming-Yang
Lam Tak-Wah
Sung Wing-Kin
Ting Hing-Fung
Publication venue
Publication date: 01/01/2001
Field of study

A widely used method for determining the similarity of two labeled trees is to compute a maximum agreement subtree of the two trees. Previous work on this similarity measure is only concerned with the comparison of labeled trees of two special kinds, namely, uniformly labeled trees (i.e., trees with all their nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled trees with distinct symbols for distinct leaves). This paper presents an algorithm for comparing trees that are labeled in an arbitrary manner. In addition to this generality, this algorithm is faster than the previous algorithms. Another contribution of this paper is on maximum weight bipartite matchings. We show how to speed up the best known matching algorithms when the input graphs are node-unbalanced or weight-unbalanced. Based on these enhancements, we obtain an efficient algorithm for a new matching problem called the hierarchical bipartite matching problem, which is at the core of our maximum agreement subtree algorithm.Comment: To appear in Journal of Algorithm

arXiv.org e-Print Archive

HKU Scholars Hub

Tree Contractions and Evolutionary Trees

Author: Kao Ming-Yang
Publication venue
Publication date: 26/01/2001
Field of study

An evolutionary tree is a rooted tree where each internal vertex has at least two children and where the leaves are labeled with distinct symbols representing species. Evolutionary trees are useful for modeling the evolutionary history of species. An agreement subtree of two evolutionary trees is an evolutionary tree which is also a topological subtree of the two given trees. We give an algorithm to determine the largest possible number of leaves in any agreement subtree of two trees T_1 and T_2 with n leaves each. If the maximum degree d of these trees is bounded by a constant, the time complexity is O(n log^2(n)) and is within a log(n) factor of optimal. For general d, this algorithm runs in O(n d^2 log(d) log^2(n)) time or alternatively in O(n d sqrt(d) log^3(n)) time

arXiv.org e-Print Archive

CiteSeerX

A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest

Author: Schalekamp Frans
van der Ster Suzanne
van Zuylen Anke
Publication venue
Publication date: 01/01/2016
Field of study

We give a 2-approximation algorithm for the Maximum Agreement Forest problem on two rooted binary trees. This NP-hard problem has been studied extensively in the past two decades, since it can be used to compute the Subtree Prune-and-Regraft (SPR) distance between two phylogenetic trees. Our result improves on the very recent 2.5-approximation algorithm due to Shi, Feng, You and Wang (2015). Our algorithm is the first approximation algorithm for this problem that uses LP duality in its analysis

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Computational Molecular Biology

Author: Lenhof H.
Mutzel P.
Vingron M.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1996
Field of study

Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography

Fast alignment of fragmentation trees

Author: Arora
Backofen
Björklund
Böcker
Canzar
Cui
Fernie
Fiehn
Florian Rasche
Franziska Hufsky
Halket
Herlihy
Hill
Horai
Jiang
Kai Dührkop
Last
Le
Lederberg
Li
Ljubić
Markus Chimani
Neumann
Oberacher
Pagh
Rasche
Rasche
Rauf
Scheubert
Schmidt
Sebastian Böcker
Sniedovich
Werner
Zhang
Zhang
Publication venue: Oxford University Press
Publication date: 11/06/2012
Field of study

Motivation: Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules such as metabolites. One major bottleneck in metabolomics is the identification of ‘unknown’ small molecules not in any database. Recently, fragmentation tree alignments have been introduced for the automated comparison of the fragmentation patterns of small molecules. Fragmentation pattern similarities are strongly correlated with the chemical similarity of the molecules, and allow us to cluster compounds based solely on their fragmentation patterns

Novel methods for the analysis of small molecule fragmentation mass spectra

Author: Hufsky Franziska
Publication venue
Publication date: 05/06/2014
Field of study

The identification of small molecules, such as metabolites, in a high throughput manner plays an important in many research areas. Mass spectrometry (MS) is one of the predominant analysis technologies and is much more sensitive than nuclear magnetic resonance spectroscopy. Fragmentation of the molecules is used to obtain information beyond its mass. Gas chromatography-MS is one of the oldest and most widespread techniques for the analysis of small molecules. Commonly, the molecule is fragmented using electron ionization (EI). Using this technique, the molecular ion peak is often barely visible in the mass spectrum or even absent. We present a method to calculate fragmentation trees from high mass accuracy EI spectra, which annotate the peaks in the mass spectrum with molecular formulas of fragments and explain relevant fragmentation pathways. Fragmentation trees enable the identification of the molecular ion and its molecular formula if the molecular ion is present in the spectrum. The method works even if the molecular ion is of very low abundance. MS experts confirm that the calculated trees correspond very well to known fragmentation mechanisms.Using pairwise local alignments of fragmentation trees, structural and chemical similarities to already-known molecules can be determined. In order to compare a fragmentation tree of an unknown metabolite to a huge database of fragmentation trees, fast algorithms for solving the tree alignment problem are required. Unfortunately the alignment of unordered trees, such as fragmentation trees, is NP-hard. We present three exact algorithms for the problem. Evaluation of our methods showed that thousands of alignments can be computed in a matter of minutes. Both the computation and the comparison of fragmentation trees are rule-free approaches that require no chemical knowledge about the unknown molecule and thus will be very helpful in the automated analysis of metabolites that are not included in common libraries