1,170 research outputs found
Cavity Matchings, Label Compressions, and Unrooted Evolutionary Trees
We present an algorithm for computing a maximum agreement subtree of two
unrooted evolutionary trees. It takes O(n^{1.5} log n) time for trees with
unbounded degrees, matching the best known time complexity for the rooted case.
Our algorithm allows the input trees to be mixed trees, i.e., trees that may
contain directed and undirected edges at the same time. Our algorithm adopts a
recursive strategy exploiting a technique called label compression. The
backbone of this technique is an algorithm that computes the maximum weight
matchings over many subgraphs of a bipartite graph as fast as it takes to
compute a single matching
An Even Faster and More Unifying Algorithm for Comparing Trees via Unbalanced Bipartite Matchings
A widely used method for determining the similarity of two labeled trees is
to compute a maximum agreement subtree of the two trees. Previous work on this
similarity measure is only concerned with the comparison of labeled trees of
two special kinds, namely, uniformly labeled trees (i.e., trees with all their
nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled
trees with distinct symbols for distinct leaves). This paper presents an
algorithm for comparing trees that are labeled in an arbitrary manner. In
addition to this generality, this algorithm is faster than the previous
algorithms.
Another contribution of this paper is on maximum weight bipartite matchings.
We show how to speed up the best known matching algorithms when the input
graphs are node-unbalanced or weight-unbalanced. Based on these enhancements,
we obtain an efficient algorithm for a new matching problem called the
hierarchical bipartite matching problem, which is at the core of our maximum
agreement subtree algorithm.Comment: To appear in Journal of Algorithm
Tree Contractions and Evolutionary Trees
An evolutionary tree is a rooted tree where each internal vertex has at least
two children and where the leaves are labeled with distinct symbols
representing species. Evolutionary trees are useful for modeling the
evolutionary history of species. An agreement subtree of two evolutionary trees
is an evolutionary tree which is also a topological subtree of the two given
trees. We give an algorithm to determine the largest possible number of leaves
in any agreement subtree of two trees T_1 and T_2 with n leaves each. If the
maximum degree d of these trees is bounded by a constant, the time complexity
is O(n log^2(n)) and is within a log(n) factor of optimal. For general d, this
algorithm runs in O(n d^2 log(d) log^2(n)) time or alternatively in O(n d
sqrt(d) log^3(n)) time
A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest
We give a 2-approximation algorithm for the Maximum Agreement Forest problem
on two rooted binary trees. This NP-hard problem has been studied extensively
in the past two decades, since it can be used to compute the Subtree
Prune-and-Regraft (SPR) distance between two phylogenetic trees. Our result
improves on the very recent 2.5-approximation algorithm due to Shi, Feng, You
and Wang (2015). Our algorithm is the first approximation algorithm for this
problem that uses LP duality in its analysis
Computational Molecular Biology
Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography
Fast alignment of fragmentation trees
Motivation: Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules such as metabolites. One major bottleneck in metabolomics is the identification of āunknownā small molecules not in any database. Recently, fragmentation tree alignments have been introduced for the automated comparison of the fragmentation patterns of small molecules. Fragmentation pattern similarities are strongly correlated with the chemical similarity of the molecules, and allow us to cluster compounds based solely on their fragmentation patterns
Novel methods for the analysis of small molecule fragmentation mass spectra
The identification of small molecules, such as metabolites, in a high throughput manner plays an important in many research areas. Mass spectrometry (MS) is one of the predominant analysis technologies and is much more sensitive than nuclear magnetic resonance spectroscopy. Fragmentation of the molecules is used to obtain information beyond its mass. Gas chromatography-MS is one of the oldest and most widespread techniques for the analysis of small molecules. Commonly, the molecule is fragmented using electron ionization (EI). Using this technique, the molecular ion peak is often barely visible in the mass spectrum or even absent. We present a method to calculate fragmentation trees from high mass accuracy EI spectra, which annotate the peaks in the mass spectrum with molecular formulas of fragments and explain relevant fragmentation pathways. Fragmentation trees enable the identification of the molecular ion and its molecular formula if the molecular ion is present in the spectrum. The method works even if the molecular ion is of very low abundance. MS experts confirm that the calculated trees correspond very well to known fragmentation mechanisms.Using pairwise local alignments of fragmentation trees, structural and chemical similarities to already-known molecules can be determined. In order to compare a fragmentation tree of an unknown metabolite to a huge database of fragmentation trees, fast algorithms for solving the tree alignment problem are required. Unfortunately the alignment of unordered trees, such as fragmentation trees, is NP-hard. We present three exact algorithms for the problem. Evaluation of our methods showed that thousands of alignments can be computed in a matter of minutes.
Both the computation and the comparison of fragmentation trees are rule-free approaches that require no chemical knowledge about the unknown molecule and thus will be very helpful in the automated analysis of metabolites that are not included in common libraries
- ā¦