35,387 research outputs found
An Even Faster and More Unifying Algorithm for Comparing Trees via Unbalanced Bipartite Matchings
A widely used method for determining the similarity of two labeled trees is
to compute a maximum agreement subtree of the two trees. Previous work on this
similarity measure is only concerned with the comparison of labeled trees of
two special kinds, namely, uniformly labeled trees (i.e., trees with all their
nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled
trees with distinct symbols for distinct leaves). This paper presents an
algorithm for comparing trees that are labeled in an arbitrary manner. In
addition to this generality, this algorithm is faster than the previous
algorithms.
Another contribution of this paper is on maximum weight bipartite matchings.
We show how to speed up the best known matching algorithms when the input
graphs are node-unbalanced or weight-unbalanced. Based on these enhancements,
we obtain an efficient algorithm for a new matching problem called the
hierarchical bipartite matching problem, which is at the core of our maximum
agreement subtree algorithm.Comment: To appear in Journal of Algorithm
Do branch lengths help to locate a tree in a phylogenetic network?
Phylogenetic networks are increasingly used in evolutionary biology to
represent the history of species that have undergone reticulate events such as
horizontal gene transfer, hybrid speciation and recombination. One of the most
fundamental questions that arise in this context is whether the evolution of a
gene with one copy in all species can be explained by a given network. In
mathematical terms, this is often translated in the following way: is a given
phylogenetic tree contained in a given phylogenetic network? Recently this tree
containment problem has been widely investigated from a computational
perspective, but most studies have only focused on the topology of the phylo-
genies, ignoring a piece of information that, in the case of phylogenetic
trees, is routinely inferred by evolutionary analyses: branch lengths. These
measure the amount of change (e.g., nucleotide substitutions) that has occurred
along each branch of the phylogeny. Here, we study a number of versions of the
tree containment problem that explicitly account for branch lengths. We show
that, although length information has the potential to locate more precisely a
tree within a network, the problem is computationally hard in its most general
form. On a positive note, for a number of special cases of biological
relevance, we provide algorithms that solve this problem efficiently. This
includes the case of networks of limited complexity, for which it is possible
to recover, among the trees contained by the network with the same topology as
the input tree, the closest one in terms of branch lengths
Parallel Graph Partitioning for Complex Networks
Processing large complex networks like social networks or web graphs has
recently attracted considerable interest. In order to do this in parallel, we
need to partition them into pieces of about equal size. Unfortunately, previous
parallel graph partitioners originally developed for more regular mesh-like
networks do not work well for these networks. This paper addresses this problem
by parallelizing and adapting the label propagation technique originally
developed for graph clustering. By introducing size constraints, label
propagation becomes applicable for both the coarsening and the refinement phase
of multilevel graph partitioning. We obtain very high quality by applying a
highly parallel evolutionary algorithm to the coarsened graph. The resulting
system is both more scalable and achieves higher quality than state-of-the-art
systems like ParMetis or PT-Scotch. For large complex networks the performance
differences are very big. For example, our algorithm can partition a web graph
with 3.3 billion edges in less than sixteen seconds using 512 cores of a high
performance cluster while producing a high quality partition -- none of the
competing systems can handle this graph on our system.Comment: Review article. Parallelization of our previous approach
arXiv:1402.328
- …