49,075 research outputs found
A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees
Reticulate events play an important role in determining evolutionary
relationships. The problem of computing the minimum number of such events to
explain discordance between two phylogenetic trees is a hard computational
problem. Even for binary trees, exact solvers struggle to solve instances with
reticulation number larger than 40-50. Here we present CycleKiller and
NonbinaryCycleKiller, the first methods to produce solutions verifiably close
to optimality for instances with hundreds or even thousands of reticulations.
Using simulations, we demonstrate that these algorithms run quickly for large
and difficult instances, producing solutions that are very close to optimality.
As a spin-off from our simulations we also present TerminusEst, which is the
fastest exact method currently available that can handle nonbinary trees: this
is used to measure the accuracy of the NonbinaryCycleKiller algorithm. All
three methods are based on extensions of previous theoretical work and are
publicly available. We also apply our methods to real data
A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest
We give a 2-approximation algorithm for the Maximum Agreement Forest problem
on two rooted binary trees. This NP-hard problem has been studied extensively
in the past two decades, since it can be used to compute the Subtree
Prune-and-Regraft (SPR) distance between two phylogenetic trees. Our result
improves on the very recent 2.5-approximation algorithm due to Shi, Feng, You
and Wang (2015). Our algorithm is the first approximation algorithm for this
problem that uses LP duality in its analysis
On unrooted and root-uncertain variants of several well-known phylogenetic network problems
The hybridization number problem requires us to embed a set of binary rooted
phylogenetic trees into a binary rooted phylogenetic network such that the
number of nodes with indegree two is minimized. However, from a biological
point of view accurately inferring the root location in a phylogenetic tree is
notoriously difficult and poor root placement can artificially inflate the
hybridization number. To this end we study a number of relaxed variants of this
problem. We start by showing that the fundamental problem of determining
whether an \emph{unrooted} phylogenetic network displays (i.e. embeds) an
\emph{unrooted} phylogenetic tree, is NP-hard. On the positive side we show
that this problem is FPT in reticulation number. In the rooted case the
corresponding FPT result is trivial, but here we require more subtle
argumentation. Next we show that the hybridization number problem for unrooted
networks (when given two unrooted trees) is equivalent to the problem of
computing the Tree Bisection and Reconnect (TBR) distance of the two unrooted
trees. In the third part of the paper we consider the "root uncertain" variant
of hybridization number. Here we are free to choose the root location in each
of a set of unrooted input trees such that the hybridization number of the
resulting rooted trees is minimized. On the negative side we show that this
problem is APX-hard. On the positive side, we show that the problem is FPT in
the hybridization number, via kernelization, for any number of input trees.Comment: 28 pages, 8 Figure
A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest
We give a 2-approximation algorithm for the Maximum Agreement Forest problem
on two rooted binary trees. This NP-hard problem has been studied extensively
in the past two decades, since it can be used to compute the rooted Subtree
Prune-and-Regraft (rSPR) distance between two phylogenetic trees. Our algorithm
is combinatorial and its running time is quadratic in the input size. To prove
the approximation guarantee, we construct a feasible dual solution for a novel
linear programming formulation. In addition, we show this linear program is
stronger than previously known formulations, and we give a compact formulation,
showing that it can be solved in polynomial tim
Active Mean Fields for Probabilistic Image Segmentation: Connections with Chan-Vese and Rudin-Osher-Fatemi Models
Segmentation is a fundamental task for extracting semantically meaningful
regions from an image. The goal of segmentation algorithms is to accurately
assign object labels to each image location. However, image-noise, shortcomings
of algorithms, and image ambiguities cause uncertainty in label assignment.
Estimating the uncertainty in label assignment is important in multiple
application domains, such as segmenting tumors from medical images for
radiation treatment planning. One way to estimate these uncertainties is
through the computation of posteriors of Bayesian models, which is
computationally prohibitive for many practical applications. On the other hand,
most computationally efficient methods fail to estimate label uncertainty. We
therefore propose in this paper the Active Mean Fields (AMF) approach, a
technique based on Bayesian modeling that uses a mean-field approximation to
efficiently compute a segmentation and its corresponding uncertainty. Based on
a variational formulation, the resulting convex model combines any
label-likelihood measure with a prior on the length of the segmentation
boundary. A specific implementation of that model is the Chan-Vese segmentation
model (CV), in which the binary segmentation task is defined by a Gaussian
likelihood and a prior regularizing the length of the segmentation boundary.
Furthermore, the Euler-Lagrange equations derived from the AMF model are
equivalent to those of the popular Rudin-Osher-Fatemi (ROF) model for image
denoising. Solutions to the AMF model can thus be implemented by directly
utilizing highly-efficient ROF solvers on log-likelihood ratio fields. We
qualitatively assess the approach on synthetic data as well as on real natural
and medical images. For a quantitative evaluation, we apply our approach to the
icgbench dataset
The Statistics of Density Peaks and the Column Density Distribution of the Lyman-Alpha Forest
We develop a method to calculate the column density distribution of the
Lyman-alpha forest for column densities in the range . The Zel'dovich approximation, with appropriate smoothing, is used to
compute the density and peculiar velocity fields. The effect of the latter on
absorption profiles is discussed and it is shown to have little effect on the
column density distribution. An approximation is introduced in which the column
density distribution is related to a statistic of density peaks (involving its
height and first and second derivatives along the line of sight) in real space.
We show that the slope of the column density distribution is determined by the
temperature-density relation as well as the power spectrum on scales . An expression relating the three is given. We
find very good agreement between the column density distribution obtained by
applying the Voigt-profile-fitting technique to the output of a full
hydrodynamic simulation and that obtained using our approximate method for a
test model. This formalism then is applied to study a group of CDM as well as
CHDM models. We show that the amplitude of the column density distribution
depends on the combination of parameters , which is not well-constrained by independent observations. The
slope of the distribution, on the other hand, can be used to distinguish
between different models: those with a smaller amplitude and a steeper slope of
the power spectrum on small scales give rise to steeper distributions, for the
range of column densities we study. Comparison with high resolution Keck data
is made.Comment: match accepted version; discussion added: the effect of the shape of
the power spectrum on the slope of the column density distributio
Distributed Dominating Set Approximations beyond Planar Graphs
The Minimum Dominating Set (MDS) problem is one of the most fundamental and
challenging problems in distributed computing. While it is well-known that
minimum dominating sets cannot be approximated locally on general graphs, over
the last years, there has been much progress on computing local approximations
on sparse graphs, and in particular planar graphs.
In this paper we study distributed and deterministic MDS approximation
algorithms for graph classes beyond planar graphs. In particular, we show that
existing approximation bounds for planar graphs can be lifted to bounded genus
graphs, and present (1) a local constant-time, constant-factor MDS
approximation algorithm and (2) a local -time
approximation scheme. Our main technical contribution is a new analysis of a
slightly modified variant of an existing algorithm by Lenzen et al.
Interestingly, unlike existing proofs for planar graphs, our analysis does not
rely on direct topological arguments.Comment: arXiv admin note: substantial text overlap with arXiv:1602.0299
- …