Search CORE

49,075 research outputs found

A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees

Author: Kelk Steven
Lekić Nela
Scornavacca Celine
van Iersel Leo
Publication venue
Publication date: 01/01/2014
Field of study

Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. Even for binary trees, exact solvers struggle to solve instances with reticulation number larger than 40-50. Here we present CycleKiller and NonbinaryCycleKiller, the first methods to produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations. Using simulations, we demonstrate that these algorithms run quickly for large and difficult instances, producing solutions that are very close to optimality. As a spin-off from our simulations we also present TerminusEst, which is the fastest exact method currently available that can handle nonbinary trees: this is used to measure the accuracy of the NonbinaryCycleKiller algorithm. All three methods are based on extensions of previous theoretical work and are publicly available. We also apply our methods to real data

arXiv.org e-Print Archive

Maastricht University Research Portal

Springer - Publisher Connector

CWI's Institutional Repository

A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest

Author: Schalekamp Frans
van der Ster Suzanne
van Zuylen Anke
Publication venue
Publication date: 01/01/2016
Field of study

We give a 2-approximation algorithm for the Maximum Agreement Forest problem on two rooted binary trees. This NP-hard problem has been studied extensively in the past two decades, since it can be used to compute the Subtree Prune-and-Regraft (SPR) distance between two phylogenetic trees. Our result improves on the very recent 2.5-approximation algorithm due to Shi, Feng, You and Wang (2015). Our algorithm is the first approximation algorithm for this problem that uses LP duality in its analysis

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On unrooted and root-uncertain variants of several well-known phylogenetic network problems

Author: Boes Olivier
Kelk Steven
Stamoulis Georgios
Stougie Leen
van Iersel Leo
Publication venue
Publication date: 01/01/2016
Field of study

The hybridization number problem requires us to embed a set of binary rooted phylogenetic trees into a binary rooted phylogenetic network such that the number of nodes with indegree two is minimized. However, from a biological point of view accurately inferring the root location in a phylogenetic tree is notoriously difficult and poor root placement can artificially inflate the hybridization number. To this end we study a number of relaxed variants of this problem. We start by showing that the fundamental problem of determining whether an \emph{unrooted} phylogenetic network displays (i.e. embeds) an \emph{unrooted} phylogenetic tree, is NP-hard. On the positive side we show that this problem is FPT in reticulation number. In the rooted case the corresponding FPT result is trivial, but here we require more subtle argumentation. Next we show that the hybridization number problem for unrooted networks (when given two unrooted trees) is equivalent to the problem of computing the Tree Bisection and Reconnect (TBR) distance of the two unrooted trees. In the third part of the paper we consider the "root uncertain" variant of hybridization number. Here we are free to choose the root location in each of a set of unrooted input trees such that the hybridization number of the resulting rooted trees is minimized. On the negative side we show that this problem is APX-hard. On the positive side, we show that the problem is FPT in the hybridization number, via kernelization, for any number of input trees.Comment: 28 pages, 8 Figure

arXiv.org e-Print Archive

Maastricht University Research Portal

VU Research Portal

CWI's Institutional Repository

INRIA a CCSD electronic archive server

A Duality Based 2-Approximation Algorithm for Maximum Agreement Forest

Author: Olver Neil
Schalekamp Frans
Stougie Leen
van der Ster Suzanne
van Zuylen Anke
Publication venue
Publication date: 01/01/2018
Field of study

We give a 2-approximation algorithm for the Maximum Agreement Forest problem on two rooted binary trees. This NP-hard problem has been studied extensively in the past two decades, since it can be used to compute the rooted Subtree Prune-and-Regraft (rSPR) distance between two phylogenetic trees. Our algorithm is combinatorial and its running time is quadratic in the input size. To prove the approximation guarantee, we construct a feasible dual solution for a novel linear programming formulation. In addition, we show this linear program is stronger than previously known formulations, and we give a compact formulation, showing that it can be solved in polynomial tim

arXiv.org e-Print Archive

VU Research Portal

CWI's Institutional Repository

Active Mean Fields for Probabilistic Image Segmentation: Connections with Chan-Vese and Rudin-Osher-Fatemi Models

Author: Janoos Firdaus
Niethammer Marc
Pohl Kilian M.
Wells III William M.
Publication venue
Publication date: 04/10/2016
Field of study

Segmentation is a fundamental task for extracting semantically meaningful regions from an image. The goal of segmentation algorithms is to accurately assign object labels to each image location. However, image-noise, shortcomings of algorithms, and image ambiguities cause uncertainty in label assignment. Estimating the uncertainty in label assignment is important in multiple application domains, such as segmenting tumors from medical images for radiation treatment planning. One way to estimate these uncertainties is through the computation of posteriors of Bayesian models, which is computationally prohibitive for many practical applications. On the other hand, most computationally efficient methods fail to estimate label uncertainty. We therefore propose in this paper the Active Mean Fields (AMF) approach, a technique based on Bayesian modeling that uses a mean-field approximation to efficiently compute a segmentation and its corresponding uncertainty. Based on a variational formulation, the resulting convex model combines any label-likelihood measure with a prior on the length of the segmentation boundary. A specific implementation of that model is the Chan-Vese segmentation model (CV), in which the binary segmentation task is defined by a Gaussian likelihood and a prior regularizing the length of the segmentation boundary. Furthermore, the Euler-Lagrange equations derived from the AMF model are equivalent to those of the popular Rudin-Osher-Fatemi (ROF) model for image denoising. Solutions to the AMF model can thus be implemented by directly utilizing highly-efficient ROF solvers on log-likelihood ratio fields. We qualitatively assess the approach on synthetic data as well as on real natural and medical images. For a quantitative evaluation, we apply our approach to the icgbench dataset

arXiv.org e-Print Archive

Carolina Digital Repository

The Statistics of Density Peaks and the Column Density Distribution of the Lyman-Alpha Forest

Author: Bagla J. S.
Bi H.
Coles P.
Cooke A. J.
Cristiani S.
Doroshkevich A. G.
Doroshkevich A. G.
Hernquist L.
Hu E.
Kofman L.
Lu L.
Matarrese S.
McGill C.
Melott A. L.
Petitjean P.
Petitjean P.
Rees M.
Rugers M.
Zeldovich Ya. B.
Publication venue: 'University of Chicago Press'
Publication date: 24/08/1996
Field of study

We develop a method to calculate the column density distribution of the Lyman-alpha forest for column densities in the range

10^{12.5} - 10^{14.5} cm^{-2}

. The Zel'dovich approximation, with appropriate smoothing, is used to compute the density and peculiar velocity fields. The effect of the latter on absorption profiles is discussed and it is shown to have little effect on the column density distribution. An approximation is introduced in which the column density distribution is related to a statistic of density peaks (involving its height and first and second derivatives along the line of sight) in real space. We show that the slope of the column density distribution is determined by the temperature-density relation as well as the power spectrum on scales

2 h Mpc^{-1} < k < 20 h Mpc^{-1}

. An expression relating the three is given. We find very good agreement between the column density distribution obtained by applying the Voigt-profile-fitting technique to the output of a full hydrodynamic simulation and that obtained using our approximate method for a test model. This formalism then is applied to study a group of CDM as well as CHDM models. We show that the amplitude of the column density distribution depends on the combination of parameters

(\Omega_b h^2)^2 T_0^{-0.7} J_{HI}^{-1}

, which is not well-constrained by independent observations. The slope of the distribution, on the other hand, can be used to distinguish between different models: those with a smaller amplitude and a steeper slope of the power spectrum on small scales give rise to steeper distributions, for the range of column densities we study. Comparison with high resolution Keck data is made.Comment: match accepted version; discussion added: the effect of the shape of the power spectrum on the slope of the column density distributio

arXiv.org e-Print Archive

Crossref

CERN Document Server

Distributed Dominating Set Approximations beyond Planar Graphs

Author: Amiri Saeed Akhoondian
Schmid Stefan
Siebertz Sebastian
Publication venue
Publication date: 01/01/2019
Field of study

The Minimum Dominating Set (MDS) problem is one of the most fundamental and challenging problems in distributed computing. While it is well-known that minimum dominating sets cannot be approximated locally on general graphs, over the last years, there has been much progress on computing local approximations on sparse graphs, and in particular planar graphs. In this paper we study distributed and deterministic MDS approximation algorithms for graph classes beyond planar graphs. In particular, we show that existing approximation bounds for planar graphs can be lifted to bounded genus graphs, and present (1) a local constant-time, constant-factor MDS approximation algorithm and (2) a local

\mathcal{O}(\log^*{n})

-time approximation scheme. Our main technical contribution is a new analysis of a slightly modified variant of an existing algorithm by Lenzen et al. Interestingly, unlike existing proofs for planar graphs, our analysis does not rely on direct topological arguments.Comment: arXiv admin note: substantial text overlap with arXiv:1602.0299

arXiv.org e-Print Archive

MPG.PuRe