1,192 research outputs found
Algorithms for weighted multidimensional search and perfect phylogeny
This dissertation is a collection of papers from two independent areas: convex optimization problems in R[superscript]d and the construction of evolutionary trees;The paper on convex optimization problems in R[superscript]d gives improved algorithms for solving the Lagrangian duals of problems that have both of the following properties. First, in absence of the bad constraints, the problems can be solved in strongly polynomial time by combinatorial algorithms. Second, the number of bad constraints is fixed. As part of our solution to these problems, we extend Cole\u27s circuit simulation approach and develop a weighted version of Megiddo\u27s multidimensional search technique;The papers on evolutionary tree construction deal with the perfect phylogeny problem, where species are specified by a set of characters and each character can occur in a species in one of a fixed number of states. This problem is known to be NP-complete. The dissertation contains the following results on the perfect phylogeny problem: (1) A linear time algorithm when all the characters have two states. (2) A polynomial time algorithm when the number of character states is fixed. (3) A polynomial time algorithm when the number of characters is fixed
Potential Maximal Clique Algorithms for Perfect Phylogeny Problems
Kloks, Kratsch, and Spinrad showed how treewidth and minimum-fill, NP-hard
combinatorial optimization problems related to minimal triangulations, are
broken into subproblems by block subgraphs defined by minimal separators. These
ideas were expanded on by Bouchitt\'e and Todinca, who used potential maximal
cliques to solve these problems using a dynamic programming approach in time
polynomial in the number of minimal separators of a graph. It is known that
solutions to the perfect phylogeny problem, maximum compatibility problem, and
unique perfect phylogeny problem are characterized by minimal triangulations of
the partition intersection graph. In this paper, we show that techniques
similar to those proposed by Bouchitt\'e and Todinca can be used to solve the
perfect phylogeny problem with missing data, the two- state maximum
compatibility problem with missing data, and the unique perfect phylogeny
problem with missing data in time polynomial in the number of minimal
separators of the partition intersection graph
A Simple Characterization of the Minimal Obstruction Sets for Three-State Perfect Phylogenies
Lam, Gusfield, and Sridhar (2009) showed that a set of three-state characters
has a perfect phylogeny if and only if every subset of three characters has a
perfect phylogeny. They also gave a complete characterization of the sets of
three three-state characters that do not have a perfect phylogeny. However, it
is not clear from their characterization how to find a subset of three
characters that does not have a perfect phylogeny without testing all triples
of characters. In this note, we build upon their result by giving a simple
characterization of when a set of three-state characters does not have a
perfect phylogeny that can be inferred from testing all pairs of characters
Improved Lower Bounds on the Compatibility of Multi-State Characters
We study a long standing conjecture on the necessary and sufficient
conditions for the compatibility of multi-state characters: There exists a
function such that, for any set of -state characters, is
compatible if and only if every subset of characters of is
compatible. We show that for every , there exists an incompatible set
of -state
characters such that every proper subset of is compatible. Thus, for every .
This improves the previous lower bound of given by Meacham (1983),
and generalizes the construction showing that given by Habib and
To (2011). We prove our result via a result on quartet compatibility that may
be of independent interest: For every integer , there exists an
incompatible set of
quartets over
labels such that every proper subset of is compatible. We contrast this
with a result on the compatibility of triplets: For every , if is
an incompatible set of more than triplets over labels, then some
proper subset of is incompatible. We show this upper bound is tight by
exhibiting, for every , a set of triplets over taxa such
that is incompatible, but every proper subset of is compatible
Phylogenetic Trees and Their Analysis
Determining the best possible evolutionary history, the lowest-cost phylogenetic tree, to fit a given set of taxa and character sequences using maximum parsimony is an active area of research due to its underlying importance in understanding biological processes. As several steps in this process are NP-Hard when using popular, biologically-motivated optimality criteria, significant amounts of resources are dedicated to both both heuristics and to making exact methods more computationally tractable. We examine both phylogenetic data and the structure of the search space in order to suggest methods to reduce the number of possible trees that must be examined to find an exact solution for any given set of taxa and associated character data. Our work on four related problems combines theoretical insight with empirical study to improve searching of the tree space. First, we show that there is a Hamiltonian path through tree space for the most common tree metrics, answering Bryant\u27s Challenge for the minimal such path. We next examine the topology of the search space under various metrics, showing that some metrics have local maxima and minima even with perfect data, while some others do not. We further characterize conditions for which sequences simulated under the Jukes-Cantor model of evolution yield well-behaved search spaces. Next, we reduce the search space needed for an exact solution by splitting the set of characters into mutually-incompatible subsets of compatible characters, building trees based on the perfect phylogenies implied by these sets, and then searching in the neighborhoods of these trees. We validate this work empirically. Finally, we compare two approaches to the generalized tree alignment problem, or GTAP: Sequence alignment followed by tree search vs. Direct Optimization, on both biological and simulated data
Recommended from our members
Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.
The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia
Tracing evolutionary links between species
The idea that all life on earth traces back to a common beginning dates back
at least to Charles Darwin's {\em Origin of Species}. Ever since, biologists
have tried to piece together parts of this `tree of life' based on what we can
observe today: fossils, and the evolutionary signal that is present in the
genomes and phenotypes of different organisms. Mathematics has played a key
role in helping transform genetic data into phylogenetic (evolutionary) trees
and networks. Here, I will explain some of the central concepts and basic
results in phylogenetics, which benefit from several branches of mathematics,
including combinatorics, probability and algebra.Comment: 18 pages, 6 figures (Invited review paper (draft version) for AMM
- …