20 research outputs found
Unique Perfect Phylogeny Characterizations via Uniquely Representable Chordal Graphs
The perfect phylogeny problem is a classic problem in computational biology,
where we seek an unrooted phylogeny that is compatible with a set of
qualitative characters. Such a tree exists precisely when an intersection graph
associated with the character set, called the partition intersection graph, can
be triangulated using a restricted set of fill edges. Semple and Steel used the
partition intersection graph to characterize when a character set has a unique
perfect phylogeny. Bordewich, Huber, and Semple showed how to use the partition
intersection graph to find a maximum compatible set of characters. In this
paper, we build on these results, characterizing when a unique perfect
phylogeny exists for a subset of partial characters. Our characterization is
stated in terms of minimal triangulations of the partition intersection graph
that are uniquely representable, also known as ur-chordal graphs. Our
characterization is motivated by the structure of ur-chordal graphs, and the
fact that the block structure of minimal triangulations is mirrored in the
graph that has been triangulated
Contributions to computational phylogenetics and algorithmic self-assembly
This dissertation addresses some of the algorithmic and combinatorial problems at the interface between biology and computation.
In particular, it focuses on problems in both computational phylogenetics, an area of study in which computation is used to better understand evolutionary relationships, and algorithmic self-assembly, an area of study in which biological processes are used to perform computation.
The first set of results investigate inferring phylogenetic trees from multi-state character data. We give a novel characterization of when a set of three-state characters has a perfect phylogeny and make progress on a long-standing conjecture regarding the compatibility of multi-state characters.
The next set of results investigate inferring phylogenetic supertrees from collections of smaller input trees when the input trees do not fully agree on the relative positions of the taxa. Two approaches to dealing with such conflicting input trees are considered. The first is to contract a set of edges in the input trees so that the resulting trees have an agreement supertree. The second is to remove a set of taxa from the input trees so that the resulting trees have an agreement supertree. We give fixed-parameter tractable algorithms for both approaches.
We then turn to the algorithmic self-assembly of fractal structures from DNA tiles and investigate approximating the Sierpinski triangle and the Sierpinski carpet with strict self-assembly. We prove tight bounds on approximating the Sierpinski triangle and exhibit a class of fractals that are generalizations of the Sierpinski carpet that can approximately self-assemble.
We conclude by discussing some ideas for further research
Fixed parameter algorithms for compatible and agreement supertree problems
Biologists represent evolutionary history of species through phylogenetic trees. Leaves of a phylogenetic tree represent the species and internal vertices represent the extinct ancestors. Given a collection of input phylogenetic trees, a common problem in computational biology is to build a supertree that captures the evolutionary history of all the species in the input trees, and is consistent with each of the input trees. In this document we study the tree compatibility and agreement supertree problems.
Tree compatibility problem is NP-complete but has been shown to be fixed parameter tractable when parametrized by number of input trees. We characterize the compatible supertree problem in terms of triangulation of a structure called the display graph. We also give an alternative characterization in terms of cuts of the display graph. We show how these characterizations are related to characterization given in terms of triangulation of the edge label intersection graph. We then give a characterization of the agreement supertree problem.
In real world data, consistent supertrees do not always exist. Inconsistencies can be dealt with by contraction of edges or removal of taxa. The agreement supertree edge contraction (AST-EC) problem asks if a collection of k rooted trees can be made to agree by contraction of at most p edges. Similarly, the agreement supertree taxon removal (AST-TR) problem asks if a collection of k rooted trees can be made to agree by removal of at most p taxa. We give fixed parameter algorithms for both cases when parametrized by k and p.
We study the long standing conjecture on the perfect phylogeny problem; there exists a function f (r) such that a given collection C of r-state characters is compatible if and only if every f (r) subset of C is compatible. We will show that for r ≥ 2, f (r) ≥ lceil (r/2) rceil * lfloor(r/2)rfloor + 1
Explaining Evolution via Constrained Persistent Perfect Phylogeny
BACKGROUND:
The perfect phylogeny is an often used model in phylogenetics since it provides an efficient basic procedure for representing the evolution of genomic binary characters in several frameworks, such as for example in haplotype inference. The model, which is conceptually the simplest, is based on the infinite sites assumption, that is no character can mutate more than once in the whole tree. A main open problem regarding the model is finding generalizations that retain the computational tractability of the original model but are more flexible in modeling biological data when the infinite site assumption is violated because of e.g. back mutations. A special case of back mutations that has been considered in the study of the evolution of protein domains (where a domain is acquired and then lost) is persistency, that is the fact that a character is allowed to return back to the ancestral state. In this model characters can be gained and lost at most once. In this paper we consider the computational problem of explaining binary data by the Persistent Perfect Phylogeny model (referred as PPP) and for this purpose we investigate the problem of reconstructing an evolution where some constraints are imposed on the paths of the tree.
RESULTS:
We define a natural generalization of the PPP problem obtained by requiring that for some pairs (character, species), neither the species nor any of its ancestors can have the character. In other words, some characters cannot be persistent for some species. This new problem is called Constrained PPP (CPPP). Based on a graph formulation of the CPPP problem, we are able to provide a polynomial time solution for the CPPP problem for matrices whose conflict graph has no edges. Using this result, we develop a parameterized algorithm for solving the CPPP problem where the parameter is the number of characters.
CONCLUSIONS:
A preliminary experimental analysis shows that the constrained persistent perfect phylogeny model allows to explain efficiently data that do not conform with the classical perfect phylogeny model
Approximately counting locally-optimal structures
A locally-optimal structure is a combinatorial structure such as a maximal
independent set that cannot be improved by certain (greedy) local moves, even
though it may not be globally optimal. It is trivial to construct an
independent set in a graph. It is easy to (greedily) construct a maximal
independent set. However, it is NP-hard to construct a globally-optimal
(maximum) independent set. In general, constructing a locally-optimal structure
is somewhat more difficult than constructing an arbitrary structure, and
constructing a globally-optimal structure is more difficult than constructing a
locally-optimal structure. The same situation arises with listing. The
differences between the problems become obscured when we move from listing to
counting because nearly everything is #P-complete. However, we highlight an
interesting phenomenon that arises in approximate counting, where the situation
is apparently reversed. Specifically, we show that counting maximal independent
sets is complete for #P with respect to approximation-preserving reductions,
whereas counting all independent sets, or counting maximum independent sets is
complete for an apparently smaller class, which has a
prominent role in the complexity of approximate counting. Motivated by the
difficulty of approximately counting maximal independent sets in bipartite
graphs, we also study the problem of approximately counting other
locally-optimal structures that arise in algorithmic applications, particularly
problems involving minimal separators and minimal edge separators. Minimal
separators have applications via fixed-parameter-tractable algorithms for
constructing triangulations and phylogenetic trees. Although exact
(exponential-time) algorithms exist for listing these structures, we show that
the counting problems are #P-complete with respect to both exact and
approximation-preserving reductions.Comment: Accepted to JCSS, preliminary version accepted to ICALP 2015 (Track
A
Computer Science and Technology Series : XV Argentine Congress of Computer Science. Selected papers
CACIC'09 was the fifteenth Congress in the CACIC series. It was organized by the School of Engineering of the National University of Jujuy. The Congress included 9 Workshops with 130 accepted papers, 1 main Conference, 4 invited tutorials, different meetings related with Computer Science Education (Professors, PhD students, Curricula) and an International School with 5 courses. CACIC 2009 was organized following the traditional Congress format, with 9 Workshops covering a diversity of dimensions of Computer Science Research. Each topic was supervised by a committee of three chairs of different Universities.
The call for papers attracted a total of 267 submissions. An average of 2.7 review reports were collected for each paper, for a grand total of 720 review reports that involved about 300 different reviewers.
A total of 130 full papers were accepted and 20 of them were selected for this book.Red de Universidades con Carreras en Informática (RedUNCI
Recommended from our members
Proceedings of the Workshop on Algorithmic Aspects of Advanced Programming Languages: WAAAPL'99: Paris, France, September 30, 1999
The first Workshop on Algorithmic Aspects of Advanced Programming Languages was held on September 30, 1999, in Paris, France, in conjunction with the PLI'99 conferences and workshops. The choice of programming languages has a huge effect on the algorithms and data structures that are to be implemented in that language. Traditionally, algorithms and data structures have been studied in the context of imperative languages. This workshop considers the algorithmic implications of choosing an advanced functional or logic programming language instead. A total of eight papers were selected for presentation at the workshop, together with an invited lecture by Robert Harper. We would like to thank Dider Remv, general chair of PLI'99, for his assistance in organizing this workshop