37 research outputs found

    Extracting Conflict-free Information from Multi-labeled Trees

    Get PDF
    A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation in MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.Comment: Submitted in Workshop on Algorithms in Bioinformatics 2012 (http://algo12.fri.uni-lj.si/?file=wabi

    The multiple gene duplication problem revisited

    Get PDF
    Motivation: Deciphering the location of gene duplications and multiple gene duplication episodes on the Tree of Life is fundamental to understanding the way gene families and genomes evolve. The multiple gene duplication problem provides a framework for placing gene duplication events onto nodes of a given species tree, and detecting episodes of multiple gene duplication. One version of the multiple gene duplication problem was defined by Guigó et al. in 1996. Several heuristic solutions have since been proposed for this problem, but no exact algorithms were known

    Generating functions for multi-labeled trees

    Get PDF
    Multi-labeled trees are a generalization of phylogenetic trees that are used, for example, in the study of gene versus species evolution and as the basis for phylogenetic network construction. Unlike phylogenetic trees, in a leaf-multi-labeled tree it is possible to label more than one leaf by the same element of the underlying label set. In this paper we derive formulae for generating functions of leaf-multi-labeled trees and use these to derive recursions for counting such trees. In particular,weprove results which generalize previous theorems by Harding on so-called tree-shapes, and by Otter on relating the number of rooted and unrooted phylogenetic trees

    Completeness Results for Parameterized Space Classes

    Full text link
    The parameterized complexity of a problem is considered "settled" once it has been shown to lie in FPT or to be complete for a class in the W-hierarchy or a similar parameterized hierarchy. Several natural parameterized problems have, however, resisted such a classification. At least in some cases, the reason is that upper and lower bounds for their parameterized space complexity have recently been obtained that rule out completeness results for parameterized time classes. In this paper, we make progress in this direction by proving that the associative generability problem and the longest common subsequence problem are complete for parameterized space classes. These classes are defined in terms of different forms of bounded nondeterminism and in terms of simultaneous time--space bounds. As a technical tool we introduce a "union operation" that translates between problems complete for classical complexity classes and for W-classes.Comment: IPEC 201

    A list of parameterized problems in bioinformatics

    Get PDF
    In this report we present a list of problems that originated in bionformatics. Our aim is to collect information on such problems that have been analyzed from the point of view of Parameterized Complexity. For every problem we give its definition and biological motivation together with known complexity results.Postprint (published version

    Algorithms for efficient phylogenetic tree construction

    Get PDF
    The rapidly increasing amount of available genomic sequence data provides an abundance of potential information for phylogenetic analyses. Many models and methods have been developed to build evolutionary trees based on this information. A common feature of most of these models is that they start out with fragments of the genome, called genes. Depending on the genes and species, and the methods used to perform the phylogenetic analyses, one typically ends up with a large number of phylogenetic trees which may not agree with one another. Simply put, the problem now is the following: Given several discordant phylogenetic trees as input, infer the (presumably) correct phylogeny. This thesis seeks to address some of the methodological and algorithmic challenges posed by this problem. In particular, we present two new algorithms related to inferring phylogenetic trees in the presence of gene duplication, and introduce a new distance measure for comparing phylogenetic trees

    Using Trees: Myrmecocystus Phylogeny and Character Evolution and New Methods for Investigating Trait Evolution and Species Delimitation (PhD Dissertation)

    Get PDF
    1) Rates of phenotypic evolution have changed throughout the history of life, producing variation in levels of morphological, functional, and ecological diversity among groups. Testing for the presence of these rate shifts is a key component of evaluating hypotheses about what causes them. General predictions regarding changes in phenotypic diversity as a function of evolutionary history and rates are developed, and tests are derived to evaluate rate changes. Simulations show that these tests are more powerful than existing tests using standardized contrasts. 
2) Species delimitation and species tree inference are difficult problems in the case of recent divergences, especially when different loci have different histories. I quantify the difficulty of the problem and introduce a non-parametric method for simultaneously dividing anonymous samples into different species and inferring a species tree, using individual gene trees as input. This heuristic method seeks to both minimize gene tree – species tree discordance and excess population structure within a species. Analyses suggest that the method may provide useful insights for systematists working at the species level with molecular data.
3) The phylogeny of Myrmecocystus ants is estimated using nine loci, finding that none of the three subgenera are monophyletic, implying repeated evolution of foraging times and particular morphologies. A new partitioned likelihood program, MrFisher, is created from MrBayes to aid analysis of multilocus datasets without assuming priors. Simulations show that using a partitioned likelihood approach in the presence of rate heterogeneity and missing data, as is common in supermatrix analyses, can recover correct branch lengths where non-partitioned likelihood gives predictably biased estimates of branch lengths but the correct topology.
4) Evolution of foraging time and coevolution of behavior and morphology in Myrmecocystus ants is examined. New models for reconstructing discrete states along branches of a tree and for examining continuous trait evolution and coevolution with discrete traits are developed and implemented. Foraging transitions between diurnal and nocturnal foraging evidently go through crepuscular intermediates. There is some evidence for increased rates of morphological character evolution associated with changes in foraging regime, but little evidence for particular optimum values for morphological traits associated with foraging

    Dynamics of HIV-1 Infection and Therapy In Vivo

    Get PDF
    Human immunodeficiency virus type 1 (HIV-1) is the causative agent of acquired immune deficiency syndrome (AIDS), a disease responsible for extensive morbidity and mortality worldwide. Despite more than thirty years of research since the discovery of HIV-1, no cure or vaccine yet exists. HIV-1 infection, while treatable with suppressive antiretroviral therapy drugs (ART), establishes lifelong persistence in the infected host as a natural consequence of the viral life cycle and the dynamic properties of the human immune cells in which HIV-1 propagates. This persistence is driven by populations of rare, long-lived HIV-1-infected cells, termed latently infected cells (LICs), that are refractory to immune clearance and viral cytopathic effects. Interruption of suppressive therapy – even after years of continuous and effective treatment – rapidly leads to virological rebound, requiring infected persons to remain on ART indefinitely. As the need to maintain lifelong daily ART imposes a substantial compliance burden on those infected, two major goals of HIV-1 research, broadly, concern (1) developing new therapeutic modalities that may alleviate some drawbacks to ART, and (2) identifying means with which to target and eradicate LICs as an approach to curing HIV-1 infection. To these ends, in the first three chapters of my thesis, I discuss my work demonstrating the utility of highly potent anti-HIV-1 antibodies in a number of therapeutic contexts. As antibody therapy expectedly did not result in cure, I was later motivated to study the nature of LIC formation and persistence. The fourth chapter of this thesis outlines my work to develop new molecular tools to interrogate LICs in a humanized mouse model of HIV-1 infection
    corecore