3,463 research outputs found
Rooting for phylogenetic networks
This paper studies the relationship between undirected (unrooted) and
directed (rooted) phylogenetic networks. We describe a polynomial-time
algorithm for deciding whether an undirected binary phylogenetic network, given
the locations of the root and reticulation vertices, can be oriented as a
directed phylogenetic network. Moreover, we give a mathematical
characterization of when this is the case and show that this directed
phylogenetic network is then always unique. These results are generalized to
the nonbinary case. In addition, we describe an algorithm for deciding whether
an undirected binary phylogenetic network can be oriented as a directed
phylogenetic network of a certain class. The algorithm is fixed-parameter
tractable (FPT) when the parameter is the level of the network and is
applicable to a wide range of network classes, including tree-child,
tree-based, stack-free and orchard networks. It can also be used to decide
whether an undirected phylogenetic network is tree-based and whether a
partly-directed phylogenetic network can be oriented as a directed phylogenetic
network. Finally, we show that, in general, it is NP-hard to decide whether an
undirected network can be oriented as a tree-based network
On unrooted and root-uncertain variants of several well-known phylogenetic network problems
The hybridization number problem requires us to embed a set of binary rooted
phylogenetic trees into a binary rooted phylogenetic network such that the
number of nodes with indegree two is minimized. However, from a biological
point of view accurately inferring the root location in a phylogenetic tree is
notoriously difficult and poor root placement can artificially inflate the
hybridization number. To this end we study a number of relaxed variants of this
problem. We start by showing that the fundamental problem of determining
whether an \emph{unrooted} phylogenetic network displays (i.e. embeds) an
\emph{unrooted} phylogenetic tree, is NP-hard. On the positive side we show
that this problem is FPT in reticulation number. In the rooted case the
corresponding FPT result is trivial, but here we require more subtle
argumentation. Next we show that the hybridization number problem for unrooted
networks (when given two unrooted trees) is equivalent to the problem of
computing the Tree Bisection and Reconnect (TBR) distance of the two unrooted
trees. In the third part of the paper we consider the "root uncertain" variant
of hybridization number. Here we are free to choose the root location in each
of a set of unrooted input trees such that the hybridization number of the
resulting rooted trees is minimized. On the negative side we show that this
problem is APX-hard. On the positive side, we show that the problem is FPT in
the hybridization number, via kernelization, for any number of input trees.Comment: 28 pages, 8 Figure
Recommended from our members
Phylogenetic patterns recover known HIV epidemiological relationships and reveal common transmission of multiple variants.
The growth of human immunodeficiency virus (HIV) sequence databases resulting from drug resistance testing has motivated efforts using phylogenetic methods to assess how HIV spreads1-4. Such inference is potentially both powerful and useful for tracking the epidemiology of HIV and the allocation of resources to prevention campaigns. We recently used simulation and a small number of illustrative cases to show that certain phylogenetic patterns are associated with different types of epidemiological linkage5. Our original approach was later generalized for large next-generation sequencing datasets and implemented as a free computational pipeline6. Previous work has claimed that direction and directness of transmission could not be established from phylogeny because one could not be sure that there were no intervening or missing links involved7-9. Here, we address this issue by investigating phylogenetic patterns from 272 previously identified HIV transmission chains with 955 transmission pairs representing diverse geography, risk groups, subtypes, and genomic regions. These HIV transmissions had known linkage based on epidemiological information such as partner studies, mother-to-child transmission, pairs identified by contact tracing, and criminal cases. We show that the resulting phylogeny inferred from real HIV genetic sequences indeed reveals distinct patterns associated with direct transmission contra transmissions from a common source. Thus, our results establish how to interpret phylogenetic trees based on HIV sequences when tracking who-infected-whom, when and how genetic information can be used for improved tracking of HIV spread. We also investigate limitations that stem from limited sampling and genetic time-trends in the donor and recipient HIV populations
Computational phylogenetics and the classification of South American languages
In recent years, South Americanist linguists have embraced computational phylogenetic methods to resolve the numerous outstanding questions about the genealogi- cal relationships among the languages of the continent. We provide a critical review of the methods and language classification results that have accumulated thus far, emphasizing the superiority of character-based methods over distance-based ones and the importance of develop- ing adequate comparative datasets for producing well- resolved classifications
Rings Reconcile Genotypic and Phenotypic Evolution within the Proteobacteria.
Although prokaryotes are usually classified using molecular phylogenies instead of phenotypes after the advent of gene sequencing, neither of these methods is satisfactory because the phenotypes cannot explain the molecular trees and the trees do not fit the phenotypes. This scientific crisis still exists and the profound disconnection between these two pillars of evolutionary biology--genotypes and phenotypes--grows larger. We use rings and a genomic form of goods thinking to resolve this conundrum (McInerney JO, Cummins C, Haggerty L. 2011. Goods thinking vs. tree thinking. Mobile Genet Elements. 1:304-308; Nelson-Sathi S, et al. 2015. Origins of major archaeal clades correspond to gene acquisitions from bacteria. Nature 517:77-80). The Proteobacteria is the most speciose prokaryotic phylum known. It is an ideal phylogenetic model for reconstructing Earth's evolutionary history. It contains diverse free living, pathogenic, photosynthetic, sulfur metabolizing, and symbiotic species. Due to its large number of species (Whitman WB, Coleman DC, Wiebe WJ. 1998. Prokaryotes: the unseen majority. Proc Nat Acad Sci U S A. 95:6578-6583) it was initially expected to provide strong phylogenetic support for a proteobacterial tree of life. But despite its many species, sequence-based tree analyses are unable to resolve its topology. Here we develop new rooted ring analyses and study proteobacterial evolution. Using protein family data and new genome-based outgroup rooting procedures, we reconstruct the complex evolutionary history of the proteobacterial rings (combinations of tree-like divergences and endosymbiotic-like convergences). We identify and map the origins of major gene flows within the rooted proteobacterial rings (P < 3.6 Ă 10(-6)) and find that the evolution of the "Alpha-," "Beta-," and "Gammaproteobacteria" is represented by a unique set of rings. Using new techniques presented here we also root these rings using outgroups. We also map the independent flows of genes involved in DNA-, RNA-, ATP-, and membrane- related processes within the Proteobacteria and thereby demonstrate that these large gene flows are consistent with endosymbioses (P < 3.6 Ă 10(-9)). Our analyses illustrate what it means to find that a gene is present, or absent, within a gene flow, and thereby clarify the origin of the apparent conflicts between genotypes and phenotypes. Here we identify the gene flows that introduced photosynthesis into the Alpha-, Beta-, and Gammaproteobacteria from the common ancestor of the Actinobacteria and the Firmicutes. Our results also explain why rooted rings, unlike trees, are consistent with the observed genotypic and phenotypic relationships observed among the various proteobacterial classes. We find that ring phylogenies can explain the genotypes and the phenotypes of biological processes within large and complex groups like the Proteobacteria
How tree-based is my network? Proximity measures for unrooted phylogenetic networks
Tree-based networks are a class of phylogenetic networks that attempt to
formally capture what is meant by "tree-like" evolution. A given non-tree-based
phylogenetic network, however, might appear to be very close to being
tree-based, or very far. In this paper, we formalise the notion of proximity to
tree-based for unrooted phylogenetic networks, with a range of proximity
measures. These measures also provide characterisations of tree-based networks.
One measure in particular, related to the nearest neighbour interchange
operation, allows us to define the notion of "tree-based rank". This provides a
subclassification within the tree-based networks themselves, identifying those
networks that are "very" tree-based. Finally, we prove results relating
tree-based networks in the settings of rooted and unrooted phylogenetic
networks, showing effectively that an unrooted network is tree-based if and
only if it can be made a rooted tree-based network by rooting it and orienting
the edges appropriately. This leads to a clarification of the contrasting
decision problems for tree-based networks, which are polynomial in the rooted
case but NP complete in the unrooted
The early expansion and evolutionary dynamics of POU class genes.
The POU genes represent a diverse class of animal-specific transcription factors that play important roles in neurogenesis, pluripotency, and cell-type specification. Although previous attempts have been made to reconstruct the evolution of the POU class, these studies have been limited by a small number of representative taxa, and a lack of sequences from basally branching organisms. In this study, we performed comparative analyses on available genomes and sequences recovered through "gene fishing" to better resolve the topology of the POU gene tree. We then used ancestral state reconstruction to map the most likely changes in amino acid evolution for the conserved domains. Our work suggests that four of the six POU families evolved before the last common ancestor of living animals-doubling previous estimates-and were followed by extensive clade-specific gene loss. Amino acid changes are distributed unequally across the gene tree, consistent with a neofunctionalization model of protein evolution. We consider our results in the context of early animal evolution, and the role of POU5 genes in maintaining stem cell pluripotency
Molecular diversity of arbuscular mycorrhizal fungi in onion roots from organic and conventional farming systems in the Netherlands
Diversity and colonization levels of naturally occurring arbuscular mycorrhizal fungi (AMF) in onion roots were studied to compare organic and conventional farming systems in the Netherlands. In 2004, 20 onion fields were sampled in a balanced survey between farming systems and between two regions, namely, Zeeland and Flevoland. In 2005, nine conventional and ten organic fields were additionally surveyed in Flevoland. AMF phylotypes were identified by rDNA sequencing. All plants were colonized, with 60% for arbuscular colonization and 84% for hyphal colonization as grand means. In Zeeland, onion roots from organic fields had higher fractional colonization levels than those from conventional fields. Onion yields in conventional farming were positively correlated with colonization level. Overall, 14 AMF phylotypes were identified. The number of phylotypes per field ranged from one to six. Two phylotypes associated with the Glomus mosseae-coronatum and the G. caledonium-geosporum species complexes were the most abundant, whereas other phylotypes were infrequently found. Organic and conventional farming systems had similar number of phylotypes per field and Shannon diversity indices. A few organic and conventional fields had larger number of phylotypes, including phylotypes associated with the genera Glomus-B, Archaeospora, and Paraglomus. This suggests that farming systems as such did not influence AMF diversity, but rather specific environmental conditions or agricultural practice
- âŠ