Search CORE

46 research outputs found

Fast neighbor joining

Author: Elias Isaac
Lagergren Jens
Publication venue: Published by Elsevier B.V.
Publication date: 17/05/2009
Field of study

AbstractReconstructing the evolutionary history of a set of species is a fundamental problem in biology and methods for solving this problem are gaged based on two characteristics: accuracy and efficiency. Neighbor Joining (NJ) is a so-called distance-based method that, thanks to its good accuracy and speed, has been embraced by the phylogeny community. It takes the distances between n taxa and produces in Θ(n3) time a phylogenetic tree, i.e., a tree which aims to describe the evolutionary history of the taxa. In addition to performing well in practice, the NJ algorithm has optimal reconstruction radius.The contribution of this paper is twofold: (1) we present an algorithm called Fast Neighbor Joining (FNJ) with optimal reconstruction radius and optimal run time complexity O(n2) and (2) we present a greatly simplified proof for the correctness of NJ. Initial experiments show that FNJ in practice has almost the same accuracy as NJ, indicating that the property of optimal reconstruction radius has great importance to their good performance. Moreover, we show how improved running time can be achieved for computing the so-called correction formulas

Elsevier - Publisher Connector

Why neighbor-joining works

Author: Levy Dan
Mihaescu Radu
Pachter Lior
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/05/2009
Field of study

We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson's optimal radius bound as a special case and explains many cases where neighbor-joining is successful even when Atteson's criterion is not satisfied. We also provide a proof for Atteson's conjecture on the optimal edge radius of the neighbor-joining algorithm. The strong performance guarantees we provide also hold for the quadratic time fast neighbor-joining algorithm, thus providing a theoretical basis for inferring very large phylogenies with neighbor-joining

Live neighbor-joining

Author: Almeida Júnior Nalvo Franco de
Araújo Graziela S.
Brígido Marcelo de Macedo
Telles Guilherme P.
Walter Maria Emília Machado Telles
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2018
Field of study

Background: In phylogenetic reconstruction the result is a tree where all taxa are leaves and internal nodes are hypothetical ancestors. In a live phylogeny, both ancestral and living taxa may coexist, leading to a tree where internal nodes may be living taxa. The well-known Neighbor-Joining heuristic is largely used for phylogenetic reconstruction. Results: We present Live Neighbor-Joining, a heuristic for building a live phylogeny. We have investigated Live Neighbor-Joining on datasets of viral genomes, a plausible scenario for its application, which allowed the construction of alternative hypothesis for the relationships among virus that embrace both ancestral and descending taxa. We also applied Live Neighbor-Joining on a set of bacterial genomes and to sets of images and texts. Non-biological data may be better explored visually when their relationship in terms of content similarity is represented by means of a phylogeny. Conclusion: Our experiments have shown interesting alternative phylogenetic hypothesis for RNA virus genomes, bacterial genomes and alternative relationships among images and texts, illustrating a wide range of scenarios where Live Neighbor-Joining may be used

Repositório Institucional da Universidade de Brasília

Directory of Open Access Journals

XplorSeq: A software environment for integrated management and phylogenetic analysis of metagenomic sequence data

Author: A Stamatakis
AB Dalby
AE Magurran
B Ewing
B Ewing
C Lozupone
CJ McManus
D Papineau
DA Peterson
Daniel N Frank
DG Higgins
DN Frank
DN Frank
DN Frank
E Pruesse
GA Denisov
J Felsenstein
JD Thompson
JF Rawls
JJ Walker
JJ Walker
JK Harris
JR Spear
JR Spear
JR Spear
JR Spear
JW Sahl
L Lee
L Sheneman
LK Baumgartner
LM Feazel
PD Schloss
PD Schloss
PJ Turnbaugh
PJ Turnbaugh
R Chenna
RE Ley
RE Ley
RE Ley
SF Altschul
SF Altschul
TA Isenbarger
TM Salmassi
TZ DeSantis Jr
W Ludwig
Publication venue: BioMed Central
Publication date: 01/10/2008
Field of study

Abstract Background Advances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects. Results XplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file. Conclusion XplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at <url>http://vent.colorado.edu/phyloware</url>.</p

Crossref

Directory of Open Access Journals

PubMed Central

Neighbor Joining And Leaf Status

Author: Weller Mathias
Publication venue
Publication date: 30/05/2023
Field of study

The Neighbor Joining Algorithm is among the most fundamental algorithmic results in computational biology. However, its definition and correctness proof are not straightforward. In particular, ''the question ''what does the NJ method seek to do?'' has until recently proved somewhat elusive'' [Gascuel \& Steel, 2006]. While a rigorous mathematical analysis is now available, it is still considered somewhat hard to follow and its proof tedious at best. In this work, we present an alternative interpretation of the goal of the Neighbor Joining algorithm by proving that it chooses to merge the two taxa u and v that maximize the ''leaf-status'', that is, the sum of distances of all leaves to the unique u-v-path

arXiv.org e-Print Archive

Phylogenomic identification of five new human homologs of the DNA repair enzyme AlkB

Author: Bhagwat Ashok S
Bujnicki Janusz M
Kurowski Michal A
Papaj Grzegorz
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: Combination of biochemical and bioinformatic analyses led to the discovery of oxidative demethylation – a novel DNA repair mechanism catalyzed by the Escherichia coli AlkB protein and its two human homologs, hABH2 and hABH3. This discovery was based on the prediction made by Aravind and Koonin that AlkB is a member of the 2OG-Fe(2+ )oxygenase superfamily. RESULTS: In this article, we report identification and sequence analysis of five human members of the (2OG-Fe(2+)) oxygenase superfamily designated here as hABH4 through hABH8. These experimentally uncharacterized and poorly annotated genes were not associated with the AlkB family in any database, but are predicted here to be phylogenetically and functionally related to the AlkB family (and specifically to the lineage that groups together hABH2 and hABH3) rather than to any other oxygenase family. Our analysis reveals the history of ABH gene duplications in the evolution of vertebrate genomes. CONCLUSIONS: We hypothesize that hABH 4–8 could either be back-up enzymes for hABH1-3 or may code for novel DNA or RNA repair activities. For example, enzymes that can dealkylate N3-methylpurines or N7-methylpurines in DNA have not been described. Our analysis will guide experimental confirmation of these novel human putative DNA repair enzymes

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Wayne State University

ISOLASI KHAMIR DARI BATANG TANAMAN TEBU DAN IDENTIFIKASINYA BERDASARKAN SEKUENS INTERNAL TRANSCRIBED SPACER

Author: Anggraini Ika
Ferniah Rejeki Siti
Kusdiyantini Endang
Publication venue: 'Badan Pengkajian dan Penerapan Teknologi (BPPT)'
Publication date: 24/06/2019
Field of study

Isolation of Yeasts from Sugarcane Stems and Their Identification Based on Internal Transcribed Spacer SequencesÂ ABSTRACTFermentative yeasts used in food, health, and energy industries need to be explored to discover their potential. The purpose of this study was to obtain fermentative yeast isolates from sugarcane stems and subsequently to undertake morphological, biochemical, and molecular identification. The isolation of epiphytic and endophytic yeasts was carried out by spread plate method using sugarcane soak water and sugarcane juice on potato dextrose agar (PDA) and yeast-glucose-peptone (YGP) agar media. Morphological identification was based on macroscopic and microscopic observations. Biochemical identification was performed using carbohydrate fermentation and 50%-glucose media tests. Selected isolates were identified molecularly using Internal Transcribed Spacer (ITS). Seven yeast isolates were obtained, of which isolate Ed 1B was selected. Isolate ED 1B was of round colonies, creamy white colour, shiny, embossed, and wavy appearance, ovoid cell shape with a cell diameter of 4.74 Âµm. It had budding cells, was able to ferment glucose and sucrose (but not lactose), and grew on 50 %-glucose media. Results of BLAST showed that isolates Ed 1B had 99% homology with Kodamaea ohmeri.Keywords: isolation, ITS, molecular identification, Saccharum officinarum L., yeastÂ ABSTRAKKhamir fermentatif yang digunakan dalam industri pangan, kesehatan dan energi perlu dieksplorasi untuk mengetahui potensinya. Tujuan penelitian ini adalah untuk memperoleh isolat khamir fermentatif dari batang tebu dan untuk kemudian diidentifikasi secara morfologi, biokimia dan molekuler. Isolasi khamir epifit dan endofit dilakukan dengan metode cawan sebar dari air rendaman tebu dan jus tebu pada media potato dextrose agar (PDA) dan yeast-glucose-peptone (YGP). Identifikasi morfologi berdasarkan pengamatan makroskopis dan mikroskopis. Identifikasi biokimia menggunakan uji fermentasi karbohidrat dan uji media glukosa 50%. Isolat terpilih diidentifikasi molekuler menggunakan Internal Transcribed Spacer (ITS). Hasil isolasi memperoleh 7 isolat khamir. Satu isolat terpilih (Ed 1B) didapatkan dan memiliki ciri-ciri koloni bulat, putih krem, mengkilap, timbul, bergelombang, bentuk sel ovoid dengan diameter sel 4,74 Âµm, memiliki budding cell, mampu memfermentasi glukosa dan sukrosa, tidak memfermentasi laktosa, serta tumbuh pada media glukosa 50%. Hasil BLAST menunjukkan bahwa isolat Ed 1B memiliki homologi 99% dengan Kodamaea ohmeri.Kata Kunci: identifikasi molekuler, isolasi, ITS, khamir, Saccharum officinarum L

Jurnal Bioteknologi & Biosains Indonesia (JBBI)

Fast computation of distance estimators

Author: A Rambaut
D Swofford
F Barker
H Kishino
I Elias
Isaac Elias
J Felsenstein
J Felsenstein
Jens Lagergren
K Tamura
K Tuplin
L Arvestad
M Kimura
N Saitou
T Jukes
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Some distance methods are among the most commonly used methods for reconstructing phylogenetic trees from sequence data. The input to a distance method is a distance matrix, containing estimated pairwise distances between all pairs of taxa. Distance methods themselves are often fast, e.g., the famous and popular Neighbor Joining (NJ) algorithm reconstructs a phylogeny of n taxa in time O(n(3)). Unfortunately, the fastest practical algorithms known for Computing the distance matrix, from n sequences of length l, takes time proportional to l·n(2). Since the sequence length typically is much larger than the number of taxa, the distance estimation is the bottleneck in phylogeny reconstruction. This bottleneck is especially apparent in reconstruction of large phylogenies or in applications where many trees have to be reconstructed, e.g., bootstrapping and genome wide applications. RESULTS: We give an advanced algorithm for Computing the number of mutational events between DNA sequences which is significantly faster than both Phylip and Paup. Moreover, we give a new method for estimating pairwise distances between sequences which contain ambiguity Symbols. This new method is shown to be more accurate as well as faster than earlier methods. CONCLUSION: Our novel algorithm for Computing distance estimators provides a valuable tool in phylogeny reconstruction. Since the running time of our distance estimation algorithm is comparable to that of most distance methods, the previous bottleneck is removed. All distance methods, such as NJ, require a distance matrix as input and, hence, our novel algorithm significantly improves the overall running time of all distance methods. In particular, we show for real world biological applications how the running time of phylogeny reconstruction using NJ is improved from a matter of hours to a matter of seconds

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Exploring Hierarchical Visualization Designs Using Phylogenetic Trees

Author: Chang Remco
Childs Hank
Crouser R. Jordan
Gramazio Connor
Griffin Garth
Li Shaomeng
Schulz Hans-Jörg
Publication venue: Smith ScholarWorks
Publication date: 01/01/2015
Field of study

Ongoing research on information visualization has produced an ever-increasing number of visualization designs. Despite this activity, limited progress has been made in categorizing this large number of information visualizations. This makes understanding their common design features challenging, and obscures the yet unexplored areas of novel designs. With this work, we provide categorization from an evolutionary perspective, leveraging a computational model to represent evolutionary processes, the phylogenetic tree. The result — a phylogenetic tree of a design corpus of hierarchical visualizations — enables better understanding of the various design features of hierarchical information visualizations, and further illuminates the space in which the visualizations lie, through support for interactive clustering and novel design suggestions. We demonstrate these benefits with our software system, where a corpus of two-dimensional hierarchical visualization designs is constructed into a phylogenetic tree. This software system supports visual interactive clustering and suggesting for novel designs; the latter capacity is also demonstrated via collaboration with an artist who sketched new designs using our system

CiteSeerX

Smith College: Smith ScholarWorks