Search CORE

6 research outputs found

Visualization of large influenza virus sequence datasets using adaptively aggregated trees with sampling-based subscale representation

Author
Publication venue: BioMed Central
Publication date
Field of study

Springer - Publisher Connector

Visualization of large influenza virus sequence datasets using adaptively aggregated trees with sampling-based subscale representation

Author: AM MacEachren
AS Fauci
D Beermann
D Bryant
E Ghedin
F Chevenet
G Mather
J Baron
J Felsenstein
J Kramer
J-DPC Fekete
JB Plotkin
JF Dufayard
JPP Lamping
L Zaslavsky
L Zaslavsky
Leonid Zaslavsky
N Amenta
PS Levy
S Weiss
SKND Card
Tatiana A Tatusova
U Rost
Y Bao
YI Wolf
Yiming Bao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Virus variation resources at the National Center for Biotechnology Information: dengue virus

Author: Bao Yiming
Kiryutin Boris
Resch Wolfgang
Rozanov Michael
Tatusova Tatiana A
Zaslavsky Leonid
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background There is an increasing number of complete and incomplete virus genome sequences available in public databases. This large body of sequence data harbors information about epidemiology, phylogeny, and virulence. Several specialized databases, such as the NCBI Influenza Virus Resource or the Los Alamos HIV database, offer sophisticated query interfaces along with integrated exploratory data analysis tools for individual virus species to facilitate extracting this information. Thus far, there has not been a comprehensive database for dengue virus, a significant public health threat. Results We have created an integrated web resource for dengue virus. The technology developed for the NCBI Influenza Virus Resource has been extended to process non-segmented dengue virus genomes. In order to allow efficient processing of the dengue genome, which is large in comparison with individual influenza segments, we developed an offline pre-alignment procedure which generates a multiple sequence alignment of all dengue sequences. The pre-calculated alignment is then used to rapidly create alignments of sequence subsets in response to user queries. This improvement in technology will also facilitate the incorporation of additional virus species in the future. The set of virus-specific databases at NCBI, which will be referred to as Virus Variation Resources (VVR), allow users to build complex queries against virus-specific databases and then apply exploratory data analysis tools to the results. The metadata is automatically collected where possible, and extended with data extracted from the literature. Conclusion The NCBI Dengue Virus Resource integrates dengue sequence information with relevant metadata (sample collection time and location, disease severity, serotype, sequenced genome region) and facilitates retrieval and preliminary analysis of dengue sequences using integrated web analysis and visualization tools.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Tree pruner: An efficient tool for selecting data from a biased genetic database

Author: Dietrich Jonathan
Dimitrijevic Mira
Green Margaret
Krishnamoorthy Mohan
Macken Catherine
Patel Pragneshkumar
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Large databases of genetic data are often biased in their representation. Thus, selection of genetic data with desired properties, such as evolutionary representation or shared genotypes, is problematic. Selection on the basis of epidemiological variables may not achieve the desired properties. Available automated approaches to the selection of influenza genetic data make a tradeoff between speed and simplicity on the one hand and control over quality and contents of the dataset on the other hand. A poorly chosen dataset may be detrimental to subsequent analyses. Results We developed a tool, <it>Tree Pruner</it>, for obtaining a dataset with desired evolutionary properties from a large, biased genetic database. Tree Pruner provides the user with an interactive phylogenetic tree as a means of editing the initial dataset from which the tree was inferred. The tree visualization changes dynamically, using colors and shading, reflecting Tree Pruner actions. At the end of a Tree Pruner session, the editing actions are implemented in the dataset. Currently, Tree Pruner is implemented on the Influenza Research Database (IRD). The data management capabilities of the IRD allow the user to store a pruned dataset for additional pruning or for subsequent analysis. Tree Pruner can be easily adapted for use with other organisms. Conclusions Tree Pruner is an efficient, manual tool for selecting a high-quality dataset with desired evolutionary properties from a biased database of genetic sequences. It offers an important alternative to automated approaches to the same goal, by providing the user with a dynamic, visual guide to the ongoing selection process and ultimate control over the contents (and therefore quality) of the dataset.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome

Author: Chang Suhua
Hilgenfeld Rolf
Mamlouk Amir Madany
Martinetz Thomas
Wang Jing
Zhang Jiajie
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Results of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few thousand sequences are to be included, analyzing the phylogenetic relationships among them becomes a challenging task. The recent frequent outbreaks of influenza A viruses have resulted in the rapid accumulation of corresponding genome sequences. Currently, there are more than 7500 influenza A virus genomes in the database. There are no efficient ways of representing this huge data set as a whole, thus preventing a further understanding of the diversity of the influenza A virus genome. Results Here we present a new algorithm, "PhyloMap", which combines ordination, vector quantization, and phylogenetic tree construction to give an elegant representation of a large sequence data set. The use of PhyloMap on influenza A virus genome sequences reveals the phylogenetic relationships of the internal genes that cannot be seen when only a subset of sequences are analyzed. Conclusions The application of PhyloMap to influenza A virus genome data shows that it is a robust algorithm for analyzing large sequence data sets. It utilizes the entire data set, minimizes bias, and provides intuitive visualization. PhyloMap is implemented in JAVA, and the source code is freely available at <url>http://www.biochem.uni-luebeck.de/public/software/phylomap.html</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Institute of Psychology,Chinese Academy Of Sciences

Institutional Repository of Institute of Psychology, Chinese Academy of Sciences

Panorama phylogenetic diversity and distribution of type A influenza viruses based on their six internal gene sequences

Author: Chen Ji-Ming
Chen Ji-Wang
Liu Shuo
Peng Dong
Shen Chao-Jian
Sun Xiang-Dong
Sun Ying-Xue
Yu Jian-Min
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Type A influenza viruses are important pathogens of humans, birds, pigs, horses and some marine mammals. The viruses have evolved into multiple complicated subtypes, lineages and sublineages. Recently, the phylogenetic diversity of type A influenza viruses from a whole view has been described based on the viral external HA and NA gene sequences, but remains unclear in terms of their six internal genes (PB2, PB1, PA, NP, MP and NS). Methods In this report, 2798 representative sequences of the six viral internal genes were selected from GenBank using the web servers in NCBI Influenza Virus Resource. Then, the phylogenetic relationships among the representative sequences were calculated using the software tools MEGA 4.1 and RAxML 7.0.4. Lineages and sublineages were classified mainly according to topology of the phylogenetic trees and distribution of the viruses in hosts, regions and time. Results The panorama phylogenetic trees of the six internal genes of type A influenza viruses were constructed. Lineages and sublineages within the type based on the six internal genes were classified and designated by a tentative universal numerical nomenclature system. The diversity of influenza viruses circulating in different regions, periods, and hosts based on the panorama trees was analyzed. Conclusion This study presents the first whole views to the phylogenetic diversity and distribution of type A influenza viruses based on their six internal genes. It also proposes a tentative universal nomenclature system for the viral lineages and sublineages. These can be a candidate framework to generalize the history and explore the future of the viruses, and will facilitate future scientific communications on the phylogenetic diversity and evolution of the viruses. In addition, it provides a novel phylogenetic view (i.e. the whole view) to recognize the viruses including the origin of the pandemic A(H1N1) influenza viruses.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central