Search CORE

9,251 research outputs found

Visualization of comparative genomic analyses by BLAST score ratio

Author: Myers Garry SA
Rasko David A
Ravel Jacques
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: The first microbial genome sequence, Haemophilus influenzae, was published in 1995. Since then, more than 400 microbial genome sequences have been completed or commenced. This massive influx of data provides the opportunity to obtain biological insights through comparative genomics. However few tools are available for this scale of comparative analysis. RESULTS: The BLAST Score Ratio (BSR) approach, implemented in a Perl script, classifies all putative peptides within three genomes using a measure of similarity based on the ratio of BLAST scores. The output of the BSR analysis enables global visualization of the degree of proteome similarity between all three genomes. Additional output enables the genomic synteny (conserved gene order) between each genome pair to be assessed. Furthermore, we extend this synteny analysis by overlaying BSR data as a color dimension, enabling visualization of the degree of similarity of the peptides being compared. CONCLUSIONS: Combining the degree of similarity, synteny and annotation will allow rapid identification of conserved genomic regions as well as a number of common genomic rearrangements such as insertions, deletions and inversions. The script and example visualizations are available at:

Springer - Publisher Connector

Directory of Open Access Journals

OPUS - University of Technology Sydney

PubMed Central

Insights into enterotoxigenic Escherichia coli diversity in Bangladesh utilizing genomic epidemiology

Author: A Bankevich
A Mentzer von
A Sjoling
AC Vicente
AM Almeida de
AR Quinlan
AR Wattam
C Lindenthal
CA Jacobi
D Hernandez
DA Rasko
DA Rasko
DG Evans
DJ Ingle
EW Myers
F Canto Del
F Canto Del
F Qadri
GB Lindblom
GP Munson
H Steinsland
HL DuPont
HL DuPont
I Bolin
I Letunic
JM Fleckenstein
JM Fleckenstein
JM Fleckenstein
JS Farris
JW Sahl
JW Sahl
JW Sahl
JW Sahl
JW Sahl
K Hayashi
K Roy
K Roy
K Roy
K Schliep
KL Kotloff
LC Crossman
M So
M So
MA Lasaro
MN Price
MS Donnenberg
P Kumar
Q Luo
R Lozano
RA Finkelstein
RA Nada
RB Sack
RC Edgar
RI Walker
S Schubert
SF Altschul
SK Patel
SM Turner
T Seemann
T Wirth
TH Hazen
W Gaastra
WHO
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Crossref

Digital Commons@Becker

Tracing Lifestyle Adaptation in Prokaryotic Genomes

Author: Altermann Eric
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2012
Field of study

Lifestyle adaptation of microbes due to changes in their ecological niches or acquisition of new environments is a major driving force for genetic changes in their respective genomes. Moving into more specialized niches often results in the acquisition of new gene sets via horizontal gene transfer to utilize previously unavailable metabolites, while genetic ballast is shed by gene loss and/or gene inactivation. In some cases, larger genome rearrangements can be observed, such as the incorporation of whole genetic islands, providing a range of new phenotypic capabilities. Until recently these changes could not be comprehensively followed and identified due to the lack of complete microbial genome sequences. The advent of high-throughput DNA sequencing has dramatically changed the scientific landscape and today microbial genomes have become increasingly abundant. Currently, more than 2,900 genomes are published and more than 11,000 genome projects are listed in the Genomes Online Database‡. Although this wealth of information provides many new opportunities to assess microbial functionality, it also creates a new array of challenges when a comparison between multiple microbial genomes is required. Here, functional genome distribution (FGD) is introduced, analyzing the diversity between microbes based on their predicted ORFeome. FGD is therefore a comparative genomics approach, emphasizing the assessments of gene complements. To further facilitate the comparison between two or more genomes, degrees of amino-acid similarities between ORFeomes can be visualized in the Artemis comparison tool, graphically depicting small and large scale genome rearrangements, insertion and deletion events, and levels of similarity between individual open reading frames. FGD provides a new tool for comparative microbial genomics and the interpretation of differences in the genetic makeup of bacteria

Crossref

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

Software tools for comparing genomic sequence

Author: Henley Morel
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2007
Field of study

We describe three software tools related to research in comparative genomics, a growing research area that explores the variation within and between organisms. We developed a set of tools that explore sequence similarity and differences in genomes. Two of these tools are specifically aimed at examining DNA sequence data from two or more genomes: (1) The Magenta\u27s OPUS tool compares genomic sequences to identify shared or unique segments between closely related species. This tool looks for functional similarities and differences in genomic data by classifying sequences into groups based on genomic categories: Orthologs, Paralogs, and Unique Sequence. (2) The DSNP tool looks at the nucleotide level to find single nucleotide polymorphisms (SNPs) within an individual. This program is a collection of existing and custom built tools to discover and analyze SNPs within the Daphnia pulex genome. (3) The third tool supports a user evaluation of two different visualization techniques for comparing nucleotide or protein sequences

UNH Scholars' Repository

The large-scale blast score ratio (LS-BSR) pipeline: a method to rapidly compare genetic content between bacterial genomes

Author: Caporaso J. Gregory
Keim Paul
Rasko David A.
Sahl Jason W.
Publication venue: 'PeerJ'
Publication date: 01/01/2014
Field of study

Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR) pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP) based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar) designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27-57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated into clinical diagnostics, or can be used to identify broadly conserved putative therapeutic candidates

OpenKnowledge@NAU

Directory of Open Access Journals

PubMed Central

Progressive Mauve: Multiple alignment of genomes with gene flux and rearrangement

Author: Darling Aaron E.
Mau Bob
Perna Nicole T.
Publication venue
Publication date: 01/01/2009
Field of study

Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms. We describe a method to align two or more genomes that have undergone large-scale recombination, particularly genomes that have undergone substantial amounts of gene gain and loss (gene flux). The method utilizes a novel alignment objective score, referred to as a sum-of-pairs breakpoint score. We also apply a probabilistic alignment filtering method to remove erroneous alignments of unrelated sequences, which are commonly observed in other genome alignment methods. We describe new metrics for quantifying genome alignment accuracy which measure the quality of rearrangement breakpoint predictions and indel predictions. The progressive genome alignment algorithm demonstrates markedly improved accuracy over previous approaches in situations where genomes have undergone realistic amounts of genome rearrangement, gene gain, loss, and duplication. We apply the progressive genome alignment algorithm to a set of 23 completely sequenced genomes from the genera Escherichia, Shigella, and Salmonella. The 23 enterobacteria have an estimated 2.46Mbp of genomic content conserved among all taxa and total unique content of 15.2Mbp. We document substantial population-level variability among these organisms driven by homologous recombination, gene gain, and gene loss. Free, open-source software implementing the described genome alignment approach is available from http://gel.ahabs.wisc.edu/mauve .Comment: Revision dated June 19, 200

arXiv.org e-Print Archive

CiteSeerX

EDGAR: A software framework for the comparative analysis of prokaryotic genomes

Author: A Altenhoff
A Becker
A da Silva
A Delcher
A Muzzi
Alexander Goesmann
Alfred Pühler
B Lee
C Dessimoz
C Schoen
D Dye
D Medini
D Zwickl
Daniel Doppmeier
E Lerat
E Zdobnov
F Garciá-Ochoa
F Meyer
F Thieme
FJ Vorhölter
Frank-Jörg Vorhölter
G Talavera
H Ochiai
H Tettelin
I Tamas
I Uchiyama
J Badger
J Felsenstein
J Peterson
J Rademaker
J Swings
J Young
Jochen Blom
K Hollricher
KA Frazer
L Jensen
L Li
L Vauterin
LG Wayne
M Starr
Martha Zakrzewski
NL Hiller
R Ciria
R Edgar
R Tatusov
RR Chaudhuri
RR Chaudhuri
S Altschul
S Salzberg
Stefan P Albaum
T DeLuca
T Hubbard
T Hulsen
W Fitch
W Qian
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Blom J, Albaum S, Doppmeier D, et al. EDGAR: a software framework for the comparative analysis of prokaryotic genomes. BMC Bioinformatics. 2009;10(1): 154.Background:The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results: To support these studies EDGAR – ''Efficient Database framework for comparative Genome Analyses using BLAST score Ratios'' – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion: EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface http://edgar.cebitec.uni-bielefeld.de webcite, where the precomputed data sets can be browsed

Crossref

Springer - Publisher Connector

PubMed Central

Publications at Bielefeld University

Aerobic Lineage of the Oxidative Stress Response Protein Rubrerythrin Emerged in an Ancient Microaerobic, (Hyper)Thermophilic Environment

Author: Cardenas JP
Holmes DS
Quatrini R
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2016
Field of study

Indexación: Web of Science; Scopus.Rubrerythrins (RBRs) are non-heme di-iron proteins belonging to the ferritin-like superfamily. They are involved in oxidative stress defense as peroxide scavengers in a wide range of organisms. The vast majority of RBRs, including classical forms of this protein, contain a C-terminal rubredoxin-like domain involved in electron transport that is used during catalysis in anaerobic conditions. Rubredoxin is an ancient and large protein family of short length (<100 residues) that contains a Fe-S center involved in electron transfer. However, functional forms of the enzyme lacking the rubredoxin-like domain have been reported (e.g., sulerythrin and ferriperoxin). In this study, phylogenomic evidence is presented that suggests that a complete lineage of rubrerythrins, lacking the rubredoxin-like domain, arose in an ancient microaerobic and (hyper)thermophilic environments in the ancestors of the Archaea Thermoproteales and Sulfolobales. This lineage (termed the "aerobic-type" lineage) subsequently evolved to become adapted to environments with progressively lower temperatures and higher oxygen concentrations via the acquisition of two co-localized genes, termed DUF3501 and RFO, encoding a conserved protein of unknown function and a predicted Fe-S oxidoreductase, respectively. Proposed Horizontal Gene Transfer events from these archaeal ancestors to Bacteria expanded the opportunities for further evolution of this RBR including adaption to lower temperatures. The second lineage (termed the cyanobacterial lineage) is proposed to have evolved in cyanobacterial ancestors, maybe in direct response to the production of oxygen via oxygenic photosynthesis during the Great Oxygen Event (GOE). It is hypothesized that both lineages of RBR emerged in a largely anaerobic world with "whiffs" of oxygen and that their subsequent independent evolutionary trajectories allowed microorganisms to transition from this anaerobic world to an aerobic one.http://journal.frontiersin.org/article/10.3389/fmicb.2016.01822/ful

Frontiers - Publisher Connector

PubMed Central

Repositorio Institucional Académico Universidad Andrés Bello

Comparative genomic analysis and molecular examination of the diversity of enterotoxigenic Escherichia coli isolates from Chile

Author: Del Canto Felipe
Fleckenstein James M
Hazen Tracy H
Luo Qingwei
Rasko David A
Vidal Roberto
Publication venue: Digital Commons@Becker
Publication date: 01/11/2019
Field of study

Enterotoxigenic Escherichia coli (ETEC) is one of the most common diarrheal pathogens in the low- and middle-income regions of the world, however a systematic examination of the genomic content of isolates from Chile has not yet been undertaken. Whole genome sequencing and comparative analysis of a collection of 125 ETEC isolates from three geographic locations in Chile, allowed the interrogation of phylogenomic groups, sequence types and genes specific to isolates from the different geographic locations. A total of 80.8% (101/125) of the ETEC isolates were identified in E. coli phylogroup A, 15.2% (19/125) in phylogroup B, and 4.0% (5/125) in phylogroup E. The over-representation of genomes in phylogroup A was significantly different from other global ETEC genomic studies. The Chilean ETEC isolates could be further subdivided into sub-clades similar to previously defined global ETEC reference lineages that had conserved multi-locus sequence types and toxin profiles. Comparison of the gene content of the Chilean ETEC identified genes that were unique based on geographic location within Chile, phylogenomic classifications or sequence type. Completion of a limited number of genomes provided insight into the ETEC plasmid content, which is conserved in some phylogenomic groups and not conserved in others. These findings suggest that the Chilean ETEC isolates contain unique virulence factor combinations and genomic content compared to global reference ETEC isolates

Directory of Open Access Journals

Digital Commons@Becker

Genome of Drosophila suzukii, the spotted wing drosophila.

Author: Begun David J
Chiu Joanna C
Cridland Julie M
Hamby Kelly A
Hamm Christopher A
Jiang Xuanting
Kwok Rosanna S
Lee Ernest K
Saelao Perot
Walton Vaughn M
Zalom Frank G
Zhang Guojie
Zhao Li
Publication venue: eScholarship, University of California
Publication date: 18/10/2013
Field of study

Drosophila suzukii Matsumura (spotted wing drosophila) has recently become a serious pest of a wide variety of fruit crops in the United States as well as in Europe, leading to substantial yearly crop losses. To enable basic and applied research of this important pest, we sequenced the D. suzukii genome to obtain a high-quality reference sequence. Here, we discuss the basic properties of the genome and transcriptome and describe patterns of genome evolution in D. suzukii and its close relatives. Our analyses and genome annotations are presented in a web portal, SpottedWingFlyBase, to facilitate public access

PubMed Central

eScholarship - University of California