Search CORE

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

Digital Repository at the University of Maryland

Genome re-annotation: a wiki solution?

Author: AL Delcher
AV Lukashin
JC Venter
JD Peterson
O White
RD Fleischmann
SF Altschul
SR Eddy
Steven L Salzberg
The International Human Genome Sequencing Consortium
Publication venue: BioMed Central
Publication date: 01/02/2007
Field of study

The annotation of most genomes becomes outdated over time, owing in part to our ever-improving knowledge of genomes and in part to improvements in bioinformatics software. Unfortunately, annotation is rarely if ever updated and resources to support routine reannotation are scarce. Wiki software, which would allow many scientists to edit each genome's annotation, offers one possible solution

Digital Repository at the University of Maryland

CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes

Author: A Chinen
A Nobusato
A van Belkum
AL Delcher
AL Delcher
AL Delcher
B Gottgens
B Ma
C Josenhans
D Gusfield
D Romero
DA Nix
DA Pollard
E Gilson
F Kunst
FR Blattner
G Levinson
H Takami
I Uchiyama
I Uchiyama
Ichizo Kobayashi
Ikuo Uchiyama
J Parkhill
J Yang
JF Tomb
JH Choi
JM Claverie
K Ishikawa
KA Frazer
M Brudno
M Brudno
M Brudno
M Kawai
M Kawai
MY Leung
N Bray
N Jareborg
N Jareborg
NA Moran
NJ Saunders
P Siguier
RA Alm
S Karlin
S Schwartz
S Schwartz
SB Needleman
SF Altschul
T Hayashi
T Tsuru
TJ Carver
Toshio Higuchi
U Dobrindt
W Huang
WJ Kent
WJ Kent
WR Pearson
Z Ning
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The recent accumulation of closely related genomic sequences provides a valuable resource for the elucidation of the evolutionary histories of various organisms. However, although numerous alignment calculation and visualization tools have been developed to date, the analysis of complex genomic changes, such as large insertions, deletions, inversions, translocations and duplications, still presents certain difficulties. RESULTS: We have developed a comparative genome analysis tool, named CGAT, which allows detailed comparisons of closely related bacteria-sized genomes mainly through visualizing middle-to-large-scale changes to infer underlying mechanisms. CGAT displays precomputed pairwise genome alignments on both dotplot and alignment viewers with scrolling and zooming functions, and allows users to move along the pre-identified orthologous alignments. Users can place several types of information on this alignment, such as the presence of tandem repeats or interspersed repetitive sequences and changes in G+C contents or codon usage bias, thereby facilitating the interpretation of the observed genomic changes. In addition to displaying precomputed alignments, the viewer can dynamically calculate the alignments between specified regions; this feature is especially useful for examining the alignment boundaries, as these boundaries are often obscure and can vary between programs. Besides the alignment browser functionalities, CGAT also contains an alignment data construction module, which contains various procedures that are commonly used for pre- and post-processing for large-scale alignment calculation, such as the split-and-merge protocol for calculating long alignments, chaining adjacent alignments, and ortholog identification. Indeed, CGAT provides a general framework for the calculation of genome-scale alignments using various existing programs as alignment engines, which allows users to compare the outputs of different alignment programs. Earlier versions of this program have been used successfully in our research to infer the evolutionary history of apparently complex genome changes between closely related eubacteria and archaea. CONCLUSION: CGAT is a practical tool for analyzing complex genomic changes between closely related genomes using existing alignment programs and other sequence analysis tools combined with extensive manual inspection

Springer - Publisher Connector

Context-driven discovery of gene cassettes in mobile integrons using a computational grammar

Author: A Moura
ACE Darling
AL Delcher
AL Delcher
CJ van Rijsbergen
D Frishman
DA Rowe-Magnus
DB Searls
E Rivas
Enrico Coiera
F Baquero
F Meyer
F Meyer
Guy Tsafnat
H Quesneville
HW Stokes
HW Stokes
IT Paulsen
J Fleiss
J Landis
Jaron Schaeffer
Jon R Iredell
K Rutherford
L Stein
M Ashburner
M Kanehisa
MA Andrade
MJ Joss
R Overbeek
RM Hall
RS Levings
S Ji
S Leung
Sally R Partridge
SF Altschul
SR Partridge
U Bohnebeck
WR Pearson
Y Boucher
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Gene discovery algorithms typically examine sequence data for low level patterns. A novel method to computationally discover higher order DNA structures is presented, using a context sensitive grammar. The algorithm was applied to the discovery of gene cassettes associated with integrons. The discovery and annotation of antibiotic resistance genes in such cassettes is essential for effective monitoring of antibiotic resistance patterns and formulation of public health antibiotic prescription policies. Results We discovered two new putative gene cassettes using the method, from 276 integron features and 978 GenBank sequences. The system achieved <it>κ </it>= 0.972 annotation agreement with an expert gold standard of 300 sequences. In rediscovery experiments, we deleted 789,196 cassette instances over 2030 experiments and correctly relabelled 85.6% (<it>α </it>≥ 95%, <it>E </it>≤ 1%, mean sensitivity = 0.86, specificity = 1, F-score = 0.93), with no false positives. Error analysis demonstrated that for 72,338 missed deletions, two adjacent deleted cassettes were labeled as a single cassette, increasing performance to 94.8% (mean sensitivity = 0.92, specificity = 1, F-score = 0.96). Conclusion Using grammars we were able to represent heuristic background knowledge about large and complex structures in DNA. Importantly, we were also able to use the context embedded in the model to discover new putative antibiotic resistance gene cassettes. The method is complementary to existing automatic annotation systems which operate at the sequence level.</p

Springer - Publisher Connector

Macquarie University ResearchOnline

The genome and transcriptome of Trichormus sp NMC-1: insights into adaptation to extreme environments on the Qinghai-Tibet Plateau

Author: A Stamatakis
A Zorina
AL Delcher
B Langmead
BA Methé
C Xie
DA Los
DJ Wright
EP Balskus
G Blanc
G Norsang
HÄ Suh
J Qi
J Qi
J Zhang
JF Hess
JI Carreto
JM Shick
JP Zehr
K Mavromatis
KS Siddiqui
L Li
L R
L Ran
M Borodovsky
M Dassanayake
M Li
M Suyama
N Myers
P Pereira
P Puigbò
P Rajaniemi
PH Sudmant
PM Shih
Q Qiu
Q Tang
R Cavicchioli
RC Edgar
RL Tatusov
S Richter
SP Singh
SP Singh
SP Singh
T De Bie
T Kaneko
T Kogej
T Shi
U Consortium
U Nübel
WM Fitch
Z Xu
Z Yang
Z Yang
ZA Cheviron
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/07/2016
Field of study

The Qinghai-Tibet Plateau (QTP) has the highest biodiversity for an extreme environment worldwide, and provides an ideal natural laboratory to study adaptive evolution. In this study, we generated a draft genome sequence of cyanobacteria Trichormus sp. NMC-1 in the QTP and performed whole transcriptome sequencing under low temperature to investigate the genetic mechanism by which T. sp. NMC-1 adapted to the specific environment. Its genome sequence was 5.9 Mb with a G+C content of 39.2% and encompassed a total of 5362 CDS. A phylogenomic tree indicated that this strain belongs to the Trichormus and Anabaena cluster. Genome comparison between T. sp. NMC-1 and six relatives showed that functionally unknown genes occupied a much higher proportion (28.12%) of the T. sp. NMC-1 genome. In addition, functions of specific, significant positively selected, expanded orthogroups, and differentially expressed genes involved in signal transduction, cell wall/membrane biogenesis, secondary metabolite biosynthesis, and energy production and conversion were analyzed to elucidate specific adaptation traits. Further analyses showed that the CheY-like genes, extracellular polysaccharide and mycosporine-like amino acids might play major roles in adaptation to harsh environments. Our findings indicate that sophisticated genetic mechanisms are involved in cyanobacterial adaptation to the extreme environment of the QTP

Institute of Hydrobiology, Chinese Academy Of Sciences

University of Bedfordshire Repository

GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes

Author: A Nagy
AL Delcher
Amrita Pati
Athanasios Lykidis
DA Benson
Galina Ovchinnikova
GX Yu
HQ Zhu
J Besemer
KL Smollett
M Tech
Natalia Mikhailova
Natalia N Ivanova
NC Kyrpides
NE Castellana
Nikos C Kyrpides
RK Aziz
S Bocs
Sean D Hooper
VM Markowitz
Y Ishino
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2010
Field of study

We present 'gene prediction improvement pipeline' (GenePRIMP; http://geneprimp.jgi-psf.org/), a computational process that performs evidence-based evaluation of gene models in prokaryotic genomes and reports anomalies including inconsistent start sites, missed genes and split genes. We found that manual curation of gene models using the anomaly reports generated by GenePRIMP improved their quality, and demonstrate the applicability of GenePRIMP in improving finishing quality and comparing different genome-sequencing and annotation technologies

UNT Digital Library

Sequence and annotation of the Wizard007 mycobacterium phage genome

Author: AL Delcher
Anthony Falcone
Benjamin Howard
Brittney Howard
Claire Rinehart
Courtney Howard
Cynthia Tope
D Gordon
Ejike Anyanwu
Elizabeth Farnsworth
Heidi Sayre
J Besemer
Jordan Olberding
Kaitlyn Cole
Karlee Driver
LD Stein
Mackenzie Perkins
Prasanna Tamarapu Parthasarathy
Rodney King
Sarah Schrader
SE Lewis
SF Altschul
TM Lowe
Tyler Scaff
Publication venue: BioMed Central
Publication date: 01/07/2010
Field of study

Public Library of Science (PLOS)

The Early Stage of Bacterial Genome-Reductive Evolution in the Host

Author: A Benenson
A Mira
A Tuanyok
AC Cheng
AI Nilsson
AL Delcher
AL Delcher
B Sallstrom
C McGilvray
C Romero
CH Lin
CJ Roy
D Dance
D DeShazer
D DeShazer
D Godoy
EW Myers
F Rodrigues
G Levinson
GC Whitlock
H Kim
Han Song
Heenam Stanley Kim
Howard Ochman
Hyojeong Yi
J Batut
J Malakooti
J Parkhill
J Parkhill
Junghyun Hwang
KW Deitsch
L Wilkinson
MTG Holden
NA Moran
NA Moran
NA Moran
NA Moran
RA Moore
RA Moore
RD Fleischmann
Ricky L. Ulrich
RL Ulrich
SE Schutzer
SL Salzberg
T Dharakul
TD Schneider
TJ Carver
TJ Treangen
TJJ Inglis
WC Nierman
William C. Nierman
Yan Yu
Publication venue: Public Library of Science
Publication date: 01/05/2010
Field of study

The equine-associated obligate pathogen Burkholderia mallei was developed by reductive evolution involving a substantial portion of the genome from Burkholderia pseudomallei, a free-living opportunistic pathogen. With its short history of divergence (∼3.5 myr), B. mallei provides an excellent resource to study the early steps in bacterial genome reductive evolution in the host. By examining 20 genomes of B. mallei and B. pseudomallei, we found that stepwise massive expansion of IS (insertion sequence) elements ISBma1, ISBma2, and IS407A occurred during the evolution of B. mallei. Each element proliferated through the sites where its target selection preference was met. Then, ISBma1 and ISBma2 contributed to the further spread of IS407A by providing secondary insertion sites. This spread increased genomic deletions and rearrangements, which were predominantly mediated by IS407A. There were also nucleotide-level disruptions in a large number of genes. However, no significant signs of erosion were yet noted in these genes. Intriguingly, all these genomic modifications did not seriously alter the gene expression patterns inherited from B. pseudomallei. This efficient and elaborate genomic transition was enabled largely through the formation of the highly flexible IS-blended genome and the guidance by selective forces in the host. The detailed IS intervention, unveiled for the first time in this study, may represent the key component of a general mechanism for early bacterial evolution in the host

Comparative genomic analysis of Vibrio parahaemolyticus: serotype conversion and virulence

Author: A Krogh
AI Gil
AL Delcher
AL Delcher
Ana I Gil
CY Chen
D Bramhill
DE Fouts
DE Fouts
Derrick E Fouts
EF Boyd
EW Myers
G Balakrish Nair
GB Nair
GB Nair
H Nasu
H Shirai
J Chun
J Tada
JF Heidelberg
Jonathan H Badger
JR Miller
K Makino
KS Park
L Feng
L Fujino
L Wang
M Dziejman
M Kamruzzaman
M Kamruzzaman
M Nishibuchi
M Nishibuchi
M Okura
Mitsuaki Nishibuchi
N Okada
NR Chowdhury
NR Chowdhury
O Colin Stine
PK Bag
RD Fleischmann
T Kodama
T Kodama
T Ono
T Popovic
X Zhou
X Zhou
Y Chen
Y Chen
Y Chen
YB Kim
Yuansha Chen
Publication venue: BioMed Central
Publication date: 01/06/2011
Field of study

Abstract Background <it>Vibrio parahaemolyticus </it>is a common cause of foodborne disease. Beginning in 1996, a more virulent strain having serotype O3:K6 caused major outbreaks in India and other parts of the world, resulting in the emergence of a pandemic. Other serovariants of this strain emerged during its dissemination and together with the original O3:K6 were termed strains of the pandemic clone. Two genomes, one of this virulent strain and one pre-pandemic strain have been sequenced. We sequenced four additional genomes of <it>V. parahaemolyticus </it>in this study that were isolated from different geographical regions and time points. Comparative genomic analyses of six strains of <it>V. parahaemolyticus </it>isolated from Asia and Peru were performed in order to advance knowledge concerning the evolution of <it>V. parahaemolyticus</it>; specifically, the genetic changes contributing to serotype conversion and virulence. Two pre-pandemic strains and three pandemic strains, isolated from different geographical regions, were serotype O3:K6 and either toxin profiles (<it>tdh+</it>, <it>trh</it>-) or (<it>tdh-</it>, <it>trh</it>+). The sixth pandemic strain sequenced in this study was serotype O4:K68. Results Genomic analyses revealed that the <it>trh</it>+ and <it>tdh</it>+ strains had different types of pathogenicity islands and mobile elements as well as major structural differences between the <it>tdh </it>pathogenicity islands of the pre-pandemic and pandemic strains. In addition, the results of single nucleotide polymorphism (SNP) analysis showed that 94% of the SNPs between O3:K6 and O4:K68 pandemic isolates were within a 141 kb region surrounding the O- and K-antigen-encoding gene clusters. The "core" genes of <it>V. parahaemolyticus </it>were also compared to those of <it>V. cholerae </it>and <it>V. vulnificus</it>, in order to delineate differences between these three pathogenic species. Approximately one-half (49-59%) of each species' core genes were conserved in all three species, and 14-24% of the core genes were species-specific and in different functional categories. Conclusions Our data support the idea that the pandemic strains are closely related and that recent South American outbreaks of foodborne disease caused by <it>V. parahaemolyticus </it>are closely linked to outbreaks in India. Serotype conversion from O3:K6 to O4:K68 was likely due to a recombination event involving a region much larger than the O-antigen- and K-antigen-encoding gene clusters. Major differences between pathogenicity islands and mobile elements are also likely driving the evolution of <it>V. parahaemolyticus</it>. In addition, our analyses categorized genes that may be useful in differentiating pathogenic Vibrios at the species level.</p

The DNA60IFX contest

Author: AL Delcher
B Langmead
D Earl
D Marbach
G Marçais
GM Church
H Li
James Taylor
KR Bradnam
Michael C Schatz
N Attar
P Rice
Q Wang
RC Holland
S Boisvert
S Gnerre
S Kurtz
S Sun
SF Altschul
Sven-Eric Schelhorn
TZ DeSantis
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study