Search CORE

122 research outputs found

Efficient Plant Gene Identification Based on Interspecies Mapping of Full-Length cDNAs

Author: Alexandrov
Alexandrov
Altschul
Aoki
Bennetzen
Borodovsky
H. Numa
H. Sakai
Jabbari
Jaillon
Jia
Kikuchi
Lander
Liu
Lomsadze
Majoros
Ming
Mott
N. Amano
Paterson
Paux
Quinn
Ralph
Schnable
Schulte
Soderlund
Spannagl
T. Itoh
T. Tanaka
Tuskan
Usuka
Vogel
Wicker
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

We present an annotation pipeline that accurately predicts exon–intron structures and protein-coding sequences (CDSs) on the basis of full-length cDNAs (FLcDNAs). This annotation pipeline was used to identify genes in 10 plant genomes. In particular, we show that interspecies mapping of FLcDNAs to genomes is of great value in fully utilizing FLcDNA resources whose availability is limited to several species. Because low sequence conservation at 5′- and 3′-ends of FLcDNAs between different species tends to result in truncated CDSs, we developed an improved algorithm to identify complete CDSs by the extension of both ends of truncated CDSs. Interspecies mapping of 71 801 monocot FLcDNAs to the Oryza sativa genome led to the detection of 22 142 protein-coding regions. Moreover, in comparing two mapping programs and three ab initio prediction programs, we found that our pipeline was more capable of identifying complete CDSs. As demonstrated by monocot interspecies mapping, in which nucleotide identity between FLcDNAs and the genome was ∼80%, the resultant inferred CDSs were sufficiently accurate. Finally, we applied both inter- and intraspecies mapping to 10 monocot and dicot genomes and identified genes in 210 551 loci. Interspecies mapping of FLcDNAs is expected to effectively predict genes and CDSs in newly sequenced genomes

CiteSeerX

Crossref

PubMed Central

MaizeGDB becomes ‘sequence-centric’

Author: Buckler
C. J. Lawrence
C. M. Andorf
D. A. Campbell
E. Cannon
Gardiner
Haas
J. Duvick
L. C. Harper
Lawrence
Lawrence
Lawrence
Liu
Lomsadze
M. E. Sparks
M. L. Schaeffer
McCarty
McMullen
Schlueter
Schnable
Sparks
Sparks
Stein
T. Z. Sen
Till
V. P. Brendel
Wilkerson
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

MaizeGDB is the maize research community’s central repository for genetic and genomic information about the crop plant and research model Zea mays ssp. mays. The MaizeGDB team endeavors to meet research needs as they evolve based on researcher feedback and guidance. Recent work has focused on better integrating existing data with sequence information as it becomes available for the B73, Mo17 and Palomero Toluqueño genomes. Major endeavors along these lines include the implementation of a genome browser to graphically represent genome sequences; implementation of POPcorn, a portal ancillary to MaizeGDB that offers access to independent maize projects and will allow BLAST similarity searches of participating projects’ data sets from a single point; and a joint MaizeGDB/PlantGDB project to involve the maize community in genome annotation. In addition to summarizing recent achievements and future plans, this article also discusses specific examples of community involvement in setting priorities and design aspects of MaizeGDB, which should be of interest to other database and resource providers seeking to better engage their users. MaizeGDB is accessible online at http://www.maizegdb.org

Digital Repository @ Iowa State University (ISU)

Crossref

PubMed Central

Genomic organization and evolution of the Atlantic salmon hemoglobin repertoire

Abstract Background The genomes of salmonids are considered pseudo-tetraploid undergoing reversion to a stable diploid state. Given the genome duplication and extensive biological data available for salmonids, they are excellent model organisms for studying comparative genomics, evolutionary processes, fates of duplicated genes and the genetic and physiological processes associated with complex behavioral phenotypes. The evolution of the tetrapod hemoglobin genes is well studied; however, little is known about the genomic organization and evolution of teleost hemoglobin genes, particularly those of salmonids. The Atlantic salmon serves as a representative salmonid species for genomics studies. Given the well documented role of hemoglobin in adaptation to varied environmental conditions as well as its use as a model protein for evolutionary analyses, an understanding of the genomic structure and organization of the Atlantic salmon α and β hemoglobin genes is of great interest. Results We identified four bacterial artificial chromosomes (BACs) comprising two hemoglobin gene clusters spanning the entire α and β hemoglobin gene repertoire of the Atlantic salmon genome. Their chromosomal locations were established using fluorescence <it>in situ </it>hybridization (FISH) analysis and linkage mapping, demonstrating that the two clusters are located on separate chromosomes. The BACs were sequenced and assembled into scaffolds, which were annotated for putatively functional and pseudogenized hemoglobin-like genes. This revealed that the tail-to-tail organization and alternating pattern of the α and β hemoglobin genes are well conserved in both clusters, as well as that the Atlantic salmon genome houses substantially more hemoglobin genes, including non-Bohr β globin genes, than the genomes of other teleosts that have been sequenced. Conclusions We suggest that the most parsimonious evolutionary path leading to the present organization of the Atlantic salmon hemoglobin genes involves the loss of a single hemoglobin gene cluster after the whole genome duplication (WGD) at the base of the teleost radiation but prior to the salmonid-specific WGD, which then produced the duplicated copies seen today. We also propose that the relatively high number of hemoglobin genes as well as the presence of non-Bohr β hemoglobin genes may be due to the dynamic life history of salmon and the diverse environmental conditions that the species encounters. Data deposition: BACs S0155C07 and S0079J05 (fps135): GenBank <ext-link ext-link-id="GQ898924" ext-link-type="gen">GQ898924</ext-link>; BACs S0055H05 and S0014B03 (fps1046): GenBank <ext-link ext-link-id="GQ898925" ext-link-type="gen">GQ898925</ext-link></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Simon Fraser University Institutional Repository

Horizontal gene transfer of acetyltransferases, invertases and chorismate mutases from different bacteria to diverse recipients

Author: A Haegeman
A Haegeman
A Krogh
A Lomsadze
A Marchler-Bauer
A Marchler-Bauer
A Mitchell
AM Waterhouse
AY Lee
B Elsworth
B Gao
BK Wijayawardena
DL Theobald
DL Wheeler
EA Doyle
EG Danchin
EH Scholl
F Wang
G Huang
G Huang
H Peng
H Yu
I Kaplan
JA Cotton
Jason B. Noon
JB Noon
JP Craig
JP Craig
JT Jones
K Tamura
KN Lambert
L Bauters
L Kall
M Burke
M Holterman
M Kumar
MG Mitchum
MW Vetting
N Karim
P Abad
P Jones
P Puigbo
P Vieira
R Bentley
RC Edgar
S Bekal
S Bekal
S Eves-van den Akker
SF Altschul
T Kikuchi
T Wylie
TC Boothby
Thomas J. Baum
TN Petersen
YM Chook
Z Fu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

CodingQuarry: Highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts

Author: A Guida
A Kumar
A Lomsadze
AD Neverov
Alison C Testa
AM McGuire
AV Lukashin
BJ Haas
BJ Haas
BJ Haas
BJ Loftus
BL Cantarel
C Camacho
C Holt
C Trapnell
C Trapnell
C Zhao
D Cullen
D Kim
D Martinez
DHD Kulp
DM Kupfer
GC Cerqueira
I Korf
I Reid
J Liu
James K Hane
JE Galagan
JK Hane
KJ Hoff
KR Christie
L Wang
M Berg Van Den
M Burset
M Dashtban
M Kozak
M Marcet-Houben
M Martin
M Stanke
M Stanke
M Stanke
MG Grabherr
N Rhind
NR Coordinators
R Dean
R Leinonen
RD Finn
Richard P Oliver
RP Oliver
RY Eberhardt
SB Hedges
Simon R Ellwood
SL Forsburg
SR Ellwood
T Steijger
TL Friesen
TU Consortium
V Ter-Hovhannisyan
VM Bruno
WM Vos de
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: The impact of gene annotation quality on functional and comparative genomics makes gene prediction an important process, particularly in non-model species, including many fungi. Sets of homologous protein sequences are rarely complete with respect to the fungal species of interest and are often small or unreliable, especially when closely related species have not been sequenced or annotated in detail. In these cases, protein homology-based evidence fails to correctly annotate many genes, or significantly improve ab initio predictions. Generalised hidden Markov models (GHMM) have proven to be invaluable tools in gene annotation and, recently, RNA-seq has emerged as a cost-effective means to significantly improve the quality of automated gene annotation. As these methods do not require sets of homologous proteins, improving gene prediction from these resources is of benefit to fungal researchers. While many pipelines now incorporate RNA-seq data in training GHMMs, there has been relatively little investigation into additionally combining RNA-seq data at the point of prediction, and room for improvement in this area motivates this study. Results: CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts. RNA-seq data informs annotations both during gene-model training and in prediction. Our approach capitalises on the high quality of fungal transcript assemblies by incorporating predictions made directly from transcript sequences. Correct predictions are made despite transcript assembly problems, including those caused by overlap between the transcripts of adjacent gene loci. Stringent benchmarking against high-confidence annotation subsets showed CodingQuarry predicted 91.3% of Schizosaccharomyces pombe genes and 90.4% of Saccharomyces cerevisiae genes perfectly. These results are 4-5% better than those of AUGUSTUS, the next best performing RNA-seq driven gene predictor tested. Comparisons against whole genome Sc. pombe and S. cerevisiae annotations further substantiate a 4-5% improvement in the number of correctly predicted genes. Conclusions: We demonstrate the success of a novel method of incorporating RNA-seq data into GHMM fungal gene prediction. This shows that a high quality annotation can be achieved without relying on protein homology or a training set of genes. CodingQuarry is freely available (https://sourceforge.net/projects/codingquarry/), and suitable for incorporation into genome annotation pipelines

Crossref

Springer - Publisher Connector

PubMed Central

espace@Curtin

Phylogenomics of Unusual Histone H2A Variants in Bdelloid Rotifers

Author: A Celeste
A Doron-Faigenboim
A Ehinger
A Li
A Lomsadze
A Stamatakis
A Stern
A Sundås
A Viera
BB Normark
BBJ Tops
C Redon
C Ricci
C Ricci
DB Mark Welch
DB Mark Welch
DB Mark Welch
E Birney
E Gladyshev
F Abascal
GE Crooks
GG Lindsey
Harmit S. Malik
HS Malik
HS Malik
J Ausio
J Ausio
J Cabrero
J Fillingham
J Hur
J Rozas
J Zhang
J Zhang
JA Downs
Jae H. Hur
JL Mark Welch
Julien Guglielmini
K Katoh
K Katoh
K Zahradka
Karine Van Doninck
L Marino-Ramirez
M Caprioli
M Nei
M Suzuki
Matthew Meselson
MEA Churchill
Michel C. Milinkovitch
Morgan L. Mandigo
O Fernandez-Capetillo
O Fernandez-Capetillo
O Fernandez-Capetillo
Peter Wang
R Keall
R Wernersson
S Jaeger
S Lee
SK Mahadevaiah
SL Ooi
T Furata
T Schröder
TD Schneider
TT Paull
UK Laemmli
V Mattimore
William S. Lane
WM Bonner
X Song
Z Dominski
Z Yang
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/03/2009
Field of study

Rotifers of Class Bdelloidea are remarkable in having evolved for millions of years, apparently without males and meiosis. In addition, they are unusually resistant to desiccation and ionizing radiation and are able to repair hundreds of radiation-induced DNA double-strand breaks per genome with little effect on viability or reproduction. Because specific histone H2A variants are involved in DSB repair and certain meiotic processes in other eukaryotes, we investigated the histone H2A genes and proteins of two bdelloid species. Genomic libraries were built and probed to identify histone H2A genes in Adineta vaga and Philodina roseola, species representing two different bdelloid families. The expressed H2A proteins were visualized on SDS-PAGE gels and identified by tandem mass spectrometry. We find that neither the core histone H2A, present in nearly all other eukaryotes, nor the H2AX variant, a ubiquitous component of the eukaryotic DSB repair machinery, are present in bdelloid rotifers. Instead, they are replaced by unusual histone H2A variants of higher mass. In contrast, a species of rotifer belonging to the facultatively sexual, desiccation- and radiation-intolerant sister class of bdelloid rotifers, the monogononts, contains a canonical core histone H2A and appears to lack the bdelloid H2A variant genes. Applying phylogenetic tools, we demonstrate that the bdelloid-specific H2A variants arose as distinct lineages from canonical H2A separate from those leading to the H2AX and H2AZ variants. The replacement of core H2A and H2AX in bdelloid rotifers by previously uncharacterized H2A variants with extended carboxy-terminal tails is further evidence for evolutionary diversity within this class of histone H2A genes and may represent adaptation to unusual features specific to bdelloid rotifers

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

DI-fusion

Repository of the University of Namur