Search CORE

52 research outputs found

Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments

Author: Allen Jonathan E
Buell C Robin
Haas Brian J
Orvis Joshua
Pertea Mihaela
Salzberg Steven L
White Owen
Wortman Jennifer R
Zhu Wei
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

EVidenceModeler (EVM) is an automated annotation tool that predicts protein-coding regions, alternatively spliced transcripts and untranslated regions of eukaryotic genes

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

The TIGR Rice Genome Annotation Resource: improvements and new features

Author: Buell C. Robin
Campbell Matthew
Childs Kevin
Haas Brian
Hamilton John
Lee Yuandan
Lin Haining
Malek Renae L.
Orvis Joshua
Ouyang Shu
Thibaud-Nissen Françoise
Wortman Jennifer
Zheng Li
Zhu Wei
Publication venue: Oxford University Press
Publication date: 01/12/2006
Field of study

In The Institute for Genomic Research Rice Genome Annotation project (), we have continued to update the rice genome sequence with new data and improve the quality of the annotation. In our current release of annotation (Release 4.0; January 12, 2006), we have identified 42 653 non-transposable element-related genes encoding 49 472 gene models as a result of the detection of alternative splicing. We have refined our identification methods for transposable element-related genes resulting in 13 237 genes that are related to transposable elements. Through incorporation of multiple transcript and proteomic expression data sets, we have been able to annotate 24 799 genes (31 739 gene models), representing ∼50% of the total gene models, as expressed in the rice genome. All structural and functional annotation is viewable through our Rice Genome Browser which currently supports 59 tracks. Enhanced data access is available through web interfaces, FTP downloads and a Data Extractor tool developed in order to support discrete dataset downloads

Crossref

PubMed Central

Re-annotation of the Theileria parva genome refines 53% of the proteome and uncovers essential components of N-glycosylation, a conserved pathway in many organisms

Author: Bishop Richard P.
Daubenberger Claudia A.
Fry Lindsay M.
Gotia Hanzel T.
Ifeonu Olukemi O.
Iqbal Shaikh B. A.
Kumari Priti
Nene Vishvanath M.
Orvis Joshua
Palmateer Nicholas C.
Pelle Roger
Silva Joana C.
Tretina Kyle
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The apicomplexan parasite Theileria parva causes a livestock disease called East coast fever (ECF), with millions of animals at risk in sub-Saharan East and Southern Africa, the geographic distribution of T. parva. Over a million bovines die each year of ECF, with a tremendous economic burden to pastoralists in endemic countries. Comprehensive, accurate parasite genome annotation can facilitate the discovery of novel chemotherapeutic targets for disease treatment, as well as elucidate the biology of the parasite. However, genome annotation remains a significant challenge because of limitations in the quality and quantity of the data being used to inform the location and function of protein-coding genes and, when RNA data are used, the underlying biological complexity of the processes involved in gene expression. Here, we apply our recently published RNAseq dataset derived from the schizont life-cycle stage of T. parva to update structural and functional gene annotations across the entire nuclear genome.; The re-annotation effort lead to evidence-supported updates in over half of all protein-coding sequence (CDS) predictions, including exon changes, gene merges and gene splitting, an increase in average CDS length of approximately 50 base pairs, and the identification of 128 new genes. Among the new genes identified were those involved in N-glycosylation, a process previously thought not to exist in this organism and a potentially new chemotherapeutic target pathway for treating ECF. Alternatively-spliced genes were identified, and antisense and multi-gene family transcription were extensively characterized.; The process of re-annotation led to novel insights into the organization and expression profiles of protein-coding sequences in this parasite, and uncovered a minimal N-glycosylation pathway that changes our current understanding of the evolution of this post-translational modification in apicomplexan parasites

edoc

CGSpace

Genome-wide diversity and gene expression profiling of Babesia microti isolates identify polymorphic genes that mediate host-pathogen interactions

Author: Ben Mamoun Choukri
Brancato Jana
Chibucos Marcus
Colinge Jacques
Cornillot Emmanuel
Crabtree Jonathan
Dwivedi Ankit
Fraser Claire M.
Frutos Roger
Gotia Hanzel T.
Hung Chris
Ifeonu Olukemi O.
Krause Peter J.
Kumar Vidya
Kumari Priti
Lawres Lauren
McCracken Carrie
Molina Douglas M.
Orvis Joshua
Ott Sandy
Pablo Jozelyn V.
Pazzi Joseph E.
Reynes Christelle
Sadzewicz Lisa
Sengamalay Naomi
Shetty Amol C.
Silva Joana C.
Su Qi
Tallon Luke
Tretina Kyle
Usmani-Brown Sahar
Virji Azan Z.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Babesia microti, a tick-transmitted, intraerythrocytic protozoan parasite circulating mainly among small mammals, is the primary cause of human babesiosis. While most cases are transmitted by Ixodes ticks, the disease may also be transmitted through blood transfusion and perinatally. A comprehensive analysis of genome composition, genetic diversity, and gene expression profiling of seven B. microti isolates revealed that genetic variation in isolates from the Northeast United States is almost exclusively associated with genes encoding the surface proteome and secretome of the parasite. Furthermore, we found that polymorphism is restricted to a small number of genes, which are highly expressed during infection. In order to identify pathogen-encoded factors involved in host-parasite interactions, we screened a proteome array comprised of 174 B. microti proteins, including several predicted members of the parasite secretome. Using this immuno-proteomic approach we identified several novel antigens that trigger strong host immune responses during the onset of infection. The genomic and immunological data presented herein provide the first insights into the determinants of B. microti interaction with its mammalian hosts and their relevance for understanding the selective pressures acting on parasite evolution

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

PubMed Central

Agritrop

Capture-based enrichment of Theileria parva DNA enables full genome assembly of first buffalo-derived strain and reveals exceptional intra-specific genetic diversity

Author: Awino Elias
Bishop Richard P
Crabtree Jonathan
Daubenberger Claudia A
Drabék Elliott
Gotia Hanzel T
Ifeonu Olukemi O
Knowles Donald P
Morrison W Ivan
Munro James B
Nene Vish
Orvis Joshua
Palmateer Nicholas C
Pelle Roger
Silva Joana C
Tallon Luke
Tretina Kyle
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2020
Field of study

Theileria parva is an economically important, intracellular, tick-transmitted parasite of cattle. A live vaccine against the parasite is effective against challenge from cattle-transmissible T. parva but not against genotypes originating from the African Cape buffalo, a major wildlife reservoir, prompting the need to characterize genome-wide variation within and between cattle- and buffalo-associated T. parva populations. Here, we describe a capture-based target enrichment approach that enables, for the first time, de novo assembly of nearly complete T. parva genomes derived from infected host cell lines. This approach has exceptionally high specificity and sensitivity and is successful for both cattle- and buffalo-derived T. parva parasites. De novo genome assemblies generated for cattle genotypes differ from the reference by ~54K single nucleotide polymorphisms (SNPs) throughout the 8.31 Mb genome, an average of 6.5 SNPs/kb. We report the first buffalo-derived T. parva genome, which is ~20 kb larger than the genome from the reference, cattle-derived, Muguga strain, and contains 25 new potential genes. The average non-synonymous nucleotide diversity (πN) per gene, between buffalo-derived T. parva and the Muguga strain, was 1.3%. This remarkably high level of genetic divergence is supported by an average Wright’s fixation index (FST), genome-wide, of 0.44, reflecting a degree of genetic differentiation between cattle- and buffalo-derived T. parva parasites more commonly seen between, rather than within, species. These findings present clear implications for vaccine development, further demonstrated by the ability to assemble nearly all known antigens in the buffalo-derived strain, which will be critical in design of next generation vaccines. The DNA capture approach used provides a clear advantage in specificity over alternative T. parva DNA enrichment methods used previously, such as those that utilize schizont purification, is less labor intensive, and enables in-depth comparative genomics in this apicomplexan parasite

edoc

Edinburgh Research Explorer

CGSpace

Rapid transcriptome sequencing of an invasive pest, the brown marmorated stink bug Halyomorpha halys

Author: Chibucos Marcus C
Creasy Todd
Daugherty Sean
Dunning Hotopp Julie C
Flowers Melissa
Ioannidis Panagiotis
Kumar Nikhil
Lu Yong
Orvis Joshua
Ott Sandra
Pick Leslie
Sengamalay Naomi
Shetty Amol
Tallon Luke J
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

Halyomorpha halys (Stål) (Insecta:Hemiptera;Pentatomidae), commonly known as the Brown Marmorated Stink Bug (BMSB), is an invasive pest of the mid-Atlantic region of the United States, causing economically important damage to a wide range of crops. Native to Asia, BMSB was first observed in Allentown, PA, USA, in 1996, and this pest is now well-established throughout the US mid-Atlantic region and beyond. In addition to the serious threat BMSB poses to agriculture, BMSB has become a nuisance to homeowners, invading home gardens and congregating in large numbers in human-made structures, including homes, to overwinter. Despite its significance as an agricultural pest with limited control options, only 100 bp of BMSB sequence data was available in public databases when this project began. Transcriptome sequencing was undertaken to provide a molecular resource to the research community to inform the development of pest control strategies and to provide molecular data for population genetics studies of BMSB. Using normalized, strand-specific libraries, we sequenced pools of all BMSB life stages on the Illumina HiSeq. Trinity was used to assemble 200,000 putative transcripts in >100,000 components. A novel bioinformatic method that analyzed the strand-specificity of the data reduced this to 53,071 putative transcripts from 18,573 components. By integrating multiple other data types, we narrowed this further to 13,211 representative transcripts. Bacterial endosymbiont genes were identified in this dataset, some of which have a copy number consistent with being lateral gene transfers between endosymbiont genomes and Hemiptera, including ankyrin-repeat related proteins, lysozyme, and mannanase. Such genes and endosymbionts may provide novel targets for BMSB-specific biocontrol. This study demonstrates the utility of strand-specific sequencing in generating shotgun transcriptomes and that rapid sequencing shotgun transcriptomes is possible without the need for extensive inbreeding to generate homozygous lines. Such sequencing can provide a rapid response to pest invasions similar to that already described for disease epidemiology.https://doi.org/10.1186/1471-2164-15-73

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community

Author: Adil Lotia
Arnaud
Ashburner
Aslett
Boyle
Bult
Consortium
Costanzo
Crabtree
Diane O. Inglis
Gail Binkley
Galagan
Gavin Sherlock
Hong
Jennifer R. Wortman
Jonathan Crabtree
Joshua Orvis
Mabey Gilsenan
Machida
Marcus C. Chibucos
Marek S. Skrzypek
Maria C. Costanzo
Martha B. Arnaud
Nierman
Prachi Shah
Remm
Rhee
Sprague
Stein
Stuart R. Miyasato
Tweedie
Twigger
Wapinski
Wortman
Publication venue: Oxford University Press
Publication date
Field of study

The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring the data. All of these data are freely available at http://www.aspgd.org. We welcome feedback from users and the research community at [email protected]

Crossref

PubMed Central

Horizontal gene transfer in Histophilus somni and its role in the evolution of pathogenic strain 2336, as determined by comparative genomic analyses

Author: A Ekins
A Harrison
AC Darling
AE Darling
AL Delcher
Alison J Duncan
Allison F Gillaspy
Anonymous
AV Karlyshev
AW Paton
B Clantin
B Zekarias
BJ May
BME Moret
C Baker-Austin
C Canchaya
C Feschotte
C Kehrenberg
C Kehrenberg
C Kehrenberg
CA Worby
CK Ward
CK Ward
Cliff S Han
CM Fraser
CN Cornelissen
CS Han
D Gordon
David Bruce
David W Dyer
DC Bay
E Hoiczyk
ER Tillier
ES Lander
F Jacob-Dubuisson
F St Michael
F St Michael
G Lima-Mendez
Gentry Barnes
H Brussow
H Hasman
H Hodak
I Sandal
J Chris Detter
J Hacker
J Hacker
J Van Donkersgoed
J Young
JA Jurcisek
Jean F Challacombe
Jenny Gipson
Jeremy Zaitshik
JF Challacombe
JF Challacombe
JH McQuiston
JL Ramos
JL Watts
Joshua Orvis
K Yahiro
KA Kline
L Corbeil
LB Corbeil
LB Corbeil
LB Corbeil
Linda S Thompson
LM Hansen
M Blanco
M Olson
M Ventura
Matthew Carson
MB Lawrenz
MG Langille
MH Saier
MS Smith
MW Gilmour
Olga Chertkov
P Siguier
PR Widders
RD Fleischmann
RG Gerlach
RJ Siezen
RJ Siezen
RK Aziz
Roxanne Tapia
RS Geertsema
RS Munson
S Siddaramappa
S Yamamoto
SF Elswaifi
Shivakumara Siddaramappa
SJ Hultgren
SL Chissoe
SP Cole
T Heise
TE Fuller
TF Smith
Thomas J Inzana
TJ Inzana
TJ Inzana
TJ Inzana
TJ Inzana
TM Weaver
U Dobrindt
Y Wu
Z Mohd-Zain
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Pneumonia and myocarditis are the most commonly reported diseases due to <it>Histophilus somni</it>, an opportunistic pathogen of the reproductive and respiratory tracts of cattle. Thus far only a few genes involved in metabolic and virulence functions have been identified and characterized in <it>H. somni </it>using traditional methods. Analyses of the genome sequences of several <it>Pasteurellaceae </it>species have provided insights into their biology and evolution. In view of the economic and ecological importance of <it>H. somni</it>, the genome sequence of pneumonia strain 2336 has been determined and compared to that of commensal strain 129Pt and other members of the <it>Pasteurellaceae</it>. Results The chromosome of strain 2336 (2,263,857 bp) contained 1,980 protein coding genes, whereas the chromosome of strain 129Pt (2,007,700 bp) contained only 1,792 protein coding genes. Although the chromosomes of the two strains differ in size, their average GC content, gene density (total number of genes predicted on the chromosome), and percentage of sequence (number of genes) that encodes proteins were similar. The chromosomes of these strains also contained a number of discrete prophage regions and genomic islands. One of the genomic islands in strain 2336 contained genes putatively involved in copper, zinc, and tetracycline resistance. Using the genome sequence data and comparative analyses with other members of the <it>Pasteurellaceae</it>, several <it>H. somni </it>genes that may encode proteins involved in virulence (<it>e.g</it>., filamentous haemaggutinins, adhesins, and polysaccharide biosynthesis/modification enzymes) were identified. The two strains contained a total of 17 ORFs that encode putative glycosyltransferases and some of these ORFs had characteristic simple sequence repeats within them. Most of the genes/loci common to both the strains were located in different regions of the two chromosomes and occurred in opposite orientations, indicating genome rearrangement since their divergence from a common ancestor. Conclusions Since the genome of strain 129Pt was ~256,000 bp smaller than that of strain 2336, these genomes provide yet another paradigm for studying evolutionary gene loss and/or gain in regard to virulence repertoire and pathogenic ability. Analyses of the complete genome sequences revealed that bacteriophage- and transposon-mediated horizontal gene transfer had occurred at several loci in the chromosomes of strains 2336 and 129Pt. It appears that these mobile genetic elements have played a major role in creating genomic diversity and phenotypic variability among the two <it>H. somni </it>strains.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A framework for human microbiome research

Author: Aagaard Kjersti M.
Abolude Olukemi O.
Abubucker Sahar
Allen-Vercoe Emma
Alm Eric J.
Alvarado Lucia
Andersen Gary L.
Anderson Scott
Appelbaum Elizabeth
Arachchi Harindra M.
Armitage Gary
Arze Cesar A.
Ayvaz Tulin
Badger Jonathan H.
Baker Carl C.
Begg Lisa
Belachew Tsegahiwot
Bhonagiri Veena
Bihan Monika
Birren Bruce W.
Blaser Martin J.
Bloom Toby
Brooks Paul
Buck Gregory A.
Buhay Christian J.
Busam Dana A.
Campbell Joseph L.
Canon Shane R.
Cantarel Brandi L.
Chain Patrick S.
Chen I-Min A.
Chen Lei
Chhibba Shaila
Chinwalla Asif T.
Chu Ken
Ciulla Dawn M.
Clemente Jose C.
Clifton Sandra W.
Conlan Sean
Crabtree Jonathan
Creasy Heather H.
Cutting Mary A.
Davidovics Noam J.
Davis Catherine C.
Deal Carolyn
Delehaunty Kimberley D.
DeSantis Todd Z.
Dewhirst Floyd Everett
Deych Elena
Di Francesco Valentina
Ding Yan
Dooling David J.
Dugan Shannon P.
Dunne Wm. Michael
Durkin A. Scott
Earl Ashlee M.
Edgar Robert C.
Erlich Rachel L.
Farmer Candace N.
Farrell Ruth M.
Faust Karoline
Feldgarden Michael
Felix Victor M.
Fisher Sheila
FitzGerald Michael G.
Fodor Anthony A.
Forney Larry
Foster Leslie
Friedman Jonathan
Friedrich Dennis C.
Fronick Catrina C.
Fulton Lucinda L.
Fulton Robert S.
Gao Hongyu
Garcia Nathalia
Gevers Dirk
Giannoukos Georgia
Gibbs Richard A.
Giblin Christina
Giglio Michelle G.
Giovanni Maria Y.
Goldberg Jonathan M.
Goll Johannes
Gonzalez Antonio
Griggs Allison
Gujja Sharvari
Haas Brian J.
Hallsworth-Pepin Kymberlie
Hamilton Holli A.
Harris Emily L.
Hepburn Theresa A.
Herter Brandi
Highlander Sarah K.
Hoffmann Diane E.
Holder Michael E.
Howarth Clinton
Huang Katherine H.
Huse Susan M.
Huttenhower Curtis
Izard Jacques Georges
Jansson Janet K.
Jiang Huaiyang
Jordan Catherine
Joshi Vandita
Katancik James A.
Keitel Wendy A.
Kelley Scott T.
Kells Cristyn
Kinder-Haake Susan
King Nicholas B.
Knight Rob
Knights Dan
Kong Heidi H.
Koren Omry
Koren Sergey
Kota Karthik C.
Kovar Christie L.
Kyrpides Nikos C.
La Rosa Patricio S.
Lee Sandra L.
Lemon Katherine Paige
Lennon Niall
Lewis Cecil M.
Lewis Lora
Ley Ruth E.
Li Kelvin
Liolios Konstantinos
Liu Bo
Liu Yue
Lo Chien-Chi
Lobos Elizabeth A.
Lozupone Catherine A.
Lunsford R. Dwayne
Madden Tessa
Madupu Ramana
Magrini Vincent
Mahurkar Anup A.
Mannon Peter J.
Mardis Elaine R.
Markowitz Victor M.
Martin John C.
Mavrommatis Konstantinos
McCorrison Jamison M.
McDonald Daniel
McEwen Jean
McGuire Amy L.
McInnes Pamela
Mehta Teena
Methé Barbara A.
Mihindukulasuriya Kathie A.
Miller Jason R.
Minx Patrick J.
Mitreva Makedonka
Muzny Donna M.
Nelson Karen E.
Newsham Irene
Nusbaum Chad
Orvis Joshua
O’Laughlin Michelle
Pagani Ioanna
Palaniappan Krishna
Pamela Sankar J.
Patel Shital M.
Pearson Matthew
Peterson Jane
Petrosino Joseph F.
Podar Mircea
Pohl Craig
Pollard Katherine S.
Pop Mihai
Priest Margaret E.
Proctor Lita M.
Qin Xiang
Raes Jeroen
Ravel Jacques
Reid Jeffrey G.
Rho Mina
Rhodes Rosamond
Riehle Kevin P.
Rivera Maria C.
Rodriguez-Mueller Beltran
Rogers Yu-Hui
Ross Matthew C.
Russ Carsten
Sanka Ravi K.
Sathirapongsasuti Fah
Schloss Jeffery A.
Schloss Patrick D.
Schmidt Thomas M.
Scholz Matthew
Schriml Lynn
Schubert Alyxandria M.
Segata Nicola
Segre Julia A.
Shannon William D.
Sharp Richard R.
Sharpton Thomas J.
Shenoy Narmada
Sheth Nihar U.
Simone Gina A.
Singh Indresh
Smillie Chris Scott
Sobel Jack D.
Sodergren Erica J.
Sommer Daniel D.
Spicer Paul
Sutton Granger G.
Sykes Sean M.
Tabbaa Diana G.
Thiagarajan Mathangi
Tomlinson Chad M.
Torralba Manolito
Treangen Todd J.
Truty Rebecca M.
Versalovic James
Vishnivetskaya Tatiana A.
Vivien Bonazzi J.
Walker Jason
Wang Lu
Wang Zhengyuan
Ward Doyle V.
Warren Wesley
Watson Mark A.
Weinstock George M.
Wellington Christopher
Wetterstrand Kris A.
White James R.
White Owen
Wilczek-Boney Katarzyna
Wilson Richard K.
Wollam Aye M.
Worley Kim C.
Wortman Jennifer R.
Wu Yuan Qing
Wylie Kristine M.
Wylie Todd
Yandava Chandri
Ye Liang
Ye Yuzhen
Yooseph Shibu
Youmans Bonnie P.
Young Sarah K.
Zeng Qiandong
Zhang Lan
Zhou Yanjiao
Zhu Yiming
Zoloth Laurie
Zucker Jeremy Daniel Hofeld
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2011
Field of study

A variety of microbial communities and their genes (the microbiome) exist throughout the human body, with fundamental roles in human health and disease. The National Institutes of Health (NIH)-funded Human Microbiome Project Consortium has established a population-scale framework to develop metagenomic protocols, resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 or 18 body sites up to three times, which have generated 5,177 microbial taxonomic profiles from 16S ribosomal RNA genes and over 3.5 terabases of metagenomic sequence so far. In parallel, approximately 800 reference strains isolated from the human body have been sequenced. Collectively, these data represent the largest resource describing the abundance and variety of the human microbiome, while providing a framework for current and future studies

CiteSeerX

DSpace@MIT

Crossref

Harvard University - DASH