41 research outputs found
Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release
BACKGROUND: Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications. RESULTS: Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5). CONCLUSION: Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms
Comparative genomics of the pathogenic ciliate Ichthyophthirius multifiliis, its free-living relatives and a host species provide insights into adoption of a parasitic lifestyle and prospects for disease control
BACKGROUND: Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental studies challenging. Unlike most well-studied protozoan parasites, Ich belongs to a phylum composed primarily of free-living members. Indeed, it is closely related to the model organism Tetrahymena thermophila. Genomic studies represent a promising strategy to reduce the impact of this disease and to understand the evolutionary transition to parasitism.
RESULTS: We report the sequencing, assembly and annotation of the Ich macronuclear genome. Compared with its free-living relative T. thermophila, the Ich genome is reduced approximately two-fold in length and gene density and three-fold in gene content. We analyzed in detail several gene classes with diverse functions in behavior, cellular function and host immunogenicity, including protein kinases, membrane transporters, proteases, surface antigens and cytoskeletal components and regulators. We also mapped by orthology Ich's metabolic pathways in comparison with other ciliates and a potential host organism, the zebrafish Danio rerio.
CONCLUSIONS: Knowledge of the complete protein-coding and metabolic potential of Ich opens avenues for rational testing of therapeutic drugs that target functions essential to this parasite but not to its fish hosts. Also, a catalog of surface protein-encoding genes will facilitate development of more effective vaccines. The potential to use T. thermophila as a surrogate model offers promise toward controlling 'white spot' disease and understanding the adaptation to a parasitic lifestyle
Sequencing of Culex quinquefasciatus establishes a platform for mosquito comparative genomics
Culex quinquefasciatus (the southern house mosquito) is an important mosquito vector of viruses such as West Nile virus and St. Louis encephalitis virus, as well as of nematodes that cause lymphatic filariasis. C. quinquefasciatus is one species within the Culex pipiens species complex and can be found throughout tropical and temperate climates of the world. The ability of C. quinquefasciatus to take blood meals from birds, livestock, and humans contributes to its ability to vector pathogens between species. Here, we describe the genomic sequence of C. quinquefasciatus: Its repertoire of 18,883 protein-coding genes is 22% larger than that of Aedes aegypti and 52% larger than that of Anopheles gambiae with multiple gene-family expansions, including olfactory and gustatory receptors, salivary gland genes, and genes associated with xenobiotic detoxification
Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease
Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent
Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease
Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent
Recommended from our members
The Gene Ontology in 2010: extensions and refinements
The Gene Ontology (GO) Consortium (http://www.geneontology.org) (GOC) continues to develop,
maintain and use a set of structured, controlled
vocabularies for the annotation of genes, gene
products and sequences. The GO ontologies
are expanding both in content and in structure.
Several new relationship types have been introduced
and used, along with existing relationships,
to create links between and within the GO domains.
These improve the representation of biology,
facilitate querying, and allow GO developers to systematically
check for and correct inconsistencies
within the GO. Gene product annotation using GO
continues to increase both in the number of total
annotations and in species coverage. GO tools,
such as OBO-Edit, an ontology-editing tool, and
AmiGO, the GOC ontology browser, have seen
major improvements in functionality, speed and
ease of use.This is the publisher’s final pdf. The published article is copyrighted by the author(s) and published by Oxford University Press. The published article can be found at: http://nar.oxfordjournals.org/
Transcriptomic Study Reveals Widespread Spliced Leader <i>Trans</i>-Splicing, Short 5′-UTRs and Potential Complex Carbon Fixation Mechanisms in the Euglenoid Alga <i>Eutreptiella</i> sp.
<div><p><i>Eutreptiella</i> are an evolutionarily unique and ecologically important genus of microalgae, but they are poorly understood with regard to their genomic make-up and expression profiles. Through the analysis of the full-length cDNAs from a <i>Eutreptiella</i> species, we found a conserved 28-nt spliced leader sequence (Eut-SL, ACACUUUCUGAGUGUCUAUUUUUUUUCG) was <i>trans</i>-spliced to the mRNAs of <i>Eutreptiella</i> sp. Using a primer derived from Eut-SL, we constructed four cDNA libraries under contrasting physiological conditions for 454 pyrosequencing. Clustering analysis of the ∼1.9×10<sup>6</sup> original reads (average length 382 bp) yielded 36,643 unique transcripts. Although only 28% of the transcripts matched documented genes, this fraction represents a functionally very diverse gene set, suggesting that SL <i>trans</i>-splicing is likely ubiquitous in this alga’s transcriptome. The mRNAs of <i>Eutreptiella</i> sp. seemed to have short 5′- untranslated regions, estimated to be 21 nucleotides on average. Among the diverse biochemical pathways represented in the transcriptome we obtained, carbonic anhydrase and genes known to function in the C<sub>4</sub> pathway and heterotrophic carbon fixation were found, posing a question whether <i>Eutreptiella</i> sp. employs multifaceted strategies to acquire and fix carbon efficiently. This first large-scale transcriptomic dataset for a euglenoid uncovers many potential novel genes and overall offers a valuable genetic resource for research on euglenoid algae.</p></div
Candidate genes involved in carbon fixation in <i>Eutreptiella</i> sp. compared with other potential C<sub>4</sub> algae <i>Thalassiosira pseudonana</i> (<i>T.ps</i>) and <i>Ostreococcus tauri</i> (<i>O.ta</i>).
<p>The numbers for <i>Eutreptiella</i> represents the number of unique sequences found in our dataset. The numbers of genes in other organisms are based on genome annotations. Sequences and annotation results are listed in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060826#pone.0060826.s020" target="_blank">Table S15</a>.</p><p>– Genes were not found in our dataset. This might be due to insufficient sequencing depth.</p>a<p>Genes potentially involved in C<sub>4</sub> carbon fixation.</p>b<p>Obtained from cloning instead of transcriptome.</p>c<p>Data adapted from Derelle et al.<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060826#pone.0060826-Derelle1" target="_blank">[63]</a>.</p
Agarose gel electrophoresis of <i>Eutreptiella</i> sp. spliced leader (Eut-SL)-based cDNA libraries.
<p>Native SL, cDNA amplified with native Eut-SL pairing with 454AT7 primer; Modified SL, cDNA amplified with modified Eut-SL pairing with 454AT7 primer. −PL: the phosphate-depleted-light sample; +PL: the phosphate-replete-light sample; −PD: the phosphate-depleted-dark sample; +PD: the phosphate-replete-dark sample. (−) negative controls without cDNA template; S-, negative controls of PCR with SL single primer.</p
Summary of <i>Eutreptiella</i> sp. transcriptome 454 sequencing and data analysis.
<p>Transcript sequences were compared to GenBank non-redundant (nr) database using the BLASTx algorithm <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0060826#pone.0060826-Altschul1" target="_blank">[27]</a>, with a cut-off E-value ≤10<sup>−3</sup>.</p