24 research outputs found
Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release
BACKGROUND: Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications. RESULTS: Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5). CONCLUSION: Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms
Sequencing of Culex quinquefasciatus establishes a platform for mosquito comparative genomics
Culex quinquefasciatus (the southern house mosquito) is an important mosquito vector of viruses such as West Nile virus and St. Louis encephalitis virus, as well as of nematodes that cause lymphatic filariasis. C. quinquefasciatus is one species within the Culex pipiens species complex and can be found throughout tropical and temperate climates of the world. The ability of C. quinquefasciatus to take blood meals from birds, livestock, and humans contributes to its ability to vector pathogens between species. Here, we describe the genomic sequence of C. quinquefasciatus: Its repertoire of 18,883 protein-coding genes is 22% larger than that of Aedes aegypti and 52% larger than that of Anopheles gambiae with multiple gene-family expansions, including olfactory and gustatory receptors, salivary gland genes, and genes associated with xenobiotic detoxification
Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease
Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent
Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease
Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent
Recommended from our members
The Gene Ontology in 2010: extensions and refinements
The Gene Ontology (GO) Consortium (http://www.geneontology.org) (GOC) continues to develop,
maintain and use a set of structured, controlled
vocabularies for the annotation of genes, gene
products and sequences. The GO ontologies
are expanding both in content and in structure.
Several new relationship types have been introduced
and used, along with existing relationships,
to create links between and within the GO domains.
These improve the representation of biology,
facilitate querying, and allow GO developers to systematically
check for and correct inconsistencies
within the GO. Gene product annotation using GO
continues to increase both in the number of total
annotations and in species coverage. GO tools,
such as OBO-Edit, an ontology-editing tool, and
AmiGO, the GOC ontology browser, have seen
major improvements in functionality, speed and
ease of use.This is the publisher’s final pdf. The published article is copyrighted by the author(s) and published by Oxford University Press. The published article can be found at: http://nar.oxfordjournals.org/
Complex patterns of genomic admixture within Southern Africa
The original publication is available at www.plosgenetics.orgWithin-population genetic diversity is greatest within Africa, while between-population genetic diversity is directly
proportional to geographic distance. The most divergent contemporary human populations include the click-speaking
forager peoples of southern Africa, broadly defined as Khoesan. Both intra- (Bantu expansion) and inter-continental
migration (European-driven colonization) have resulted in complex patterns of admixture between ancient geographically
isolated Khoesan and more recently diverged populations. Using gender-specific analysis and almost 1 million autosomal
markers, we determine the significance of estimated ancestral contributions that have shaped five contemporary southern
African populations in a cohort of 103 individuals. Limited by lack of available data for homogenous Khoesan
representation, we identify the Ju/’hoan (n = 19) as a distinct early diverging human lineage with little to no significant non-
Khoesan contribution. In contrast to the Ju/’hoan, we identify ancient signatures of Khoesan and Bantu unions resulting in
significant Khoesan- and Bantu-derived contributions to the Southern Bantu amaXhosa (n = 15) and Khoesan !Xun (n = 14),
respectively. Our data further suggests that contemporary !Xun represent distinct Khoesan prehistories. Khoesan
assimilation with European settlement at the most southern tip of Africa resulted in significant ancestral Khoesan
contributions to the Coloured (n = 25) and Baster (n = 30) populations. The latter populations were further impacted by 170
years of East Indian slave trade and intra-continental migrations resulting in a complex pattern of genetic variation
(admixture). The populations of southern Africa provide a unique opportunity to investigate the genomic variability from
some of the oldest human lineages to the implications of complex admixture patterns including ancient and recently
diverged human lineages.Funded by the J. Craig Venter Family Foundation, La Jolla, CA, USA, and Illumina, San Diego, CA, USA, to VMH.EAT is supported by a National Health and Medical Research Council (NHMRC) Australia Fellowship, and OL and NJS are supported in part by NIH/NCATS grant number UL1 RR025774.Publisher's versio
Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies
The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full-length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the ∼27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations
Ju/'hoan-Yoruba ancestry informative markers (AIMs) defined ancestral contributions to the !Xun and amaXhosa, providing evidence for two distinct !Xun lineages with differing ancestral contributions.
<p>(A) STRUCTURE analysis for 2,687 Ju/'hoan-Yoruba AIMs identifies a third ‘unknown’ population cluster when assuming three ancestral populations. (B) Ancestral contributions to the !Xun shows a diverse contribution of a Ju/'hoan and unknown likely Khoesan ancestral fraction and a constant Bantu-derived fraction. (C) Ancestral contributions to the amaXhosa demonstrate more even contributions. (D) Based on ancestral fractions the !Xun are further classified as Ju/'hoan ancestral (n = 6) and (E) unknown ancestral, suggesting two unique !Xun lineages, each with significant Bantu ancestral contributions. The single admixed !Xun (NF2 ⇓) and the three Angolan !Xun (V) are indicated.</p
Relatedness and demographic history of the Ju/'hoan to global populations defines early divergence and genomic impact of forager existence.
<p>(A) Circular Neighbor Joining phylogenetic tree for 24,402 LD-pruned autosomal markers after merging our data with global population data for a total of 521 samples from 14 populations, with the Pan genome as the outgroup. We confirm early divergence of the Ju/'hoan and report independent branching of the Angolan !Xun. (B) Plot of total length of ROH against number of ROH (>500 kb) for each study sample against European and Yoruba using 716,734 markers (367 samples). Our foraging groups show smaller overall ROH lengths than the Europeans, yet longer than the Yoruba, suggesting small effective population sizes of a likely ancient population with minimal to no impact from a dramatic bottleneck.</p