24 research outputs found

    Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release

    Get PDF
    BACKGROUND: Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications. RESULTS: Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5). CONCLUSION: Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms

    Sequencing of Culex quinquefasciatus establishes a platform for mosquito comparative genomics

    Get PDF
    Culex quinquefasciatus (the southern house mosquito) is an important mosquito vector of viruses such as West Nile virus and St. Louis encephalitis virus, as well as of nematodes that cause lymphatic filariasis. C. quinquefasciatus is one species within the Culex pipiens species complex and can be found throughout tropical and temperate climates of the world. The ability of C. quinquefasciatus to take blood meals from birds, livestock, and humans contributes to its ability to vector pathogens between species. Here, we describe the genomic sequence of C. quinquefasciatus: Its repertoire of 18,883 protein-coding genes is 22% larger than that of Aedes aegypti and 52% larger than that of Anopheles gambiae with multiple gene-family expansions, including olfactory and gustatory receptors, salivary gland genes, and genes associated with xenobiotic detoxification

    Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease

    Get PDF
    Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent

    Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease

    Get PDF
    Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent

    Complex patterns of genomic admixture within Southern Africa

    Get PDF
    The original publication is available at www.plosgenetics.orgWithin-population genetic diversity is greatest within Africa, while between-population genetic diversity is directly proportional to geographic distance. The most divergent contemporary human populations include the click-speaking forager peoples of southern Africa, broadly defined as Khoesan. Both intra- (Bantu expansion) and inter-continental migration (European-driven colonization) have resulted in complex patterns of admixture between ancient geographically isolated Khoesan and more recently diverged populations. Using gender-specific analysis and almost 1 million autosomal markers, we determine the significance of estimated ancestral contributions that have shaped five contemporary southern African populations in a cohort of 103 individuals. Limited by lack of available data for homogenous Khoesan representation, we identify the Ju/’hoan (n = 19) as a distinct early diverging human lineage with little to no significant non- Khoesan contribution. In contrast to the Ju/’hoan, we identify ancient signatures of Khoesan and Bantu unions resulting in significant Khoesan- and Bantu-derived contributions to the Southern Bantu amaXhosa (n = 15) and Khoesan !Xun (n = 14), respectively. Our data further suggests that contemporary !Xun represent distinct Khoesan prehistories. Khoesan assimilation with European settlement at the most southern tip of Africa resulted in significant ancestral Khoesan contributions to the Coloured (n = 25) and Baster (n = 30) populations. The latter populations were further impacted by 170 years of East Indian slave trade and intra-continental migrations resulting in a complex pattern of genetic variation (admixture). The populations of southern Africa provide a unique opportunity to investigate the genomic variability from some of the oldest human lineages to the implications of complex admixture patterns including ancient and recently diverged human lineages.Funded by the J. Craig Venter Family Foundation, La Jolla, CA, USA, and Illumina, San Diego, CA, USA, to VMH.EAT is supported by a National Health and Medical Research Council (NHMRC) Australia Fellowship, and OL and NJS are supported in part by NIH/NCATS grant number UL1 RR025774.Publisher's versio

    Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies

    No full text
    The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full-length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the ∼27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations

    Ju/'hoan-Yoruba ancestry informative markers (AIMs) defined ancestral contributions to the !Xun and amaXhosa, providing evidence for two distinct !Xun lineages with differing ancestral contributions.

    No full text
    <p>(A) STRUCTURE analysis for 2,687 Ju/'hoan-Yoruba AIMs identifies a third ‘unknown’ population cluster when assuming three ancestral populations. (B) Ancestral contributions to the !Xun shows a diverse contribution of a Ju/'hoan and unknown likely Khoesan ancestral fraction and a constant Bantu-derived fraction. (C) Ancestral contributions to the amaXhosa demonstrate more even contributions. (D) Based on ancestral fractions the !Xun are further classified as Ju/'hoan ancestral (n = 6) and (E) unknown ancestral, suggesting two unique !Xun lineages, each with significant Bantu ancestral contributions. The single admixed !Xun (NF2 ⇓) and the three Angolan !Xun (V) are indicated.</p

    Relatedness and demographic history of the Ju/'hoan to global populations defines early divergence and genomic impact of forager existence.

    No full text
    <p>(A) Circular Neighbor Joining phylogenetic tree for 24,402 LD-pruned autosomal markers after merging our data with global population data for a total of 521 samples from 14 populations, with the Pan genome as the outgroup. We confirm early divergence of the Ju/'hoan and report independent branching of the Angolan !Xun. (B) Plot of total length of ROH against number of ROH (>500 kb) for each study sample against European and Yoruba using 716,734 markers (367 samples). Our foraging groups show smaller overall ROH lengths than the Europeans, yet longer than the Yoruba, suggesting small effective population sizes of a likely ancient population with minimal to no impact from a dramatic bottleneck.</p
    corecore