25 research outputs found

    A Plasmodium Whole-Genome Synteny Map: Indels and Synteny Breakpoints as Foci for Species-Specific Genes

    Get PDF
    Whole-genome comparisons are highly informative regarding genome evolution and can reveal the conservation of genome organization and gene content, gene regulatory elements, and presence of species-specific genes. Initial comparative genome analyses of the human malaria parasite Plasmodium falciparum and rodent malaria parasites (RMPs) revealed a core set of 4,500 Plasmodium orthologs located in the highly syntenic central regions of the chromosomes that sharply defined the boundaries of the variable subtelomeric regions. We used composite RMP contigs, based on partial DNA sequences of three RMPs, to generate a whole-genome synteny map of P. falciparum and the RMPs. The core regions of the 14 chromosomes of P. falciparum and the RMPs are organized in 36 synteny blocks, representing groups of genes that have been stably inherited since these malaria species diverged, but whose relative organization has altered as a result of a predicted minimum of 15 recombination events. P. falciparum-specific genes and gene families are found in the variable subtelomeric regions (575 genes), at synteny breakpoints (42 genes), and as intrasyntenic indels (126 genes). Of the 168 non-subtelomeric P. falciparum genes, including two newly discovered gene families, 68% are predicted to be exported to the surface of the blood stage parasite or infected erythrocyte. Chromosomal rearrangements are implicated in the generation and dispersal of P. falciparum-specific gene families, including one encoding receptor-associated protein kinases. The data show that both synteny breakpoints and intrasyntenic indels can be foci for species-specific genes with a predicted role in host-parasite interactions and suggest that, besides rearrangements in the subtelomeric regions, chromosomal rearrangements may also be involved in the generation of species-specific gene families. A majority of these genes are expressed in blood stages, suggesting that the vertebrate host exerts a greater selective pressure than the mosquito vector, resulting in the acquisition of diversity

    Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome

    Get PDF
    The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymenas germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum.</p

    Sequencing of Culex quinquefasciatus establishes a platform for mosquito comparative genomics

    Get PDF
    Culex quinquefasciatus (the southern house mosquito) is an important mosquito vector of viruses such as West Nile virus and St. Louis encephalitis virus, as well as of nematodes that cause lymphatic filariasis. C. quinquefasciatus is one species within the Culex pipiens species complex and can be found throughout tropical and temperate climates of the world. The ability of C. quinquefasciatus to take blood meals from birds, livestock, and humans contributes to its ability to vector pathogens between species. Here, we describe the genomic sequence of C. quinquefasciatus: Its repertoire of 18,883 protein-coding genes is 22% larger than that of Aedes aegypti and 52% larger than that of Anopheles gambiae with multiple gene-family expansions, including olfactory and gustatory receptors, salivary gland genes, and genes associated with xenobiotic detoxification

    Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease

    Get PDF
    Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent

    Genomic Insights Into The Ixodes scapularis Tick Vector Of Lyme Disease

    Get PDF
    Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retrotransposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing B57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing’, prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent

    Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea

    Get PDF
    Background: Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus. Results: We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event. Conclusions: Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes

    Origin and Putative Mechanism of Expansion of the <i>tstk</i> Family in <i>P. falciparum</i>

    No full text
    <div><p>(A) Analysis of <i>P. falciparum</i>-specific genes at the SBPs revealed a gene family encoding receptor-associated protein kinases (TSTK). Maximum likelihood distances were calculated for the C-terminal 400 amino acids of all TSTKs, including those found for other <i>Plasmodium</i> species, <i>Toxoplasma gondii, Cryptosporidium parvum,</i> and <i>C. hominis</i>. The tree was rooted using the clade with the three non-<i>Plasmodium</i> sequences as the outgroup (shaded dark gray). The syntenic progenitor genes clearly form one clade (shaded light gray), while the clustering of the other 20 mainly subtelomeric <i>pftstk</i> is more ambiguous (the three non-subtelomeric copies are shown in bold and include <i>pftstk7a,</i> which appears most closely related to the clade of progenitor genes). Circles represent branch points with bootstrap values of 100% (white), 90%–99% (light gray) and 65%–89% (dark gray).</p><p>(B) See <a href="http://www.plospathogens.org/article/info:doi/10.1371/journal.ppat.0010044#ppat-0010044-g002" target="_blank">Figure 2</a> for the numbering of the SBs and the symbols used in this figure. Based on the 15 recombination events described in <a href="http://www.plospathogens.org/article/info:doi/10.1371/journal.ppat.0010044#ppat-0010044-g003" target="_blank">Figure 3</a> and the phylogenetic analysis of the <i>tstk</i> family, we suggest the origin and putative evolution of the <i>pftstk</i> family as shown here. Phylogenetic analysis suggests that the intersyntenic <i>pftstk7a</i> is most closely related to the progenitor founder gene, <i>pftstk0</i>. Interestingly, this gene is the first nonsyntenic gene upstream of SB “VIIe:2b.” This SB is linked in the cRMP genome to SB “I:2a” that in <i>P. falciparum</i> is also flanked by a member of the <i>tstk</i> family, the subtelomeric <i>pftstk1</i>. Based on these observations we suggest that the founder gene <i>pftstk0</i> was duplicated after the split of <i>P. falciparum</i> from the other <i>Plasmodium</i> species but before SBs “VIIe:2b” and “I:2a” were separated (1). This gene was then directly involved in the breakage of this link, creating Pfchr1 (“I:2a”) and destroying the telomere of “VIId:6d” by addition of “VIIe:2b” (2). During this recombination process, the gene was duplicated and is now present not only as two chromosome-internal copies on “VIIIc:12d” <i>(pftstk0)</i> and between “VIId:6d” and “VIIe:2b” <i>(pftstk7a)</i> but also as a first telomeric copy on the newly formed telomere of Pfchr1 <i>(pftstk1)</i>. From here the gene could expand to the other subtelomeric regions (3). Local gene duplications resulted in the generation of seven copies on Pfchr9 and two copies on Pfchr4. After a copy of <i>pftstk</i> ended up at the left-hand cRMP subtelomeric end of SB “Xb:5a,” the telomere conversion linked SB “Xa:12a” to SB “Xb:5a,” which turned this telomeric copy into an intersyntenic gene <i>(pftstk10a)</i>. The last non-subtelomeric copy, <i>pftstk13,</i> most likely resulted from a different process of mobility of <i>P. falciparum</i>-specific elements creating the intrasyntenic genes.</p></div

    Deduced Organization of the cRMP <i>sera</i> Locus

    No full text
    <div><p>(A) The combination of three <i>P. berghei,</i> six <i>P. chabaudi,</i> and two <i>P. yoelii</i> contigs (thick black lines) in a region of Pfchr2 containing eight <i>sera</i> copies demonstrates the strength of the “composite genome approach.” Syntenic genes (black, linked by dashed vertical lines; left, PFB0315w and PFB0320c; right, PFB0365w) flank the <i>sera</i> clusters and reveal the presence of five <i>sera</i> genes in the RMPs.</p><p>(B) Phylogenetic analysis revealed a close relation between <i>pfsera8, pfsera7,</i> and <i>pfsera6</i> and their syntenic orthologs in the RMPs (shaded gray, linked by dashed vertical lines in [A]). Other <i>sera</i> copies (<i>pfsera1–5, pbsera1–2,</i> and <i>pysera1–2</i>) clustered in species-specific groups (linked by solid horizontal lines in [A]). Circles represent branch points with bootstrap values of 100% (white), 90%–99% (light gray), and 65%–89% (dark gray).</p></div

    Hidden genomic evolution in a morphospecies-The landscape of rapidly evolving genes in Tetrahymena

    No full text
    A morphospecies is defined as a taxonomic species based wholly on morphology, but often morphospecies consist of clusters of cryptic species that can be identified genetically or molecularly. The nature of the evolutionary novelty that accompanies speciation in a morphospecies is an intriguing question. Morphospecies are particularly common among ciliates, a group of unicellular eukaryotes that separates 2 kinds of nuclei-the silenced germline nucleus (micronucleus [MIC]) and the actively expressed somatic nucleus (macronucleus [MAC])-within a common cytoplasm. Because of their very similar morphologies, members of the Tetrahymena genus are considered a morphospecies. We explored the hidden genomic evolution within this genus by performing a comprehensive comparative analysis of the somatic genomes of 10 species and the germline genomes of 2 species of Tetrahymena. These species show high genetic divergence; phylogenomic analysis suggests that the genus originated about 300 million years ago (Mya). Seven universal protein domains are preferentially included among the species-specific (i.e., the youngest) Tetrahymena genes. In particular, leucine-rich repeat (LRR) genes make the largest contribution to the high level of genome divergence of the 10 species. LRR genes can be sorted into 3 different age groups. Parallel evolutionary trajectories have independently occurred among LRR genes in the different Tetrahymena species. Thousands of young LRR genes contain tandem arrays of exactly 90-bp exons. The introns separating these exons show a unique, extreme phase 2 bias, suggesting a clonal origin and successive expansions of 90-bp-exon LRR genes. Identifying LRR gene age groups allowed us to document a Tetrahymena intron length cycle. The youngest 90-bp exon LRR genes in T. thermophila are concentrated in pericentromeric and subtelomeric regions of the 5 micronuclear chromosomes, suggesting that these regions act as genome innovation centers. Copies of a Tetrahymena Long interspersed element (LINE)-like retrotransposon are very frequently found physically adjacent to 90-bp exon/intron repeat units of the youngest LRR genes. We propose that Tetrahymena species have used a massive exon-shuffling mechanism, involving unequal crossing over possibly in concert with retrotransposition, to create the unique 90-bp exon array LRR genes.</p

    Protein interaction data curation: the International Molecular Exchange (IMEx) consortium.

    Get PDF
    The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices
    corecore