22 research outputs found

    Evolutionary genomics of the human parasite Schistosoma mansoni and its hosts

    No full text
    Traditionally, genomes were perceived as stagnant and passive entities, whereby genes occupied specific locations along the chromosomes. This idea was later challenged by the identification of mobile genes (e.g., transposable elements or TEs). By the process of transposition, TEs are able to move between loci and modify the resulting genomic environment. Such movements create large scale changes in genomes and transcriptomes, through chromosomal re-arrangements and the modification of gene expression patterns. The main objective of my thesis research is to analyze TE dynamics of the genome and transcriptome of Schistosoma mansoni, a human parasite that has a genome laden with TEs. Parasites of the genus Schistosoma are the causative agents of schistosomiasis, a widespread tropical disease infecting over 200 million people worldwide. S. mansoni is the most widespread species of schistosome, distributed in both the Old and the New world. Having an Old world origin, S. mansoni reportedly invaded the New World through the trans-Atlantic slave trade during last 500 years and quickly became established. I quantified TE copy number in the genomes and transcriptomes of Old world and New world strains of the parasite using a custom qPCR assay. Compared to the Old world strains, New world strains of S. mansoni appear to carry more copies of TEs in their genome, possible genomic signatures of its New world invasion. This work provides empirical data to support the hypothesis that TEs proliferate extensively upon genomic stress (such as habitat invasion). The genomic dynamism caused by such proliferations may have enabled parasite to evolve adaptations to thrive in the New world. TEs can also introduce adaptive advantages through the process of Horizontal Gene Transfer (HGT), the movement of genes between distinct biological lineages. The literature suggests that HGTs often occur between hosts and parasites, presumably facilitated by the intimate nature of the host-parasite interaction. As HGTs provide novel adaptations to the gene recipients, the identification of HGTs will provide new insights to host-parasite co-evolution. In Schistosoma-host systems, several published reports suggested the possibility of HGT between schistosomes and hosts, and that some HGTs assist the parasite in evasion of the host immune system. However, most of these reports are based on circumstantial evidence and have not been critically and independently validated. I used molecular and bioinformatics approaches to analyze 13 published claims of HGTs between schistosomes and their hosts. My research revealed that most supposed schistosome-host HGTs are false positives and affirm the importance of using multiple methodologies to validate HGTs, as both DNA samples and next-generation sequencing (NGS) datasets are prone to contamination. Specifically, sequence data from hosts that are infected with parasites often contain spurious parasite DNA as technical artifacts (“xenobiotics”) that are not indicative of HGT between hosts and parasites. Next, I explored the signatures of NGS contamination while studying changes in host gene expression in response to parasitic infection. I sequenced, assembled and annotated the liver transcriptomes of uninfected and S. mansoni infected mice to a) trace the effect of xenobiotics on transcriptome assembly and b) identify global changes in host gene expression related to parasite infection. I found that xenobiotic transcripts can falsely appear as differentially expressed, significantly affecting downstream analysis. I also identified genes associated with metabolic, immunological and inflammatory responses that were differentially expressed in infected mice. These findings enhance our understanding of the host immunological repertoire involved in response to S. mansoni infections, providing insights into host-parasite interplay at the transcript level

    annotated genes_proteins

    No full text
    16,571 genes were annotated in the golden eagle genome, this fasta file describes the protein sequences (see readme.pdf for more information). Corresponding file archived in fortress is kmer70_min10000_scaffolds_revisedassembly.all.maker.proteins

    Data from: The Genome sequence of a widespread apex predator, the golden eagle (Aquila chrysaetos)

    No full text
    Biologists routinely use molecular markers to identify conservation units, to quantify genetic connectivity, to estimate population sizes, and to identify targets of selection. Many imperiled eagle populations require such efforts and would benefit from enhanced genomic resources. We sequenced, assembled, and annotated the first eagle genome using DNA from a male golden eagle (Aquila chrysaetos) captured in western North America. We constructed genomic libraries that were sequenced using Illumina technology and assembled the high-quality data to a depth of ~40x coverage. The genome assembly includes 2,552 scaffolds >10 Kb and 415 scaffolds >1.2 Mb. We annotated 16,571 genes that are involved in myriad biological processes, including such disparate traits as beak formation and color vision. We also identified repetitive regions spanning 92 Mb (~6% of the assembly), including LINES, SINES, LTR-RTs and DNA transposons. The mitochondrial genome encompasses 17,332 bp and is ~91% identical to the Mountain Hawk-Eagle (Nisaetus nipalensis). Finally, the data reveal that several anonymous microsatellites commonly used for population studies are embedded within protein-coding genes and thus may not have evolved in a neutral fashion. Because the genome sequence includes ~800,000 novel polymorphisms, markers can now be chosen based on their proximity to functional genes involved in migration, carnivory, and other biological processes

    annotated genes_gff

    No full text
    16,571 genes were annotated in the golden eagle genome, this is the .gff file associated with the annotations (see readme.pdf for additional information)

    microsatellites_kmer70_min200_scaffolds_revisedassembly_allmicrosatellites.fasta

    No full text
    I used the MISA.pl script (http://pgrc.ipk-gatersleben.de/misa/misa.html) to identify all microsatellites present in scaffolds greater than 200 bp. Misa.pl kmer70_min200_scaffolds_revisedassembly.fast

    genome assembly_kmer70_min10000_scaffolds_revisedassembly

    No full text
    The golden eagle genome was assembled in ABySS with the following parameters: abyss-pe s=202 n=10 k=70 l=30; followed by file specifications for 1) 'lib' -- the paired-end and mate-pair reads, 2) 'se' -- the unpaired and mate-pair reads used as single-end reads in assembly and 3) 'mp' -- the paired-end and mate-pair reads used in scaffolding

    transposable elements_repeatmasker_kmer70-v2-min200-scaffolds.fa

    No full text
    ############################ ###### RepeatMasker ######## ############################ ## RepeatMasker version 4.0.2 ## RepeatMaskerLibrary-20130422 version ## command: RepeatMasker -nolow -no_is -norna -dir . \ -lib RepeatMaskerLib.embl.lib \ kmer70-v2-min200-scaffolds.fa ## output files: repeatmasker_kmer70-v2-min200-scaffolds.fa.masked repeatmasker_kmer70-v2-min200-scaffolds.fa.tb

    transposable elements_RepeatProteinMask output_ge_v2_all

    No full text
    ############################ ##### RepeatProteinMask #### ############################ ## version: 4.0.2 ## command: RepeatProteinMask -noLowSimple -pvalue 1e-4 -engine abblast kmer70-v2-min200-scaffolds.fa ## output files: ge_v2_all.anno
    corecore