116 research outputs found

    Comparasite: a database for comparative study of transcriptomes of parasites defined by full-length cDNAs

    Get PDF
    Comparasite is a database for comparative studies of transcriptomes of parasites. In this database, each data is defined by the full-length cDNAs from various apicomplexan parasites. It integrates seven individual databases, Full-Parasites, consisting of numerous full-length cDNA clones that we have produced and sequenced: 12 484 cDNA sequences from Plasmodium falciparum, 11 262 from Plasmodium yoelii, 9633 from Plasmodium vivax, 1518 from Plasmodium berghei, 7400 from Toxoplasma gondii, 5921 from Cryptosporidium parvum and 10 966 from the tapeworm Echinococcus multilocularis. Putatively counterpart gene groups are clustered and comparative analysis of any combination of six apicomplexa species is implemented, such as interspecies comparisons regarding protein motifs (InterPro), predicted subcellular localization signals (PSORT), transmembrane regions (SOSUI) or upstream promoter elements. By specifying keywords and other search conditions, Comparasite retrieves putative counterpart gene groups containing a given feature in common or in a species-specific manner. By enabling multi-faceted comparative analyses of genes of apicomplexa protozoa, monophyletic organisms that have evolved to diversify to parasitize various hosts by adopting complex life cycles, Comparasite should help elucidate the mechanism behind parasitism. Our full-length cDNA databases and Comparasite are accessible from

    cDNA sequences reveal considerable gene prediction inaccuracy in the Plasmodium falciparum genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The completion of the <it>Plasmodium falciparum </it>genome represents a milestone in malaria research. The genome sequence allows for the development of genome-wide approaches such as microarray and proteomics that will greatly facilitate our understanding of the parasite biology and accelerate new drug and vaccine development. Designing and application of these genome-wide assays, however, requires accurate information on gene prediction and genome annotation. Unfortunately, the genes in the parasite genome databases were mostly identified using computer software that could make some erroneous predictions.</p> <p>Results</p> <p>We aimed to obtain cDNA sequences to examine the accuracy of gene prediction <it>in silico</it>. We constructed cDNA libraries from mixed blood stages of <it>P. falciparum </it>parasite using the SMART cDNA library construction technique and generated 17332 high-quality expressed sequence tags (EST), including 2198 from primer-walking experiments. Assembly of our sequence tags produced 2548 contigs and 2671 singletons <it>versus </it>5220 contigs and 5910 singletons when our EST were assembled with EST in public databases. Comparison of all the assembled EST/contigs with predicted CDS and genomic sequences in the PlasmoDB database identified 356 genes with predicted coding sequences fully covered by EST, including 85 genes (23.6%) with introns incorrectly predicted. Careful automatic software and manual alignments found an additional 308 genes that have introns different from those predicted, with 152 new introns discovered and 182 introns with sizes or locations different from those predicted. Alternative spliced and antisense transcripts were also detected. Matching cDNA to predicted genes also revealed silent chromosomal regions, mostly at subtelomere regions.</p> <p>Conclusion</p> <p>Our data indicated that approximately 24% of the genes in the current databases were predicted incorrectly, although some of these inaccuracies could represent alternatively spliced transcripts, and that more genes than currently predicted have one or more additional introns. It is therefore necessary to annotate the parasite genome with experimental data, although obtaining complete cDNA sequences from this parasite will be a formidable task due to the high AT nature of the genome. This study provides valuable information for genome annotation that will be critical for functional analyses.</p

    A Plasmodium falciparum FcB1-schizont-EST collection providing clues to schizont specific gene structure and polymorphism

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The <it>Plasmodium falciparum </it>genome (3D7 strain) published in 2002, revealed ~5,400 genes, mostly based on <it>in silico </it>predictions. Experimental data is therefore required for structural and functional assessments of <it>P. falciparum </it>genes and expression, and polymorphic data are further necessary to exploit genomic information to further qualify therapeutic target candidates. Here, we undertook a large scale analysis of a <it>P. falciparum </it>FcB1-schizont-EST library previously constructed by suppression subtractive hybridization (SSH) to study genes expressed during merozoite morphogenesis, with the aim of: 1) obtaining an exhaustive collection of schizont specific ESTs, 2) experimentally validating or correcting <it>P. falciparum </it>gene models and 3) pinpointing genes displaying protein polymorphism between the FcB1 and 3D7 strains.</p> <p>Results</p> <p>A total of 22,125 clones randomly picked from the SSH library were sequenced, yielding 21,805 usable ESTs that were then clustered on the <it>P. falciparum </it>genome. This allowed identification of 243 protein coding genes, including 121 previously annotated as hypothetical. Statistical analysis of GO terms, when available, indicated significant enrichment in genes involved in "entry into host-cells" and "actin cytoskeleton". Although most ESTs do not span full-length gene reading frames, detailed sequence comparison of FcB1-ESTs versus 3D7 genomic sequences allowed the confirmation of exon/intron boundaries in 29 genes, the detection of new boundaries in 14 genes and identification of protein polymorphism for 21 genes. In addition, a large number of non-protein coding ESTs were identified, mainly matching with the two A-type rRNA units (on chromosomes 5 and 7) and to a lower extent, two atypical rRNA loci (on chromosomes 1 and 8), TARE subtelomeric regions (several chromosomes) and the recently described telomerase RNA gene (chromosome 9).</p> <p>Conclusion</p> <p>This FcB1-schizont-EST analysis confirmed the actual expression of 243 protein coding genes, allowing the correction of structural annotations for a quarter of these sequences. In addition, this analysis demonstrated the actual transcription of several remarkable non-protein coding loci: 2 atypical rRNA, TARE region and telomerase RNA gene. Together with other collections of <it>P. falciparum </it>ESTs, usually generated from mixed parasite stages, this collection of FcB1-schizont-ESTs provides valuable data to gain further insight into the <it>P. falciparum </it>gene structure, polymorphism and expression.</p

    DNA-encoded nucleosome occupancy is associated with transcription levels in the human malaria parasite Plasmodium falciparum.

    Get PDF
    BackgroundIn eukaryotic organisms, packaging of DNA into nucleosomes controls gene expression by regulating access of the promoter to transcription factors. The human malaria parasite Plasmodium falciparum encodes relatively few transcription factors, while extensive nucleosome remodeling occurs during its replicative cycle in red blood cells. These observations point towards an important role of the nucleosome landscape in regulating gene expression. However, the relation between nucleosome positioning and transcriptional activity has thus far not been explored in detail in the parasite.ResultsHere, we analyzed nucleosome positioning in the asexual and sexual stages of the parasite's erythrocytic cycle using chromatin immunoprecipitation of MNase-digested chromatin, followed by next-generation sequencing. We observed a relatively open chromatin structure at the trophozoite and gametocyte stages, consistent with high levels of transcriptional activity in these stages. Nucleosome occupancy of genes and promoter regions were subsequently compared to steady-state mRNA expression levels. Transcript abundance showed a strong inverse correlation with nucleosome occupancy levels in promoter regions. In addition, AT-repeat sequences were strongly unfavorable for nucleosome binding in P. falciparum, and were overrepresented in promoters of highly expressed genes.ConclusionsThe connection between chromatin structure and gene expression in P. falciparum shares similarities with other eukaryotes. However, the remarkable nucleosome dynamics during the erythrocytic stages and the absence of a large variety of transcription factors may indicate that nucleosome binding and remodeling are critical regulators of transcript levels. Moreover, the strong dependency between chromatin structure and DNA sequence suggests that the P. falciparum genome may have been shaped by nucleosome binding preferences. Nucleosome remodeling mechanisms in this deadly parasite could thus provide potent novel anti-malarial targets

    TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

    Get PDF
    The TBestDB database contains ∼370 000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes experimental information, consensus sequences, gene annotations and metabolic pathway predictions. Most of these ESTs have been generated by the Protist EST Program, a collaboration among six Canadian research groups. EST sequences are read from trace files up to a minimum quality cut-off, vector and linker sequence is masked, and the ESTs are clustered using phrap. The resulting consensus sequences are automatically annotated by using the AutoFACT program. The datasets are automatically checked for clustering errors due to chimerism and potential cross-contamination between organisms, and suspect data are flagged in or removed from the database. Access to data deposited in TBestDB by individual users can be restricted to those users for a limited period. With this first report on TBestDB, we open the database to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories. For instructions on submission to TBestDB, contact [email protected]. The database can be queried at

    Molecular Biology

    Get PDF

    Core promoters are predicted by their distinct physicochemical properties in the genome of Plasmodium falciparum

    Get PDF
    A method is presented to computationally identify core promoters in the Plasmodium falciparum genome using only DNA physicochemical properties

    Vesicle Targeting In Plasmodium Falciparum: The Identification and Molecular Characterization of Plasmodium Falciparum Family of of Snare Proteins

    Get PDF
    Proteins of the SNARE (Soluble N-ethylmaleimide sensitive factor attachment protein receptor) super-family have been characterized as playing an essential role in vesicle targeting and fusion in all eukaryotes. The intracellular malaria parasite Plasmodium falciparum exhibits an unusual endomembrane system that is characterized by an unstacked Golgi apparatus, a developmentally induced apical complex, and various organellar structures of parasite origin in the infected host cells. How malaria parasites target nuclear-encoded proteins to these novel compartments is a central question in Plasmodium cell biology. Ultrastructural studies elsewhere have implicated the participation of specialized vesicular elements in transport of virulence proteins, including various cytoadherance and host cell remodeling factors, into the infected erythrocyte cytoplasm. However, little is known about the machineries that define the directionality of vesicle trafficking in malaria parasites. We hypothesized that the P. falciparum SNARE proteins would exhibit novel features required for vesicle targeting to the parasite-specific compartments. We then identified for the first time and confirmed the expression of eighteen SNARE genes in P. falciparum. Members of the PfSNAREs exhibit atypical structural features (Ayong et al., 2007, Molecular & Biochemical Parasitology, 152(2), 113-122). Among the atypical PfSNAREs, PfSec22 contains an unusual insertion of the Plasmodium export element (PEXEL) within its profilin-like longin domain, preceded by an N-terminal hydrophobic segment. Localization analyses suggest that PfSec22 is predominantly a vesicle-associated SNARE of the ER/Golgi interface, but which associates partially with mobile extraparasitic vesicles in P. falciparum-infected erythrocytes at trophozoite stages. We showed that PfSec22 export into host cells occurs via a two-step model that involves extraparasitic vesicle budding from the parasite plasma membrane and fusion with the parasitophorous vacuolar membrane. Export of PfSec22 was independent of its membrane-insertion suggesting that this protein might cross the vacuolar space as a single-pass type IV membrane protein. We demonstrated that the atypical longin domain dictates the steady-state localization of PfSec22, regulating its ER/Golgi trafficking and export into host cells. Our study provides the first experimental evidence for SNARE protein export in P. falciparum, and suggests a role of PfSec22 in vesicle trafficking within the infected host cell (Ayong et al, Eukaryotic Cell, Epub Jul 17, 2009) Next, to define the physiological function of the PfSec22 protein in Plasmodium parasites, we investigated its cognate partners. Using purified recombinant proteins we showed that PfSec22 forms direct binding interactions with six other PfSNAREs in vitro. These included the PfSyn5, PfBet1, PfGS27, PfSyn6, PfSyn16 and PfSyn18 PfSNAREs. By generating GFPexpressing parasites, we successfully localized the SNARE proteins PfSyn5, PfBet1 and PfGS27 to the parasite cis-Golgi compartment. We confirmed the association of PfSec22 with PfSyn5, PfBet1 and PfGS27 in vivo by immunoprecipitation analyses. Our data indicate a conserved ERto-Golgi SNARE assembly in P. falciparum, and suggest that the malaria Sec22 protein might form novel SNARE complexes required for vesicle traffic within P. falciparum-infected erythrocytes

    A spatial and temporal analysis of Plasmodium falciparum transcription

    Get PDF
    Developmentally-linked gene expression is critical to the success of the human malaria parasite Plasmodium alciparum in ensuring colonisation, adaptation, replication and transmission during its complex life cycle as well as the manifestation of disease in humans. Yet, despite the wealth of high-throughput transcriptomic data, our understanding of the organisation of the transcriptional unit outside of the open reading frame is limited. The objectives of this study were directed towards understanding how intergenic space is organised over the entire P. falciparum genome and determining the likely spatio-temporal organisation of transcripts over these intergenic regions. In addition, as homopolymeric poly dA.dT are significantly overrepresented within these regions, a spatial analysis of poly dA.dT tract positional bias was undertaken and correlated with available nucleosome positioning data. These studies in P. falciparum were supported with comparative analyses using a range of other Apicomplexan parasites. Finally, the role of the 5’ untranslated region in directing transcriptional and translational efficiency for a typical housekeeping gene was investigated. Towards these aims a range of approaches were employed including bioinformatics, comparative genomics, data modelling and reporter gene assays. The findings presented in this thesis extend our understanding of the transcriptional landscape in this important human pathogen, generating models that can be experimentally validated when new RNAseq datasets become available. Ideas relating to how different selective forces are at play in shaping the organisation and sequence of intergenic regions are also presented. Moreover, we demonstrate comparable organisations of intergenic regions and homopolymer tracts within a number of Apicomplexan parasites, many important for human and animal health, providing the basis for a comparative approach to understanding transcriptional processes across this medically important phylum

    Visualising plasmodium falciparum functional genomic data in MaGnET: malaria genome exploration tool

    Get PDF
    Malaria affects the lives of 500 million people around the world each year. The disease is caused by protozoan parasites of the genus Plasmodium, whose ability to evade the immune system and quickly evolve resistance to drugs poses a major challenge for disease control. The results of several Plasmodium genome sequencing projects have revealed how little is known about the function of their genes (over half of the approximately 5400 genes in Plasmodium falciparum, the most deadly human parasite, are annotated as hypothetical ). Recently, several large-scale studies have attempted to shed light on the processes in which genes are involved; for example, the use of DNA microarrays to profile the parasite s gene expression. With the emergence of varied types of functional genomic data comes a need for effective tools that allow biologists (and bioinformaticians) to explore these data. The goal of exploration/browsing-style analyses will typically be to derive clues towards the function of thus far uncharacterised gene products, and to formulate experimentally testable hypotheses. Graphic interfaces to individual data sets are obviously beneficial in this endeavour. However, effective visual data exploration requires also that interfaces to different functional genomic data are integrated and that the user can carry forward a selected group of genes (not merely one at a time) across a variety of data sets. Non-expert users especially benefit from workbenchlike tools offering access to the data in this way. Still, only very few of the contemporary publicly available software have implemented such functionality. This work introduces a novel software tool for the integrated visualisation of functional genomic data relating to P. falciparum: the Malaria Genome Exploration Tool (MaGnET). MaGnET consists of a light-weight Java program for effective visualisation linked to a MySQL database for data storage. In order to maximise accessibility, the program is publicly available over the World Wide Web (http://www.malariagenomeexplorer.org/). MaGnET incorporates a Genome Viewer for visualising the location of genomic features, a Protein-Protein Interaction Viewer for visualising networks of experimentally determined interactions and an Expression Data Viewer for displaying mRNA and protein expression data. Complex database queries can easily be constructed in the Data Analysis Viewer. An advantage over most other tools is that all sections are fully integrated, allowing users to carry selected groups of genes across different datasets. Furthermore, MaGnET provides useful advanced visualisation features, including mapping of expression data onto genomic location or protein-protein interaction network. The inclusion of available third-party Java software has expanded the visualisation capability of MaGnET; for example, the Jmol viewer has been incorporated for viewing 3-D protein structures. An effort has been made to only include data in MaGnET that is at least of reasonable quality. The MaGnET database collates experimental data from various public Plasmodium resources (e.g. PlasmoDB) and from published functional genomic studies, such as DNA microarrays. In addition, through careful filtering and labelling we have been able to include some predicted annotation that has not been experimentally confirmed, such as Gene Ontology and InterPro functional assignments and modelled protein structures. The application of MaGnET to malaria biology is demonstrated through a series of small studies. Initial examples show how MaGnET can be used to effectively demonstrate results from previously published analyses. This is followed up by using MaGnET to make a set of predictions about the possible functions of selected uncharacterised genes and suggesting follow-up experiments
    • …
    corecore