16 research outputs found

    Quantification of stochastic noise of splicing and polyadenylation in Entamoeba histolytica

    Get PDF
    Alternative splicing and polyadenylation were observed pervasively in eukaryotic messenger RNAs. These alternative isoforms could either be consequences of physiological regulation or stochastic noise of RNA processing. To quantify the extent of stochastic noise in splicing and polyadenylation, we analyzed the alternative usage of splicing and polyadenylation sites in Entamoeba histolytica using RNA-Seq. First, we identified a large number of rarely spliced alternative junctions and then showed that the occurrence of these alternative splicing events is correlated with splicing site sequence, occurrence of constitutive splicing events and messenger RNA abundance. Our results implied the majority of these alternative splicing events are likely to be stochastic error of splicing machineries, and we estimated the corresponding error rates. Second, we observed extensive microheterogeneity of polyadenylation cleavage sites, and the extent of such microheterogeneity is correlated with the occurrence of constitutive cleavage events, suggesting most of such microheterogeneity is likely to be stochastic. Overall, we only observed a small fraction of alternative splicing and polyadenylation isoforms that are unlikely to be solely stochastic, implying the functional relevance of alternative splicing and polyadenylation in E. histolytica is limited. Lastly, we revised the gene models and annotated their 3′UTR in AmoebaDB, providing valuable resources to the community

    Regulation of Gene Expression in Protozoa Parasites

    Get PDF
    Infections with protozoa parasites are associated with high burdens of morbidity and mortality across the developing world. Despite extensive efforts to control the transmission of these parasites, the spread of populations resistant to drugs and the lack of effective vaccines against them contribute to their persistence as major public health problems. Parasites should perform a strict control on the expression of genes involved in their pathogenicity, differentiation, immune evasion, or drug resistance, and the comprehension of the mechanisms implicated in that control could help to develop novel therapeutic strategies. However, until now these mechanisms are poorly understood in protozoa. Recent investigations into gene expression in protozoa parasites suggest that they possess many of the canonical machineries employed by higher eukaryotes for the control of gene expression at transcriptional, posttranscriptional, and epigenetic levels, but they also contain exclusive mechanisms. Here, we review the current understanding about the regulation of gene expression in Plasmodium sp., Trypanosomatids, Entamoeba histolytica and Trichomonas vaginalis

    RNA Sequencing Reveals Widespread Transcription of Natural Antisense RNAs in Entamoeba Species.

    Get PDF
    Entamoeba is a genus of Amoebozoa that includes the intestine-colonizing pathogenic species Entamoeba histolytica. To understand the basis of gene regulation in E. histolytica from an evolutionary perspective, we have profiled the transcriptomes of its closely related species E. dispar, E. moshkovskii and E. invadens. Genome-wide identification of transcription start sites (TSS) and polyadenylation sites (PAS) revealed the similarities and differences of their gene regulatory sequences. In particular, we found the widespread initiation of antisense transcription from within the gene coding sequences is a common feature among all Entamoeba species. Interestingly, we observed the enrichment of antisense transcription in genes involved in several processes that are common to species infecting the human intestine, e.g., the metabolism of phospholipids. These results suggest a potentially conserved and compact gene regulatory system in Entamoeba

    Serum-Dependent Selective Expression of EhTMKB1-9, a Member of Entamoeba histolytica B1 Family of Transmembrane Kinases

    Get PDF
    Entamoeba histolytica transmembrane kinases (EhTMKs) can be grouped into six distinct families on the basis of motifs and sequences. Analysis of the E. histolytica genome revealed the presence of 35 EhTMKB1 members on the basis of sequence identity (≥95%). Only six homologs were full length containing an extracellular domain, a transmembrane segment and an intracellular kinase domain. Reverse transcription followed by polymerase chain reaction (RT-PCR) of the kinase domain was used to generate a library of expressed sequences. Sequencing of randomly picked clones from this library revealed that about 95% of the clones were identical with a single member, EhTMKB1-9, in proliferating cells. On serum starvation, the relative number of EhTMKB1-9 derived sequences decreased with concomitant increase in the sequences derived from another member, EhTMKB1-18. The change in their relative expression was quantified by real time PCR. Northern analysis and RNase protection assay were used to study the temporal nature of EhTMKB1-9 expression after serum replenishment of starved cells. The results showed that the expression of EhTMKB1-9 was sinusoidal. Specific transcriptional induction of EhTMKB1-9 upon serum replenishment was further confirmed by reporter gene (luciferase) expression and the upstream sequence responsible for serum responsiveness was identified. EhTMKB1-9 is one of the first examples of an inducible gene in Entamoeba. The protein encoded by this member was functionally characterized. The recombinant kinase domain of EhTMKB1-9 displayed protein kinase activity. It is likely to have dual specificity as judged from its sensitivity to different kinase inhibitors. Immuno-localization showed EhTMKB1-9 to be a surface protein which decreased on serum starvation and got relocalized on serum replenishment. Cell lines expressing either EhTMKB1-9 without kinase domain, or EhTMKB1-9 antisense RNA, showed decreased cellular proliferation and target cell killing. Our results suggest that E. histolytica TMKs of B1 family are functional kinases likely to be involved in serum response and cellular proliferation

    Application of a Naïve Bayes Classifier to Assign Polyadenylation Sites from 3\u27 End Deep Sequencing Data: A Dissertation

    Get PDF
    Cleavage and polyadenylation of a precursor mRNA is important for transcription termination, mRNA stability, and regulation of gene expression. This process is directed by a multitude of protein factors and cis elements in the pre-mRNA sequence surrounding the cleavage and polyadenylation site. Importantly, the location of the cleavage and polyadenylation site helps define the 3’ untranslated region of a transcript, which is important for regulation by microRNAs and RNA binding proteins. Additionally, these sites have generally been poorly annotated. To identify 3’ ends, many techniques utilize an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Previously, simple heuristic filters relying on the number of adenines in the genomic sequence downstream of a putative polyadenylation site have been used to remove these sites of internal priming. However, these simple filters may not remove all sites of internal priming and may also exclude true polyadenylation sites. Therefore, I developed a naïve Bayes classifier to identify putative sites from oligo-dT primed 3’ end deep sequencing as true or false/internally primed. Notably, this algorithm uses a combination of sequence elements to distinguish between true and false sites. Finally, the resulting algorithm is highly accurate in multiple model systems and facilitates identification of novel polyadenylation sites

    Ancient roles of non-coding RNAs in eukaryotic evolution

    Get PDF
    RNAs not coding for proteins, non-coding RNAs (ncRNAs) have many important roles in all kingdoms of life. Especially in eukaryotes, the regulatory functions of ncRNAs have been suggested as a major force in the evolution of complex traits. Cellular processes that are regulated by ncRNAs include for example cell differentiation, organ development and defense against viruses and transposable elements. This is achieved through a number of mechanisms like RNA destabilization and modification, transcriptional and translational control and chromatin modifications. Dictyostelium discoideum is a social amoeba and the best studied organism representing Amoebozoa, one of the eukaryotic supergroups. It has for long served as an excellent model for many basic cellular events like chemotaxis, differentiation and development and recently also for infection. The ncRNA population in D. discoideum is in many ways typical of eukaryotes but also harbors particularities. In this thesis I have studied spliceosomal RNAs as well as the RNA interference and microRNA pathways, which probably were present in the last eukaryotic common ancestor. I have also characterized Class I RNAs which seems to be specific to social amoebae. In addition, we have described the signal recognition particle RNA in several protists and also the involvement of a ncRNA during host interaction and stress in Giardia lamblia. Combining the well established molecular tools and knowledge about various pathways in D. discoideum, with the growing understanding of ncRNA, could in the future give important information about the function of ncRNAs as well as their ancient roles and evolution

    Comparative approaches to genome evolution in Blastocystis and Entamoeba

    Get PDF
    Parasitism has arisen independently in numerous lineages of eukaryotes. Investigating the origins of parasitism is a core question in evolutionary biology and allows identification of parasite-specific factors that aid in diagnosis and treatment. Comparative genomic studies have often been applied within clades of parasites, which allows their ancestral state to be imagined, but cannot elucidate the processes that surrounded the emergence of parasitism. This question must be approached by comparison with a free-living out-group, to reconstruct the ancestral non-parasitic state. In this thesis, I examine free-living relatives of two intestinal protists of global importance, Blastocystis sp. and Entamoeba histolytica, to explore their evolution. A draft genome sequence for Proteromonas lacertae, the non-pathogenic sister-taxon of Blastocystis, is presented along with a transcriptome for Cafeteria roenbergensis, a free-living out-group to the Blastocystis-Proteromonas clade. Together with the published Blastocystis sp. genome sequences, the P. lacertae genome and the C. roenbergensis transcriptome were used in a comparative genomic analysis. This revealed that the Blastocystis genomes are genuinely small, compared to other Stramenopiles and that this reduction is genome-wide as well as with respect to specific cellular apparatus, such as the flagellum and other motility-associated genes, which have been totally lost from the ancestor of Blastocystis. Rather than observe the same loss of function from metabolic capability, this reduction was associated with loss of gene complexity and is indicative of genomic streamlining. This is coupled with gene family expansion of Ig-like domain-containing proteins, potentially bestowing adhesive qualities to the cell surface. A transcriptome for Mastigamoeba sp., a free-living out-group to the Entamoeba genus, is also presented. The Mastigamoeba sp. transcriptome was used in a comparative analysis of the E. histolytica genome. This analysis revealed large-scale expansion of Ras-family proteins in the ancestor of Entamoeba, which may be linked to motility and phagocytosis required for pathogenesis. Analysis of cathepsins revealed processes of genomic reduction and expansion occurring within the same gene family indicating genomic streamlining and subsequent specialisation in the parasite. I have shown how we might revisit crucial questions in evolutionary biology using the latest genome sequencing technology. By generating new genomic resources for free-living protists, this thesis exposes the mechanism by which two common intestinal parasites of humans and animals evolved. It makes substantial contribution to our understanding of the origins of parasite genomes, and of microbial biodiversity, while revealing numerous parasite-specific features that will sustain future research

    Spliceosomal intron and spliceosome evolution in Giardia lamblia and other diplomonads

    Get PDF
    Spliceosomal introns interrupt protein coding genes in all characterized eukaryotic nuclear genomes and are removed by a large RNA-protein complex termed the spliceosome. Diplomonads are diverse unicellular eukaryotes that display compact genomes with few spliceosomal introns. My thesis objectives were to explore spliceosomal intron and spliceosome diversity as well as RNA processing mechanisms in the diplomonads Giardia lamblia and Spironucleus spp. Surprisingly, G. lamblia was found to contain a proportionally large number of fragmented spliceosomal introns that are spliced in trans from separate pre-mRNA molecules. Next, both evolutionarily divergent and conventional spliceosomal small nuclear RNAs were identified in G. lamblia and Spironucleus spp. and an RNA 3ʹ end motif was determined to be involved in processing of both non-coding RNAs and trans-introns in G. lamblia. These findings shed light on spliceosome and spliceosomal intron evolution in eukaryotes undergoing severe genomic reduction and potentially complete loss of their spliceosomal introns.University of Lethbridge (SGS Graduate Fellowship), Natural Sciences and Engineering Research Council of Canada (NSERC) (Discovery Grant and Alexander Graham Bell Canada Graduate Scholarships)

    Décodage de l'expression de gènes cryptiques

    Full text link
    Pour certaines espèces, les nouvelles technologies de séquençage à haut débit et les pipelines automatiques d'annotation permettent actuellement de passer du tube Eppendorf au fichier genbank en un clic de souris, ou presque. D'autres organismes, en revanche, résistent farouchement au bio-informaticien le plus acharné en leur opposant une complexité génomique confondante. Les diplonémides en font partie. Ma thèse est centrée sur la découverte de nouvelles stratégies d'encryptage de l'information génétique chez ces eucaryotes, et l'identification des processus moléculaires de décodage. Les diplonémides sont des protistes marins qui prospèrent à travers tous les océans de la planète. Ils se distinguent par une diversité d'espèces riche et inattendue. Mais la caractéristique la plus fascinante de ce groupe est leur génome mitochondrial en morceaux dont les gènes sont encryptés. Ils sont décodés au niveau ARN par trois processus: (i) l'épissage en trans, (ii) l'édition par polyuridylation à la jonction des fragments de gènes, et (iii) l'édition par substitution de A-vers-I et C-vers-T; une diversité de processus posttranscriptionnels exceptionnelle dans les mitochondries. Par des méthodes bio-informatiques, j'ai reconstitué complètement le transcriptome mitochondrial à partir de données de séquences ARN à haut débit. Nous avons ainsi découvert six nouveaux gènes dont l'un présente des isoformes par épissage alternatif en trans, 216 positions éditées par polyuridylation sur 14 gènes (jusqu'à 29 uridines par position) et 114 positions éditées par déamination de A-vers-I et C-vers-T sur sept gènes (nad4, nad7, rns, y1, y2, y3, y5). Afin d'identifier les composants de la machinerie réalisant la maturation des ARNs mitochondriaux, le génome nucléaire a été séquencé, puis je l'ai assemblé et annoté. Cette machinerie est probablement singulière et complexe car aucun signal en cis ni acteur en trans caractéristiques des machineries d'épissage connues n'a été trouvé. J'ai identifié plusieurs candidats prometteurs qui devront être validés expérimentalement: des ARN ligases, un nombre important de protéines de la famille des PPR impliquées dans l'édition des ARNs dans les organites de plantes, ainsi que plusieurs déaminases. Durant ma thèse, nous avons mis en évidence de nouveaux types de maturation posttranscriptionnelle des ARNs dans la mitochondrie des diplonémides et identifié des candidats prometteurs de la machinerie. Ces composants, capables de lier précisément des fragments d'ARN et de les éditer pourraient trouver des applications biotechnologique. Au niveau évolutif, la caractérisation de nouvelles excentricités moléculaires de ce type nous donne une idée des processus de recrutement de gènes, de leur adaptation à de nouvelles fonctions, et de la mise en place de machineries moléculaires complexes.Thanks to new high throughput sequencing technologies and automatic annotation pipelines, proceeding from an eppendorf tube to a genbank file can be achieved in a single mouse click or so, for some species. Others, however, fiercely resist bioinformaticians with their confounding genomic complexity. Diplonemids are one of them. My thesis is centered on the discovery of new strategies for encrypting genetic information in eukaryotes, and the identification of molecular decoding processes. Diplonemids are a group of poorly studied marine protists. Unexpectedly, metagenomic studies have recently ranked this group as one of the most diverse in the oceans. Yet, their most distinctive feature is their multipartite mitochondrial genome with genes in pieces, and encryption by nucleotide deletions and substitutions. Genes are decrypted at the RNA level through three processes: (i) trans-splicing, (ii) polyuridylation at the junction of gene pieces and (iii) substitutions of A-to-I and C-to-T. Such a diverse arsenal of mitochondrial post-transcriptional processes is highly exceptional. Using a bioinformatics approach, I have reconstructed the mitochondrial transcriptome from RNA-seq libraries. We have identified six new genes including one that presents alternative trans-splicing isoforms. In total, there are 216 uridines added in 14 genes with up to 29 U insertions, and 114 positions edited by deamination (A-to-I or C-to-T) among seven genes (nad4, nad7, rns, y1, y2, y3, y5). In order to identify the machinery that processes mitochondrial RNAs, the nuclear genome has been sequenced. I have then assembled and annotated the genome. This machinery is probably unique and complex because no cis signal or trans actor typical for known splicing machineries have been found. I have identified promising protein candidates that are worth to be tested experimentally, notably RNA ligases, numerous members of the PPR family involved in plants RNA editing and deaminases. During my thesis, we have identified new types of post-transcriptional RNA processing in diplonemid mitochondria and identified new promising candidates for the machinery. A system capable of joining precisely or editing RNAs could find biotechnological applications. From an evolutionary perspective, the discovery of new molecular systems gives insight into the process of gene recruitment, adaptation to new functions and establishment of complex molecular machineries
    corecore