14,379 research outputs found

    In search of lost introns

    Full text link
    Many fundamental questions concerning the emergence and subsequent evolution of eukaryotic exon-intron organization are still unsettled. Genome-scale comparative studies, which can shed light on crucial aspects of eukaryotic evolution, require adequate computational tools. We describe novel computational methods for studying spliceosomal intron evolution. Our goal is to give a reliable characterization of the dynamics of intron evolution. Our algorithmic innovations address the identification of orthologous introns, and the likelihood-based analysis of intron data. We discuss a compression method for the evaluation of the likelihood function, which is noteworthy for phylogenetic likelihood problems in general. We prove that after O(nL)O(nL) preprocessing time, subsequent evaluations take O(nL/logL)O(nL/\log L) time almost surely in the Yule-Harding random model of nn-taxon phylogenies, where LL is the input sequence length. We illustrate the practicality of our methods by compiling and analyzing a data set involving 18 eukaryotes, more than in any other study to date. The study yields the surprising result that ancestral eukaryotes were fairly intron-rich. For example, the bilaterian ancestor is estimated to have had more than 90% as many introns as vertebrates do now

    A Quantitative Approach to Investigating the Hypothesis of Prokaryotic Intron Loss

    Get PDF
    Using a novel method, we show that ordered triplets of motifs usually associated with spliceosomal intron recognition are underrepresented in the protein coding sequence of complete Thermotogae, archaeal and bacterial genomes. The underrepresentation observed does not extend to the noncoding strand, suggesting that the cause of the asymmetry is related to mRNA rather than DNA. Our data do not suggest that the underrepresentation is due to gene transfer from eukaryotes. We speculate that one possible explanation for these observations is that the protein coding sequence of Thermotogae, Archaea and Bacteria was at some time in the past subjected to selection against certain motifs appearing in an order which might initiate splicing in environments harboring a functional spliceosome. This is consistent with, but certainly does not prove, a hypothetical scenario in which at least some prokaryote lineages once possessed a functional spliceosome. Thus, we present a new quantitative method, observations obtained using the method, and a speculative discussion of a possible explanation of the observations

    Computational Identification of Four Spliceosomal snRNAs from the Deep-Branching Eukaryote Giardia intestinalis

    Get PDF
    Funding: Marsden Fund New Zealand Allan Wilson Centre The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.RNAs processing other RNAs is very general in eukaryotes, but is not clear to what extent it is ancestral to eukaryotes. Here we focus on pre-mRNA splicing, one of the most important RNA-processing mechanisms in eukaryotes. In most eukaryotes splicing is predominantly catalysed by the major spliceosome complex, which consists of five uridine-rich small nuclear RNAs (U-snRNAs) and over 200 proteins in humans. Three major spliceosomal introns have been found experimentally in Giardia; one Giardia U-snRNA (U5) and a number of spliceosomal proteins have also been identified. However, because of the low sequence similarity between the Giardia ncRNAs and those of other eukaryotes, the other U-snRNAs of Giardia had not been found. Using two computational methods, candidates for Giardia U1, U2, U4 and U6 snRNAs were identified in this study and shown by RT-PCR to be expressed. We found that identifying a U2 candidate helped identify U6 and U4 based on interactions between them. Secondary structural modelling of the Giardia U-snRNA candidates revealed typical features of eukaryotic U-snRNAs. We demonstrate a successful approach to combine computational and experimental methods to identify expected ncRNAs in a highly divergent protist genome. Our findings reinforce the conclusion that spliceosomal small-nuclear RNAs existed in the last common ancestor of eukaryotes

    A new reference genome assembly for the microcrustacean Daphnia pulex

    Get PDF
    Comparing genomes of closely related genotypes from populations with distinct demographic histories can help reveal the impact of effective population size on genome evolution. For this purpose, we present a high quality genome assembly of Daphnia pulex (PA42), and compare this with the first sequenced genome of this species (TCO), which was derived from an isolate from a population with >90% reduction in nucleotide diversity. PA42 has numerous similarities to TCO at the gene level, with an average amino acid sequence identity of 98.8 and >60% of orthologous proteins identical. Nonetheless, there is a highly elevated number of genes in the TCO genome annotation, with similar to 7000 excess genes appearing to be false positives. This view is supported by the high GC content, lack of introns, and short length of these suspicious gene annotations. Consistent with the view that reduced effective population size can facilitate the accumulation of slightly deleterious genomic features, we observe more proliferation of transposable elements (TEs) and a higher frequency of gained introns in the TCO genome

    NG peptides: A novel family of neurophysin-associated neuropeptides

    Get PDF
    NOTICE: this is the author’s version of a work that was accepted for publication in GENE. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in GENE, [VOL 458, ISSUE 1-2, (2010)] DOI: 10.1016/j.gene.2010.03.00

    Complete chloroplast genome sequence of Holoparasite Cistanche Deserticola (Orobanchaceae) reveals gene loss and horizontal gene transfer from Its host Haloxylon Ammodendron (Chenopodiaceae)

    Get PDF
    The central function of chloroplasts is to carry out photosynthesis, and its gene content and structure are highly conserved across land plants. Parasitic plants, which have reduced photosynthetic ability, suffer gene losses from the chloroplast (cp) genome accompanied by the relaxation of selective constraints. Compared with the rapid rise in the number of cp genome sequences of photosynthetic organisms, there are limited data sets from parasitic plants. The authors report the complete sequence of the cp genome of Cistanche deserticola, a holoparasitic desert species belonging to the family Orobanchaceae

    Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3

    Get PDF
    Background: Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described. Results: Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates. Conclusion: We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution

    Exon-phase symmetry and intrinsic structural disorder promote modular evolution in the human genome

    Get PDF
    A key signature of module exchange in the genome is phase symmetry of exons, suggestive of exon shuffling events that occurred without disrupting translation reading frame. At the protein level, intrinsic structural disorder may be another key element because disordered regions often serve as functional elements that can be effectively integrated into a protein structure. Therefore, we asked whether exon-phase symmetry in the human genome and structural disorder in the human proteome are connected, signalling such evolutionary mechanisms in the assembly of multi-exon genes. We found an elevated level of structural disorder of regions encoded by symmetric exons and a preferred symmetry of exons encoding for mostly disordered regions (>70% predicted disorder). Alternatively spliced symmetric exons tend to correspond to the most disordered regions. The genes of mostly disordered proteins (>70% predicted disorder) tend to be assembled from symmetric exons, which often arise by internal tandem duplications. Preponderance of certain types of short motifs (e.g. SH3-binding motif) and domains (e.g. high-mobility group domains) suggests that certain disordered modules have been particularly effective in exon-shuffling events. Our observations suggest that structural disorder has facilitated modular assembly of complex genes in evolution of the human genome. © 2013 The Author(s)

    Selection at a single locus leads to widespread expansion of toxoplasma gondii lineages that are virulent in mice

    Get PDF
    The determinants of virulence are rarely defined for eukaryotic parasites such as T. gondii, a widespread parasite of mammals that also infects humans, sometimes with serious consequences. Recent laboratory studies have established that variation in a single secreted protein, a serine/threonine kinase known as ROPO18, controls whether or not mice survive infection. Here, we establish the extent and nature of variation in ROP18among a collection of parasite strains from geographically diverse regions. Compared to other genes, ROP18 showed extremely high levels of diversification and changes in expression level, which correlated with severity of infection in mice. Comparison with an out-group demonstrated that changes in the upstream region that regulates expression of ROP18 led to an historical increase in the expression and exposed the protein to diversifying selective pressure. Surprisingly, only three atypically distinct protein variants exist despite marked genetic divergence elsewhere in the genome. These three forms of ROP18 are likely adaptations for different niches in nature, and they confer markedly different virulence to mice. The widespread distribution of a single mouse-virulent allele among geographically and genetically disparate parasites may have consequences for transmission and disease in other hosts, including humans
    corecore