166 research outputs found

    EST Analysis of Ostreococcus lucimarinus, the Most Compact Eukaryotic Genome, Shows an Excess of Introns in Highly Expressed Genes

    Get PDF
    Background: The genome of the pico-eukaryotic (bacterial-sized) prasinophyte green alga Ostreococcus lucimarinus has one of the highest gene densities known in eukaryotes, yet it contains many introns. Phylogenetic studies suggest this unusually compact genome (13.2 Mb) is an evolutionarily derived state among prasinophytes. The presence of introns in the highly reduced O. lucimarinus genome appears to be in opposition to simple explanations of genome evolution based on unidirectional tendencies, either neutral or selective. Therefore, patterns of intron retention in this species can potentially provide insights into the forces governing intron evolution. Methodology/Principal Findings: Here we studied intron features and levels of expression in O. lucimarinus using expressed sequence tags (ESTs) to annotate the current genome assembly. ESTs were assembled into unigene clusters that were mapped back to the O. lucimarinus Build 2.0 assembly using BLAST and the level of gene expression was inferred from the number of ESTs in each cluster. We find a positive correlation between expression levels and both intron number (R = +0.0893, p =,0.0005) and intron density (number of introns/kb of CDS; R = +0.0753, p =,0.005). Conclusions/Significance: In a species with a genome that has been recently subjected to a great reduction of non-coding DNA, these results imply the existence of selective/functional roles for introns that are principally detectable in highly expressed genes. In these cases, introns are likely maintained by balancing the selective forces favoring their maintenanc

    The surprising negative correlation of gene length and optimal codon use - disentangling translational selection from GC-biased gene conversion in yeast

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Surprisingly, in several multi-cellular eukaryotes optimal codon use correlates negatively with gene length. This contrasts with the expectation under selection for translational accuracy. While suggested explanations focus on variation in strength and efficiency of translational selection, it has rarely been noticed that the negative correlation is reported only in organisms whose optimal codons are biased towards codons that end with G or C (-GC). This raises the question whether forces that affect base composition - such as GC-biased gene conversion - contribute to the negative correlation between optimal codon use and gene length.</p> <p>Results</p> <p>Yeast is a good organism to study this as equal numbers of optimal codons end in -GC and -AT and one may hence compare frequencies of optimal GC- with optimal AT-ending codons to disentangle the forces. Results of this study demonstrate in yeast frequencies of GC-ending (optimal AND non-optimal) codons decrease with gene length and increase with recombination. A decrease of GC-ending codons along genes contributes to the negative correlation with gene length. Correlations with recombination and gene expression differentiate between GC-ending and optimal codons, and also substitution patterns support effects of GC-biased gene conversion.</p> <p>Conclusion</p> <p>While the general effect of GC-biased gene conversion is well known, the negative correlation of optimal codon use with gene length has not been considered in this context before. Initiation of gene conversion events in promoter regions and the presence of a gene conversion gradient most likely explain the observed decrease of GC-ending codons with gene length and gene position.</p

    Compare the differences of synonymous codon usage between the two species within cardiovirus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cardioviruses are positive-strand RNA viruses in the Picornaviridae family that can cause enteric infection in rodents and also been detected at lower frequencies in other mammals such as pigs and human beings. The Cardiovirus genus consists two distinct species: Encephalomyocarditis virus (EMCV) and Theilovirus (ThV). There are a lot differences between the two species. In this study, the differences of codon usage in EMCV and ThV were compared.</p> <p>Results</p> <p>The mean ENC values of EMCV and ThV are 54.86 and 51.08 respectively, higher than 40.And there are correlations between (C+G)<sub>12</sub>% and (C+G)<sub>3</sub>% for both EMCV and ThV (r = -0.736;r = 0.986, P < 0.01, repectively). For ThV the (C+G)<sub>12</sub>%, (C+G)<sub>3</sub>%, axis <it>f</it>'<sub>1 </sub>and axis <it>f</it>'<sub>2 </sub>had a significant correlations respectively but not for EMCV. According to the RSCU values, the EMCV species seemed to prefer U, G and C ending codon, while the ThV spice seemed to like using U and A ending codon. However, in both genus AGA for Arg, AUU for Ile, UCU for Ser, and GGA for Gly were chosen preferentially. Correspondence analysis detected one major trend in the first axis (<it>f</it>'<sub>1</sub>) which accounted for 22.89% of the total variation, and another major trend in the second axis (<it>f</it>'<sub>2</sub>) which accounted for 17.64% of the total variation. And the plots of the same serotype seemed at the same region at the coordinate.</p> <p>Conclusion</p> <p>The overall extents of codon usage bias in both EMCV and ThV are low. The mutational pressure is the main factor that determines the codon usage bias, but the (C+G) content plays a more important role in codon usage bias for ThV than for EMCV. The synonymous codon usage pattern in both EMCV and ThV genes is gene function and geography specific, but not host specific. Maybe the serotype is one factor effected the codon bias for ThV, and location has no significant effect on the variations of synonymous codon usage in these virus genes.</p

    Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3

    Get PDF
    Background: Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described. Results: Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates. Conclusion: We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution

    Estimation of Isolation Times of the Island Species in the Drosophila simulans Complex from Multilocus DNA Sequence Data

    Get PDF
    Background: The Drosophila simulans species complex continues to serve as an important model system for the study of new species formation. The complex is comprised of the cosmopolitan species, D. simulans, and two island endemics, D. mauritiana and D. sechellia. A substantial amount of effort has gone into reconstructing the natural history of the complex, in part to infer the context in which functional divergence among the species has arisen. In this regard, a key parameter to be estimated is the initial isolation time (t) of each island species. Loci in regions of low recombination have lower divergence within the complex than do other loci, yet divergence from D. melanogaster is similar for both classes. This might reflect gene flow of the lowrecombination loci subsequent to initial isolation, but it might also reflect differential effects of changing population size on the two recombination classes of loci when the low-recombination loci are subject to genetic hitchhiking or pseudohitchhiking Methodology/Principal Findings: New DNA sequence variation data for 17 loci corroborate the prior observation from 13 loci that DNA sequence divergence is reduced in genes of low recombination. Two models are presented to estimate t and other relevant parameters (substitution rate correction factors in lineages leading to the island species and, in the case of the 4-parameter model, the ratio of ancestral to extant effective population size) from the multilocus DNA sequence data. Conclusions/Significance: In general, it appears that both island species were isolated at about the same time, here estimated at,250,000 years ago. It also appears that the difference in divergence patterns of genes in regions of low an

    Evidence for Centromere Drive in the Holocentric Chromosomes of Caenorhabditis

    Get PDF
    In monocentric organisms with asymmetric meiosis, the kinetochore proteins, such as CENH3 and CENP-C, evolve adaptively to counterbalance the deleterious effects of centromere drive, which is caused by the expansion of centromeric satellite repeats. The selection regimes that act on CENH3 and CENP-C genes have not been analyzed in organisms with holocentric chromosomes, although holocentrism is speculated to have evolved to suppress centromere drive. We tested both CENH3 and CENP-C for positive selection in several species of the holocentric genus Caenorhabditis using the maximum likelihood approach and sliding-window analysis. Although CENP-C did not show any signs of positive selection, positive selection has been detected in the case of CENH3. These results support the hypothesis that centromere drive occurs in Nematoda, at least in the telokinetic meiosis of Caenorhabditis

    Mutational Biases and Selective Forces Shaping the Structure of Arabidopsis Genes

    Get PDF
    Recently features of gene expression profiles have been associated with structural parameters of gene sequences in organisms representing a diverse set of taxa. The emerging picture indicates that natural selection, mediated by gene expression profiles, has a significant role in determining genic structures. However the current situation is less clear in plants as the available data indicates that the effect of natural selection mediated by gene expression is very weak. Moreover, the direction of the patterns in plants appears to contradict those observed in animal genomes. In the present work we analized expression data for >18000 Arabidopsis genes retrieved from public datasets obtained with different technologies (MPSS and high density chip arrays) and compared them with gene parameters. Our results show that the impact of natural selection mediated by expression on genes sequences is significant and distinguishable from the effects of regional mutational biases. In addition, we provide evidence that the level and the breadth of gene expression are related in opposite ways to many structural parameters of gene sequences. Higher levels of expression abundance are associated with smaller transcripts, consistent with the need to reduce costs of both transcription and translation. Expression breadth, however, shows a contrasting pattern, i.e. longer genes have higher breadth of expression, possibly to ensure those structural features associated with gene plasticity. Based on these results, we propose that the specific balance between these two selective forces play a significant role in shaping the structure of Arabidopsis genes

    Recombination Drives Vertebrate Genome Contraction

    Get PDF
    Selective and/or neutral processes may govern variation in DNA content and, ultimately, genome size. The observation in several organisms of a negative correlation between recombination rate and intron size could be compatible with a neutral model in which recombination is mutagenic for length changes. We used whole-genome data on small insertions and deletions within transposable elements from chicken and zebra finch to demonstrate clear links between recombination rate and a number of attributes of reduced DNA content. Recombination rate was negatively correlated with the length of introns, transposable elements, and intergenic spacer and with the rate of short insertions. Importantly, it was positively correlated with gene density, the rate of short deletions, the deletion bias, and the net change in sequence length. All these observations point at a pattern of more condensed genome structure in regions of high recombination. Based on the observed rates of small insertions and deletions and assuming that these rates are representative for the whole genome, we estimate that the genome of the most recent common ancestor of birds and lizards has lost nearly 20% of its DNA content up until the present. Expansion of transposable elements can counteract the effect of deletions in an equilibrium mutation model; however, since the activity of transposable elements has been low in the avian lineage, the deletion bias is likely to have had a significant effect on genome size evolution in dinosaurs and birds, contributing to the maintenance of a small genome. We also demonstrate that most of the observed correlations between recombination rate and genome contraction parameters are seen in the human genome, including for segregating indel polymorphisms. Our data are compatible with a neutral model in which recombination drives vertebrate genome size evolution and gives no direct support for a role of natural selection in this process

    Primula vulgaris (primrose) genome assembly, annotation and gene expression, with comparative genomics on the heterostyly supergene

    Get PDF
    Primula vulgaris (primrose) exhibits heterostyly: plants produce self-incompatible pin- or thrum-form flowers, with anthers and stigma at reciprocal heights. Darwin concluded that this arrangement promotes insect-mediated cross-pollination; later studies revealed control by a cluster of genes, or supergene, known as the S (Style length) locus. The P. vulgaris S locus is absent from pin plants and hemizygous in thrum plants (thrum-specific); mutation of S locus genes produces self-fertile homostyle flowers with anthers and stigma at equal heights. Here, we present a 411 Mb P. vulgaris genome assembly of a homozygous inbred long homostyle, representing ~87% of the genome. We annotate over 24,000 P. vulgaris genes, and reveal more genes up-regulated in thrum than pin flowers. We show reduced genomic read coverage across the S locus in other Primula species, including P. veris, where we define the conserved structure and expression of the S locus genes in thrum. Further analysis reveals the S locus has elevated repeat content (64%) compared to the wider genome (37%). Our studies suggest conservation of S locus genetic architecture in Primula, and provide a platform for identification and evolutionary analysis of the S locus and downstream targets that regulate heterostyly in diverse heterostylous species

    Both Size and GC-Content of Minimal Introns Are Selected in Human Populations

    Get PDF
    Background: We previously have studied the insertion and deletion polymorphism by sequencing no more than one hundred introns in a mixed human population and found that the minimal introns tended to maintain length at an optimal size. Here we analyzed re-sequenced 179 individual genomes (from African, European, and Asian populations) from the data released by the 1000 Genome Project to study the size dynamics of minimal introns. Principal Findings: We not only confirmed that minimal introns in human populations are selected but also found two major effects in minimal intron evolution: (i) Size-effect: minimal introns longer than an optimal size (87 nt) tend to have a higher ratio of deletion to insertion than those that are shorter than the optimal size; (ii) GC-effect: minimal introns with lower GC content tend to be more frequently deleted than those with higher GC content. The GC-effect results in a higher GC content in minimal introns than their flanking exons as opposed to larger introns ($125 nt) that always have a lower GC content than that of their flanking exons. We also observed that the two effects are distinguishable but not completely separable within and between populations. Conclusions: We validated the unique mutation dynamics of minimal introns in keeping their near-optimal size and GC content, and our observations suggest potentially important functions of human minimal introns in transcript processin
    • …
    corecore