4,987 research outputs found

    Exploration of alternative splicing events in ten different grapevine cultivars

    Get PDF
    Background: The complex dynamics of gene regulation in plants are still far from being fully understood. Among many factors involved, alternative splicing (AS) in particular is one of the least well documented. For many years, AS has been considered of less relevant in plants, especially when compared to animals, however, since the introduction of next generation sequencing techniques the number of plant genes believed to be alternatively spliced has increased exponentially. Results: Here, we performed a comprehensive high-throughput transcript sequencing of ten different grapevine cultivars, which resulted in the first high coverage atlas of the grape berry transcriptome. We also developed findAS, a software tool for the analysis of alternatively spliced junctions. We demonstrate that at least 44 % of multi-exonic genes undergo AS and a large number of low abundance splice variants is present within the 131.622 splice junctions we have annotated from Pinot noir. Conclusions: Our analysis shows that similar to 70 % of AS events have relatively low expression levels, furthermore alternative splice sites seem to be enriched near the constitutive ones in some extent showing the noise of the splicing mechanisms. However, AS seems to be extensively conserved among the 10 cultivars

    N-terminal proteomics assisted profiling of the unexplored translation initiation landscape in Arabidopsis thaliana

    Get PDF
    Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well-and poorly-annotated genomes

    Natural antisense transcripts with coding capacity in Arabidopsis may have a regulatory role that is not linked to double-stranded RNA degradation

    Get PDF
    BACKGROUND: Overlapping transcripts in antisense orientation have the potential to form double-stranded RNA (dsRNA), a substrate for a number of different RNA-modification pathways. One prominent route for dsRNA is its breakdown by Dicer enzyme complexes into small RNAs, a pathway that is widely exploited by RNA interference technology to inactivate defined genes in transgenic lines. The significance of this pathway for endogenous gene regulation remains unclear. RESULTS: We have examined transcription data for overlapping gene pairs in Arabidopsis thaliana. On the basis of an analysis of transcripts with coding regions, we find the majority of overlapping gene pairs to be convergently overlapping pairs (COPs), with the potential for dsRNA formation. In all tissues, COP transcripts are present at a higher frequency compared to the overall gene pool. The probability that both the sense and antisense copy of a COP are co-transcribed matches the theoretical value for coexpression under the assumption that the expression of one partner does not affect the expression of the other. Among COPs, we observe an over-representation of spliced (intron-containing) genes (90%) and of genes with alternatively spliced transcripts. For loci where antisense transcripts overlap with sense transcript introns, we also find a significant bias in favor of alternative splicing and variation of polyadenylation. CONCLUSION: The results argue against a predominant RNA degradation effect induced by dsRNA formation. Instead, our data support alternative roles for dsRNAs. They suggest that at least for a subgroup of COPs, antisense expression may induce alternative splicing or polyadenylation

    SpliceGrapher: detecting patterns of alternative splicing from RNA-Seq data in the context of gene models and EST data

    Get PDF
    We propose a method for predicting splice graphs that enhances curated gene models using evidence from RNA-Seq and EST alignments. Results obtained using RNA-Seq experiments in Arabidopsis thaliana show that predictions made by our SpliceGrapher method are more consistent with current gene models than predictions made by TAU and Cufflinks. Furthermore, analysis of plant and human data indicates that the machine learning approach used by SpliceGrapher is useful for discriminating between real and spurious splice sites, and can improve the reliability of detection of alternative splicing. SpliceGrapher is available for download at http://SpliceGrapher.sf.net

    Deciphering the Plant Splicing Code: Experimental and Computational Approaches for Predicting Alternative Splicing and Splicing Regulatory Elements

    Get PDF
    Extensive alternative splicing (AS) of precursor mRNAs (pre-mRNAs) in multicellular eukaryotes increases the protein-coding capacity of a genome and allows novel ways to regulate gene expression. In flowering plants, up to 48% of intron-containing genes exhibit AS. However, the full extent of AS in plants is not yet known, as only a few high-throughput RNA-Seq studies have been performed. As the cost of obtaining RNA-Seq reads continues to fall, it is anticipated that huge amounts of plant sequence data will accumulate and help in obtaining a more complete picture of AS in plants. Although it is not an onerous task to obtain hundreds of millions of reads using high-throughput sequencing technologies, computational tools to accurately predict and visualize AS are still being developed and refined. This review will discuss the tools to predict and visualize transcriptome-wide AS in plants using short-reads and highlight their limitations. Comparative studies of AS events between plants and animals have revealed that there are major differences in the most prevalent types of AS events, suggesting that plants and animals differ in the way they recognize exons and introns. Extensive studies have been performed in animals to identify cis-elements involved in regulating AS, especially in exon skipping. However, few such studies have been carried out in plants. Here, we review the current state of research on splicing regulatory elements (SREs) and briefly discuss emerging experimental and computational tools to identify cis-elements involved in regulation of AS in plants. The availability of curated alternative splice forms in plants makes it possible to use computational tools to predict SREs involved in AS regulation, which can then be verified experimentally. Such studies will permit identification of plant-specific features involved in AS regulation and contribute to deciphering the splicing code in plants

    Assessing the impact of alternative splicing on the diversity and evolution of the proteome in plants

    Get PDF
    Splicing is one of the key processing steps during the maturation of a gene’s primary transcript into the mRNA molecule used as a template for protein production. Splicing involves the removal of segments called introns and re-joining of the remaining segments called exons. It is by now well established that not always the same segments are removed from a gene’s primary transcript during the splicing process. The consequence of this splicing variation, termed Alternative Splicing (AS), is that multiple distinct mature mRNA molecules can be produced from a single gene. One of the two biological roles that are ascribed to AS is that of a mechanism which enables an organism to produce multiple functionally distinct proteins from a single gene. Alternatively, AS can serve as a means for controlling gene expression at the post-transcriptional level. Although many clear examples have been reported for both roles, the extent to which AS increases the functional diversity of the proteome, regulates gene expression or simply reflects noise in splicing machinery is not well known. Determining the full functional impact of AS by designing and performing wet-lab experiments for all AS events is unfeasible and bioinformatics approaches have therefore widely been used for studying the impact of AS at a genome-wide scale. In this thesis four bioinformatics studies are presented that were aimed at determining the extent to which AS is used in plants as a mechanism for producing multiple distinct functional proteins from a single gene. Each chapter uses a different method for analyzing specific properties of AS. Under the premise that functional genetic features are more likely to be conserved than non-functional ones, AS events that are present in two or more species are more likely to be biologically relevant than those that are confined to a single species. In chapter 2 we analyzed the conservation of AS by performing a comparative analysis between three divergent plant species. The results of that study indicated that the vast majority of AS events does not persist over long periods of evolution. We concluded, based on this lack of conservation, that AS only has a limited impact on the functional diversity of the proteome in plants. Following this conclusion, it can hypothesized that the variation that AS induces at the transcriptome level is not likely to be manifested at the protein level. In chapter 3 we tested this hypothesis by analyzing two independent proteomics datasets. This type of data can be used to directly identify proteins present in a biological sample. Our results indicated that the variation induced by AS at the transcriptome level is also manifested at the protein level. We concluded that either many AS events have a confined species-specific (not conserved) function or simply produce protein variants that are stable enough to escape rapid turn-over. Another method for determining whether AS increases the functional diversity of the proteome is by determining whether protein sequence variations that are typically induced by AS are common within the plant kingdom. We found (chapter 4) that this is not the case in plants and concluded that novel functions do not frequently arise through AS. We also found that most of the AS-induced variation is lost, similarly as for redundant gene copies, within a very short evolutionary time period. One limitation of genome-wide analyses is that these capture only the more general patterns. However, the functional impact of AS can be very different in different genes or gene-families. In order fully assess the functional impact of AS, it is therefore important to also study the process within the functional context of individual genes or gene families. In chapter 5 we demonstrated this concept by performing a detailed analysis of AS within the MADS-box gene family. We were able to provide clues as to how AS might impact the protein-protein interaction capabilities of individual MADS proteins. Some of our predictions were supported by experimental evidence. We further showed how AS can serve as an evolutionary mechanism for experimenting with novel functions (novel interactions) without the explicit loss of existing functions. The overall conclusion, based on the performed analyses is as follows: AS primarily is a consequence of noise in the splicing machinery and results in an increased diversity of the proteome. However, only a small fraction of the proteins resulting from AS will have beneficial functions and are subsequently selected for during evolution. The large remaining fraction is, similarly as for redundant gene-copies, lost within a very short evolutionary time period after its emergence. </p

    Genome-wide cloning and sequence analysis of leucine-rich repeat receptor-like protein kinase genes in Arabidopsis thaliana

    Get PDF
    Abstract Background Transmembrane receptor kinases play critical roles in both animal and plant signaling pathways regulating growth, development, differentiation, cell death, and pathogenic defense responses. In Arabidopsis thaliana, there are at least 223 Leucine-rich repeat receptor-like kinases (LRR-RLKs), representing one of the largest protein families. Although functional roles for a handful of LRR-RLKs have been revealed, the functions of the majority of members in this protein family have not been elucidated. Results As a resource for the in-depth analysis of this important protein family, the complementary DNA sequences (cDNAs) of 194 LRR-RLKs were cloned into the GatewayR donor vector pDONR/ZeoR and analyzed by DNA sequencing. Among them, 157 clones showed sequences identical to the predictions in the Arabidopsis sequence resource, TAIR8. The other 37 cDNAs showed gene structures distinct from the predictions of TAIR8, which was mainly caused by alternative splicing of pre-mRNA. Most of the genes have been further cloned into GatewayR destination vectors with GFP or FLAG epitope tags and have been transformed into Arabidopsis for in planta functional analysis. All clones from this study have been submitted to the Arabidopsis Biological Resource Center (ABRC) at Ohio State University for full accessibility by the Arabidopsis research community. Conclusions Most of the Arabidopsis LRR-RLK genes have been isolated and the sequence analysis showed a number of alternatively spliced variants. The generated resources, including cDNA entry clones, expression constructs and transgenic plants, will facilitate further functional analysis of the members of this important gene family
    • 

    corecore