32 research outputs found

    3D RNA-seq:A powerful and flexible tool for rapid and accurate differential expression and alternative splicing analysis of RNA-seq data for biologists

    Get PDF
    RNA-sequencing (RNA-seq) analysis of gene expression and alternative splicing should be routine and robust but is often a bottleneck for biologists because of different and complex analysis programs and reliance on specialized bioinformatics skills. We have developed the ‘3D RNA-seq’ App, an R shiny App and web-based pipeline for the comprehensive analysis of RNA-seq data from any organism. It represents an easy-to-use, flexible and powerful tool for analysis of both gene and transcript-level gene expression to identify differential gene/transcript expression, differential alternative splicing and differential transcript usage (3D) as well as isoform switching from RNA-seq data. 3D RNA-seq integrates state-of-the-art differential expression analysis tools and adopts best practice for RNA-seq analysis. The program is designed to be run by biologists with minimal bioinformatics experience (or by bioinformaticians) allowing lab scientists to analyse their RNA-seq data. It achieves this by operating through a user-friendly graphical interface which automates the data flow through the programs in the pipeline. The comprehensive analysis performed by 3D RNA-seq is extremely rapid and accurate, can handle complex experimental designs, allows user setting of statistical parameters, visualizes the results through graphics and tables, and generates publication quality figures such as heat-maps, expression profiles and GO enrichment plots. The utility of 3D RNA-seq is illustrated by analysis of data from a time-series of cold-treated Arabidopsis plants and from dexamethasone-treated male and female mouse cortex and hypothalamus data identifying dexamethasone-induced sex- and brain region-specific differential gene expression and alternative splicing

    Cold-Dependent Expression and Alternative Splicing of Arabidopsis Long Non-coding RNAs

    Get PDF
    Plants re-program their gene expression when responding to changing environmental conditions. Besides differential gene expression, extensive alternative splicing (AS) of pre-mRNAs and changes in expression of long non-coding RNAs (lncRNAs) are associated with stress responses. RNA-sequencing of a diel time-series of the initial response of Arabidopsis thaliana rosettes to low temperature showed massive and rapid waves of both transcriptional and AS activity in protein-coding genes. We exploited the high diversity of transcript isoforms in AtRTD2 to examine regulation and post-transcriptional regulation of lncRNA gene expression in response to cold stress. We identified 135 lncRNA genes with cold-dependent differential expression (DE) and/or differential alternative splicing (DAS) of lncRNAs including natural antisense RNAs, sORF lncRNAs, and precursors of microRNAs (miRNAs) and trans-acting small-interfering RNAs (tasiRNAs). The high resolution (HR) of the time-series allowed the dynamics of changes in transcription and AS to be determined and identified early and adaptive transcriptional and AS changes in the cold response. Some lncRNA genes were regulated only at the level of AS and using plants grown at different temperatures and a HR time-course of the first 3 h of temperature reduction, we demonstrated that the AS of some lncRNAs is highly sensitive to small temperature changes suggesting tight regulation of expression. In particular, a splicing event in TAS1a which removed an intron that contained the miR173 processing and phased siRNAs generation sites was differentially alternatively spliced in response to cold. The cold-induced reduction of the spliced form of TAS1a and of the tasiRNAs suggests that splicing may enhance production of the siRNAs. Our results identify candidate lncRNAs that may contribute to the regulation of expression that determines the physiological processes essential for acclimation and freezing tolerance

    How does temperature affect splicing events? Isoform switching of splicing factors regulates splicing of <i>LATE ELONGATED HYPOCOTYL</i> (<i>LHY</i>)

    Get PDF
    One of the ways in which plants can respond to temperature is via alternative splicing (AS). Previous work showed that temperature changes affected the splicing of several circadian clock gene transcripts. Here we investigated the role of RNA‐binding splicing factors (SFs) in temperature‐sensitive alternative splicing (AS) of the clock gene LATE ELONGATED HYPOCOTYL (LHY). We characterised, in wild type plants, temperature‐associated isoform switching and expression patterns for SF transcripts from a high‐resolution temperature and time series RNA‐seq experiment. In addition we employed quantitative RT‐PCR of SF mutant plants to explore the role of the SFs in cooling‐associated AS of LHY. We show that the splicing and expression of several SFs responds sufficiently rapidly and sensitively to temperature changes to contribute to the splicing of the 5’UTR of LHY. Moreover the choice of splice site in LHY was altered in some SF mutants. The splicing of the 5’UTR region of LHY has characteristics of a molecular thermostat, where the ratio of transcript isoforms is sensitive to temperature changes as modest as 2°C and is scalable over a wide dynamic range of temperature. Our work provides novel insight into SF‐mediated coupling of the perception of temperature to post‐transcriptional regulation of the clock

    AtRTD - a comprehensive reference transcript dataset resource for accurate quantification of transcript - specific expression in <i>Arabidopsis thaliana</i>

    Get PDF
    RNA-sequencing (RNA-seq) allows global gene expression analysis at the individual transcript level. Accurate quantification of transcript variants generated by alternative splicing (AS) remains a challenge. We have developed a comprehensive, nonredundant Arabidopsis reference transcript dataset (AtRTD) containing over 74 000 transcripts for use with algorithms to quantify AS transcript isoforms in RNA-seq. The AtRTD was formed by merging transcripts from TAIR10 and novel transcripts identified in an AS discovery project. We have estimated transcript abundance in RNA-seq data using the transcriptome-based alignment-free programmes Sailfish and Salmon and have validated quantification of splicing ratios from RNA-seq by high resolution reverse transcription polymerase chain reaction (HR RT-PCR). Good correlations between splicing ratios from RNA-seq and HR RT-PCR were obtained demonstrating the accuracy of abundances calculated for individual transcripts in RNA-seq. The AtRTD is a resource that will have immediate utility in analysing Arabidopsis RNA-seq data to quantify differential transcript abundance and expression.</p

    REVEILLE2 thermosensitive splicing: a molecular basis for the integration of nocturnal temperature information by the Arabidopsis circadian clock

    Get PDF
    ‱ Cold stress is one of the major environmental factors that limit growth and yield of plants. However, it is still not fully understood how plants account for daily temperature fluctuations, nor how these temperature changes are integrated with other regulatory systems such as the circadian clock. ‱ We demonstrate that REVEILLE2 undergoes alternative splicing after chilling that increases accumulation of a transcript isoform encoding a MYB-like transcription factor. We explore the biological function of REVEILLE2 in Arabidopsis thaliana using a combination of molecular genetics, transcriptomics, and physiology. ‱ Disruption of REVEILLE2 alternative splicing alters regulatory gene expression, impairs circadian timing, and improves photosynthetic capacity. Changes in nuclear gene expression are particularly apparent in the initial hours following chilling, with chloroplast gene expression subsequently up-regulated. ‱ The response of REVEILLE2 to chilling extends our understanding of plants immediate response to cooling. We propose that the circadian component REVEILLE2 restricts plants responses to nocturnal reductions in temperature, thereby enabling appropriate responses to daily environmental changes

    A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing

    Get PDF
    Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses

    Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size

    Get PDF
    Abstract Background Co-expression has been widely used to identify novel regulatory relationships using high throughput measurements, such as microarray and RNA-seq data. Evaluation studies on co-expression network analysis methods mostly focus on networks of small or medium size of up to a few hundred nodes. For large networks, simulated expression data usually consist of hundreds or thousands of profiles with different perturbations or knock-outs, which is uncommon in real experiments due to their cost and the amount of work required. Thus, the performances of co-expression network analysis methods on large co-expression networks consisting of a few thousand nodes, with only a small number of profiles with a single perturbation, which more accurately reflect normal experimental conditions, are generally uncharacterized and unknown. Methods We proposed a novel network inference methods based on Relevance Low order Partial Correlation (RLowPC). RLowPC method uses a two-step approach to select on the high-confidence edges first by reducing the search space by only picking the top ranked genes from an intial partial correlation analysis and, then computes the partial correlations in the confined search space by only removing the linear dependencies from the shared neighbours, largely ignoring the genes showing lower association. Results We selected six co-expression-based methods with good performance in evaluation studies from the literature: Partial correlation, PCIT, ARACNE, MRNET, MRNETB and CLR. The evaluation of these methods was carried out on simulated time-series data with various network sizes ranging from 100 to 3000 nodes. Simulation results show low precision and recall for all of the above methods for large networks with a small number of expression profiles. We improved the inference significantly by refinement of the top weighted edges in the pre-inferred partial correlation networks using RLowPC. We found improved performance by partitioning large networks into smaller co-expressed modules when assessing the method performance within these modules. Conclusions The evaluation results show that current methods suffer from low precision and recall for large co-expression networks where only a small number of profiles are available. The proposed RLowPC method effectively reduces the indirect edges predicted as regulatory relationships and increases the precision of top ranked predictions. Partitioning large networks into smaller highly co-expressed modules also helps to improve the performance of network inference methods. The RLowPC R package for network construction, refinement and evaluation is available at GitHub: https://github.com/wyguo/RLowPC

    Experimental Design for Time-Series RNA-Seq Analysis of Gene Expression and Alternative Splicing

    No full text
    RNA-sequencing (RNA-seq) is currently the method of choice for analysis of differential gene expression. To fully exploit the wealth of data generated from genome-wide transcriptomic approaches, the initial design of the experiment is of paramount importance. Biological rhythms in nature are pervasive and are driven by endogenous gene networks collectively known as circadian clocks. Measuring circadian gene expression requires time-course experiments which take into account time-of-day factors influencing variability in expression levels. We describe here an approach for characterizing diurnal changes in expression and alternative splicing for plants undergoing cooling. The method uses inexpensive everyday laboratory equipment and utilizes an RNA-seq application (3D RNA-seq) that can handle complex experimental designs and requires little or no prior bioinformatics expertise

    Additional file 1: of Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size

    No full text
    File contains additional Figures and Tables. Figure S1. Bar plots of pAUROC values for top 1000 edge predictions. Figure S2. Bar plots of pAUROC values of top 1000 predictions for GNW3000 module-based. Figure S3. GNW settings for data simulation. Figure S4. Examples of evaluation results. Table S1. Summaries of evaluation of gene network inference methods. Table S2. R packages used to construct and evaluate GRNs. (DOCX 1867 kb
    corecore