383,350 research outputs found

    rMAPS: RNA map analysis and plotting server for alternative exon regulation.

    Get PDF
    RNA-binding proteins (RBPs) play a critical role in the regulation of alternative splicing (AS), a prevalent mechanism for generating transcriptomic and proteomic diversity in eukaryotic cells. Studies have shown that AS can be regulated by RBPs in a binding-site-position dependent manner. Depending on where RBPs bind, splicing of an alternative exon can be enhanced or suppressed. Therefore, spatial analyses of RBP motifs and binding sites around alternative exons will help elucidate splicing regulation by RBPs. The development of high-throughput sequencing technologies has allowed transcriptome-wide analyses of AS and RBP-RNA interactions. Given a set of differentially regulated alternative exons obtained from RNA sequencing (RNA-seq) experiments, the rMAPS web server (http://rmaps.cecsresearch.org) performs motif analyses of RBPs in the vicinity of alternatively spliced exons and creates RNA maps that depict the spatial patterns of RBP motifs. Similarly, rMAPS can also perform spatial analyses of RBP-RNA binding sites identified by cross-linking immunoprecipitation sequencing (CLIP-seq) experiments. We anticipate rMAPS will be a useful tool for elucidating RBP regulation of alternative exon splicing using high-throughput sequencing data

    On the utility of RNA sample pooling to optimize cost and statistical power in RNA sequencing experiments

    Get PDF
    Background: In gene expression studies, RNA sample pooling is sometimes considered because of budget constraints or lack of sufficient input material. Using microarray technology, RNA sample pooling strategies have been reported to optimize both the cost of data generation as well as the statistical power for differential gene expression (DGE) analysis. For RNA sequencing, with its different quantitative output in terms of counts and tunable dynamic range, the adequacy and empirical validation of RNA sample pooling strategies have not yet been evaluated. In this study, we comprehensively assessed the utility of pooling strategies in RNA-seq experiments using empirical and simulated RNA-seq datasets. Result: The data generating model in pooled experiments is defined mathematically to evaluate the mean and variability of gene expression estimates. The model is further used to examine the trade-off between the statistical power of testing for DGE and the data generating costs. Empirical assessment of pooling strategies is done through analysis of RNA-seq datasets under various pooling and non-pooling experimental settings. Simulation study is also used to rank experimental scenarios with respect to the rate of false and true discoveries in DGE analysis. The results demonstrate that pooling strategies in RNA-seq studies can be both cost-effective and powerful when the number of pools, pool size and sequencing depth are optimally defined. Conclusion: For high within-group gene expression variability, small RNA sample pools are effective to reduce the variability and compensate for the loss of the number of replicates. Unlike the typical cost-saving strategies, such as reducing sequencing depth or number of RNA samples (replicates), an adequate pooling strategy is effective in maintaining the power of testing DGE for genes with low to medium abundance levels, along with a substantial reduction of the total cost of the experiment. In general, pooling RNA samples or pooling RNA samples in conjunction with moderate reduction of the sequencing depth can be good options to optimize the cost and maintain the power

    A High-Throughput Method for Illumina RNA-Seq Library Preparation.

    Get PDF
    With the introduction of cost effective, rapid, and superior quality next generation sequencing techniques, gene expression analysis has become viable for labs conducting small projects as well as large-scale gene expression analysis experiments. However, the available protocols for construction of RNA-sequencing (RNA-Seq) libraries are expensive and/or difficult to scale for high-throughput applications. Also, most protocols require isolated total RNA as a starting point. We provide a cost-effective RNA-Seq library synthesis protocol that is fast, starts with tissue, and is high-throughput from tissue to synthesized library. We have also designed and report a set of 96 unique barcodes for library adapters that are amenable to high-throughput sequencing by a large combination of multiplexing strategies. Our developed protocol has more power to detect differentially expressed genes when compared to the standard Illumina protocol, probably owing to less technical variation amongst replicates. We also address the problem of gene-length biases affecting differential gene expression calls and demonstrate that such biases can be efficiently minimized during mRNA isolation for library preparation

    FLASH: ultra-fast protocol to identify RNA-protein interactions in cells

    No full text
    Determination of the in vivo binding sites of RNA-binding proteins (RBPs) is paramount to understanding their function and how they affect different aspects of gene regulation. With hundreds of RNA-binding proteins identified in human cells, a flexible, high-resolution, high-throughput, highly multiplexible and radioactivity-free method to determine their binding sites has not been described to date. Here we report FLASH (Fast Ligation of RNA after some sort of Affinity Purification for High-throughput Sequencing), which uses a special adapter design and an optimized protocol to determine protein-RNA interactions in living cells. The entire FLASH protocol, starting from cells on plates to a sequencing library, takes 1.5 days. We demonstrate the flexibility, speed and versatility of FLASH by using it to determine RNA targets of both tagged and endogenously expressed proteins under diverse conditions in vivo

    Finite mixtures of matrix-variate Poisson-log normal distributions for three-way count data

    Full text link
    Three-way data structures, characterized by three entities, the units, the variables and the occasions, are frequent in biological studies. In RNA sequencing, three-way data structures are obtained when high-throughput transcriptome sequencing data are collected for n genes across p conditions at r occasions. Matrix-variate distributions offer a natural way to model three-way data and mixtures of matrix-variate distributions can be used to cluster three-way data. Clustering of gene expression data is carried out as means to discovering gene co-expression networks. In this work, a mixture of matrix-variate Poisson-log normal distributions is proposed for clustering read counts from RNA sequencing. By considering the matrix-variate structure, full information on the conditions and occasions of the RNA sequencing dataset is simultaneously considered, and the number of covariance parameters to be estimated is reduced. A Markov chain Monte Carlo expectation-maximization algorithm is used for parameter estimation and information criteria are used for model selection. The models are applied to both real and simulated data, giving favourable clustering results

    Evaluation of experimental design and computational parameter choices affecting analyses of ChIP-seq and RNA-seq data in undomesticated poplar trees.

    Get PDF
    BackgroundOne of the great advantages of next generation sequencing is the ability to generate large genomic datasets for virtually all species, including non-model organisms. It should be possible, in turn, to apply advanced computational approaches to these datasets to develop models of biological processes. In a practical sense, working with non-model organisms presents unique challenges. In this paper we discuss some of these challenges for ChIP-seq and RNA-seq experiments using the undomesticated tree species of the genus Populus.ResultsWe describe specific challenges associated with experimental design in Populus, including selection of optimal genotypes for different technical approaches and development of antibodies against Populus transcription factors. Execution of the experimental design included the generation and analysis of Chromatin immunoprecipitation-sequencing (ChIP-seq) data for RNA polymerase II and transcription factors involved in wood formation. We discuss criteria for analyzing the resulting datasets, determination of appropriate control sequencing libraries, evaluation of sequencing coverage needs, and optimization of parameters. We also describe the evaluation of ChIP-seq data from Populus, and discuss the comparison between ChIP-seq and RNA-seq data and biological interpretations of these comparisons.ConclusionsThese and other "lessons learned" highlight the challenges but also the potential insights to be gained from extending next generation sequencing-supported network analyses to undomesticated non-model species

    Advancing transcriptome platforms

    Get PDF
    During the last decade of years, remarkable technological innovations have emerged that allow the direct or indirect determination of the transcriptome at unprecedented scale and speed. Studies using these methods have already altered our view of the extent and complexity of transcript profiling, which has advanced from one-gene-at-a-time to a holistic view of the genome. Here, we outline the major technical advances in transcriptome characterization, including the most popular used hybridization-based platform, the well accepted tag-based sequencing platform, and the recently developed RNA-Seq (RNA sequencing) based platform. Importantly, these next-generation technologies revolutionize assessing the entire transcriptome via the recent RNA-Seq technology

    YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs.

    Get PDF
    Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes
    • ā€¦
    corecore