2,065 research outputs found

    aFold – using polynomial uncertainty modelling for differential gene expression estimation from RNA sequencing data

    No full text
    Data normalization and identification of significant differential expression represent crucial steps in RNA-Seq analysis. Many available tools rely on assumptions that are often not met by real data, including the common assumption of symmetrical distribution of up- and down-regulated genes, the presence of only few differentially expressed genes and/or few outliers. Moreover, the cut-off for selecting significantly differentially expressed genes for further downstream analysis often depend on arbitrary choices

    Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster

    Get PDF
    Comparison of normalization methods across conditions. Boxplots show the differences in the coefficient of variation across flies in each genotype/sex/environment condition. (PDF 245 kb

    Differential gene expression analysis tools exhibit substandard performance for long non-coding RNA-sequencing data

    Get PDF
    Background: Long non-coding RNAs (lncRNAs) are typically expressed at low levels and are inherently highly variable. This is a fundamental challenge for differential expression (DE) analysis. In this study, the performance of 25 pipelines for testing DE in RNA-seq data is comprehensively evaluated, with a particular focus on lncRNAs and low-abundance mRNAs. Fifteen performance metrics are used to evaluate DE tools and normalization methods using simulations and analyses of six diverse RNA-seq datasets. Results: Gene expression data are simulated using non-parametric procedures in such a way that realistic levels of expression and variability are preserved in the simulated data. Throughout the assessment, results for mRNA and lncRNA were tracked separately. All the pipelines exhibit inferior performance for lncRNAs compared to mRNAs across all simulated scenarios and benchmark RNA-seq datasets. The substandard performance of DE tools for lncRNAs applies also to low-abundance mRNAs. No single tool uniformly outperformed the others. Variability, number of samples, and fraction of DE genes markedly influenced DE tool performance. Conclusions: Overall, linear modeling with empirical Bayes moderation (limma) and a non-parametric approach (SAMSeq) showed good control of the false discovery rate and reasonable sensitivity. Of note, for achieving a sensitivity of at least 50%, more than 80 samples are required when studying expression levels in realistic settings such as in clinical cancer research. About half of the methods showed a substantial excess of false discoveries, making these methods unreliable for DE analysis and jeopardizing reproducible science. The detailed results of our study can be consulted through a user-friendly web application, giving guidance on selection of the optimal DE tool (http://statapps.ugent.be/tools/AppDGE/)

    Cross platform standardisation and normalisation experimental pipeline for use in the biodiscovery of dysregulated human circulating miRNAs

    Get PDF
    Introduction. Micro RNAs (miRNAs) are a class of highly conserved small non-coding RNAs that play an important part in the post-transcriptional regulation of gene expression. A substantial number of miRNAs have been proposed as biomarkers for diseases. While reverse transcriptase Real-time PCR (RT-qPCR) is considered the gold standard for the evaluation and validation of miRNA biomarkers, small RNA sequencing is now routinely being adopted for the identification of dysregulated miRNAs. However, in many cases where putative miRNA biomarkers are identified using small RNA sequencing, they are not substantiated when RT-qPCR is used for validation. To date, there is a lack of consensus regarding optimal methodologies for miRNA detection, quantification and standardisation when different platform technologies are used. Materials and Methods. In this study we present an experimental pipeline that takes into consideration sample collection, processing, enrichment, and the subsequent comparative analysis of circulating small ribonucleic acids using small RNA sequencing and RT-qPCR. Results, Discussion, Conclusions Initially, a panel of miRNAs dysregulated in circulating blood from breast cancer patients compared to healthy women were identified using small RNA sequencing. MiR-320a was identified as the most dysregulated miRNA between the two female cohorts. Total RNA and enriched small RNA populations (<30 bp) isolated from peripheral blood from the same female cohort samples were then tested for using a miR-320a RT-qPCR assay. When total RNA was analysed with this miR-320a RT-qPCR assay, a 2.3-fold decrease in expression levels was observed between blood samples from healthy controls and breast cancer patients. However, upon enrichment for the small RNA population and subsequent analysis of miR-320a using RT-qPCR, its dysregulation in breast cancer patients was more pronounced with an 8.89-fold decrease in miR-320a expression. We propose that the experimental pipeline outlined could serve as a robust approach for the identification and validation of small RNA biomarkers for disease

    Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data

    Get PDF
    A large number of computational methods have been developed for analyzing differential gene expression in RNA-seq data. We describe a comprehensive evaluation of common methods using the SEQC benchmark dataset and ENCODE data. We consider a number of key features, including normalization, accuracy of differential expression detection and differential expression analysis when one condition has no detectable expression. We find significant differences among the methods, but note that array-based methods adapted to RNA-seq data perform comparably to methods designed for RNA-seq. Our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth

    Microarray Data Preprocessing: From Experimental Design to Differential Analysis

    Get PDF
    DNA microarray data preprocessing is of utmost importance in the analytical path starting from the experimental design and leading to a reliable biological interpretation. In fact, when all relevant aspects regarding the experimental plan have been considered, the following steps from data quality check to differential analysis will lead to robust, trustworthy results. In this chapter, all the relevant aspects and considerations about microarray preprocessing will be discussed. Preprocessing steps are organized in an orderly manner, from experimental design to quality check and batch effect removal, including the most common visualization methods. Furthermore, we will discuss data representation and differential testing methods with a focus on the most common microarray technologies, such as gene expression and DNA methylation.Peer reviewe
    • 

    corecore