4,874 research outputs found

    Normal uniform mixture differential gene expression detection for cDNA microarrays

    Get PDF
    BACKGROUND: One of the primary tasks in analysing gene expression data is finding genes that are differentially expressed in different samples. Multiple testing issues due to the thousands of tests run make some of the more popular methods for doing this problematic. RESULTS: We propose a simple method, Normal Uniform Differential Gene Expression (NUDGE) detection for finding differentially expressed genes in cDNA microarrays. The method uses a simple univariate normal-uniform mixture model, in combination with new normalization methods for spread as well as mean that extend the lowess normalization of Dudoit, Yang, Callow and Speed (2002) [1]. It takes account of multiple testing, and gives probabilities of differential expression as part of its output. It can be applied to either single-slide or replicated experiments, and it is very fast. Three datasets are analyzed using NUDGE, and the results are compared to those given by other popular methods: unadjusted and Bonferroni-adjusted t tests, Significance Analysis of Microarrays (SAM), and Empirical Bayes for microarrays (EBarrays) with both Gamma-Gamma and Lognormal-Normal models. CONCLUSION: The method gives a high probability of differential expression to genes known/suspected a priori to be differentially expressed and a low probability to the others. In terms of known false positives and false negatives, the method outperforms all multiple-replicate methods except for the Gamma-Gamma EBarrays method to which it offers comparable results with the added advantages of greater simplicity, speed, fewer assumptions and applicability to the single replicate case. An R package called nudge to implement the methods in this paper will be made available soon at

    Application of Volcano Plots in Analyses of mRNA Differential Expressions with Microarrays

    Full text link
    Volcano plot displays unstandardized signal (e.g. log-fold-change) against noise-adjusted/standardized signal (e.g. t-statistic or -log10(p-value) from the t test). We review the basic and an interactive use of the volcano plot, and its crucial role in understanding the regularized t-statistic. The joint filtering gene selection criterion based on regularized statistics has a curved discriminant line in the volcano plot, as compared to the two perpendicular lines for the "double filtering" criterion. This review attempts to provide an unifying framework for discussions on alternative measures of differential expression, improved methods for estimating variance, and visual display of a microarray analysis result. We also discuss the possibility to apply volcano plots to other fields beyond microarray.Comment: 8 figure

    A statistical framework for the design of microarray experiments and effective detection of differential gene expression

    Full text link
    Four reasons why you might wish to read this paper: 1. We have devised a new statistical T test to determine differentially expressed genes (DEG) in the context of microarray experiments. This statistical test adds a new member to the traditional T-test family. 2. An exact formula for calculating the detection power of this T test is presented, which can also be fairly easily modified to cover the traditional T tests. 3. We have presented an accurate yet computationally very simple method to estimate the fraction of non-DEGs in a set of genes being tested. This method is superior to an existing one which is computationally much involved. 4. We approach the multiple testing problem from a fresh angle, and discuss its relation to the classical Bonferroni procedure and to the FDR (false discovery rate) approach. This is most useful in the analysis of microarray data, where typically several thousands of genes are being tested simultaneously.Comment: 9 pages, 1 table; to appear in Bioinformatic

    Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data

    Get PDF
    BACKGROUND: An increasing number of studies have profiled tumor specimens using distinct microarray platforms and analysis techniques. With the accumulating amount of microarray data, one of the most intriguing yet challenging tasks is to develop robust statistical models to integrate the findings. RESULTS: By applying a two-stage Bayesian mixture modeling strategy, we were able to assimilate and analyze four independent microarray studies to derive an inter-study validated "meta-signature" associated with breast cancer prognosis. Combining multiple studies (n = 305 samples) on a common probability scale, we developed a 90-gene meta-signature, which strongly associated with survival in breast cancer patients. Given the set of independent studies using different microarray platforms which included spotted cDNAs, Affymetrix GeneChip, and inkjet oligonucleotides, the individually identified classifiers yielded gene sets predictive of survival in each study cohort. The study-specific gene signatures, however, had minimal overlap with each other, and performed poorly in pairwise cross-validation. The meta-signature, on the other hand, accommodated such heterogeneity and achieved comparable or better prognostic performance when compared with the individual signatures. Further by comparing to a global standardization method, the mixture model based data transformation demonstrated superior properties for data integration and provided solid basis for building classifiers at the second stage. Functional annotation revealed that genes involved in cell cycle and signal transduction activities were over-represented in the meta-signature. CONCLUSION: The mixture modeling approach unifies disparate gene expression data on a common probability scale allowing for robust, inter-study validated prognostic signatures to be obtained. With the emerging utility of microarrays for cancer prognosis, it will be important to establish paradigms to meta-analyze disparate gene expression data for prognostic signatures of potential clinical use

    A full Bayesian hierarchical mixture model for the variance of gene differential expression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In many laboratory-based high throughput microarray experiments, there are very few replicates of gene expression levels. Thus, estimates of gene variances are inaccurate. Visual inspection of graphical summaries of these data usually reveals that heteroscedasticity is present, and the standard approach to address this is to take a log<sub>2 </sub>transformation. In such circumstances, it is then common to assume that gene variability is constant when an analysis of these data is undertaken. However, this is perhaps too stringent an assumption. More careful inspection reveals that the simple log<sub>2 </sub>transformation does not remove the problem of heteroscedasticity. An alternative strategy is to assume independent gene-specific variances; although again this is problematic as variance estimates based on few replications are highly unstable. More meaningful and reliable comparisons of gene expression might be achieved, for different conditions or different tissue samples, where the test statistics are based on accurate estimates of gene variability; a crucial step in the identification of differentially expressed genes.</p> <p>Results</p> <p>We propose a Bayesian mixture model, which classifies genes according to similarity in their variance. The result is that genes in the same latent class share the similar variance, estimated from a larger number of replicates than purely those per gene, i.e. the total of all replicates of all genes in the same latent class. An example dataset, consisting of 9216 genes with four replicates per condition, resulted in four latent classes based on their similarity of the variance.</p> <p>Conclusion</p> <p>The mixture variance model provides a realistic and flexible estimate for the variance of gene expression data under limited replicates. We believe that in using the latent class variances, estimated from a larger number of genes in each derived latent group, the <it>p</it>-values obtained are more robust than either using a constant gene or gene-specific variance estimate.</p

    Gene expression profiling in prepubertal and adult male mice using cDNA and oligonucleotide microarrays

    Get PDF
    Variations in gene expression are the basis of differences in cell and tissue function, response to DNA damaging agents, susceptibility to genetic disease, and cellular differentiation. The purpose of this dissertation research was to characterize variation in basal gene expression among adult mouse tissues for selected stress response, DNA repair and damage control genes and to utilize variation in temporal gene expression patterns to identify candidate genes associated with germ cell differentiation from mitosis through meiosis in the prepubertal mouse testis. To accomplish these goals, high throughput analyses of gene expression were performed using custom cDNA and random oligonucleotide microarrays. CDNA microarray technology was optimized by evaluating the effects of multiple hybridization and image analysis methodologies on the magnitude of background-subtracted hybridization signal intensities. The results showed that hybridizing lower probe quantities in a buffer developed at Lawrence Livermore National Laboratory to tryptone-blocked microarrays improved signal intensities. In addition, the error in expression ratio measurements was significantly reduced when microarray images were preprocessed. A custom cDNA microarray comprised of 417 genes and enriched for stress response, DNA repair, and damage control genes was used to investigate basal gene expression differences among adult mouse testis, brain, liver, spleen, and heart. Genes with functions related to stress response exhibited the most variation in expression among tissues whereas DNA repair-associated gene expression varied the least. Random oligonucleotide microarrays comprised of ∌10,000 genes were used to profile changes in gene expression during the first wave of spermatogenesis in the prepubertal mouse testis. Approximately 550 genes were differentially expressed as male germ cells differentiated from spermatogonia to primary spermatocytes. These findings suggest that the 313 unannotated sequences and 178 genes with known functions in other biological pathways have spermatogenesis-associated roles. This dissertation research showed that microarrays are a useful tool for quantitating the expression of large numbers of genes in parallel under normal physiological conditions and during differentiation. It has also provided candidate genes for future investigations of the molecular mechanisms underlying (1) tissue-specific DNA damage response and genetic disease susceptibility and (2) cellular differentiation during the onset and progression of spermatogenesis

    Microarray Data Preprocessing: From Experimental Design to Differential Analysis

    Get PDF
    DNA microarray data preprocessing is of utmost importance in the analytical path starting from the experimental design and leading to a reliable biological interpretation. In fact, when all relevant aspects regarding the experimental plan have been considered, the following steps from data quality check to differential analysis will lead to robust, trustworthy results. In this chapter, all the relevant aspects and considerations about microarray preprocessing will be discussed. Preprocessing steps are organized in an orderly manner, from experimental design to quality check and batch effect removal, including the most common visualization methods. Furthermore, we will discuss data representation and differential testing methods with a focus on the most common microarray technologies, such as gene expression and DNA methylation.Peer reviewe
    • 

    corecore