91 research outputs found

    Valproic Acid Teratogenicity: A Toxicogenomics Approach

    Get PDF
    Embryonic development is a highly coordinated set of processes that depend on hierarchies of signaling and gene regulatory networks, and the disruption of such networks may underlie many cases of chemically induced birth defects. The antiepileptic drug valproic acid (VPA) is a potent inducer of neural tube defects (NTDs) in human and mouse embryos. As with many other developmental toxicants however, the mechanism of VPA teratogenicity is unknown. Using microarray analysis, we compared the global gene expression responses to VPA in mouse embryos during the critical stages of teratogen action in vivo with those in cultured P19 embryocarcinoma cells in vitro. Among the identified VPA-responsive genes, some have been associated previously with NTDs or VPA effects [vinculin, metallothioneins 1 and 2 (Mt1, Mt2), keratin 1-18 (Krt1-18)], whereas others provide novel putative VPA targets, some of which are associated with processes relevant to neural tube formation and closure [transgelin 2 (Tagln2), thyroid hormone receptor interacting protein 6, galectin-1 (Lgals1), inhibitor of DNA binding 1 (Idb1), fatty acid synthase (Fasn), annexins A5 and A11 (Anxa5, Anxa11)], or with VPA effects or known molecular actions of VPA (Lgals1, Mt1, Mt2, Id1, Fasn, Anxa5, Anxa11, Krt1-18). A subset of genes with a transcriptional response to VPA that is similar in embryos and the cell model can be evaluated as potential biomarkers for VPA-induced teratogenicity that could be exploited directly in P19 cell–based in vitro assays. As several of the identified genes may be activated or repressed through a pathway of histone deacetylase (HDAC) inhibition and specificity protein 1 activation, our data support a role of HDAC as an important molecular target of VPA action in vivo

    Differential expression analysis for sequence count data

    Get PDF
    *Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq

    Intra- and inter-individual genetic differences in gene expression

    Get PDF
    Genetic variation is known to influence the amount of mRNA produced by a gene. Given that the molecular machines control mRNA levels of multiple genes, we expect genetic variation in the components of these machines would influence multiple genes in a similar fashion. In this study we show that this assumption is correct by using correlation of mRNA levels measured independently in the brain, kidney or liver of multiple, genetically typed, mice strains to detect shared genetic influences. These correlating groups of genes (CGG) have collective properties that account for 40-90% of the variability of their constituent genes and in some cases, but not all, contain genes encoding functionally related proteins. Critically, we show that the genetic influences are essentially tissue specific and consequently the same genetic variations in the one animal may up-regulate a CGG in one tissue but down-regulate the same CGG in a second tissue. We further show similarly paradoxical behaviour of CGGs within the same tissues of different individuals. The implication of this study is that this class of genetic variation can result in complex inter- and intra-individual and tissue differences and that this will create substantial challenges to the investigation of phenotypic outcomes, particularly in humans where multiple tissues are not readily available.

&#xa

    Algebraic Comparison of Partial Lists in Bioinformatics

    Get PDF
    The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

    A simple approach to ranking differentially expressed gene expression time courses through Gaussian process regression.

    Get PDF
    BACKGROUND: The analysis of gene expression from time series underpins many biological studies. Two basic forms of analysis recur for data of this type: removing inactive (quiet) genes from the study and determining which genes are differentially expressed. Often these analysis stages are applied disregarding the fact that the data is drawn from a time series. In this paper we propose a simple model for accounting for the underlying temporal nature of the data based on a Gaussian process. RESULTS: We review Gaussian process (GP) regression for estimating the continuous trajectories underlying in gene expression time-series. We present a simple approach which can be used to filter quiet genes, or for the case of time series in the form of expression ratios, quantify differential expression. We assess via ROC curves the rankings produced by our regression framework and compare them to a recently proposed hierarchical Bayesian model for the analysis of gene expression time-series (BATS). We compare on both simulated and experimental data showing that the proposed approach considerably outperforms the current state of the art. CONCLUSIONS: Gaussian processes offer an attractive trade-off between efficiency and usability for the analysis of microarray time series. The Gaussian process framework offers a natural way of handling biological replicates and missing values and provides confidence intervals along the estimated curves of gene expression. Therefore, we believe Gaussian processes should be a standard tool in the analysis of gene expression time series

    Empirical Bayes models for multiple probe type microarrays at the probe level

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>When analyzing microarray data a primary objective is often to find differentially expressed genes. With empirical Bayes and penalized t-tests the sample variances are adjusted towards a global estimate, producing more stable results compared to ordinary t-tests. However, for Affymetrix type data a clear dependency between variability and intensity-level generally exists, even for logged intensities, most clearly for data at the probe level but also for probe-set summarizes such as the MAS5 expression index. As a consequence, adjustment towards a global estimate results in an intensity-level dependent false positive rate.</p> <p>Results</p> <p>We propose two new methods for finding differentially expressed genes, Probe level Locally moderated Weighted median-t (PLW) and Locally Moderated Weighted-t (LMW). Both methods use an empirical Bayes model taking the dependency between variability and intensity-level into account. A global covariance matrix is also used allowing for differing variances between arrays as well as array-to-array correlations. PLW is specially designed for Affymetrix type arrays (or other multiple-probe arrays). Instead of making inference on probe-set summaries, comparisons are made separately for each perfect-match probe and are then summarized into one score for the probe-set.</p> <p>Conclusion</p> <p>The proposed methods are compared to 14 existing methods using five spike-in data sets. For RMA and GCRMA processed data, PLW has the most accurate ranking of regulated genes in four out of the five data sets, and LMW consistently performs better than all examined moderated t-tests when used on RMA, GCRMA, and MAS5 expression indexes.</p

    Generalized shrinkage F-like statistics for testing an interaction term in gene expression analysis in the presence of heteroscedasticity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many analyses of gene expression data involve hypothesis tests of an interaction term between two fixed effects, typically tested using a residual variance. In expression studies, the issue of variance heteroscedasticity has received much attention, and previous work has focused on either between-gene or within-gene heteroscedasticity. However, in a single experiment, heteroscedasticity may exist both within and between genes. Here we develop flexible shrinkage error estimators considering both between-gene and within-gene heteroscedasticity and use them to construct <it>F</it>-like test statistics for testing interactions, with cutoff values obtained by permutation. These permutation tests are complicated, and several permutation tests are investigated here.</p> <p>Results</p> <p>Our proposed test statistics are compared with other existing shrinkage-type test statistics through extensive simulation studies and a real data example. The results show that the choice of permutation procedures has dramatically more influence on detection power than the choice of <it>F </it>or <it>F</it>-like test statistics. When both types of gene heteroscedasticity exist, our proposed test statistics can control preselected type-I errors and are more powerful. Raw data permutation is not valid in this setting. Whether unrestricted or restricted residual permutation should be used depends on the specific type of test statistic.</p> <p>Conclusions</p> <p>The <it>F</it>-like test statistic that uses the proposed flexible shrinkage error estimator considering both types of gene heteroscedasticity and unrestricted residual permutation can provide a statistically valid and powerful test. Therefore, we recommended that it should always applied in the analysis of real gene expression data analysis to test an interaction term.</p
    corecore