Search CORE

91 research outputs found

Linear Models for Microarray Data Analysis: Hidden Similarities and Differences

Author: Kerr M.K.
Lönnstedt I.
M. Kathleen Kerr
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Selecting "Significant" Differentially Expressed Genes from the Combined Perspective of the Null and the Alternative

Author: B. Moerkerke
Benjamini Y.
E. Goetghebeur
Lönnstedt I.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Valproic Acid Teratogenicity: A Toxicogenomics Approach

Author: Barondes SH
Brouns MR
Caruso A
Chang H
Downs KM
Dressel U
Lönnstedt I
McBurney MW
Svensson K
Xu W
Publication venue: National Institute of Environmental Health Sciences
Publication date: 03/06/2004
Field of study

Embryonic development is a highly coordinated set of processes that depend on hierarchies of signaling and gene regulatory networks, and the disruption of such networks may underlie many cases of chemically induced birth defects. The antiepileptic drug valproic acid (VPA) is a potent inducer of neural tube defects (NTDs) in human and mouse embryos. As with many other developmental toxicants however, the mechanism of VPA teratogenicity is unknown. Using microarray analysis, we compared the global gene expression responses to VPA in mouse embryos during the critical stages of teratogen action in vivo with those in cultured P19 embryocarcinoma cells in vitro. Among the identified VPA-responsive genes, some have been associated previously with NTDs or VPA effects [vinculin, metallothioneins 1 and 2 (Mt1, Mt2), keratin 1-18 (Krt1-18)], whereas others provide novel putative VPA targets, some of which are associated with processes relevant to neural tube formation and closure [transgelin 2 (Tagln2), thyroid hormone receptor interacting protein 6, galectin-1 (Lgals1), inhibitor of DNA binding 1 (Idb1), fatty acid synthase (Fasn), annexins A5 and A11 (Anxa5, Anxa11)], or with VPA effects or known molecular actions of VPA (Lgals1, Mt1, Mt2, Id1, Fasn, Anxa5, Anxa11, Krt1-18). A subset of genes with a transcriptional response to VPA that is similar in embryos and the cell model can be evaluated as potential biomarkers for VPA-induced teratogenicity that could be exploited directly in P19 cell–based in vitro assays. As several of the identified genes may be activated or repressed through a pathway of histone deacetylase (HDAC) inhibition and specificity protein 1 activation, our data support a role of HDAC as an important molecular target of VPA action in vivo

Crossref

Online Research @ Cardiff

PubMed Central

Differential expression analysis for sequence count data

Author: A Agresti
A Mortazavi
AC Cameron
AM Smith
AS Morrissy
B Langmead
C Loader
CI Bliss
DD Licatalosi
G Robertson
GK Smyth
GK Smyth
I Lönnstedt
J Bullard
JC Marioni
JF Lawless
JS Bloom
K Saha
L Wang
L Whitaker
M Kasowski
MD Robinson
MD Robinson
MD Robinson
MD Robinson
P Engström
P McCullagh
RC Gentleman
Simon Anders
SJ Clark
U Nagalakshmi
Wolfgang Huber
Y Benjamini
Publication venue
Publication date: 01/01/2010
Field of study

*Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq

Crossref

Springer

Springer - Publisher Connector

PubMed Central

Institute of Mathematics AS CR, v. v. i.

Nature Precedings

Intra- and inter-individual genetic differences in gene expression

Genetic variation is known to influence the amount of mRNA produced by a gene. Given that the molecular machines control mRNA levels of multiple genes, we expect genetic variation in the components of these machines would influence multiple genes in a similar fashion. In this study we show that this assumption is correct by using correlation of mRNA levels measured independently in the brain, kidney or liver of multiple, genetically typed, mice strains to detect shared genetic influences. These correlating groups of genes (CGG) have collective properties that account for 40-90% of the variability of their constituent genes and in some cases, but not all, contain genes encoding functionally related proteins. Critically, we show that the genetic influences are essentially tissue specific and consequently the same genetic variations in the one animal may up-regulate a CGG in one tissue but down-regulate the same CGG in a second tissue. We further show similarly paradoxical behaviour of CGGs within the same tissues of different individuals. The implication of this study is that this class of genetic variation can result in complex inter- and intra-individual and tissue differences and that this will create substantial challenges to the investigation of phenotypic outcomes, particularly in humans where multiple tissues are not readily available.
&#xa

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

The Australian National University

Nature Precedings

ScholarBank@NUS

Algebraic Comparison of Partial Lists in Bioinformatics

Author: A Gobbi
A Kalousis
A Kossenkov
A Sboner
AC Haury
AL Boulesteix
Arkady B. Khodursky
B Di Camillo
B Efron
B Efron
B Efron
B Schowe
C Cortes
C Cortes
C Furlanello
C Schneider
C Schneider
C Soneson
C Yao
Cesare Furlanello
Consortium The MicroArray Quality Control (MAQC)
D Albanese
D Cai
D Corrada
D Critchlow
D Saari
D Witten
G Guzzetta
G Jurman
G Jurman
G Lance
G Lance
G Smyth
Giuseppe Jurman
GS Cheon
I Guyon
I Jeffery
I Lönnstedt
J Bar-Ilan
J Borda
J Chen
J Ioannidis
J Neter
J Storey
L Ein-Dor
L Kuncheva
L Yu
L Zhang
M Desarkar
M Kauers
M Kauers
M Kendall
M Schimek
M Schimek
M Slawski
M Villarino
M Villarino
O Bousquet
P Baldi
P Diaconis
P Diaconis
P Hall
P Hall
P Krízek
PC Boutros
R Fagin
R Gentleman
R Graham
R Pearson
R Pique-Regi
R Pique-Regi
R Simon
Roberto Visintainer
S Abramov
S Dudoit
S Lin
S Lin
S Mukherjee
S Setlur
S Simićc
S Vanderlooy
Samantha Riccadonna
SK Lau
T Bø
T Calders
V Tusher
Visintainer
W Fury
W Hoeffding
W Shi
X Wang
X Yang
Y Xiao
Y Xiao
Z He
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2010
Field of study

The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Directory of Open Access Journals

PubMed Central

A simple approach to ranking differentially expressed gene expression time courses through Gaussian process regression.

Author: A Honkela
A Subramanian
Alfredo A Kalaitzis
B Efron
B Finkenstadt
C Angelini
C Angelini
CE Rasmussen
DJC MacKay
DJC MacKay
G Della Gatta
I Lönnstedt
J Ernst
J Vanhatalo
JD Storey
M Bansal
M Schena
M Yuan
ME Tipping
MF Möller
MK Kerr
N Friedman
ND Lawrence
Neil D Lawrence
O Stegle
O Stegle
P Gao
PDW Kirk
PT Spellman
RA Irizarry
RM Neal
S Dudoit
SD Bay
YC Tai
Z Bar-Joseph
Z Bar-Joseph
Publication venue: BMC Bioinformatics
Publication date: 01/05/2011
Field of study

BACKGROUND: The analysis of gene expression from time series underpins many biological studies. Two basic forms of analysis recur for data of this type: removing inactive (quiet) genes from the study and determining which genes are differentially expressed. Often these analysis stages are applied disregarding the fact that the data is drawn from a time series. In this paper we propose a simple model for accounting for the underlying temporal nature of the data based on a Gaussian process. RESULTS: We review Gaussian process (GP) regression for estimating the continuous trajectories underlying in gene expression time-series. We present a simple approach which can be used to filter quiet genes, or for the case of time series in the form of expression ratios, quantify differential expression. We assess via ROC curves the rankings produced by our regression framework and compare them to a recently proposed hierarchical Bayesian model for the analysis of gene expression time-series (BATS). We compare on both simulated and experimental data showing that the proposed approach considerably outperforms the current state of the art. CONCLUSIONS: Gaussian processes offer an attractive trade-off between efficiency and usability for the analysis of microarray time series. The Gaussian process framework offers a natural way of handling biological replicates and missing values and provides confidence intervals along the estimated curves of gene expression. Therefore, we believe Gaussian processes should be a standard tool in the analysis of gene expression time series

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Apollo (Cambridge)

White Rose Research Online

Empirical Bayes models for multiple probe type microarrays at the probe level

Author: A Hess
A Sjögren
A Spira
AM Hein
AP Dempster
B Efron
BP Durbin
BP Durbin
D Gaile
D Holder
DM Rocke
E Kristiansson
E Kristiansson
GK Smyth
I Lönnstedt
IA Eaves
J Comander
J Hu
JW Tukey
LM Cope
M Åstrand
MA Sartor
Magnus Åstrand
Mats Rudemo
N Jain
P Baldi
P Munson
Petter Mostad
R Opgen-Rhein
RA Irizarry
RS Stearman
S Choe
SC Geller
T Hastie
VG Tusher
W Huber
W Lemon
X Liu
X Liu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background When analyzing microarray data a primary objective is often to find differentially expressed genes. With empirical Bayes and penalized t-tests the sample variances are adjusted towards a global estimate, producing more stable results compared to ordinary t-tests. However, for Affymetrix type data a clear dependency between variability and intensity-level generally exists, even for logged intensities, most clearly for data at the probe level but also for probe-set summarizes such as the MAS5 expression index. As a consequence, adjustment towards a global estimate results in an intensity-level dependent false positive rate. Results We propose two new methods for finding differentially expressed genes, Probe level Locally moderated Weighted median-t (PLW) and Locally Moderated Weighted-t (LMW). Both methods use an empirical Bayes model taking the dependency between variability and intensity-level into account. A global covariance matrix is also used allowing for differing variances between arrays as well as array-to-array correlations. PLW is specially designed for Affymetrix type arrays (or other multiple-probe arrays). Instead of making inference on probe-set summaries, comparisons are made separately for each perfect-match probe and are then summarized into one score for the probe-set. Conclusion The proposed methods are compared to 14 existing methods using five spike-in data sets. For RMA and GCRMA processed data, PLW has the most accurate ranking of regulated genes in four out of the five data sets, and LMW consistently performs better than all examined moderated t-tests when used on RMA, GCRMA, and MAS5 expression indexes.</p

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Chalmers Research

Chalmers Publication Library

At least two well-spaced samples are needed to genotype a solid tumor

Author: A Roth
A Sottoriva
AB Olshen
B Reva
B Vogelstein
D Shibata
D Shibata
Darryl Shibata
H Kang
I Hajirasouliha
IM Lönnstedt
J Gagan
K Cibulskis
Kimberly Siegmund
NE Navin
RA Burrell
RC Griffiths
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Generalized shrinkage F-like statistics for testing an interaction term in gene expression analysis in the presence of heteroscedasticity

Author: A Butte
BA Craig
BP Tu
CC Pritchard
D Chabas
DA Allison
EL Lehmann
ES Edgington
F Jaffrezic
GA Churchill
GA Churchill
George Casella
GK Smyth
H Wu
I Lönnstedt
J Neter
J Sebat
Jie Yang
JTG Hwang
K Kizilkaya
KP White
Lauren M McIntyre
LM McIntyre
M Kirst
M Schena
MJ Anderson
MK Kerr
MK Kerr
MK Kerr
MK Kerr
ML Wayne
N Rostoks
P Baldi
P Kelly
R Blekhman
RD Wolfinger
S Feng
S Pounds
SPA Fodor
SV Nuzhdin
SY Kim
T Chu
T Galitski
TJ Tong
VG Tusher
X Cui
X Cui
X Zhang
X Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Many analyses of gene expression data involve hypothesis tests of an interaction term between two fixed effects, typically tested using a residual variance. In expression studies, the issue of variance heteroscedasticity has received much attention, and previous work has focused on either between-gene or within-gene heteroscedasticity. However, in a single experiment, heteroscedasticity may exist both within and between genes. Here we develop flexible shrinkage error estimators considering both between-gene and within-gene heteroscedasticity and use them to construct <it>F</it>-like test statistics for testing interactions, with cutoff values obtained by permutation. These permutation tests are complicated, and several permutation tests are investigated here. Results Our proposed test statistics are compared with other existing shrinkage-type test statistics through extensive simulation studies and a real data example. The results show that the choice of permutation procedures has dramatically more influence on detection power than the choice of <it>F </it>or <it>F</it>-like test statistics. When both types of gene heteroscedasticity exist, our proposed test statistics can control preselected type-I errors and are more powerful. Raw data permutation is not valid in this setting. Whether unrestricted or restricted residual permutation should be used depends on the specific type of test statistic. Conclusions The <it>F</it>-like test statistic that uses the proposed flexible shrinkage error estimator considering both types of gene heteroscedasticity and unrestricted residual permutation can provide a statistically valid and powerful test. Therefore, we recommended that it should always applied in the analysis of real gene expression data analysis to test an interaction term.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central