Search CORE

Novel DNA methylation profiles associated with key gene regulation and transcription pathways in blood and placenta of growth-restricted neonates

Author: Barg E
Bejamini Y
Chris Mathews
David J Williams
Graham A Hitman
Melissa C Smart
Robert Lowe
Sara L Hillman
Sarah Finer
Smyth GK
Vardhman K Rakyan
Publication venue: 'Informa UK Limited'
Publication date: 11/12/2014
Field of study

BB/H012494/1/ Biotechnology and Biological Sciences Research Counci

UCL Discovery

Queen Mary Research Online

Differential expression analysis for sequence count data

Author: A Agresti
A Mortazavi
AC Cameron
AM Smith
AS Morrissy
B Langmead
C Loader
CI Bliss
DD Licatalosi
G Robertson
GK Smyth
GK Smyth
I Lönnstedt
J Bullard
JC Marioni
JF Lawless
JS Bloom
K Saha
L Wang
L Whitaker
M Kasowski
MD Robinson
MD Robinson
MD Robinson
MD Robinson
P Engström
P McCullagh
RC Gentleman
Simon Anders
SJ Clark
U Nagalakshmi
Wolfgang Huber
Y Benjamini
Publication venue
Publication date: 01/01/2010
Field of study

*Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq

Springer

Institute of Mathematics AS CR, v. v. i.

Nature Precedings

Technical Variability Is Greater than Biological Variability in a Microarray Experiment but Both Are Outweighed by Changes Induced by Stimulation

Author: A Sureda
AJ Holloway
AR Whitney
Arkady B. Khodursky
C Adrain
CS Moller-Levet
DA Casciano
GK Smyth
GK Smyth
GK Smyth
Gordon K. Smyth
H Ashdown
I Schober
JC Pinheiro
JP Novak
K Asai
KA Hintz
KY Kim
M Diehn
M Morley
M Wachulec
ME Ritchie
N Brockdorff
Nigel Curtis
NJ Poindexter
P De Feo
PA Bryant
PA Bryant
PD Lee
Penelope A. Bryant
RM Kerkhoven
Roy Robins-Browne
S Jozefowski
SD Su
SV Aulock
VG Cheung
Y Kimura
YH Yang
YH Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 31/05/2011
Field of study

INTRODUCTION: A central issue in the design of microarray-based analysis of global gene expression is that variability resulting from experimental processes may obscure changes resulting from the effect being investigated. This study quantified the variability in gene expression at each level of a typical in vitro stimulation experiment using human peripheral blood mononuclear cells (PBMC). The primary objective was to determine the magnitude of biological and technical variability relative to the effect being investigated, namely gene expression changes resulting from stimulation with lipopolysaccharide (LPS). METHODS AND RESULTS: Human PBMC were stimulated in vitro with LPS, with replication at 5 levels: 5 subjects each on 2 separate days with technical replication of LPS stimulation, amplification and hybridisation. RNA from samples stimulated with LPS and unstimulated samples were hybridised against common reference RNA on oligonucleotide microarrays. There was a closer correlation in gene expression between replicate hybridisations (0.86-0.93) than between different subjects (0.66-0.78). Deconstruction of the variability at each level of the experimental process showed that technical variability (standard deviation (SD) 0.16) was greater than biological variability (SD 0.06), although both were low (SD<0.1 for all individual components). There was variability in gene expression both at baseline and after stimulation with LPS and proportion of cell subsets in PBMC was likely partly responsible for this. However, gene expression changes after stimulation with LPS were much greater than the variability from any source, either individually or combined. CONCLUSIONS: Variability in gene expression was very low and likely to improve further as technical advances are made. The finding that stimulation with LPS has a markedly greater effect on gene expression than the degree of variability provides confidence that microarray-based studies can be used to detect changes in gene expression of biological interest in infectious diseases

Public Library of Science (PLOS)

Queen's University Belfast Research Portal

NuGO contributions to GenePattern

Author: A Alexa
A Subramanian
C. Mayer
C. Reiff
EL Lehman
GK Smyth
M Dai
M Reich
M. Müller
P. J. De Groot
Publication venue: Springer-Verlag
Publication date: 01/01/2008
Field of study

NuGO, the European Nutrigenomics Organization, utilizes 31 powerful computers for, e.g., data storage and analysis. These so-called black boxes (NBXses) are located at the sites of different partners. NuGO decided to use GenePattern as the preferred genomic analysis tool on each NBX. To handle the custom made Affymetrix NuGO arrays, new NuGO modules are added to GenePattern. These NuGO modules execute the latest Bioconductor version ensuring up-to-date annotations and access to the latest scientific developments. The following GenePattern modules are provided by NuGO: NuGOArrayQualityAnalysis for comprehensive quality control, NuGOExpressionFileCreator for import and normalization of data, LimmaAnalysis for identification of differentially expressed genes, TopGoAnalysis for calculation of GO enrichment, and GetResultForGo for retrieval of information on genes associated with specific GO terms. All together, these NuGO modules allow comprehensive, up-to-date, and user friendly analysis of Affymetrix data. A special feature of the NuGO modules is that for analysis they allow the use of either the standard Affymetrix or the MBNI custom CDF-files, which remap probes based on current knowledge. In both cases a .chip-file is created to enable GSEA analysis. The NuGO GenePattern installations are distributed as binary Ubuntu (.deb) packages via the NuGO repository

Wageningen University & Research Publications

Identifying differential exon splicing using linear models and correlation coefficients

Author: C Delaloy
C Gieffers
D Das
E Purdom
F Verissimo
G Fiermonte
GB John
GK Smyth
GK Smyth
JA Mayr
Jacqueline A Pallas
K Srinivasan
M Guipponi
M Huizing
M O'Reilly
MD Robinson
MJ Okoniewski
N Nakumara
RA Irizarry
RC Gentleman
Sonia H Shah
TA Clark
TW Gong
Y Xing
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: With the availability of the Affymetrix exon arrays a number of tools have been developed to enable the analysis. These however can be expensive or have several pre-installation requirements. This led us to develop an analysis workflow for analysing differential splicing using freely available software packages that are already being widely used for gene expression analysis. The workflow uses the packages in the standard installation of R and Bioconductor (BiocLite) to identify differential splicing. We use the splice index method with the LIMMA framework. The main drawback with this approach is that it relies on accurate estimates of gene expression from the probe-level data. Methods such as RMA and PLIER may misestimate when a large proportion of exons are spliced. We therefore present the novel concept of a gene correlation coefficient calculated using only the probeset expression pattern within a gene. We show that genes with lower correlation coefficients are likely to be differentially spliced.Results: The LIMMA approach was used to identify several tissue-specific transcripts and splicing events that are supported by previous experimental studies. Filtering the data is necessary, particularly removing exons and genes that are not expressed in all samples and cross-hybridising probesets, in order to reduce the false positive rate. The LIMMA approach ranked genes containing single or few differentially spliced exons much higher than genes containing several differentially spliced exons. On the other hand we found the gene correlation coefficient approach better for identifying genes with a large number of differentially spliced exons.Conclusion: We show that LIMMA can be used to identify differential exon splicing from Affymetrix exon array data. Though further work would be necessary to develop the use of correlation coefficients into a complete analysis approach, the preliminary results demonstrate their usefulness for identifying differentially spliced genes. The two approaches work complementary as they can potentially identify different subsets of genes (single/few spliced exons vs. large transcript structure differences)

UQ eSpace (University of Queensland)

UCL Discovery

Messina: A Novel Analysis Tool to Identify Biologically Relevant Molecules in Disease

Author: Andrew V. Biankin
C Cortes
C Nadeau
C Widakowich
Christopher J. Scarlett
D Segara
Davendra Segara
DJ Slamon
Emily K. Colvin
GK Smyth
GK Smyth
Hilal Lashuel
James G. Kench
JM Harvey
L Li
Mark Pinese
R Tibshirani
RA Irizarry
Robert L. Sutherland
SA Tomlins
SA Tomlins
Susan M. Henshall
Y Benjamini
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

BACKGROUND: Morphologically similar cancers display heterogeneous patterns of molecular aberrations and follow substantially different clinical courses. This diversity has become the basis for the definition of molecular phenotypes, with significant implications for therapy. Microarray or proteomic expression profiling is conventionally employed to identify disease-associated genes, however, traditional approaches for the analysis of profiling experiments may miss molecular aberrations which define biologically relevant subtypes. METHODOLOGY/PRINCIPAL FINDINGS: Here we present Messina, a method that can identify those genes that only sometimes show aberrant expression in cancer. We demonstrate with simulated data that Messina is highly sensitive and specific when used to identify genes which are aberrantly expressed in only a proportion of cancers, and compare Messina to contemporary analysis techniques. We illustrate Messina by using it to detect the aberrant expression of a gene that may play an important role in pancreatic cancer. CONCLUSIONS/SIGNIFICANCE: Messina allows the detection of genes with profiles typical of markers of molecular subtype, and complements existing methods to assist the identification of such markers. Messina is applicable to any global expression profiling data, and to allow its easy application has been packaged into a freely-available stand-alone software package

CiteSeerX

Public Library of Science (PLOS)

Data analysis issues for allele-specific expression using Illumina's GoldenGate assay.

Author: A Gimelbrant
AC Tan
Antigone S Dimas
AS Dimas
BE Stranger
BJ Main
C Daelemans
Caroline Daelemans
D Serre
Emmanouil T Dermitzakis
GK Smyth
GK Smyth
GK Smyth
HS Lo
HT Bjornsson
International HapMap Consortium
International HapMap Consortium
J Oosting
J Staaf
JB Fan
JC Knight
K Zhang
KB Meyer
KK Dobbin
Matthew E Ritchie
Matthew S Forrest
ME Ritchie
MJ Dunning
MJ Dunning
ML Martin-Magniette
MP Lee
Panagiotis Deloukas
PH van Bilsen
PR Buckland
PV Pant
R Development Core Team
S Davis
Simon Tavaré
X Feng
Publication venue: BMC Bioinformatics
Publication date: 01/01/2010
Field of study

BACKGROUND: High-throughput measurement of allele-specific expression (ASE) is a relatively new and exciting application area for array-based technologies. In this paper, we explore several data sets which make use of Illumina's GoldenGate BeadArray technology to measure ASE. This platform exploits coding SNPs to obtain relative expression measurements for alleles at approximately 1500 positions in the genome. RESULTS: We analyze data from a mixture experiment where genomic DNA samples from pairs of individuals of known genotypes are pooled to create allelic imbalances at varying levels for the majority of SNPs on the array. We observe that GoldenGate has less sensitivity at detecting subtle allelic imbalances (around 1.3 fold) compared to extreme imbalances, and note the benefit of applying local background correction to the data. Analysis of data from a dye-swap control experiment allowed us to quantify dye-bias, which can be reduced considerably by careful normalization. The need to filter the data before carrying out further downstream analysis to remove non-responding probes, which show either weak, or non-specific signal for each allele, was also demonstrated. Throughout this paper, we find that a linear model analysis of the data from each SNP is a flexible modelling strategy that allows for testing of allelic imbalances in each sample when replicate hybridizations are available. CONCLUSIONS: Our analysis shows that local background correction carried out by Illumina's software, together with quantile normalization of the red and green channels within each array, provides optimal performance in terms of false positive rates. In addition, we strongly encourage intensity-based filtering to remove SNPs which only measure non-specific signal. We anticipate that a similar analysis strategy will prove useful when quantifying ASE on Illumina's higher density Infinium BeadChips.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

Apollo (Cambridge)

Archive ouverte UNIGE

Haploid transcriptome analysis reveals allelelic gene expression variants, co-expressed gene groups, and linkages between expression and copy number variation

Author: B Lemos
Brian Boyle
C Fraley
Christian R Landry
CR Landry
D Kliebenstein
GK Smyth
Isabelle Giguère
John MacKay
Jukka-Pekka Verta
Sebastien Caron
W Hsieh
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Transcriptome analyses of mouse and human mammary cell subpopulations reveal multiple conserved genes and pathways

Author: A Orimo
A Raouf
Bhupinder Pal
BP Schneider
C Ginestier
Di Wu
E Charafe-Jauffret
E Lim
Elgene Lim
EY Sum
F Vaillant
François Vaillant
Geoffrey J Lindeman
GK Smyth
GK Smyth
GK Smyth
Gordon K Smyth
H Kendrick
Hideo Yagita
J Stingl
J Stingl
Jane E Visvader
JE Visvader
JE Visvader
JI Herschkowitz
JP Thiery
K Noto
KE Sleeman
LA Carey
M Shackleton
M Shipitsin
M Zhang
Marie-Liesse Asselin-Labat
ME Ritchie
ME Ritchie
MF Holick
ML Asselin-Labat
ML Asselin-Labat
N Barker
P Eirew
R Villadsen
RC Gentleman
RD Cardiff
RW Cho
SR Oakes
SY Hsu
T Bouras
TA DiMeo
Toula Bouras
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

INTRODUCTION: Molecular characterization of the normal epithelial cell types that reside in the mammary gland is an important step toward understanding pathways that regulate self-renewal, lineage commitment, and differentiation along the hierarchy. Here we determined the gene expression signatures of four distinct subpopulations isolated from the mouse mammary gland. The epithelial cell signatures were used to interrogate mouse models of mammary tumorigenesis and to compare with their normal human counterpart subsets to identify conserved genes and networks. METHODS: RNA was prepared from freshly sorted mouse mammary cell subpopulations (mammary stem cell (MaSC)-enriched, committed luminal progenitor, mature luminal and stromal cell) and used for gene expression profiling analysis on the Illumina platform. Gene signatures were derived and compared with those previously reported for the analogous normal human mammary cell subpopulations. The mouse and human epithelial subset signatures were then subjected to Ingenuity Pathway Analysis (IPA) to identify conserved pathways. RESULTS: The four mouse mammary cell subpopulations exhibited distinct gene signatures. Comparison of these signatures with the molecular profiles of different mouse models of mammary tumorigenesis revealed that tumors arising in MMTV-Wnt-1 and p53-/- mice were enriched for MaSC-subset genes, whereas the gene profiles of MMTV-Neu and MMTV-PyMT tumors were most concordant with the luminal progenitor cell signature. Comparison of the mouse mammary epithelial cell signatures with their human counterparts revealed substantial conservation of genes, whereas IPA highlighted a number of conserved pathways in the three epithelial subsets. CONCLUSIONS: The conservation of genes and pathways across species further validates the use of the mouse as a model to study mammary gland development and highlights pathways that are likely to govern cell-fate decisions and differentiation. It is noteworthy that many of the conserved genes in the MaSC population have been considered as epithelial-mesenchymal transition (EMT) signature genes. Therefore, the expression of these genes in tumor cells may reflect basal epithelial cell characteristics and not necessarily cells that have undergone an EMT. Comparative analyses of normal mouse epithelial subsets with murine tumor models have implicated distinct cell types in contributing to tumorigenesis in the different models