Search CORE

UCL Discovery

Analyzing Multiple-Probe Microarray: Estimation and Application of Gene Expression Indexes

Author: Hu Jianhua
Huang Jianhua Z.
Maadooliat Mehdi
Publication venue: e-Publications@Marquette
Publication date: 01/09/2012
Field of study

Gene expression index estimation is an essential step in analyzing multiple probe microarray data. Various modeling methods have been proposed in this area. Amidst all, a popular method proposed in Li and Wong (2001) is based on a multiplicative model, which is similar to the additive model discussed in Irizarry et al. (2003a) at the logarithm scale. Along this line, Hu et al. (2006) proposed data transformation to improve expression index estimation based on an ad hoc entropy criteria and naive grid search approach. In this work, we re-examined this problem using a new profile likelihood-based transformation estimation approach that is more statistically elegant and computationally efficient. We demonstrate the applicability of the proposed method using a benchmark Affymetrix U95A spiked-in experiment. Moreover, We introduced a new multivariate expression index and used the empirical study to shows its promise in terms of improving model fitting and power of detecting differential expression over the commonly used univariate expression index. As the other important content of the work, we discussed two generally encountered practical issues in application of gene expression index: normalization and summary statistic used for detecting differential expression. Our empirical study shows somewhat different findings from the MAQC project (MAQC, 2006)

epublications@Marquette

University of Essex Research Repository

Physico-chemical foundations underpinning microarray and next-generation sequencing experiments

Author: A. Buhot
A. E. Pozhitkov
A. Halperin
A. Harrison
A. Ott
Amend
B. M. Pettitt
Berger
Binder
Binder
Binder
Binder
Bolstad
Bullard
Burden
Burden
C. Gibas
C. J. Burden
Chou
Chou
Czypionka
D. P. Kreil
D. Tautz
E. Carlon
Fasold
Fasold
Fiche
Fuchs
H. Binder
Halperin
Harr
Harrison
He
Heim
Held
Hooyberghs
Huettel
Iltumur
Irizarry
Irizarry
Irizarry
Irving
J. Hooyberghs
Kane
L. J. Gamble
Lee
Lee
Letowski
Li
Liebich
Lockhart
Luebke
Marshall
Matveeva
Mueckstein
Mulders
Naiser
Naiser
P. A. Noble
Pingel
Pozhitkov
R. Levicky
Relogio
Rouillard
Tanaka
Trapp
Upton
Vainrub
Wodicka
Yu
Zhang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized

Hal - Université Grenoble Alpes

The Australian National University

HAL-CEA

MPG.PuRe

Can Zipf's law be adapted to normalize microarrays?

Author: Costello Christine M
Croucher Peter JP
Deuschl Günther
Häsler Robert
Lu Tim
Schreiber Stefan
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays)

Central Archive at the University of Reading

Using genomic DNA-based probe-selection to improve the sensitivity of high-density oligonucleotide arrays when applied to heterologous species

Author: Broadley Martin R.
Craigon David J.
Emmerson Zoe F.
Hammond John P.
Higgins Janet
May Sean T.
Townsend Henrik J.
White Philip J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

High-density oligonucleotide (oligo) arrays are a powerful tool for transcript profiling. Arrays based on GeneChip® technology are amongst the most widely used, although GeneChip® arrays are currently available for only a small number of plant and animal species. Thus, we have developed a method to improve the sensitivity of high-density oligonucleotide arrays when applied to heterologous species and tested the method by analysing the transcriptome of Brassica oleracea L., a species for which no GeneChip® array is available, using a GeneChip® array designed for Arabidopsis thaliana (L.) Heynh. Genomic DNA from B. oleracea was labelled and hybridised to the ATH1-121501 GeneChip® array. Arabidopsis thaliana probe-pairs that hybridised to the B. oleracea genomic DNA on the basis of the perfect-match (PM) probe signal were then selected for subsequent B. oleracea transcriptome analysis using a .cel file parser script to generate probe mask files. The transcriptional response of B. oleracea to a mineral nutrient (phosphorus; P) stress was quantified using probe mask files generated for a wide range of gDNA hybridisation intensity thresholds. An example probe mask file generated with a gDNA hybridisation intensity threshold of 400 removed > 68 % of the available PM probes from the analysis but retained >96 % of available A. thaliana probe-sets. Ninety-nine of these genes were then identified as significantly regulated under P stress in B. oleracea, including the homologues of P stress responsive genes in A. thaliana. Increasing the gDNA hybridisation intensity thresholds up to 500 for probe-selection increased the sensitivity of the GeneChip® array to detect regulation of gene expression in B. oleracea under P stress by up to 13-fold. Our open-source software to create probe mask files is freely available http://affymetrix.arabidopsis.info/xspecies/ webcite and may be used to facilitate transcriptomic analyses of a wide range of plant and animal species in the absence of custom arrays

Public Library of Science (PLOS)

Probing host pathogen cross-talk by transcriptional profiling of both Mycobacterium tuberculosis and infected human dendritic cells and macrophages

Author: A Kolb-Maurer
A Kumar
A Savina
A Yamauchi
A Yoshimura
Alessandra Mortellaro
AM Talaat
AM Talaat
Antoine Tanne
B Hutter
B Pron
BM Bolstad
Brigitte Gicquel
C Nathan
CH Wang
D Chaussabel
D Lombardi
D Schnappinger
D Schnappinger
D Werling
Derya Unutmaz
DG Russell
DJ Beste
DM Roberts
E Giacomini
E Werner
F Niedergang
FS Machado
G Cappelli
G Harth
G Sulzenbacher
GJ Nau
GK Smyth
GK Smyth
GR Stewart
H Ohno
H Rachman
HD Park
I Vergne
J Bacon
JA Mangan
JC Betts
JC Boldrick
JD McKinney
JP Wang
L Shi
L Tailleux
LP Barker
LR Camacho
Ludovic Tailleux
M Kanehisa
M Kanehisa
M Pelizzola
MA Harris
Maria Foti
Mattia Pelizzola
MB Eisen
MI Voskuil
Michael Withers
MM Westcott
MV Tullius
N Mohagheghpour
Neil G. Stoker
Olivier Neyrolles
P Ricciardi-Castagnoli
Paola Ricciardi Castagnoli
PC Karakousis
Philip D. Butcher
R Van der Geize
RA Henderson
RA Irizarry
S Ehrt
S Nicholson
S Ragno
S Stenger
S Sturgill-Koszycki
Simon J. Waddell
SJ Waddell
SJ Waddell
SL Kendall
SL Kendall
ST Cole
T Kawai
T Parish
X Jiao
Y Hu
Y Yuan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2008
Field of study

This study provides the proof of principle that probing the host and the microbe transcriptomes simultaneously is a valuable means to accessing unique information on host pathogen interactions. Our results also underline the extraordinary plasticity of host cell and pathogen responses to infection, and provide a solid framework to further understand the complex mechanisms involved in immunity to M. tuberculosis and in mycobacterial adaptation to different intracellular environments

St George's Online Research Archive

Sussex Research Online

HAL-Pasteur

ScholarBank@NUS

Motif effects in Affymetrix GeneChips seriously affect probe intensities

Author: Andrew P. Harrison
Arteaga-Salas
Arteaga-Salas
Barrett
Binder
Binder
Burden
Do
Fasold
Friedman
Geller
Gharaibeh
Graham J. G. Upton
Irizarry
Irizarry
Kerkhoven
Langdon
McCall
Mei
Memon
Memon
Mulders
Naef
Shanahan
Upton
Upton
Upton
Wanke
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

An Affymetrix GeneChip consists of an array of hundreds of thousands of probes (each a sequence of 25 bases) with the probe values being used to infer the extent to which genes are expressed in the biological material under investigation. In this article, we demonstrate that these probe values are also strongly influenced by their precise base sequence. We use data from >28 000 CEL files relating to 10 different Affymetrix GeneChip platforms and involving nearly 1000 experiments. Our results confirm known effects (those due to the T7-primer and the formation of G-quadruplexes) but reveal other effects. We show that there can be huge variations from one experiment to another, and that there may also be sizeable disparities between batches within an experiment and between CEL files within a batch. © 2012 The Author(s)

University of Essex Research Repository

CiteSeerX

Harvard University - DASH

Recommended from our members

Evaluation of Normalization Procedures for Oligonucleotide Array Data Based On Spiked cRNA Controls

Author: Brown Eugene L
Hill Andrew A.
Hunter Craig P.
Slonim Donna K
Tucker-Kellogg Greg
Whitley Maryann Z
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/10/2010
Field of study

Background: Affymetrix oligonucleotide arrays simultaneously measure the abundances of thousands of mRNAs in biological samples. Comparability of array results is necessary for the creation of large-scale gene expression databases. The standard strategy for normalizing oligonucleotide array readouts has practical drawbacks. We describe alternative normalization procedures for oligonucleotide arrays based on a common pool of known biotin-labeled cRNAs spiked into each hybridization. Results: We first explore the conditions for validity of the 'constant mean assumption', the key assumption underlying current normalization methods. We introduce 'frequency normalization', a 'spike-in'-based normalization method which estimates array sensitivity, reduces background noise and allows comparison between array designs. This approach does not rely on the constant mean assumption and so can be effective in conditions where standard procedures fail. We also define 'scaled frequency', a hybrid normalization method relying on both spiked transcripts and the constant mean assumption while maintaining all other advantages of frequency normalization. We compare these two procedures to a standard global normalization method using experimental data. We also use simulated data to estimate accuracy and investigate the effects of noise. We find that scaled frequency is as reproducible and accurate as global normalization while offering several practical advantages. Conclusions: Scaled frequency quantitation is a convenient, reproducible technique that performs as well as global normalization on serial experiments with the same array design, while offering several additional features. Specifically, the scaled-frequency method enables the comparison of expression measurements across different array designs, yields estimates of absolute message abundance in cRNA and determines the sensitivity of individual arrays.Molecular and Cellular Biolog

Optimising the analysis of transcript data using high density oligonucleotide arrays and genomic DNA-based probe selection

Author: Broadley Martin R.
Graham Neil S.
Hammond John P.
May Sean T.
White Philip J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Background: Affymetrix GeneChip arrays are widely used for transcriptomic studies in a diverse range of species. Each gene is represented on a GeneChip array by a probe-set, consisting of up to 16 probe-pairs. Signal intensities across probe-pairs within a probe-set vary in part due to different physical hybridisation characteristics of individual probes with their target labelled transcripts. We have previously developed a technique to study the transcriptomes of heterologous species based on hybridising genomic DNA (gDNA) to a GeneChip array designed for a different species, and subsequently using only those probes with good homology. Results: Here we have investigated the effects of hybridising homologous species gDNA to study the transcriptomes of species for which the arrays have been designed. Genomic DNA from Arabidopsis thaliana and rice (Oryza sativa) were hybridised to the Affymetrix Arabidopsis ATH1 and Rice Genome GeneChip arrays respectively. Probe selection based on gDNA hybridisation intensity increased the number of genes identified as significantly differentially expressed in two published studies of Arabidopsis development, and optimised the analysis of technical replicates obtained from pooled samples of RNA from rice. Conclusion: This mixed physical and bioinformatics approach can be used to optimise estimates of gene expression when using GeneChip arrays

Central Archive at the University of Reading