Search CORE

14,905 research outputs found

A new non-linear normalization method for reducing variability in DNA microarray experiments

Author: Berka Randy
Brunak Søren
Gautier Laurent
Jarmer Hanne
Jensen Lars Juhl
Knudsen Steen
Nielsen Claus
Nielser Henrik Bjørn
Saxild Hans-Henrik
Workman Christopher
Publication venue: BioMed Central
Publication date: 01/01/2002
Field of study

BACKGROUND: Microarray data are subject to multiple sources of variation, of which biological sources are of interest whereas most others are only confounding. Recent work has identified systematic sources of variation that are intensity-dependent and non-linear in nature. Systematic sources of variation are not limited to the differing properties of the cyanine dyes Cy5 and Cy3 as observed in cDNA arrays, but are the general case for both oligonucleotide microarray (Affymetrix GeneChips) and cDNA microarray data. Current normalization techniques are most often linear and therefore not capable of fully correcting for these effects. RESULTS: We present here a simple and robust non-linear method for normalization using array signal distribution analysis and cubic splines. These methods compared favorably to normalization using robust local-linear regression (lowess). The application of these methods to oligonucleotide arrays reduced the relative error between replicates by 5-10% compared with a standard global normalization method. Application to cDNA arrays showed improvements over the standard method and over Cy3-Cy5 normalization based on dye-swap replication. In addition, a set of known differentially regulated genes was ranked higher by the t-test. In either cDNA or Affymetrix technology, signal-dependent bias was more than ten times greater than the observed print-tip or spatial effects. CONCLUSIONS: Intensity-dependent normalization is important for both high-density oligonucleotide array and cDNA array data. Both the regression and spline-based methods described here performed better than existing linear methods when assessed on the variability of replicate arrays. Dye-swap normalization was less effective at Cy3-Cy5 normalization than either regression or spline-based methods alone

CiteSeerX

PubMed Central

Online Research Database In Technology

A novel normalization method for effective removal of systematic variation in microarray data

Author: Chua Su-Wen
Nissom Peter M.
Vijayakumar Praveen
Wong Victor V.T.
Yam Chew-Yeam
Yang He
Publication venue: Oxford University Press
Publication date: 09/03/2006
Field of study

Normalization of cDNA and oligonucleotide microarray data has become a standard procedure to offset non-biological differences between two samples for accurate identification of differentially expressed genes. Although there are many normalization techniques available, their ability to accurately remove systematic variation has not been sufficiently evaluated. In this study, we performed experimental validation of various normalization methods in order to assess their ability to accurately offset non-biological differences (systematic variation). The limitations of many existing normalization methods become apparent when there are unbalanced shifts in transcript levels. To overcome this limitation, we have proposed a novel normalization method that uses a matching algorithm for the distribution peaks of the expression log ratio. The robustness and effectiveness of this method was evaluated using both experimental and simulated data

Crossref

PubMed Central

Physico-chemical foundations underpinning microarray and next-generation sequencing experiments

Author: A. Buhot
A. E. Pozhitkov
A. Halperin
A. Harrison
A. Ott
Amend
B. M. Pettitt
Berger
Binder
Binder
Binder
Binder
Bolstad
Bullard
Burden
Burden
C. Gibas
C. J. Burden
Chou
Chou
Czypionka
D. P. Kreil
D. Tautz
E. Carlon
Fasold
Fasold
Fiche
Fuchs
H. Binder
Halperin
Harr
Harrison
He
Heim
Held
Hooyberghs
Huettel
Iltumur
Irizarry
Irizarry
Irizarry
Irving
J. Hooyberghs
Kane
L. J. Gamble
Lee
Lee
Letowski
Li
Liebich
Lockhart
Luebke
Marshall
Matveeva
Mueckstein
Mulders
Naiser
Naiser
P. A. Noble
Pingel
Pozhitkov
R. Levicky
Relogio
Rouillard
Tanaka
Trapp
Upton
Vainrub
Wodicka
Yu
Zhang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized

University of Essex Research Repository

Crossref

Hal - Université Grenoble Alpes

PubMed Central

Warwick Research Archives Portal Repository

The Australian National University

HAL-CEA

MPG.PuRe

Can Zipf's law be adapted to normalize microarrays?

Author: Costello Christine M
Croucher Peter JP
Deuschl Günther
Häsler Robert
Lu Tim
Schreiber Stefan
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays)

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Error, reproducibility and sensitivity : a pipeline for data processing of Agilent oligonucleotide expression arrays

Author: AR Dabney
AR Dabney
Benjamin Chain
BM Bolstad
BP Durbin
BS Everitt
CR Hampton
D Wang
E Birney
Helen Bowen
J Fan
J Rasaiyaah
J Rasaiyaah
Jane Rasaiyaah
Jhen Tsang
John Hammond
JP Hammond
L Shi
M Noursadeghi
M Sultan
Mahdad Noursadeghi
MN McCall
PA 't Hoen
TA Patterson
TC Kroll
WE Johnson
Wilfried Posch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2% of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log2 units ( 6% of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells

Central Archive at the University of Reading

Crossref

Springer - Publisher Connector

UCL Discovery

PubMed Central

Warwick Research Archives Portal Repository

Normalized Affymetrix expression data are biased by G-quadruplex formation

Author: Altman
Andrew P. Harrison
Barrett
Bolstad
Burge
Cambon
Do
Dudoit
Eisen
Farhat N. Memon
Geller
Gellert
Giorgi
Graham J. G. Upton
Hammond
Harris
Hochreiter
Hubbell
Hugh P. Shanahan
Irizarry
Irizarry
Iwamoto
Kittleson
Langdon
Li
Memon
Memon
Naef
Patterson
Ringnér
Ryan
Sen
Stalteri
Upton
Upton
Walton
Wu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2011
Field of study

Probes with runs of four or more guanines (G-stacks) in their sequences can exhibit a level of hybridization that is unrelated to the expression levels of the mRNA that they are intended to measure. This is most likely caused by the formation of G-quadruplexes, where inter-probe guanines form Hoogsteen hydrogen bonds, which probes with G-stacks are capable of forming. We demonstrate that for a specific microarray data set using the Human HG-U133A Affymetrix GeneChip and RMA normalization there is significant bias in the expression levels, the fold change and the correlations between expression levels. These effects grow more pronounced as the number of G-stack probes in a probe set increases. Approximately 14 of the probe sets are directly affected. The analysis was repeated for a number of other normalization pipelines and two, FARMS and PLIER, minimized the bias to some extent. We estimate that ∼15 of the data sets deposited in the GEO database are susceptible to the effect. The inclusion of G-stack probes in the affected data sets can bias key parameters used in the selection and clustering of genes. The elimination of these probes from any analysis in such affected data sets outweighs the increase of noise in the signal. © 2011 The Author(s)

University of Essex Research Repository

CiteSeerX

Royal Holloway Research Online

Crossref

Royal Holloway - Pure

PubMed Central

Analyzing Multiple-Probe Microarray: Estimation and Application of Gene Expression Indexes

Author: Hu Jianhua
Huang Jianhua Z.
Maadooliat Mehdi
Publication venue: e-Publications@Marquette
Publication date: 01/09/2012
Field of study

Gene expression index estimation is an essential step in analyzing multiple probe microarray data. Various modeling methods have been proposed in this area. Amidst all, a popular method proposed in Li and Wong (2001) is based on a multiplicative model, which is similar to the additive model discussed in Irizarry et al. (2003a) at the logarithm scale. Along this line, Hu et al. (2006) proposed data transformation to improve expression index estimation based on an ad hoc entropy criteria and naive grid search approach. In this work, we re-examined this problem using a new profile likelihood-based transformation estimation approach that is more statistically elegant and computationally efficient. We demonstrate the applicability of the proposed method using a benchmark Affymetrix U95A spiked-in experiment. Moreover, We introduced a new multivariate expression index and used the empirical study to shows its promise in terms of improving model fitting and power of detecting differential expression over the commonly used univariate expression index. As the other important content of the work, we discussed two generally encountered practical issues in application of gene expression index: normalization and summary statistic used for detecting differential expression. Our empirical study shows somewhat different findings from the MAQC project (MAQC, 2006)

epublications@Marquette

PubMed Central

A Genome-Wide Analysis Reveals Significant Overlap of Transcription and DNA Repair in Stationary Phase Yeast

Author: Abraham Korol
Aviv de Morgan
Eviatar Nevo
Leonid Brodsky
Yechezkel Kashi
Yefim Ronin
Publication venue
Publication date: 27/01/2008
Field of study

The association between transcription and DNA repair is acknowledged as a player in the generation of mutations in a non-random fashion in prokaryotes and eukaryotes. Previous studies demonstrated that the transcription complex is capable of directing DNA repair to sites of transcription. This process is especially important to growth-arrested cells, in which many DNA repair capacities are diminished; it may also lead to mutations preferentially in transcribed genes. Using microarray analysis of growth-arrested yeast cultures, we demonstrated on a genomic scale, the co-localization of a DNA-turnover marker, indicative of DNA-repair-associated DNA synthesis, with genes persistently transcribed during stationary phase. This may serve as a clue regarding the non-random manner in which non-dividing cells may potentially mutate in the absence of replication, solely as a result of their inherent, transcriptional stress response

Nature Precedings

Profound effect of profiling platform and normalization strategy on detection of differentially expressed microRNAs

Author: Kaiser Sebastian
Meyer Swanhild U.
Pfaffl Michael W.
Thirion Christian
Wagner Carola
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/2012
Field of study

Adequate normalization minimizes the effects of systematic technical variations and is a prerequisite for getting meaningful biological changes. However, there is inconsistency about miRNA normalization performances and recommendations. Thus, we investigated the impact of seven different normalization methods (reference gene index, global geometric mean, quantile, invariant selection, loess, loessM, and generalized procrustes analysis) on intra- and inter-platform performance of two distinct and commonly used miRNA profiling platforms. We included data from miRNA profiling analyses derived from a hybridization-based platform (Agilent Technologies) and an RT-qPCR platform (Applied Biosystems). Furthermore, we validated a subset of miRNAs by individual RT-qPCR assays. Our analyses incorporated data from the effect of differentiation and tumor necrosis factor alpha treatment on primary human skeletal muscle cells and a murine skeletal muscle cell line. Distinct normalization methods differed in their impact on (i) standard deviations, (ii) the area under the receiver operating characteristic (ROC) curve, (iii) the similarity of differential expression. Loess, loessM, and quantile analysis were most effective in minimizing standard deviations on the Agilent and TLDA platform. Moreover, loess, loessM, invariant selection and generalized procrustes analysis increased the area under the ROC curve, a measure for the statistical performance of a test. The Jaccard index revealed that inter-platform concordance of differential expression tended to be increased by loess, loessM, quantile, and GPA normalization of AGL and TLDA data as well as RGI normalization of TLDA data. We recommend the application of loess, or loessM, and GPA normalization for miRNA Agilent arrays and qPCR cards as these normalization approaches showed to (i) effectively reduce standard deviations, (ii) increase sensitivity and accuracy of differential miRNA expression detection as well as (iii) increase inter-platform concordance. Results showed the successful adoption of loessM and generalized procrustes analysis to one-color miRNA profiling experiments

Open Access LMU