Search CORE

3 research outputs found

A robust measure of correlation between two genes on a microarray

Author: A Beaton
Aya Mitani
B Zhang
Brian VanKoten
C Brown
C Glasbey
D Jiang
D Rocke
DM Rocke
E Hubbel
E Marshall
E Schadt
F Mosteller
G Davidson
H Lopuhaä
HP Lopuhaa
I Gat-Vilks
J Ioannidis
J Qin
J Tukey
Johanna Hardin
K Kafadar
K Kafadar
K Yeung
L Dodd
L Heyer
Leanne Hicks
M Eisen
P Huber
P Rousseeuw
P Rousseeuw
P Spellman
R Wilcox
S Bergmann
S Carter
S Chu
S Datta
S Dudoit
T Golub
Toxicogenomics Research Consortium
X Wang
X Wang
Y Yang
Y Yang
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The underlying goal of microarray experiments is to identify gene expression patterns across different experimental conditions. Genes that are contained in a particular pathway or that respond similarly to experimental conditions could be co-expressed and show similar patterns of expression on a microarray. Using any of a variety of clustering methods or gene network analyses we can partition genes of interest into groups, clusters, or modules based on measures of similarity. Typically, Pearson correlation is used to measure distance (or similarity) before implementing a clustering algorithm. Pearson correlation is quite susceptible to outliers, however, an unfortunate characteristic when dealing with microarray data (well known to be typically quite noisy.) Results We propose a resistant similarity metric based on Tukey's biweight estimate of multivariate scale and location. The resistant metric is simply the correlation obtained from a resistant covariance matrix of scale. We give results which demonstrate that our correlation metric is much more resistant than the Pearson correlation while being more efficient than other nonparametric measures of correlation (e.g., Spearman correlation.) Additionally, our method gives a systematic gene flagging procedure which is useful when dealing with large amounts of noisy data. Conclusion When dealing with microarray data, which are known to be quite noisy, robust methods should be used. Specifically, robust distances, including the biweight correlation, should be used in clustering and gene network analysis.</p

Scholarship@Claremont

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Scatterplot of all pairwise correlations of the 1000 most variable genes in the yeast data

Author: Aya Mitani (77277)
Brian VanKoten (77279)
Johanna Hardin (77276)
Leanne Hicks (77278)
Publication venue
Publication date
Field of study

Copyright information:Taken from "A robust measure of correlation between two genes on a microarray"http://www.biomedcentral.com/1471-2105/8/220BMC Bioinformatics 2007;8():220-220.Published online 25 Jun 2007PMCID:PMC1929126. The blackest hexagons represent 9,556 pairs of genes. The lightest hexagons represent one pair of genes. Notice that, though most of the points lie near the line y = x, there are many pairs of genes that give quite different correlations when measured with Pearson's or the biweight. Each letter refers to a gene pair which will be described in figures (2), (3), and (4)

The Francis Crick Institute