Search CORE

1,174 research outputs found

Application of Pearson correlation coefficient (PCC) and Kolmogorov-Smirnov distance (KSD) metrics to identify disease-specific biomarker genes

Author: Hung-Chung Huang
P Jafari
Siyuan Zheng
UR Chandran
VG Tusher
Zhongming Zhao
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

EPEPT: A web service for enhanced P-value estimation in permutation tests

Author: A Subramanian
B Efron
E Edgington
G Benson
Hector Rovira
Ilya Shmulevich
J Boyle
Jake Lin
John Boyle
R Deidda
TA Knijnenburg
Theo A Knijnenburg
VG Tusher
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

MADAM - An open source meta-analysis toolbox for R and Bioconductor

Author: Armin Graber
DR Rhodes
F Hong
I Borozan
JD Storey
JK Choi
Karl G Kugler
Laurin AJ Mueller
O Troyanskaya
RA Fisher
RC Gentleman
VG Tusher
Y Moreau
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Mulcom: a multiple comparison statistical test for microarray data in Bioconductor

Author: A Gentile
A Mira
C Dunnett
Claudio Isella
Davide Corà
DB Allison
Enzo Medico
GK Smyth
L Gautier
R: Development core team
RC Gentleman
S Fagoonee
S Tardito
Tommaso Renzulli
VG Tusher
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Many microarray experiments compare a common control group with several ”test ” groups, like in the case, for example of a time-course experiments where time zero serves as a common reference point. The MulCom package described here implements the Dunnett’s t-test, which has been specifically developed to handle multiple comparisons against a common reference, in a version tailored for genomic data analysis that we named MulCom (Multiple Comparisons) test. The implementation includes two test parameters, namely the t value and an optional minimal fold-change value, m, with automated, permutation-based estimation of False Discovery Rate (FDR) for parameter combinations of choice. The package permits automated optimization of the test parameters to obtain the maximum number of significant genes at a given FDR value. In this vignette we present the rationale, implementation and usage of the MulCom package, plus a practical application on a time-course microarra

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Institutional Research Information System University of Turin

KC-SMARTR: An R package for detection of statistically significant aberrations in multi-experiment aCGH data

Author: Arno Velds
C Klijn
Christiaan Klijn
D Hanahan
ES Venkatraman
H Fiegler
H Holstege
Henne Holstege
Jorma J de Ronde
Jos Jonkers
K Chin
Lodewyk FA Wessels
Marcel JT Reinders
VG Tusher
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Most approaches used to find recurrent or differential DNA Copy Number Alterations (CNA) in array Comparative Genomic Hybridization (aCGH) data from groups of tumour samples depend on the discretization of the aCGH data to gain, loss or no-change states. This causes loss of valuable biological information in tumour samples, which are frequently heterogeneous. We have previously developed an algorithm, KC-SMART, that bases its estimate of the magnitude of the CNA at a given genomic location on kernel convolution (Klijn et al., 2008). This accounts for the intensity of the probe signal, its local genomic environment and the signal distribution across multiple samples. Results: Here we extend the approach to allow comparative analyses of two groups of samples and introduce the R implementation of these two approaches. The comparative module allows for a supervised analysis to be performed, to enable the identification of regions that are differentially aberrated between two user-defined classes. We analyzed data from a series of B- and T-cell lymphomas and were able to retrieve all positive control regions (VDJ regions) in addition to a number of new regions. A t-test employing segmented data, that we implemented, was also able to locate all the positive control regions and a number of new regions but these regions were highly fragmented. Conclusions: KC-SMARTR offers recurrent CNA and class specific CNA detection, at different genomic scales, in a single package without the need for additional segmentation. It is memory efficient and runs on a wide range of machines. Most importantly, it does not rely on data discretization and therefore maximally exploits the biological information in the aCGH data.MediamaticsElectrical Engineering, Mathematics and Computer Scienc

Crossref

TU Delft Repository

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A powerful method for detecting differentially expressed genes from GeneChip arrays that does not require replicates

Author: AK Hein
Anne-Mette K Hein
B Efron
CM Kendziorski
DB Allison
GK Smyth
KK Lin
P Baldi
R Gottardo
RA Irizarry
RC Gentleman
S Richardson
SE Choe
Sylvia Richardson
VG Tusher
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Studies of differential expression that use Affymetrix GeneChip arrays are often carried out with a limited number of replicates. Reasons for this include financial considerations and limits on the available amount of RNA for sample preparation. In addition, failed hybridizations are not uncommon leading to a further reduction in the number of replicates available for analysis. Most existing methods for studying differential expression rely on the availability of replicates and the demand for alternative methods that require few or no replicates is high. RESULTS: We describe a statistical procedure for performing differential expression analysis without replicates. The procedure relies on a Bayesian integrated approach (BGX) to the analysis of Affymetrix GeneChips. The BGX method estimates a posterior distribution of expression for each gene and condition, from a simultaneous consideration of the available probe intensities representing the gene in a condition. Importantly, posterior distributions of expression are obtained regardless of the number of replicates available. We exploit these posterior distributions to create ranked gene lists that take into account the estimated expression difference as well as its associated uncertainty. We estimate the proportion of non-differentially expressed genes empirically, allowing an informed choice of cut-off for the ranked gene list, adapting an approach proposed by Efron. We assess the performance of the method, and compare it to those of other methods, on publicly available spike-in data sets, as well as in a proper biological setting. CONCLUSION: The method presented is a powerful tool for extracting information on differential expression from GeneChip expression studies with limited or no replicates

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Patterns of subnet usage reveal distinct scales of regulation in the transcriptional regulatory network of Escherichia coli

Author: A Travers
C Marr
Carsten Marr
DP Sangurdekar
E Krause
Fabian J. Theis
G Balázsi
H Yu
J Vogel
J Ward Jr
JD Glasner
JJ Faith
Larry S. Liebovitch
M. Madan Babu
Marc-Thorsten Hütt
MJ Herrgard
N Blot
N Sonnenschein
NM Luscombe
O Alter
Q Cui
R Milo
R Milo
RM Gutierrez-Rios
S Gama-Castro
S Gottesman
S Mangan
S Mangan
S Mangan
SS Shen-Orr
T Beissbarth
VG Tusher
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2010
Field of study

The set of regulatory interactions between genes, mediated by transcription factors, forms a species' transcriptional regulatory network (TRN). By comparing this network with measured gene expression data one can identify functional properties of the TRN and gain general insight into transcriptional control. We define the subnet of a node as the subgraph consisting of all nodes topologically downstream of the node, including itself. Using a large set of microarray expression data of the bacterium Escherichia coli, we find that the gene expression in different subnets exhibits a structured pattern in response to environmental changes and genotypic mutation. Subnets with less changes in their expression pattern have a higher fraction of feed-forward loop motifs and a lower fraction of small RNA targets within them. Our study implies that the TRN consists of several scales of regulatory organization: 1) subnets with more varying gene expression controlled by both transcription factors and post-transcriptional RNA regulation, and 2) subnets with less varying gene expression having more feed-forward loops and less post-transcriptional RNA regulation.Comment: 14 pages, 8 figures, to be published in PLoS Computational Biolog

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

City University of New York

Crossref

Directory of Open Access Journals

PubMed Central

PuSH

Extended analysis of benchmark datasets for Agilent two-color microarrays

Author: Affymetrix
C Li
E Hubbell
External RNA Controls Consortium
Kathleen F Kerr
KF Kerr
L Guo
L Shi
LX Qin
M Zahurak
R Shippy
RA Irizarry
RD Canales
SC Baker
TA Patterson
VG Tusher
WD Tong
YH Yang
ZJ Wu
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background As part of its broad and ambitious mission, the MicroArray Quality Control (MAQC) project reported the results of experiments using External RNA Controls (ERCs) on five microarray platforms. For most platforms, several different methods of data processing were considered. However, there was no similar consideration of different methods for processing the data from the Agilent two-color platform. While this omission is understandable given the scale of the project, it can create the false impression that there is consensus about the best way to process Agilent two-color data. It is also important to consider whether ERCs are representative of all the probes on a microarray. Results A comparison of different methods of processing Agilent two-color data shows substantial differences among methods for low-intensity genes. The sensitivity and specificity for detecting differentially expressed genes varies substantially for different methods. Analysis also reveals that the ERCs in the MAQC data only span the upper half of the intensity range, and therefore cannot be representative of all genes on the microarray. Conclusion Although ERCs demonstrate good agreement between observed and expected log-ratios on the Agilent two-color platform, such an analysis is incomplete. Simple loess normalization outperformed data processing with Agilent's Feature Extraction software for accurate identification of differentially expressed genes. Results from studies using ERCs should not be over-generalized when ERCs are not representative of all probes on a microarray.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Examining smoking-induced differential gene expression changes in buccal mucosa

Author: A Spira
A Subramanian
AI Saeed
BM Bolstad
C Wu
Dennis Burian
Doris M Kupfer
J Hellemans
M Vondracek
Marita C Jenkins
MD Thompson
O Ceder
R Breitling
R Vadigepalli
RA Irizarry
S Sridhar
SD Spivack
VG Tusher
Vicky L White
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Gene expression changes resulting from conditions such as disease, environmental stimuli, and drug use, can be monitored in the blood. However, a less invasive method of sample collection is of interest because of the discomfort and specialized personnel necessary for blood sampling especially if multiple samples are being collected. Buccal mucosa cells are easily collected and may be an alternative sample material for biomarker testing. A limited number of studies, primarily in the smoker/oral cancer literature, address this tissue's efficacy as an RNA source for expression analysis. The current study was undertaken to determine if total RNA isolated from buccal mucosa could be used as an alternative tissue source to assay relative gene expression. Methods Total RNA was isolated from swabs, reverse transcribed and amplified. The amplified cDNA was used in RT-qPCR and microarray analyses to evaluate gene expression in buccal cells. Initially, RT-qPCR was used to assess relative transcript levels of four genes from whole blood and buccal cells collected from the same seven individuals, concurrently. Second, buccal cell RNA was used for microarray-based differential gene expression studies by comparing gene expression between a group of female smokers and nonsmokers. Results An amplification protocol allowed use of less buccal cell total RNA (50 ng) than had been reported previously with human microarrays. Total RNA isolated from buccal cells was degraded but was of sufficient quality to be used with RT-qPCR to detect expression of specific genes. We report here the finding of a small number of statistically significant differentially expressed genes between smokers and nonsmokers, using buccal cells as starting material. Gene Set Enrichment Analysis confirmed that these genes had a similar expression pattern to results from another study. Conclusions Our results suggest that despite a high degree of degradation, RNA from buccal cells from cheek mucosa could be used to detect differential gene expression between smokers and nonsmokers. However the RNA degradation, increase in sample variability and microarray failure rate show that buccal samples should be used with caution as source material in expression studies.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A close examination of double filtering with fold change and t test in microarray analysis

Author: AE Gelfand
G Casella
I Hedenfalk
I Lonnstedt
J Cao
JD Storey
JD Storey
Jing Cao
M Sauer
MA Newton
MM Kittleson
N Jain
P Baldi
P Quinn
R Opgen-Rhein
RA Irizarry
SE Choe
Song Zhang
T Han
VG Tusher
X Cui
Y Li
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Many researchers use the double filtering procedure with fold change and <it>t </it>test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while <it>t </it>statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central