Search CORE

130 research outputs found

Analysis with respect to instrumental variables for the exploration of microarray data structures

Author: AC Culhane
AC Culhane
AL Wollenberg
CJF Ter Braak
CJF Ter Braak
CR Rao
D Chessel
Florent Baty
GJ Dennis
H Martens
Jan Wiegand
JD Lebreton
Joseph Schwager
K Fellenberg
Martin H Brutsche
Michaël Facompré
NC Kenkel
O Alter
Q Tan
R Development Core Team
R Sabatier
RA Irizarry
RC Gentleman
S Dolédec
S Dolédec
S Dolédec
S Dray
S Perelman
The Gene Ontology Consortium
V Makarenkov
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Evaluating the importance of the different sources of variations is essential in microarray data experiments. Complex experimental designs generally include various factors structuring the data which should be taken into account. The objective of these experiments is the exploration of some given factors while controlling other factors. RESULTS: We present here a family of methods, the analyses with respect to instrumental variables, which can be easily applied to the particular case of microarray data. An illustrative example of analysis with instrumental variables is given in the case of microarray data investigating the effect of beverage intake on peripheral blood gene expression. This approach is compared to an ANOVA-based gene-by-gene statistical method. CONCLUSION: Instrumental variables analyses provide a simple way to control several sources of variation in a multivariate analysis of microarray data. Due to their flexibility, these methods can be associated with a large range of ordination techniques combined with one or several qualitative and/or quantitative descriptive variables

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Public data and open source tools for multi-assay genomic investigation of disease

Author: Carey VJ
Castelo R
Culhane AC
Davis S
El-Hachem N
Gendoo DM
Gomez-Cabrero D
Haibe-Kains B
Hansen KD
Kannan L
Morgan M
Ramos M
Re A
Safikhani Z
Waldron L.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources; however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for accessing publicly available data and tools to support development of needed integrative methods

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Investigating the global genomic diversity of Escherichia coli using a multi-genome DNA microarray platform with novel gene prediction strategies

Author: A Relogio
AC Culhane
AC Culhane
AO Carter
BM Bolstad
CR Laing
DA Rasko
FR Blattner
H Ochman
Isha R Patel
J Letowski
Joseph E LeClerc
K Tamura
K Tamura
L Gautier
LM Wick
LW Riley
MA Karmali
MA Karmali
ML Kotewicz
ML Kotewicz
NT Perna
O Tenaillon
PS Mead
R Development Core Team (2010)
RA Welch
SA Jackson
SA Jackson
Scott A Jackson
T Hayashi
T Wirth
TA Cebula
TA Cebula
Tammy Barnaba
Thomas A Cebula
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The gene content of a diverse group of 183 unique <it>Escherichia coli </it>and <it>Shigella </it>isolates was determined using the Affymetrix GeneChip® <it>E. coli </it>Genome 2.0 Array, originally designed for transcriptome analysis, as a genotyping tool. The probe set design utilized by this array provided the opportunity to determine the gene content of each strain very accurately and reliably. This array constitutes 10,112 independent genes representing four individual <it>E. coli </it>genomes, therefore providing the ability to survey genes of several different pathogen types. The entire ECOR collection, 80 EHEC-like isolates, and a diverse set of isolates from our FDA strain repository were included in our analysis. Results From this study we were able to define sets of genes that correspond to, and therefore define, the EHEC pathogen type. Furthermore, our sampling of 63 unique strains of O157:H7 showed the ability of this array to discriminate between closely related strains. We found that individual strains of O157:H7 differed, on average, by 197 probe sets. Finally, we describe an analysis method that utilizes the power of the probe sets to determine accurately the presence/absence of each gene represented on this array. Conclusions These elements provide insights into understanding the microbial diversity that exists within extant <it>E. coli </it>populations. Moreover, these data demonstrate that this novel microarray-based analysis is a powerful tool in the field of molecular epidemiology and the newly emerging field of microbial forensics.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Supervised multivariate analysis of sequence groups to identify specificity determining residues

Author: A Carro
A del Sol Mesa
AC Culhane
AC Culhane
AR Fersht
CD Livingstone
CL Tucker
D Charif
Desmond G Higgins
DG Higgins
DH Morgan
E Beitz
F Pazos
G Casari
G Zhang
H Yao
HM Wilks
Iain M Wallace
J Thioulouse
JC Gower
JD Thompson
JG Henikoff
KM Mayer
L Yuan
LA Mirny
M Clamp
N Saitou
O Lichtarge
OV Kalinina
OV Kalinina
RC Gentleman
RD Finn
RJ Edwards
S Dolédec
S Henikoff
SJ Hubbard
SS Hannenhalli
TD Schneider
V Vacic
W Pirovano
WR Atchley
X Gu
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Proteins that evolve from a common ancestor can change functionality over time, and it is important to be able identify residues that cause this change. In this paper we show how a supervised multivariate statistical method, Between Group Analysis (BGA), can be used to identify these residues from families of proteins with different substrate specifities using multiple sequence alignments. Results We demonstrate the usefulness of this method on three different test cases. Two of these test cases, the Lactate/Malate dehydrogenase family and Nucleotidyl Cyclases, consist of two functional groups. The other family, Serine Proteases consists of three groups. BGA was used to analyse and visualise these three families using two different encoding schemes for the amino acids. Conclusion This overall combination of methods in this paper is powerful and flexible while being computationally very fast and simple. BGA is especially useful because it can be used to analyse any number of functional classes. In the examples we used in this paper, we have only used 2 or 3 classes for demonstration purposes but any number can be used and visualised.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Integrated Analysis of Multiple Microarray Datasets Identifies a Reproducible Survival Predictor in Ovarian Cancer

BACKGROUND: Public data integration may help overcome challenges in clinical implementation of microarray profiles. We integrated several ovarian cancer datasets to identify a reproducible predictor of survival. METHODOLOGY/PRINCIPAL FINDINGS: Four microarray datasets from different institutions comprising 265 advanced stage tumors were uniformly reprocessed into a single training dataset, also adjusting for inter-laboratory variation ("batch-effect"). Supervised principal component survival analysis was employed to identify prognostic models. Models were independently validated in a 61-patient cohort using a custom array genechip and a publicly available 229-array dataset. Molecular correspondence of high- and low-risk outcome groups between training and validation datasets was demonstrated using Subclass Mapping. Previously established molecular phenotypes in the 2(nd) validation set were correlated with high and low-risk outcome groups. Functional representational and pathway analysis was used to explore gene networks associated with high and low risk phenotypes. A 19-gene model showed optimal performance in the training set (median OS 31 and 78 months, p < 0.01), 1(st) validation set (median OS 32 months versus not-yet-reached, p = 0.026) and 2(nd) validation set (median OS 43 versus 61 months, p = 0.013) maintaining independent prognostic power in multivariate analysis. There was strong molecular correspondence of the respective high- and low-risk tumors between training and 1(st) validation set. Low and high-risk tumors were enriched for favorable and unfavorable molecular subtypes and pathways, previously defined in the public 2(nd) validation set. CONCLUSIONS/SIGNIFICANCE: Integration of previously generated cancer microarray datasets may lead to robust and widely applicable survival predictors. These predictors are not simply a compilation of prognostic genes but appear to track true molecular phenotypes of good- and poor-outcome

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Stability of gene contributions and identification of outliers in multivariate analysis of microarray data

Author: A Bhattacharjee
A Spira
AC Culhane
B Efron
CR Rao
D Chessel
DA Jackson
DA Jackson
Daniel Jaeger
F Baty
F Westad
F Westad
Florent Baty
Frank Preiswerk
GJ Dennis
H Martens
J Tukey
K Fellenberg
L Lebart
L Wouters
M Greenacre
M Quenouille
Martin H Brutsche
Martin M Schumacher
MWJ Milan
O Alter
PR Peres-Neto
Q Tan
Q Tan
RM Rutherford
S Dray
TJ Ringrose
U Böckenholt
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

BACKGROUND: Multivariate ordination methods are powerful tools for the exploration of complex data structures present in microarray data. These methods have several advantages compared to common gene-by-gene approaches. However, due to their exploratory nature, multivariate ordination methods do not allow direct statistical testing of the stability of genes. RESULTS: In this study, we developed a computationally efficient algorithm for: i) the assessment of the significance of gene contributions and ii) the identification of sample outliers in multivariate analysis of microarray data. The approach is based on the use of resampling methods including bootstrapping and jackknifing. A statistical package of R functions was developed. This package includes tools for both inferring the statistical significance of gene contributions and identifying outliers among samples. CONCLUSION: The methodology was successfully applied to three published data sets with varying levels of signal intensities. Its relevance was compared with alternative methods. Overall, it proved to be particularly effective for the evaluation of the stability of microarray data

Crossref

Springer - Publisher Connector

edoc

PubMed Central

A Resource for Discovering Specific and Universal Biomarkers for Distributed Stem Cells

Author: A Cicalese
A Mirza
AC Culhane
AC Culhane
AD Boiko
AG Knudson
AJ Quyn
AV Capuco
B Boivin
CJ Luckey
DJ Rossi
F Hong
G Matioli
G Watanabe
H-S Lee
HJ Snippert
HJ Snippert
IL Weissman
J Cairns
J-F Paré
JA Martinez-Climent
James L. Sherley
Janet L. Smith
JB Kim
JK Sax
JL Sherley
JL Sherley
JL Sherley
JL Sherley
JL Sherley
JL Sherley
JL Sherley
JL Sherley
Joseph Najbauer
JR Merok
JR Merok
K Kannan
KB Spurgers
L Becker
L Berglund
L Rambhatla
L Rambhatla
L Zhu
LG Lajtha
M Dai
M Kondo
M Loeffler
M Noh
M Zhang
MA Harris
Minsoo Noh
N Barker
N Barker
N Blackett
N Gévry
NO Fortunel
P Sampath
R Taghizadeh
RA Irizarry
S Huang
SJ Morrison
SR Pine
T Enver
T Reya
V Jaks
Y Liu
Y Liu
Yang Hoon Huh
YH Loh
Z Darzynkiewicz
Z Tothova
Publication venue: Public Library of Science
Publication date: 19/07/2011
Field of study

Specific and universal biomarkers for distributed stem cells (DSCs) have been elusive. A major barrier to discovery of such ideal DSC biomarkers is difficulty in obtaining DSCs in sufficient quantity and purity. To solve this problem, we used cell lines genetically engineered for conditional asymmetric self-renewal, the defining DSC property. In gene microarray analyses, we identified 85 genes whose expression is tightly asymmetric self-renewal associated (ASRA). The ASRA gene signature prescribed DSCs to undergo asymmetric self-renewal to a greater extent than committed progenitor cells, embryonic stem cells, or induced pluripotent stem cells. This delineation has several significant implications. These include: 1) providing experimental evidence that DSCs in vivo undergo asymmetric self-renewal as individual cells; 2) providing an explanation why earlier attempts to define a common gene expression signature for DSCs were unsuccessful; and 3) predicting that some ASRA proteins may be ideal biomarkers for DSCs. Indeed, two ASRA proteins, CXCR6 and BTG2, and two other related self-renewal pattern associated (SRPA) proteins identified in this gene resource, LGR5 and H2A.Z, display unique asymmetric patterns of expression that have a high potential for universal and specific DSC identification

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Combinations of newly confirmed Glioma-Associated loci link regions on chromosomes 1 and 9 to increased disease risk

Author: A Bernet
A Bisio
A Rzhetsky
A Subramanian
A Takehara
AC Culhane
AD Skol
AL Price
B Linghu
Charles DeLisi
CJ Sherr
D Hanahan
DS Lee
H Azuma
J Park
JU Kang
Jui-Hung Hung
K Huebner
KG Becker
KI Goh
LA Boardman
M Abdul
M Kanehisa
M Reyes-Mugica
M Wrensch
Mark Kon
Network CGAR
PJ Whiting
S Frank
S Leidel
S Purcell
S Shete
SN Stacey
Tun-Hsiang Yang
VA McKusick
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Glioblastoma multiforme (GBM) tends to occur between the ages of 45 and 70. This relatively early onset and its poor prognosis make the impact of GBM on public health far greater than would be suggested by its relatively low frequency. Tissue and blood samples have now been collected for a number of populations, and predisposing alleles have been sought by several different genome-wide association (GWA) studies. The Cancer Genome Atlas (TCGA) at NIH has also collected a considerable amount of data. Because of the low concordance between the results obtained using different populations, only 14 predisposing single nucleotide polymorphism (SNP) candidates in five genomic regions have been replicated in two or more studies. The purpose of this paper is to present an improved approach to biomarker identification. Methods Association analysis was performed with control of population stratifications using the EIGENSTRAT package, under the null hypothesis of "no association between GBM and control SNP genotypes," based on an additive inheritance model. Genes that are strongly correlated with identified SNPs were determined by linkage disequilibrium (LD) or expression quantitative trait locus (eQTL) analysis. A new approach that combines meta-analysis and pathway enrichment analysis identified additional genes. Results (i) A meta-analysis of SNP data from TCGA and the Adult Glioma Study identifies 12 predisposing SNP candidates, seven of which are reported for the first time. These SNPs fall in five genomic regions (5p15.33, 9p21.3, 1p21.2, 3q26.2 and 7p15.3), three of which have not been previously reported. (ii) 25 genes are strongly correlated with these 12 SNPs, eight of which are known to be cancer-associated. (iii) The relative risk for GBM is highest for risk allele combinations on chromosomes 1 and 9. (iv) A combined meta-analysis/pathway analysis identified an additional four genes. All of these have been identified as cancer-related, but have not been previously associated with glioma. (v) Some SNPs that do not occur reproducibly across populations are in reproducible (invariant) pathways, suggesting that they affect the same biological process, and that population discordance can be partially resolved by evaluating processes rather than genes. Conclusion We have uncovered 29 glioma-associated gene candidates; 12 of them known to be cancer related (<it>p </it>= 1. 4 × 10-6), providing additional statistical support for the relevance of the new candidates. This additional information on risk loci is potentially important for identifying Caucasian individuals at risk for glioma, and for assessing relative risk.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Therapeutic Implications of GIPC1 Silencing in Cancer

GIPC1 is a cytoplasmic scaffold protein that interacts with numerous receptor signaling complexes, and emerging evidence suggests that it plays a role in tumorigenesis. GIPC1 is highly expressed in a number of human malignancies, including breast, ovarian, gastric, and pancreatic cancers. Suppression of GIPC1 in human pancreatic cancer cells inhibits in vivo tumor growth in immunodeficient mice. To better understand GIPC1 function, we suppressed its expression in human breast and colorectal cancer cell lines and human mammary epithelial cells (HMECs) and assayed both gene expression and cellular phenotype. Suppression of GIPC1 promotes apoptosis in MCF-7, MDA-MD231, SKBR-3, SW480, and SW620 cells and impairs anchorage-independent colony formation of HMECs. These observations indicate GIPC1 plays an essential role in oncogenic transformation, and its expression is necessary for the survival of human breast and colorectal cancer cells. Additionally, a GIPC1 knock-down gene signature was used to interrogate publically available breast and ovarian cancer microarray datasets. This GIPC1 signature statistically correlates with a number of breast and ovarian cancer phenotypes and clinical outcomes, including patient survival. Taken together, these data indicate that GIPC1 inhibition may represent a new target for therapeutic development for the treatment of human cancers

Public Library of Science (PLOS)

CiteSeerX

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

University of Queensland eSpace

SIGNATURE: A workbench for gene expression signature analysis

Author: A Loboda
A Singh
A Subramanian
AC Culhane
CH Ooi
DA Barbie
DJ Wong
DM Langenau
DR Rhodes
F Huang
J Lamb
J Lucas
JE Lucas
Jeffrey T Chang
Joseph E Lucas
Joseph R Nevins
JT Chang
JT Chang
JT Leek
KA Furge
L Shi
M Barnes
M Reich
M Richter
M West
M West
Michael L Gatza
ML Gatza
Peyton Vaughn
R Spang
RA Irizarry
S Gotz
S Kumar
S Kumar
S Maouche
TA Hall
William T Barry
XH Zhang
Z Liu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The biological phenotype of a cell, such as a characteristic visual image or behavior, reflects activities derived from the expression of collections of genes. As such, an ability to measure the expression of these genes provides an opportunity to develop more precise and varied sets of phenotypes. However, to use this approach requires computational methods that are difficult to implement and apply, and thus there is a critical need for intelligent software tools that can reduce the technical burden of the analysis. Tools for gene expression analyses are unusually difficult to implement in a user-friendly way because their application requires a combination of biological data curation, statistical computational methods, and database expertise. Results We have developed SIGNATURE, a web-based resource that simplifies gene expression signature analysis by providing software, data, and protocols to perform the analysis successfully. This resource uses Bayesian methods for processing gene expression data coupled with a curated database of gene expression signatures, all carried out within a GenePattern web interface for easy use and access. Conclusions SIGNATURE is available for public use at <url>http://genepattern.genome.duke.edu/signature/</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central