Search CORE

1,002 research outputs found

A robust prognostic signature for hormone-positive node-negative breast cancer

Author: Collisson Eric A
Enache Oana M
Gray Joe W
Griffith Obi L
Heiser Laura M
Pepin Francois
Spellman Paul T
Publication venue: Digital Commons@Becker
Publication date: 01/01/2013
Field of study

BACKGROUND: Systemic chemotherapy in the adjuvant setting can cure breast cancer in some patients that would otherwise recur with incurable, metastatic disease. However, since only a fraction of patients would have recurrence after surgery alone, the challenge is to stratify high-risk patients (who stand to benefit from systemic chemotherapy) from low-risk patients (who can safely be spared treatment related toxicities and costs). METHODS: We focus here on risk stratification in node-negative, ER-positive, HER2-negative breast cancer. We use a large database of publicly available microarray datasets to build a random forests classifier and develop a robust multi-gene mRNA transcription-based predictor of relapse free survival at 10 years, which we call the Random Forests Relapse Score (RFRS). Performance was assessed by internal cross-validation, multiple independent data sets, and comparison to existing algorithms using receiver-operating characteristic and Kaplan-Meier survival analysis. Internal redundancy of features was determined using k-means clustering to define optimal signatures with smaller numbers of primary genes, each with multiple alternates. RESULTS: Internal OOB cross-validation for the initial (full-gene-set) model on training data reported an ROC AUC of 0.704, which was comparable to or better than those reported previously or obtained by applying existing methods to our dataset. Three risk groups with probability cutoffs for low, intermediate, and high-risk were defined. Survival analysis determined a highly significant difference in relapse rate between these risk groups. Validation of the models against independent test datasets showed highly similar results. Smaller 17-gene and 8-gene optimized models were also developed with minimal reduction in performance. Furthermore, the signature was shown to be almost equally effective on both hormone-treated and untreated patients. CONCLUSIONS: RFRS allows flexibility in both the number and identity of genes utilized from thousands to as few as 17 or eight genes, each with multiple alternatives. The RFRS reports a probability score strongly correlated with risk of relapse. This score could therefore be used to assign systemic chemotherapy specifically to those high-risk patients most likely to benefit from further treatment

Crossref

Springer - Publisher Connector

Digital Commons@Becker

PubMed Central

Integrated analysis of breast cancer cell lines reveals unique signaling pathways

Author: Barbara L Weber
Carolyn L Talcott
Jeffrey R Jackson
Joe W Gray
Keith R Laderoute
Laura M Heiser
Merrill Knapp
Nicholas J Wang
Paul T Spellman
Ph.D Paul T Spellman
Richard F Wooster
Safiyyah Ziyad
Sylvie Laquerre
Wen-Lin Kuo
Yinghui Guan
Zhi Hu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Mapping of sub-networks in the EGFR-MAPK pathway in different breast cancer cell lines reveals that PAK1 may be a marker for sensitivity to MEK inhibitors

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central

UNT Digital Library

The Iterative Signature Algorithm for the analysis of large scale gene expression data

Author: A. Brazma
A. Schulze
C.M. Perou
D.D. Lee
E. Lander
G. Getz
G. Sherlock
J. Ihmels
J.E. Staunton
J.L. DeRisi
Jan Ihmels
L. Lazzeroni
M. Bittner
M. Bittner
M. Schena
M.B. Eisen
N.S. Holter
Naama Barkai
O. Alter
P. Tamayo
P.T. Spellman
R.B. Altman
S. Tavazoie
Sven Bergmann
T. Hastie
T.G. Kolda
U. Alon
U. Scherf
Y. Cheng
Publication venue: 'American Physical Society (APS)'
Publication date: 08/10/2002
Field of study

We present a new approach for the analysis of genome-wide expression data. Our method is designed to overcome the limitations of traditional techniques, when applied to large-scale data. Rather than alloting each gene to a single cluster, we assign both genes and conditions to context-dependent and potentially overlapping transcription modules. We provide a rigorous definition of a transcription module as the object to be retrieved from the expression data. An efficient algorithm, that searches for the modules encoded in the data by iteratively refining sets of genes and conditions until they match this definition, is established. Each iteration involves a linear map, induced by the normalized expression matrix, followed by the application of a threshold function. We argue that our method is in fact a generalization of Singular Value Decomposition, which corresponds to the special case where no threshold is applied. We show analytically that for noisy expression data our approach leads to better classification due to the implementation of the threshold. This result is confirmed by numerical analyses based on in-silico expression data. We discuss briefly results obtained by applying our algorithm to expression data from the yeast S. cerevisiae.Comment: Latex, 36 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Graphitized Needle Cokes and Natural Graphites for Lithium Intercalation

Author: Goldberger W. M.
Kinoshita K.
Pekala R. W.
Spellman L. M.
Tran T. D.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/1996
Field of study

This paper examined effects of heat treatment and milling (before or after heat treatment) on the (electrochemical) intercalating ability of needle petroleum coke; natural graphite particles are included for comparison. 1 tab, 4 figs, 7 refs

Crossref

UNT Digital Library

Carbohydrate structures of the human-immunodeficiency-virus (HIV) recombinant envelope glycoprotein gp120 produced in Chinese-hamster ovary cells

Author: J Solomon
L J Basa
M Larkin
M W Spellman
T Feizi
T Mizuochi
Publication venue: 'Portland Press Ltd.'
Publication date
Field of study

Crossref

Genotype List String: a grammar for describing HLA and KIR genotyping results in a text string

Author: Bochtler W
Cooley S
Gragert L
Guethlein LA
Heuer ML
Hollenbach JA
Mack SJ
Maiers M
Marsh SGE
Milius RP
Mueller CR
Pollack J
Robinson J
Spellman S
Trachtenberg EA
Publication venue: WILEY-BLACKWELL
Publication date: 12/07/2013
Field of study

Knowledge of an individual's human leukocyte antigen (HLA) genotype is essential for modern medical genetics, and is crucial for hematopoietic stem cell and solid-organ transplantation. However, the high levels of polymorphism known for the HLA genes make it difficult to generate an HLA genotype that unambiguously identifies the alleles that are present at a given HLA locus in an individual. For the last 20 years, the histocompatibility and immunogenetics community has recorded this HLA genotyping ambiguity using allele codes developed by the National Marrow Donor Program (NMDP). While these allele codes may have been effective for recording an HLA genotyping result when initially developed, their use today results in increased ambiguity in an HLA genotype, and they are no longer suitable in the era of rapid allele discovery and ultra-high allele polymorphism. Here, we present a text string format capable of fully representing HLA genotyping results. This Genotype List (GL) String format is an extension of a proposed standard for reporting killer-cell immunoglobulin-like receptor (KIR) genotype data that can be applied to any genetic data that use a standard nomenclature for identifying variants. The GL String format uses a hierarchical set of operators to describe the relationships between alleles, lists of possible alleles, phased alleles, genotypes, lists of possible genotypes, and multilocus unphased genotypes, without losing typing information or increasing typing ambiguity. When used in concert with appropriate tools to create, exchange, and parse these strings, we anticipate that GL Strings will replace NMDP allele codes for reporting HLA genotypes

UCL Discovery

Beyond element-wise interactions: identifying complex interactions in biological processes

Author: A Kahvejian
AJ Tate
B Gourévitch
C Granger
C Zou
Christophe Ladroue
CJ Needham
CWJ Granger
H Parkinson
HW Mewes
J Geweke
J Pearl
J Peirce
J Shendure
J Wu
J Yu
JF Geweke
Jianfeng Feng
K Friston
K Sachs
Keith Kendrick
L Royer
M Ding
M Eichler
M Fletcher
MC Teixeira
N Wiener
O David
PT Spellman
R Aebersold
RA Horn
RS Wang
S Guo
S Klamt
S Mukherjee
Shuixia Guo
SM Kosslyn
T Barrett
Vladimir Brezina
Y Chen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 22/09/2009
Field of study

Background: Biological processes typically involve the interactions of a number of elements (genes, cells) acting on each others. Such processes are often modelled as networks whose nodes are the elements in question and edges pairwise relations between them (transcription, inhibition). But more often than not, elements actually work cooperatively or competitively to achieve a task. Or an element can act on the interaction between two others, as in the case of an enzyme controlling a reaction rate. We call “complex” these types of interaction and propose ways to identify them from time-series observations. Methodology: We use Granger Causality, a measure of the interaction between two signals, to characterize the influence of an enzyme on a reaction rate. We extend its traditional formulation to the case of multi-dimensional signals in order to capture group interactions, and not only element interactions. Our method is extensively tested on simulated data and applied to three biological datasets: microarray data of the Saccharomyces cerevisiae yeast, local field potential recordings of two brain areas and a metabolic reaction. Conclusions: Our results demonstrate that complex Granger causality can reveal new types of relation between signals and is particularly suited to biological data. Our approach raises some fundamental issues of the systems biology approach since finding all complex causalities (interactions) is an NP hard problem

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB

Author: A Brazma
AI Saeed
Alvis Brazma
Anna Farne
AR Jones
B Dysvik
BR Zeeberg
CA Ball
Catherine A Ball
Christian J Stoeckert
Donald S Maier
E Manduchi
Ele Holloway
Farrell Wymore
Gavin Sherlock
Helen C Causton
Helen Parkinson
J White
John Quackenbush
Joseph White
Junmin Liu
Kjell Petersen
M Navarange
Michael Miller
MT Vass
P Spellman
Patricia L Whetzel
Paul T Spellman
Philippe Rocca-Serra
PL Whetzel
PT Spellman
R Anbazhagan
Rafael A Irizarry
Tim F Rayner
Ugis Sarkans
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Sharing of microarray data within the research community has been greatly facilitated by the development of the disclosure and communication standards MIAME and MAGE-ML by the MGED Society. However, the complexity of the MAGE-ML format has made its use impractical for laboratories lacking dedicated bioinformatics support. RESULTS: We propose a simple tab-delimited, spreadsheet-based format, MAGE-TAB, which will become a part of the MAGE microarray data standard and can be used for annotating and communicating microarray data in a MIAME compliant fashion. CONCLUSION: MAGE-TAB will enable laboratories without bioinformatics experience or support to manage, exchange and submit well-annotated microarray data in a standard format using a spreadsheet. The MAGE-TAB format is self-contained, and does not require an understanding of MAGE-ML or XML

University of Bergen

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

NORA - Norwegian Open Research Archives

Semi-supervised gene shaving method for predicting low variation biological pathways from genome-wide data

Author: A Blais
A Gasch
A Subramanian
B Manly
D Zhu
Dongxiao Zhu
E Glynn
E Rivas
G Carter
G Rustici
G Tseng
J Tomphor
K Do
K Yeung
L Lazzroni
L Liang
L Tian
M Dequeant
M Eisen
M Wall
O Alter
P Larson
P Spellman
P Tamayo
S Pittler
T Hastie
V Mootha
Y Cao
Z Qin
Publication venue: BioMed Central
Publication date: 30/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

A statistical framework for integrating two microarray data sets in differential expression analysis

Author: D Lockhart
D Singh
EM Conlon
F Hong
GJ McLachlan
GJ McLachlan
GJ McLachlan
I Borozan
JD Storey
Jin-Xiong She
JK Choi
KHS Wilson
L Ein-Dor
L Xu
L Xu
M Miron
M Schena
M Zhang
P Cahan
PT Spellman
S Dudoit
Sarah E Eckenrode
SE Eckenrode
TR Golub
VK Mootha
X Cui
Y Benjamini
Y Lai
Yinglei Lai
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Different microarray data sets can be collected for studying the same or similar diseases. We expect to achieve a more efficient analysis of differential expression if an efficient statistical method can be developed for integrating different microarray data sets. Although many statistical methods have been proposed for data integration, the genome-wide concordance of different data sets has not been well considered in the analysis. Results Before considering data integration, it is necessary to evaluate the genome-wide concordance so that misleading results can be avoided. Based on the test results, different subsequent actions are suggested. The evaluation of genome-wide concordance and the data integration can be achieved based on the normal distribution based mixture models. Conclusion The results from our simulation study suggest that misleading results can be generated if the genome-wide concordance issue is not appropriately considered. Our method provides a rigorous parametric solution. The results also show that our method is robust to certain model misspecification and is practically useful for the integrative analysis of differential expression.</p

Crossref

Directory of Open Access Journals

PubMed Central

George Washington University: Health Sciences Research Commons (HSRC)