Search CORE

1,063 research outputs found

A Comparison of Methods for Data-Driven Cancer Outlier Discovery, and An Application Scheme to Semisupervised Predictive Biomarker Discovery

Author: Karrila Seppo
Lee Julian Hock Ean
Tucker-Kellogg Greg
Publication venue: Libertas Academica
Publication date: 01/01/2011
Field of study

A core component in translational cancer research is biomarker discovery using gene expression profiling for clinical tumors. This is often based on cell line experiments; one population is sampled for inference in another. We disclose a semisupervised workflow focusing on binary (switch-like, bimodal) informative genes that are likely cancer relevant, to mitigate this non-statistical problem. Outlier detection is a key enabling technology of the workflow, and aids in identifying the focus genes

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGH

Author: Greg Tucker-Kellogg
Oscar M Rueda
Ramón Díaz-Uriarte
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Genomic DNA copy-number alterations (CNAs) are associated with complex diseases, including cancer: CNAs are indeed related to tumoral grade, metastasis, and patient survival. CNAs discovered from array-based comparative genomic hybridization (aCGH) data have been instrumental in identifying disease-related genes and potential therapeutic targets. To be immediately useful in both clinical and basic research scenarios, aCGH data analysis requires accurate methods that do not impose unrealistic biological assumptions and that provide direct answers to the key question, “What is the probability that this gene/region has CNAs?” Current approaches fail, however, to meet these requirements. Here, we introduce reversible jump aCGH (RJaCGH), a new method for identifying CNAs from aCGH; we use a nonhomogeneous hidden Markov model fitted via reversible jump Markov chain Monte Carlo; and we incorporate model uncertainty through Bayesian model averaging. RJaCGH provides an estimate of the probability that a gene/region has CNAs while incorporating interprobe distance and the capability to analyze data on a chromosome or genome-wide basis. RJaCGH outperforms alternative methods, and the performance difference is even larger with noisy data and highly variable interprobe distance, both commonly found features in aCGH data. Furthermore, our probabilistic method allows us to identify minimal common regions of CNAs among samples and can be extended to incorporate expression data. In summary, we provide a rigorous statistical framework for locating genes and chromosomal regions with CNAs with potential applications to cancer and other complex human diseases

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGH

Author: Greg Tucker-Kellogg
Oscar M Rueda
Ramón Díaz-Uriarte
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Direct Inference of SNP Heterozygosity Rates and Resolution of LOH Detection

Author: Brian J Reid
Greg Tucker-Kellogg
Patricia C Galipeau
Steven G Self
Thomas G Paulson
Xiaohong Li
Publication venue: Public Library of Science
Publication date: 01/11/2007
Field of study

Single nucleotide polymorphisms (SNPs) have been increasingly utilized to investigate somatic genetic abnormalities in premalignancy and cancer. LOH is a common alteration observed during cancer development, and SNP assays have been used to identify LOH at specific chromosomal regions. The design of such studies requires consideration of the resolution for detecting LOH throughout the genome and identification of the number and location of SNPs required to detect genetic alterations in specific genomic regions. Our study evaluated SNP distribution patterns and used probability models, Monte Carlo simulation, and real human subject genotype data to investigate the relationships between the number of SNPs, SNP HET rates, and the sensitivity (resolution) for detecting LOH. We report that variances of SNP heterozygosity rate in dbSNP are high for a large proportion of SNPs. Two statistical methods proposed for directly inferring SNP heterozygosity rates require much smaller sample sizes (intermediate sizes) and are feasible for practical use in SNP selection or verification. Using HapMap data, we showed that a region of LOH greater than 200 kb can be reliably detected, with losses smaller than 50 kb having a substantially lower detection probability when using all SNPs currently in the HapMap database. Higher densities of SNPs may exist in certain local chromosomal regions that provide some opportunities for reliably detecting LOH of segment sizes smaller than 50 kb. These results suggest that the interpretation of the results from genome-wide scans for LOH using commercial arrays need to consider the relationships among inter-SNP distance, detection probability, and sample size for a specific study. New experimental designs for LOH studies would also benefit from considering the power of detection and sample sizes required to accomplish the proposed aims

Crossref

Directory of Open Access Journals

PubMed Central

Recommended from our members

Evaluation of Normalization Procedures for Oligonucleotide Array Data Based On Spiked cRNA Controls

Author: Brown Eugene L
Hill Andrew A.
Hunter Craig P.
Slonim Donna K
Tucker-Kellogg Greg
Whitley Maryann Z
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/10/2010
Field of study

Background: Affymetrix oligonucleotide arrays simultaneously measure the abundances of thousands of mRNAs in biological samples. Comparability of array results is necessary for the creation of large-scale gene expression databases. The standard strategy for normalizing oligonucleotide array readouts has practical drawbacks. We describe alternative normalization procedures for oligonucleotide arrays based on a common pool of known biotin-labeled cRNAs spiked into each hybridization. Results: We first explore the conditions for validity of the 'constant mean assumption', the key assumption underlying current normalization methods. We introduce 'frequency normalization', a 'spike-in'-based normalization method which estimates array sensitivity, reduces background noise and allows comparison between array designs. This approach does not rely on the constant mean assumption and so can be effective in conditions where standard procedures fail. We also define 'scaled frequency', a hybrid normalization method relying on both spiked transcripts and the constant mean assumption while maintaining all other advantages of frequency normalization. We compare these two procedures to a standard global normalization method using experimental data. We also use simulated data to estimate accuracy and investigate the effects of noise. We find that scaled frequency is as reproducible and accurate as global normalization while offering several practical advantages. Conclusions: Scaled frequency quantitation is a convenient, reproducible technique that performs as well as global normalization on serial experiments with the same array design, while offering several additional features. Specifically, the scaled-frequency method enables the comparison of expression measurements across different array designs, yields estimates of absolute message abundance in cRNA and determines the sensitivity of individual arrays.Molecular and Cellular Biolog

Harvard University - DASH

Genetic progression and the waiting time to cancer

Author: Arne Traulsen
Bert Vogelstein
David Dingli
Greg Tucker-Kellogg
Kenneth W Kinzler
Martin A Nowak
Niko Beerenwinkel
Tibor Antal
Victor E Velculescu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

Cancer results from genetic alterations that disturb the normal cooperative behavior of cells. Recent high-throughput genomic studies of cancer cells have shown that the mutational landscape of cancer is complex and that individual cancers may evolve through mutations in as many as 20 different cancer-associated genes. We use data published by Sjoblom et al. (2006) to develop a new mathematical model for the somatic evolution of colorectal cancers. We employ the Wright-Fisher process for exploring the basic parameters of this evolutionary process and derive an analytical approximation for the expected waiting time to the cancer phenotype. Our results highlight the relative importance of selection over both the size of the cell population at risk and the mutation rate. The model predicts that the observed genetic diversity of cancer genomes can arise under a normal mutation rate if the average selective advantage per mutation is on the order of 1%. Increased mutation rates due to genetic instability would allow even smaller selective advantages during tumorigenesis. The complexity of cancer progression thus can be understood as the result of multiple sequential mutations, each of which has a relatively small but positive effect on net cell growth.Comment: Details available as supplementary material at http://www.people.fas.harvard.edu/~antal/publications.htm

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Repository for Publications and Research Data

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Search algorithms as a framework for the optimization of drug combinations

Author: A Wagner
AJ Viterbi
AL Barabasi
Andrew D. McCulloch
B Schmidt
CM Reidys
D Calzolari
Diego Calzolari
E Lin
EK Kemsley
F Jelinek
G Paternostro
GA Bekey
Giovanni Paternostro
GR Zimmermann
Greg Tucker-Kellogg
J Bechhoefer
J Lamb
JA Radford
Jacob D. Feala
JB Fitzgerald
JD Feala
Jennifer Schofield
JG Hardman
JG Wood
JJ Schneider
JM Toivonen
John C. Reed
JW Gargano
K Wang
Laurence Coquin
M Bohm
M Nerenberg
PG Gobbi
PK Wong
R Johannesson
R Marcus
R Palmer
R Pfeifer
RA Kloner
RA Weinberg
RP Araujo
Stefania Bruschi
V Diehl
WR Greco
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 13/10/2008
Field of study

Combination therapies are often needed for effective clinical outcomes in the management of complex diseases, but presently they are generally based on empirical clinical experience. Here we suggest a novel application of search algorithms, originally developed for digital communication, modified to optimize combinations of therapeutic interventions. In biological experiments measuring the restoration of the decline with age in heart function and exercise capacity in Drosophila melanogaster, we found that search algorithms correctly identified optimal combinations of four drugs with only one third of the tests performed in a fully factorial search. In experiments identifying combinations of three doses of up to six drugs for selective killing of human cancer cells, search algorithms resulted in a highly significant enrichment of selective combinations compared with random searches. In simulations using a network model of cell death, we found that the search algorithms identified the optimal combinations of 6-9 interventions in 80-90% of tests, compared with 15-30% for an equivalent random search. These findings suggest that modified search algorithms from information theory have the potential to enhance the discovery of novel therapeutic drug combinations. This report also helps to frame a biomedical problem that will benefit from an interdisciplinary effort and suggests a general strategy for its solution.Comment: 36 pages, 10 figures, revised versio

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Inferring Pathway Activity toward Precise Disease Classification

Author: A Agresti
A Bhattacharjee
A Subramanian
AA Alizadeh
AH Bild
B Tian
CL Banka
DG Beer
Doheon Lee
E Segal
EJ Yeoh
Eunjung Lee
Greg Tucker-Kellogg
GV Glinsky
Han-Yu Chuang
HY Chuang
J Chen
J Lapointe
JA Swets
Jong-Won Kim
JP Svensson
JP Vert
KM Mani
L Ein-Dor
L Tian
LJ van 't Veer
MJ van de Vijver
P Pavlidis
P Pavlidis
R Sharan
RA Fisher
RA Gatenby
RA Gatenby
S Draghici
S Efroni
S Ramaswamy
SA Tomlins
SS Gambhir
SW Doniger
T Breslin
T Ideker
TR Golub
Trey Ideker
VK Mootha
WF Symmans
Y Saeys
Y Wang
Z Guo
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

The advent of microarray technology has made it possible to classify disease states based on gene expression profiles of patients. Typically, marker genes are selected by measuring the power of their expression profiles to discriminate among patients of different disease states. However, expression-based classification can be challenging in complex diseases due to factors such as cellular heterogeneity within a tissue sample and genetic heterogeneity across patients. A promising technique for coping with these challenges is to incorporate pathway information into the disease classification procedure in order to classify disease based on the activity of entire signaling pathways or protein complexes rather than on the expression levels of individual genes or proteins. We propose a new classification method based on pathway activities inferred for each patient. For each pathway, an activity level is summarized from the gene expression levels of its condition-responsive genes (CORGs), defined as the subset of genes in the pathway whose combined expression delivers optimal discriminative power for the disease phenotype. We show that classifiers using pathway activity achieve better performance than classifiers based on individual gene expression, for both simple and complex case-control studies including differentiation of perturbed from non-perturbed cells and subtyping of several different kinds of cancer. Moreover, the new method outperforms several previous approaches that use a static (i.e., non-conditional) definition of pathways. Within a pathway, the identified CORGs may facilitate the development of better diagnostic markers and the discovery of core alterations in human disease

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central