Search CORE

University of East Anglia digital repository

PubMed Central

Efficient sampling for Bayesian inference of conjunctive Bayesian networks

Author: Beerenwinkel N
Sakoparnig T
Publication venue: 'Oxford University Press (OUP)'
Publication date: 10/07/2012
Field of study

Motivation: Cancer development is driven by the accumulation of advantageous mutations and subsequent clonal expansion of cells harbouring these mutations, but the order in which mutations occur remains poorly understood. Advances in genome sequencing and the soon-arriving flood of cancer genome data produced by large cancer sequencing consortia hold the promise to elucidate cancer progression. However, new computational methods are needed to analyse these large datasets. Results: We present a Bayesian inference scheme for Conjunctive Bayesian Networks, a probabilistic graphical model in which mutations accumulate according to partial order constraints and cancer genotypes are observed subject to measurement noise. We develop an efficient MCMC sampling scheme specifically designed to overcome local optima induced by dependency structures. We demonstrate the performance advantage of our sampler over traditional approaches on simulated data and show the advantages of adopting a Bayesian perspective when reanalyzing cancer datasets and comparing our results to previous maximum-likelihood-based approaches. Availability: An R package including the sampler and examples is available at http://www.cbg.ethz.ch/software/bayes-cbn. Contacts: [email protected]

ZORA

Efficient sampling for Bayesian inference of conjunctive Bayesian networks

Author: Bell
Bogojeska
Desper
Desper
Fearon
Gerstung
Hanahan
Heydebreck
Hjelm
Jiang
N. Beerenwinkel
Radmacher
Szabo
T. Sakoparnig
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Simultaneous Inference of Cancer Pathways and Tumor Progression from Cross-Sectional Mutation Data

Author: A. Tofigh
B. Vogelstein
C. Kandoth
C.-H. Yeang
C.A. Miller
C.S.-O. Attolini
C.W. Brennan
E.R. Fearon
E.R. Fearon
F. Vandin
F. Vandin
G. Ciriello
J. Rahnenführer
L.D. Wood
M. Gerstung
M. Hjelm
M.S. Lawrence
N. Beerenwinkel
N. Beerenwinkel
N. Beerenwinkel
N. Beerenwinkel
N. Beerenwinkel
N.D. Dees
R. Desper
R. Desper
T. Sakoparnig
The Cancer Genome Atlas Network
The Cancer Genome Atlas Network
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Recent cancer sequencing studies provide a wealth of somatic mutation data from a large number of patients. One of the most intriguing and challenging questions arising from this data is to determine whether the temporal order of somatic mutations in a cancer follows any common progression. Since we usually obtain only one sample from a patient, such inferences are commonly made from cross-sectional data from different patients. This analysis is complicated by the extensive variation in the somatic mutations across different patients, variation that is reduced by examining combinations of mutations in various pathways. Thus far, methods to reconstruct tumor progression at the pathway level have restricted attention to known, a priori defined pathways. In this work we show how to simultaneously infer pathways and the temporal order of their mutations from cross-sectional data, leveraging on the exclusivity property of driver mutations within a pathway. We define the pathway linear progression model, and derive a combinatorial formulation for the problem of finding the optimal model from mutation data. We show that with enough samples the optimal solution to this problem uniquely identifies the correct model with high probability even when errors are present in the mutation data. We then formulate the problem as an integer linear program (ILP), which allows the analysis of datasets from recent studies with large numbers of samples. We use our algorithm to analyze somatic mutation data from three cancer studies, including two studies from The Cancer Genome Atlas (TCGA) on large number of samples on colorectal cancer and glioblastoma. The models reconstructed with our method capture most of the current knowledge of the progression of somatic mutations in these cancer types, while also providing new insights on the tumor progression at the pathway level

Syddansk Universitets Forskerportal

PubMed Central

Archivio istituzionale della ricerca - Università di Padova

Signatures of positive selection reveal a universal role of chromatin modifiers as cancer driver genes

Author: A Fischer
A Gonzalez-Perez
A Marusyk
A O’Shaughnessy
A Roth
A Schuh
A Sottoriva
AY Lai
B Vogelstein
B Zhao
C Kandoth
CA Miller
CW Roberts
D Szklarczyk
D Tamborero
D Tamborero
DA Landau
DA Landau
ER Fearon
F Supek
H Li
J-Y Lee
JA Biegel
JM Smith
K Cibulskis
L Bassaganyas
L Oesper
M Gallo Le
M Gerlinger
M Gerstung
M Kircher
M Xie
MC Álvarez-Silva
MD Leiserson
MJ Bissell
MJ Williams
MR Stratton
MS Lawrence
MS Lawrence
MV Dieci
N Beerenwinkel
N Bolli
N McGranahan
ND Dees
NF Miranda de
PA Futreal
PC Nowell
PJ Campbell
R Nielsen
S Nik-Zainal
S Vohra
SC Li
SL Ostrow
T Orvis
T Sakoparnig
VN Babenko
XS Puente
Y Cai
Y Chudnovsky
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Tumors are composed of an evolving population of cells subjected to tissue-specific selection, which fuels tumor heterogeneity and ultimately complicates cancer driver gene identification. Here, we integrate cancer cell fraction, population recurrence, and functional impact of somatic mutations as signatures of selection into a Bayesian model for driver prediction. We demonstrate that our model, cDriver, outperforms competing methods when analyzing solid tumors, hematological malignancies, and pan-cancer datasets. Applying cDriver to exome sequencing data of 21 cancer types from 6,870 individuals revealed 98 unreported tumor type-driver gene connections. These novel connections are highly enriched for chromatin-modifying proteins, hinting at a universal role of chromatin regulation in cancer etiology. Although infrequently mutated as single genes, we show that chromatin modifiers are altered in a large fraction of cancer patients. In summary, we demonstrate that integration of evolutionary signatures is key for identifying mutational driver genes, thereby facilitating the discovery of novel therapeutic targets for cancer treatment.We acknowledge support of the Spanish Ministry of Economy and Competitiveness, 'Centro de Excelencia Severo Ochoa 2013-2017'. We acknowledge the support of the CERCA Programme/Generalitat de Catalunya. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 635290. Luis Zapata has been supported by the International PhD scholarship program of La Caixa at CRG

Public Library of Science (PLOS)

UPF Digital Repository

Combinatorial Modeling of Chromatin Features Quantitatively Predicts DNA Replication Timing in <i>Drosophila</i>

<div><p>In metazoans, each cell type follows a characteristic, spatio-temporally regulated DNA replication program. Histone modifications (HMs) and chromatin binding proteins (CBPs) are fundamental for a faithful progression and completion of this process. However, no individual HM is strictly indispensable for origin function, suggesting that HMs may act combinatorially in analogy to the histone code hypothesis for transcriptional regulation. In contrast to gene expression however, the relationship between combinations of chromatin features and DNA replication timing has not yet been demonstrated. Here, by exploiting a comprehensive data collection consisting of 95 CBPs and HMs we investigated their combinatorial potential for the prediction of DNA replication timing in <i>Drosophila</i> using quantitative statistical models. We found that while combinations of CBPs exhibit moderate predictive power for replication timing, pairwise interactions between HMs lead to accurate predictions genome-wide that can be locally further improved by CBPs. Independent feature importance and model analyses led us to derive a simplified, biologically interpretable model of the relationship between chromatin landscape and replication timing reaching 80% of the full model accuracy using six model terms. Finally, we show that pairwise combinations of HMs are able to predict differential DNA replication timing across different cell types. All in all, our work provides support to the existence of combinatorial HM patterns for DNA replication and reveal cell-type independent key elements thereof, whose experimental investigation might contribute to elucidate the regulatory mode of this fundamental cellular process.</p></div

CiteSeerX

The Francis Crick Institute

PubMed Central

A probabilistic method for leveraging functional annotations to enhance estimation of the temporal order of pathway mutations during carcinogenesis

Author: A Szabo
A Youn
AG Deshwar
Arnold J. Stromberg
B Reva
BJ Raphael
C Zong
Chi Wang
Chunming Liu
CSO Attolini
CW Brennan
E Shtivelman
EC Pacheco-Pinedo
EM Ross
G Stanta
HS Farahani
IA Adzhubei
J Liu
J Wang
Jinpeng Liu
John L. Villano
K Jahn
L Zhang
Li Chen
M Gerstung
M Kanehisa
M Román
MD Leiserson
MD Leiserson
Menghan Wang
N Bansal
N Beerenwinkel
N Beerenwinkel
N Beerenwinkel
N Beerenwinkel
N McGranahan
P Perez-Moreno
PC Ng
R Desper
R Schwartz
RF Schwarz
S Constantinescu
S Cristea
Susanne M. Arnold
T Sakoparnig
Tianxin Yu
W Jiao
X Dai
Y Choi
YK Cheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Predicting cancer type from tumour DNA signatures

Author: A Fabregat
A Gonzalez-Perez
AS Ho
B Vogelstein
B Vogelstein
C Bettegowda
C Cortes
C Kandoth
C Rubio-Perez
C Rubio-Perez
CF Davis
CH Mermel
CW Brennan
D Amar
DA Haber
E Cerami
E Khurana
E Kirkizlar
Ewa Szczurek
G Ciriello
G Ciriello
I Guyon
J Friedman
J Gao
J Khan
J Zhu
Kee Pang Soh
L Breiman
L Ein-Dor
LAJ Diaz
M Milacic
MN Wright
MS Lawrence
N Meinshausen
N Pavlidis
NA O’Leary
Niko Beerenwinkel
P Martinez
P Polak
PA Futreal
R Caruana
R Core Team
R Díaz-Uriarte
R Tibshirani
S Kang
S Ramaswamy
SA Forbes
SM Ahn
T Golub
T Hastie
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
The Cancer Genome Atlas Research Network
Thomas Sakoparnig
X Hao
X Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Abstract Background Establishing the cancer type and site of origin is important in determining the most appropriate course of treatment for cancer patients. Patients with cancer of unknown primary, where the site of origin cannot be established from an examination of the metastatic cancer cells, typically have poor survival. Here, we evaluate the potential and limitations of utilising gene alteration data from tumour DNA to identify cancer types. Methods Using sequenced tumour DNA downloaded via the cBioPortal for Cancer Genomics, we collected the presence or absence of calls for gene alterations for 6640 tumour samples spanning 28 cancer types, as predictive features. We employed three machine-learning techniques, namely linear support vector machines with recursive feature selection, L 1-regularised logistic regression and random forest, to select a small subset of gene alterations that are most informative for cancer-type prediction. We then evaluated the predictive performance of the models in a comparative manner. Results We found the linear support vector machine to be the most predictive model of cancer type from gene alterations. Using only 100 somatic point-mutated genes for prediction, we achieved an overall accuracy of 49.4±0.4 % (95 % confidence interval). We observed a marked increase in the accuracy when copy number alterations are included as predictors. With a combination of somatic point mutations and copy number alterations, a mere 50 genes are enough to yield an overall accuracy of 77.7±0.3 %. Conclusions A general cancer diagnostic tool that utilises either only somatic point mutations or only copy number alterations is not sufficient for distinguishing a broad range of cancer types. The combination of both gene alteration types can dramatically improve the performance

Uncovering the subtype-specific temporal order of cancer pathway dysregulation

Impact of Natural Genetic Variation on Gene Expression Dynamics

Author: A Abdollahi
A Alexa
A Bureau
A Bureau
A Califano
A Gerrits
A Gerrits
A Kiani
A Kiani
A Yen
AA Motsinger-Reif
AC Nica
AL Price
Andreas Beyer
AS Dimas
AS Rodin
BA Goldstein
C Liu
CE Müller-Sieburg
Christine A. Wells
CL Fisher
CR Geest
CWM Reuter
D Altshuler
D Amaratunga
D Szklarczyk
DL Nicolae
DS Sieburth
E Petretto
EL Heinzen
EN Smith
ET Dermitzakis
FBS Briggs
FX Li
G Swiers
G Van Zant
H Iwasaki
H Zhong
HK Chung
HP Kang
I Dybedal
J Ding
J Dutkowski
J Fu
J Wang
JE Powell
JJ Michaelson
JP Shaffer
K Bullaughey
KE Lohmueller
KL Lunetta
L Dan
M Ackermann
M Ashburner
M Wang
M Yoshida
Marit Ackermann
MN Davies
MV Rockman
N Takakura
O Gonzlez-Recio
R Alberts
RA Shivdasani
RM Johnson
S Baksh
S Durinck
S Ghaffari
S Loguercio
SE Jacobsen
SH Orkin
SSF Lee
T Ideker
T Sakoparnig
TH Lee
U Roshan
V Emilsson
Weronika Sikora-Wohlfeld
WW Yang
Y Benjamini
Y Li
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study