928 research outputs found
Conjunctive Bayesian networks
Conjunctive Bayesian networks (CBNs) are graphical models that describe the
accumulation of events which are constrained in the order of their occurrence.
A CBN is given by a partial order on a (finite) set of events. CBNs generalize
the oncogenetic tree models of Desper et al. by allowing the occurrence of an
event to depend on more than one predecessor event. The present paper studies
the statistical and algebraic properties of CBNs. We determine the maximum
likelihood parameters and present a combinatorial solution to the model
selection problem. Our method performs well on two datasets where the events
are HIV mutations associated with drug resistance. Concluding with a study of
the algebraic properties of CBNs, we show that CBNs are toric varieties after a
coordinate transformation and that their ideals possess a quadratic Gr\"{o}bner
basis.Comment: Published in at http://dx.doi.org/10.3150/07-BEJ6133 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Epistasis and Shapes of Fitness Landscapes
The relationship between the shape of a fitness landscape and the underlying
gene interactions, or epistasis, has been extensively studied in the two-locus
case. Gene interactions among multiple loci are usually reduced to two-way
interactions. We present a geometric theory of shapes of fitness landscapes for
multiple loci. A central concept is the genotope, which is the convex hull of
all possible allele frequencies in populations. Triangulations of the genotope
correspond to different shapes of fitness landscapes and reveal all the gene
interactions. The theory is applied to fitness data from HIV and Drosophila
melanogaster. In both cases, our findings refine earlier analyses and reveal
previously undetected gene interactions.Comment: 31 pages, 7 figures; typos removed, Example 3.10 adde
A Mutagenetic Tree Hidden Markov Model for Longitudinal Clonal HIV Sequence Data
RNA viruses provide prominent examples of measurably evolving populations. In
HIV infection, the development of drug resistance is of particular interest,
because precise predictions of the outcome of this evolutionary process are a
prerequisite for the rational design of antiretroviral treatment protocols. We
present a mutagenetic tree hidden Markov model for the analysis of longitudinal
clonal sequence data. Using HIV mutation data from clinical trials, we estimate
the order and rate of occurrence of seven amino acid changes that are
associated with resistance to the reverse transcriptase inhibitor efavirenz.Comment: 20 pages, 6 figure
Parametric inference of recombination in HIV genomes
Recombination is an important event in the evolution of HIV. It affects the
global spread of the pandemic as well as evolutionary escape from host immune
response and from drug therapy within single patients. Comprehensive
computational methods are needed for detecting recombinant sequences in large
databases, and for inferring the parental sequences.
We present a hidden Markov model to annotate a query sequence as a
recombinant of a given set of aligned sequences. Parametric inference is used
to determine all optimal annotations for all parameters of the model. We show
that the inferred annotations recover most features of established hand-curated
annotations. Thus, parametric analysis of the hidden Markov model is feasible
for HIV full-length genomes, and it improves the detection and annotation of
recombinant forms.
All computational results, reference alignments, and C++ source code are
available at http://bio.math.berkeley.edu/recombination/.Comment: 20 pages, 5 figure
Markov models for accumulating mutations
We introduce and analyze a waiting time model for the accumulation of genetic changes. The continuous-time conjunctive Bayesian network is defined by a partially ordered set of mutations and by the rate of fixation of each mutation. The partial order encodes constraints on the order in which mutations can fixate in the population, shedding light on the mutational pathways underlying the evolutionary process. We study a censored version of the model and derive equations for an em algorithm to perform maximum likelihood estimation of the model parameters. We also show how to select the maximum likelihood partially ordered set. The model is applied to genetic data from cancer cells and from drug resistant human immunodeficiency viruses, indicating implications for diagnosis and treatmen
ISMB/ECCB 2015
ISSN:1367-4803ISSN:1460-205
Efficient sampling for Bayesian inference of conjunctive Bayesian networks
Motivation: Cancer development is driven by the accumulation of advantageous mutations and subsequent clonal expansion of cells harbouring these mutations, but the order in which mutations occur remains poorly understood. Advances in genome sequencing and the soon-arriving flood of cancer genome data produced by large cancer sequencing consortia hold the promise to elucidate cancer progression. However, new computational methods are needed to analyse these large datasets. Results: We present a Bayesian inference scheme for Conjunctive Bayesian Networks, a probabilistic graphical model in which mutations accumulate according to partial order constraints and cancer genotypes are observed subject to measurement noise. We develop an efficient MCMC sampling scheme specifically designed to overcome local optima induced by dependency structures. We demonstrate the performance advantage of our sampler over traditional approaches on simulated data and show the advantages of adopting a Bayesian perspective when reanalyzing cancer datasets and comparing our results to previous maximum-likelihood-based approaches. Availability: An R package including the sampler and examples is available at http://www.cbg.ethz.ch/software/bayes-cbn. Contacts: [email protected]
- …