885 research outputs found
An alternative marginal likelihood estimator for phylogenetic models
Bayesian phylogenetic methods are generating noticeable enthusiasm in the
field of molecular systematics. Many phylogenetic models are often at stake and
different approaches are used to compare them within a Bayesian framework. The
Bayes factor, defined as the ratio of the marginal likelihoods of two competing
models, plays a key role in Bayesian model selection. We focus on an
alternative estimator of the marginal likelihood whose computation is still a
challenging problem. Several computational solutions have been proposed none of
which can be considered outperforming the others simultaneously in terms of
simplicity of implementation, computational burden and precision of the
estimates. Practitioners and researchers, often led by available software, have
privileged so far the simplicity of the harmonic mean estimator (HM) and the
arithmetic mean estimator (AM). However it is known that the resulting
estimates of the Bayesian evidence in favor of one model are biased and often
inaccurate up to having an infinite variance so that the reliability of the
corresponding conclusions is doubtful. Our new implementation of the
generalized harmonic mean (GHM) idea recycles MCMC simulations from the
posterior, shares the computational simplicity of the original HM estimator,
but, unlike it, overcomes the infinite variance issue. The alternative
estimator is applied to simulated phylogenetic data and produces fully
satisfactory results outperforming those simple estimators currently provided
by most of the publicly available software
Statistical Phylogenetic Tree Analysis Using Differences of Means
We propose a statistical method to test whether two phylogenetic trees with
given alignments are significantly incongruent. Our method compares the two
distributions of phylogenetic trees given by the input alignments, instead of
comparing point estimations of trees. This statistical approach can be applied
to gene tree analysis for example, detecting unusual events in genome evolution
such as horizontal gene transfer and reshuffling. Our method uses difference of
means to compare two distributions of trees, after embedding trees in a vector
space. Bootstrapping alignment columns can then be applied to obtain p-values.
To compute distances between means, we employ a "kernel trick" which speeds up
distance calculations when trees are embedded in a high-dimensional feature
space, e.g. splits or quartets feature space. In this pilot study, first we
test our statistical method's ability to distinguish between sets of gene trees
generated under coalescence models with species trees of varying dissimilarity.
We follow our simulation results with applications to various data sets of
gophers and lice, grasses and their endophytes, and different fungal genes from
the same genome. A companion toolkit, {\tt Phylotree}, is provided to
facilitate computational experiments.Comment: 17 pages, 6 figure
Evolutionary Inference via the Poisson Indel Process
We address the problem of the joint statistical inference of phylogenetic
trees and multiple sequence alignments from unaligned molecular sequences. This
problem is generally formulated in terms of string-valued evolutionary
processes along the branches of a phylogenetic tree. The classical evolutionary
process, the TKF91 model, is a continuous-time Markov chain model comprised of
insertion, deletion and substitution events. Unfortunately this model gives
rise to an intractable computational problem---the computation of the marginal
likelihood under the TKF91 model is exponential in the number of taxa. In this
work, we present a new stochastic process, the Poisson Indel Process (PIP), in
which the complexity of this computation is reduced to linear. The new model is
closely related to the TKF91 model, differing only in its treatment of
insertions, but the new model has a global characterization as a Poisson
process on the phylogeny. Standard results for Poisson processes allow key
computations to be decoupled, which yields the favorable computational profile
of inference under the PIP model. We present illustrative experiments in which
Bayesian inference under the PIP model is compared to separate inference of
phylogenies and alignments.Comment: 33 pages, 6 figure
- …