444 research outputs found
The identifiability of tree topology for phylogenetic models, including covarion and mixture models
For a model of molecular evolution to be useful for phylogenetic inference,
the topology of evolutionary trees must be identifiable. That is, from a joint
distribution the model predicts, it must be possible to recover the tree
parameter. We establish tree identifiability for a number of phylogenetic
models, including a covarion model and a variety of mixture models with a
limited number of classes. The proof is based on the introduction of a more
general model, allowing more states at internal nodes of the tree than at
leaves, and the study of the algebraic variety formed by the joint
distributions to which it gives rise. Tree identifiability is first established
for this general model through the use of certain phylogenetic invariants.Comment: 20 pages, 1 figur
Brain Weight and Life-Span in Primate Species
In haplorhine primates (tarsiers, monkeys, apes, and humans), there is a significant correlation between brain weight and maximum life-span when the effect of body size is removed. There is also a significant correlation in haplorhine primates between brain weight and female age at first reproduction. For strepsirhine primates (lorises and lemurs), there are no significant correlations between brain weight and either life-span or female reproductive age when the effect of body size is removed. This lack of correlation in strepsirhine primates may be related to the fact that these primates are nocturnal and/or natives of the island of Madagascar, both of which conditions may reduce competition for resources and predation pressure. These findings suggest that in haplorhine primates the genetic systems controlling brain growth are linked to the systems governing the life cycle so that species with longer cycles have larger brains. When the effect of body weight is removed, leaf-eating haplorhines have significantly smaller brains and shorter lives than haplorhines with other diets. Harem-living haplorhines also have significantly smaller brains and shorter life-spans than troop-living haplorhines when the effect of body weight is removed. We also sought to test the rate-of-living hypothesis by determining whether primates with basal metabolic rates that are higher than would be expected for their body size have shorter maximum life-spans than would be expected for their body size. Metabolic rate is not correlated with life-span or female age at first reproduction when the effect of body size is removed
Identifying evolutionary trees and substitution parameters for the general Markov model with invariable sites
The general Markov plus invariable sites (GM+I) model of biological sequence
evolution is a two-class model in which an unknown proportion of sites are not
allowed to change, while the remainder undergo substitutions according to a
Markov process on a tree. For statistical use it is important to know if the
model is identifiable; can both the tree topology and the numerical parameters
be determined from a joint distribution describing sequences only at the leaves
of the tree? We establish that for generic parameters both the tree and all
numerical parameter values can be recovered, up to clearly understood issues of
`label swapping.' The method of analysis is algebraic, using phylogenetic
invariants to study the variety defined by the model. Simple rational formulas,
expressed in terms of determinantal ratios, are found for recovering numerical
parameters describing the invariable sites
Charlie and Joyce at the Intersection of Science and Art
One evening over sushi in late 2014, Doris Tsao and I came up with the idea of inviting Charlie Gross and Joyce Carol Oates to come to Caltech to have a public conversation about science and art (Fig. 1). The idea was received enthusiastically by our colleagues, which led to the Division of Biology and Biological Engineering and the Division of the Humanities and Social Sciences through the James Michelin Distinguished Visitors program sponsoring “At the Intersection of Art and Sciences: a conversation with Joyce Carol Oates and Charles Gross” which played to a packed house of more than 300 at Caltech on April 29, 2015. Here is one image grabbed from the video made of their conversation. Charlie and Joyce are discussing how the amnesic subject, HM, whom she learned about from Charlie, inspired her to write A Man Without A Shadow. Her novel is about a neuropsychologist who is studying HM and falls in love with him even though he cannot remember her! Their fascinating conversation can be seen at: https://www.youtube.com/watch?v=dyRuSTjQmVg
Charlie and Joyce at the Intersection of Science and Art
One evening over sushi in late 2014, Doris Tsao and I came up with the idea of inviting Charlie Gross and Joyce Carol Oates to come to Caltech to have a public conversation about science and art (Fig. 1). The idea was received enthusiastically by our colleagues, which led to the Division of Biology and Biological Engineering and the Division of the Humanities and Social Sciences through the James Michelin Distinguished Visitors program sponsoring “At the Intersection of Art and Sciences: a conversation with Joyce Carol Oates and Charles Gross” which played to a packed house of more than 300 at Caltech on April 29, 2015. Here is one image grabbed from the video made of their conversation. Charlie and Joyce are discussing how the amnesic subject, HM, whom she learned about from Charlie, inspired her to write A Man Without A Shadow. Her novel is about a neuropsychologist who is studying HM and falls in love with him even though he cannot remember her! Their fascinating conversation can be seen at: https://www.youtube.com/watch?v=dyRuSTjQmVg
Identifiability of parameters in latent structure models with many observed variables
While hidden class models of various types arise in many statistical
applications, it is often difficult to establish the identifiability of their
parameters. Focusing on models in which there is some structure of independence
of some of the observed variables conditioned on hidden ones, we demonstrate a
general approach for establishing identifiability utilizing algebraic
arguments. A theorem of J. Kruskal for a simple latent-class model with finite
state space lies at the core of our results, though we apply it to a diverse
set of models. These include mixtures of both finite and nonparametric product
distributions, hidden Markov models and random graph mixture models, and lead
to a number of new results and improvements to old ones. In the parametric
setting, this approach indicates that for such models, the classical definition
of identifiability is typically too strong. Instead generic identifiability
holds, which implies that the set of nonidentifiable parameters has measure
zero, so that parameter inference is still meaningful. In particular, this
sheds light on the properties of finite mixtures of Bernoulli products, which
have been used for decades despite being known to have nonidentifiable
parameters. In the nonparametric setting, we again obtain identifiability only
when certain restrictions are placed on the distributions that are mixed, but
we explicitly describe the conditions.Comment: Published in at http://dx.doi.org/10.1214/09-AOS689 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …