8,924 research outputs found
Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models
The interpretation of complex high-dimensional data typically requires the
use of dimensionality reduction techniques to extract explanatory
low-dimensional representations. However, in many real-world problems these
representations may not be sufficient to aid interpretation on their own, and
it would be desirable to interpret the model in terms of the original features
themselves. Our goal is to characterise how feature-level variation depends on
latent low-dimensional representations, external covariates, and non-linear
interactions between the two. In this paper, we propose to achieve this through
a structured kernel decomposition in a hybrid Gaussian Process model which we
call the Covariate Gaussian Process Latent Variable Model (c-GPLVM). We
demonstrate the utility of our model on simulated examples and applications in
disease progression modelling from high-dimensional gene expression data in the
presence of additional phenotypes. In each setting we show how the c-GPLVM can
extract low-dimensional structures from high-dimensional data sets whilst
allowing a breakdown of feature-level variability that is not present in other
commonly used dimensionality reduction approaches
Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures
Probabilistic graphical models are a central tool in AI; however, they are
generally not as expressive as deep neural models, and inference is notoriously
hard and slow. In contrast, deep probabilistic models such as sum-product
networks (SPNs) capture joint distributions in a tractable fashion, but still
lack the expressive power of intractable models based on deep neural networks.
Therefore, we introduce conditional SPNs (CSPNs), conditional density
estimators for multivariate and potentially hybrid domains which allow
harnessing the expressive power of neural networks while still maintaining
tractability guarantees. One way to implement CSPNs is to use an existing SPN
structure and condition its parameters on the input, e.g., via a deep neural
network. This approach, however, might misrepresent the conditional
independence structure present in data. Consequently, we also develop a
structure-learning approach that derives both the structure and parameters of
CSPNs from data. Our experimental evidence demonstrates that CSPNs are
competitive with other probabilistic models and yield superior performance on
multilabel image classification compared to mean field and mixture density
networks. Furthermore, they can successfully be employed as building blocks for
structured probabilistic models, such as autoregressive image models.Comment: 13 pages, 6 figure
Transient LTRE analysis reveals the demographic and trait-mediated processes that buffer population growth.
Temporal variation in environmental conditions affects population growth directly via its impact on vital rates, and indirectly through induced variation in demographic structure and phenotypic trait distributions. We currently know very little about how these processes jointly mediate population responses to their environment. To address this gap, we develop a general transient life table response experiment (LTRE) which partitions the contributions to population growth arising from variation in (1) survival and reproduction, (2) demographic structure, (3) trait values and (4) climatic drivers. We apply the LTRE to a population of yellow-bellied marmots (Marmota flaviventer) to demonstrate the impact of demographic and trait-mediated processes. Our analysis provides a new perspective on demographic buffering, which may be a more subtle phenomena than is currently assumed. The new LTRE framework presents opportunities to improve our understanding of how trait variation influences population dynamics and adaptation in stochastic environments
Computational inference beyond Kingman's coalescent
Full likelihood inference under Kingman's coalescent is a computationally challenging problem to which importance sampling (IS) and the product of approximate conditionals (PAC) method have been applied successfully. Both methods can be expressed in terms of families of intractable conditional sampling distributions (CSDs), and rely on principled approximations for accurate inference. Recently, more general Λ- and Ξ- coalescents have been observed to provide better modelling ts to some genetic data sets. We derive families of approximate CSDs for nite sites Λ- and Ξ-coalescents, and use them to obtain "approximately optimal" IS and PAC algorithms for Λ coalescents, yielding substantial gains in efficiency over existing methods
Individual versus cluster recoveries within a spatially structured population
Stochastic modeling of disease dynamics has had a long tradition. Among the
first epidemic models including a spatial structure in the form of local
interactions is the contact process. In this article we investigate two
extensions of the contact process describing the course of a single disease
within a spatially structured human population distributed in social clusters.
That is, each site of the -dimensional integer lattice is occupied by a
cluster of individuals; each individual can be healthy or infected. The
evolution of the disease depends on three parameters, namely the outside
infection rate which models the interactions between the clusters, the within
infection rate which takes into account the repeated contacts between
individuals in the same cluster, and the size of each social cluster. For the
first model, we assume cluster recoveries, while individual recoveries are
assumed for the second one. The aim is to investigate the existence of
nontrivial stationary distributions for both processes depending on the value
of each of the three parameters. Our results show that the probability of an
epidemic strongly depends on the recovery mechanism.Comment: Published at http://dx.doi.org/10.1214/105051605000000764 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
The basic reproduction number, , in structured populations
In this paper, we provide a straightforward approach to defining and deriving
the key epidemiological quantity, the basic reproduction number, , for
Markovian epidemics in structured populations. The methodology derived is
applicable to, and demonstrated on, both and epidemics and allows
for population as well as epidemic dynamics. The approach taken is to consider
the epidemic process as a multitype process by identifying and classifying the
different types of infectious units along with the infections from, and the
transitions between, infectious units. For the household model, we show that
our expression for agrees with earlier work despite the alternative
nature of the construction of the mean reproductive matrix, and hence, the
basic reproduction number.Comment: 26 page
Partitioning variance in population growth for models with environmental and demographic stochasticity
How demographic factors lead to variation or change in growth rates can be investigated using life table response experiments (LTRE) based on structured population models. Traditionally, LTREs focused on decomposing the asymptotic growth rate, but more recently decompositions of annual 'realized' growth rates using ' transient' LTREs have gained in popularity.Transient LTREs have been used particularly to understand how variation in vital rates translate into variation in growth for populations under long-term study. For these, complete population models may be constructed to investigate how temporal variation in environmental drivers affect vital rates. Such investigations have usually come down to estimating covariate coefficients for the effects of environmental variables on vital rates, but formal ways of assessing how they lead to variation in growth rates have been lacking.We extend transient LTREs to further partition the contributions from vital rates into contributions from temporally varying factors that affect them. The decomposition allows one to compare the resultant effect on the growth rate of different environmental factors, as well as density dependence, which may each act via multiple vital rates. We also show how realized growth rates can be decomposed into separate components from environmental and demographic stochasticity. The latter is typically omitted in LTRE analyses.We illustrate these extensions with an integrated population model (IPM) for data from a 26 years study on northern wheatears (Oenanthe oenanthe), a migratory passerine bird breeding in an agricultural landscape. For this population, consisting of around 50-120 breeding pairs per year, we partition variation in realized growth rates into environmental contributions from temperature, rainfall, population density and unexplained random variation via multiple vital rates, and from demographic stochasticity.The case study suggests that variation in first year survival via the unexplained random component, and adult survival via temperature are two main factors behind environmental variation in growth rates. More than half of the variation i
- …