755 research outputs found

    Nonparametric inference in hidden Markov models using P-splines

    Full text link
    Hidden Markov models (HMMs) are flexible time series models in which the distributions of the observations depend on unobserved serially correlated states. The state-dependent distributions in HMMs are usually taken from some class of parametrically specified distributions. The choice of this class can be difficult, and an unfortunate choice can have serious consequences for example on state estimates, on forecasts and generally on the resulting model complexity and interpretation, in particular with respect to the number of states. We develop a novel approach for estimating the state-dependent distributions of an HMM in a nonparametric way, which is based on the idea of representing the corresponding densities as linear combinations of a large number of standardized B-spline basis functions, imposing a penalty term on non-smoothness in order to maintain a good balance between goodness-of-fit and smoothness. We illustrate the nonparametric modeling approach in a real data application concerned with vertical speeds of a diving beaked whale, demonstrating that compared to parametric counterparts it can lead to models that are more parsimonious in terms of the number of states yet fit the data equally well

    A shared-parameter continuous-time hidden Markov and survival model for longitudinal data with informative dropout

    Get PDF
    A shared-parameter approach for jointly modeling longitudinal and survival data is proposed. With respect to available approaches, it allows for time-varying random effects that affect both the longitudinal and the survival processes. The distribution of these random effects is modeled according to a continuous-time hidden Markov chain so that transitions may occur at any time point. For maximum likelihood estimation, we propose an algorithm based on a discretization of time until censoring in an arbitrary number of time windows. The observed information matrix is used to obtain standard errors. We illustrate the approach by simulation, even with respect to the effect of the number of time windows on the precision of the estimates, and by an application to data about patients suffering from mildly dilated cardiomyopathy

    A semiparametric extension of the stochastic block model for longitudinal networks

    Full text link
    To model recurrent interaction events in continuous time, an extension of the stochastic block model is proposed where every individual belongs to a latent group and interactions between two individuals follow a conditional inhomogeneous Poisson process with intensity driven by the individuals' latent groups. The model is shown to be identifiable and its estimation is based on a semiparametric variational expectation-maximization algorithm. Two versions of the method are developed, using either a nonparametric histogram approach (with an adaptive choice of the partition size) or kernel intensity estimators. The number of latent groups can be selected by an integrated classification likelihood criterion. Finally, we demonstrate the performance of our procedure on synthetic experiments, analyse two datasets to illustrate the utility of our approach and comment on competing methods

    Simple approximate MAP inference for Dirichlet processes mixtures

    Get PDF
    The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics
    • …
    corecore