1,131 research outputs found
Semiparametric Pseudo-Likelihood Estimation in Markov Random Fields
International audienceProbabilistic graphical models for continuous variables can be built out of either parametric or nonparametric conditional density estimators. While several research efforts have been focusing on parametric approaches (such as Gaussian models), kernel-based estimators are still the only viable and well-understood option for nonparametric density estimation. This paper develops a semiparametric estimator of probability density functions based on the nonparanormal transformation, which has been recently proposed for mapping arbitrarily distributed data samples onto normally distributed datasets. Pointwise and uniform consistency properties are established for the developed method. The resulting density model is then applied to pseudo-likelihood estimation in Markov random fields. An experimental evaluation on data distributed according to a variety of density functions indicates that such semiparametric Markov random field models significantly outperform both their Gaussian and kernel-based alternatives in terms of prediction accuracy
Semiparametric Pseudo-Likelihood Estimation in Markov Random Fields
International audienceProbabilistic graphical models for continuous variables can be built out of either parametric or nonparametric conditional density estimators. While several research efforts have been focusing on parametric approaches (such as Gaussian models), kernel-based estimators are still the only viable and well-understood option for nonparametric density estimation. This paper develops a semiparametric estimator of probability density functions based on the nonparanormal transformation, which has been recently proposed for mapping arbitrarily distributed data samples onto normally distributed datasets. Pointwise and uniform consistency properties are established for the developed method. The resulting density model is then applied to pseudo-likelihood estimation in Markov random fields. An experimental evaluation on data distributed according to a variety of density functions indicates that such semiparametric Markov random field models significantly outperform both their Gaussian and kernel-based alternatives in terms of prediction accuracy
Two likelihood-based semiparametric estimation methods for panel count data with covariates
We consider estimation in a particular semiparametric regression model for
the mean of a counting process with ``panel count'' data. The basic model
assumption is that the conditional mean function of the counting process is of
the form where is a
vector of covariates and is the baseline mean function. The ``panel
count'' observation scheme involves observation of the counting process
for an individual at a random number of random time points;
both the number and the locations of these time points may differ across
individuals. We study semiparametric maximum pseudo-likelihood and maximum
likelihood estimators of the unknown parameters derived
on the basis of a nonhomogeneous Poisson process assumption. The
pseudo-likelihood estimator is fairly easy to compute, while the maximum
likelihood estimator poses more challenges from the computational perspective.
We study asymptotic properties of both estimators assuming that the
proportional mean model holds, but dropping the Poisson process assumption used
to derive the estimators. In particular we establish asymptotic normality for
the estimators of the regression parameter under appropriate
hypotheses. The results show that our estimation procedures are robust in the
sense that the estimators converge to the truth regardless of the underlying
counting process.Comment: Published in at http://dx.doi.org/10.1214/009053607000000181 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Nonparametric inference in hidden Markov models using P-splines
Hidden Markov models (HMMs) are flexible time series models in which the
distributions of the observations depend on unobserved serially correlated
states. The state-dependent distributions in HMMs are usually taken from some
class of parametrically specified distributions. The choice of this class can
be difficult, and an unfortunate choice can have serious consequences for
example on state estimates, on forecasts and generally on the resulting model
complexity and interpretation, in particular with respect to the number of
states. We develop a novel approach for estimating the state-dependent
distributions of an HMM in a nonparametric way, which is based on the idea of
representing the corresponding densities as linear combinations of a large
number of standardized B-spline basis functions, imposing a penalty term on
non-smoothness in order to maintain a good balance between goodness-of-fit and
smoothness. We illustrate the nonparametric modeling approach in a real data
application concerned with vertical speeds of a diving beaked whale,
demonstrating that compared to parametric counterparts it can lead to models
that are more parsimonious in terms of the number of states yet fit the data
equally well
Generalized structured additive regression based on Bayesian P-splines
Generalized additive models (GAM) for modelling nonlinear effects of continuous covariates are now well established tools for the applied statistician. In this paper we develop Bayesian GAM's and extensions to generalized structured additive regression based on one or two dimensional P-splines as the main building block. The approach extends previous work by Lang und Brezger (2003) for Gaussian responses. Inference relies on Markov chain Monte Carlo (MCMC) simulation techniques, and is either based on iteratively weighted least squares (IWLS) proposals or on latent utility representations of (multi)categorical regression models. Our approach covers the most common univariate response distributions, e.g. the Binomial, Poisson or Gamma distribution, as well as multicategorical responses. For the first time, we present Bayesian semiparametric inference for the widely used multinomial logit models. As we will demonstrate through two applications on the forest health status of trees and a space-time analysis of health insurance data, the approach allows realistic modelling of complex problems. We consider the enormous flexibility and extendability of our approach as a main advantage of Bayesian inference based on MCMC techniques compared to more traditional approaches. Software for the methodology presented in the paper is provided within the public domain package BayesX
- ā¦