5,729 research outputs found
Recommended from our members
hdpGLM: An R Package to Estimate Heterogeneous Effects in Generalized Linear Models Using Hierarchical Dirichlet Process
The existence of latent clusters with different responses to a treatment is a major concern in scientific research, as latent effect heterogeneity often emerges due to latent or unobserved features - e.g., genetic characteristics, personality traits, or hidden motivations - of the subjects. Conventional random- and fixed-effects methods cannot be applied to that heterogeneity if the group markers associated with that heterogeneity are latent or unobserved. Alternative methods that combine regression models and clustering procedures using Dirichlet process are available, but these methods are complex to implement, especially for non-linear regression models with discrete or binary outcomes. This article discusses the R package hdpGLM as a means of implementing a novel hierarchical Dirichlet process approach to estimate mixtures of generalized linear models outlined in Ferrari (2020). The methods implemented make it easy for researchers to investigate heterogeneity in the effect of treatment or background variables and identify clusters of subjects with differential effects. This package provides several features for out-of-the-box estimation and to generate numerical summaries and visualizations of the results. A comparison with other similar R packages is provided
On the posterior distribution of classes of random means
The study of properties of mean functionals of random probability measures is
an important area of research in the theory of Bayesian nonparametric
statistics. Many results are now known for random Dirichlet means, but little
is known, especially in terms of posterior distributions, for classes of priors
beyond the Dirichlet process. In this paper, we consider normalized random
measures with independent increments (NRMI's) and mixtures of NRMI. In both
cases, we are able to provide exact expressions for the posterior distribution
of their means. These general results are then specialized, leading to
distributional results for means of two important particular cases of NRMI's
and also of the two-parameter Poisson--Dirichlet process.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ200 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Nonlinear Models Using Dirichlet Process Mixtures
We introduce a new nonlinear model for classification, in which we model the
joint distribution of response variable, y, and covariates, x,
non-parametrically using Dirichlet process mixtures. We keep the relationship
between y and x linear within each component of the mixture. The overall
relationship becomes nonlinear if the mixture contains more than one component.
We use simulated data to compare the performance of this new approach to a
simple multinomial logit (MNL) model, an MNL model with quadratic terms, and a
decision tree model. We also evaluate our approach on a protein fold
classification problem, and find that our model provides substantial
improvement over previous methods, which were based on Neural Networks (NN) and
Support Vector Machines (SVM). Folding classes of protein have a hierarchical
structure. We extend our method to classification problems where a class
hierarchy is available. We find that using the prior information regarding the
hierarchical structure of protein folds can result in higher predictive
accuracy
Bayesian Nonparametric Calibration and Combination of Predictive Distributions
We introduce a Bayesian approach to predictive density calibration and
combination that accounts for parameter uncertainty and model set
incompleteness through the use of random calibration functionals and random
combination weights. Building on the work of Ranjan, R. and Gneiting, T. (2010)
and Gneiting, T. and Ranjan, R. (2013), we use infinite beta mixtures for the
calibration. The proposed Bayesian nonparametric approach takes advantage of
the flexibility of Dirichlet process mixtures to achieve any continuous
deformation of linearly combined predictive distributions. The inference
procedure is based on Gibbs sampling and allows accounting for uncertainty in
the number of mixture components, mixture weights, and calibration parameters.
The weak posterior consistency of the Bayesian nonparametric calibration is
provided under suitable conditions for unknown true density. We study the
methodology in simulation examples with fat tails and multimodal densities and
apply it to density forecasts of daily S&P returns and daily maximum wind speed
at the Frankfurt airport.Comment: arXiv admin note: text overlap with arXiv:1305.2026 by other author
Estimation in Dirichlet random effects models
We develop a new Gibbs sampler for a linear mixed model with a Dirichlet
process random effect term, which is easily extended to a generalized linear
mixed model with a probit link function. Our Gibbs sampler exploits the
properties of the multinomial and Dirichlet distributions, and is shown to be
an improvement, in terms of operator norm and efficiency, over other commonly
used MCMC algorithms. We also investigate methods for the estimation of the
precision parameter of the Dirichlet process, finding that maximum likelihood
may not be desirable, but a posterior mode is a reasonable approach. Examples
are given to show how these models perform on real data. Our results complement
both the theoretical basis of the Dirichlet process nonparametric prior and the
computational work that has been done to date.Comment: Published in at http://dx.doi.org/10.1214/09-AOS731 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …