7,435 research outputs found

    The supervised hierarchical Dirichlet process

    Full text link
    We propose the supervised hierarchical Dirichlet process (sHDP), a nonparametric generative model for the joint distribution of a group of observations and a response variable directly associated with that whole group. We compare the sHDP with another leading method for regression on grouped data, the supervised latent Dirichlet allocation (sLDA) model. We evaluate our method on two real-world classification problems and two real-world regression problems. Bayesian nonparametric regression models based on the Dirichlet process, such as the Dirichlet process-generalised linear models (DP-GLM) have previously been explored; these models allow flexibility in modelling nonlinear relationships. However, until now, Hierarchical Dirichlet Process (HDP) mixtures have not seen significant use in supervised problems with grouped data since a straightforward application of the HDP on the grouped data results in learnt clusters that are not predictive of the responses. The sHDP solves this problem by allowing for clusters to be learnt jointly from the group structure and from the label assigned to each group.Comment: 14 page

    Generative Supervised Classification Using Dirichlet Process Priors.

    Get PDF
    Choosing the appropriate parameter prior distributions associated to a given Bayesian model is a challenging problem. Conjugate priors can be selected for simplicity motivations. However, conjugate priors can be too restrictive to accurately model the available prior information. This paper studies a new generative supervised classifier which assumes that the parameter prior distributions conditioned on each class are mixtures of Dirichlet processes. The motivations for using mixtures of Dirichlet processes is their known ability to model accurately a large class of probability distributions. A Monte Carlo method allowing one to sample according to the resulting class-conditional posterior distributions is then studied. The parameters appearing in the class-conditional densities can then be estimated using these generated samples (following Bayesian learning). The proposed supervised classifier is applied to the classification of altimetric waveforms backscattered from different surfaces (oceans, ices, forests, and deserts). This classification is a first step before developing tools allowing for the extraction of useful geophysical information from altimetric waveforms backscattered from nonoceanic surfaces

    A Tutorial on Bayesian Nonparametric Models

    Full text link
    A key problem in statistical modeling is model selection, how to choose a model at an appropriate level of complexity. This problem appears in many settings, most prominently in choosing the number ofclusters in mixture models or the number of factors in factor analysis. In this tutorial we describe Bayesian nonparametric methods, a class of methods that side-steps this issue by allowing the data to determine the complexity of the model. This tutorial is a high-level introduction to Bayesian nonparametric methods and contains several examples of their application.Comment: 28 pages, 8 figure
    • …
    corecore