250 research outputs found

    A survey on Bayesian nonparametric learning

    Full text link
    Ā© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. Bayesian (machine) learning has been playing a significant role in machine learning for a long time due to its particular ability to embrace uncertainty, encode prior knowledge, and endow interpretability. On the back of Bayesian learning's great success, Bayesian nonparametric learning (BNL) has emerged as a force for further advances in this field due to its greater modelling flexibility and representation power. Instead of playing with the fixed-dimensional probabilistic distributions of Bayesian learning, BNL creates a new ā€œgameā€ with infinite-dimensional stochastic processes. BNL has long been recognised as a research subject in statistics, and, to date, several state-of-the-art pilot studies have demonstrated that BNL has a great deal of potential to solve real-world machine-learning tasks. However, despite these promising results, BNL has not created a huge wave in the machine-learning community. Esotericism may account for this. The books and surveys on BNL written by statisticians are overcomplicated and filled with tedious theories and proofs. Each is certainly meaningful but may scare away new researchers, especially those with computer science backgrounds. Hence, the aim of this article is to provide a plain-spoken, yet comprehensive, theoretical survey of BNL in terms that researchers in the machine-learning community can understand. It is hoped this survey will serve as a starting point for understanding and exploiting the benefits of BNL in our current scholarly endeavours. To achieve this goal, we have collated the extant studies in this field and aligned them with the steps of a standard BNL procedure-from selecting the appropriate stochastic processes through manipulation to executing the model inference algorithms. At each step, past efforts have been thoroughly summarised and discussed. In addition, we have reviewed the common methods for implementing BNL in various machine-learning tasks along with its diverse applications in the real world as examples to motivate future studies

    On Monte Carlo methods for the Dirichlet process mixture model, and the selection of its precision parameter prior

    Get PDF
    Two issues commonly faced by users of Dirichlet process mixture models are: 1) how to appropriately select a hyperprior for its precision parameter alpha, and 2) the typically slow mixing of the MCMC chain produced by conditional Gibbs samplers based on its stick-breaking representation, as opposed to marginal collapsed Gibbs samplers based on the Polya urn, which have smaller integrated autocorrelation times. In this thesis, we analyse the most common approaches to hyperprior selection for alpha, we identify their limitations, and we propose a new methodology to overcome them. To address slow mixing, we revisit three label-switching Metropolis moves from the literature (Hastie et al., 2015; Papaspiliopoulos and Roberts, 2008), improve them, and introduce a fourth move. Secondly, we revisit two i.i.d. sequential importance samplers which operate in the collapsed space (Liu, 1996; S. N. MacEachern et al., 1999), and we develop a new sequential importance sampler for the stick-breaking parameters of Dirichlet process mixtures, which operates in the stick-breaking space and which has minimal integrated autocorrelation time. Thirdly, we introduce the i.i.d. transcoding algorithm which, conditional to a partition of the data, can infer back which specific stick in the stick-breaking construction each observation originated from. We use it as a building block to develop the transcoding sampler, which removes the need for label-switching Metropolis moves in the conditional stick-breaking sampler, as it uses the better performing marginal sampler (or any other sampler) to drive the MCMC chain, and augments its exchangeable partition posterior with conditional i.i.d. stick-breaking parameter inferences after the fact, thereby inheriting its shorter autocorrelation times

    Structured Mixture of Continuation-ratio Logits Models for Ordinal Regression

    Full text link
    We develop a nonparametric Bayesian modeling approach to ordinal regression based on priors placed directly on the discrete distribution of the ordinal responses. The prior probability models are built from a structured mixture of multinomial distributions. We leverage a continuation-ratio logits representation to formulate the mixture kernel, with mixture weights defined through the logit stick-breaking process that incorporates the covariates through a linear function. The implied regression functions for the response probabilities can be expressed as weighted sums of parametric regression functions, with covariate-dependent weights. Thus, the modeling approach achieves flexible ordinal regression relationships, avoiding linearity or additivity assumptions in the covariate effects. A key model feature is that the parameters for both the mixture kernel and the mixture weights can be associated with a continuation-ratio logits regression structure. Hence, an efficient and relatively easy to implement posterior simulation method can be designed, using P\'olya-Gamma data augmentation. Moreover, the model is built from a conditional independence structure for category-specific parameters, which results in additional computational efficiency gains through partial parallel sampling. In addition to the general mixture structure, we study simplified model versions that incorporate covariate dependence only in the mixture kernel parameters or only in the mixture weights. For all proposed models, we discuss approaches to prior specification and develop Markov chain Monte Carlo methods for posterior simulation. The methodology is illustrated with several synthetic and real data examples
    • ā€¦
    corecore