73 research outputs found

    Learning latent features with infinite non-negative binary matrix tri-factorization

    Get PDF
    Non-negative Matrix Factorization (NMF) has been widely exploited to learn latent features from data. However, previous NMF models often assume a fixed number of features, saypfeatures, wherepis simply searched by experiments. Moreover, it is even difficult to learn binary features, since binary matrix involves more challenging optimization problems. In this paper, we propose a new Bayesian model called infinite non-negative binary matrix tri-factorizations model (iNBMT), capable of learning automatically the latent binary features as well as feature number based on Indian Buffet Process (IBP). Moreover, iNBMT engages a tri-factorization process that decomposes a nonnegative matrix into the product of three components including two binary matrices and a non-negative real matrix. Compared with traditional bi-factorization, the tri-factorization can better reveal the latent structures among items (samples) and attributes (features). Specifically, we impose an IBP prior on the two infinite binary matrices while a truncated Gaussian distribution is assumed on the weight matrix. To optimize the model, we develop an efficient modified maximization-expectation algorithm (ME-algorithm), with the iteration complexity one order lower than another recently-proposed Maximization-Expectation-IBP model[9]. We present the model definition, detail the optimization, and finally conduct a series of experiments. Experimental results demonstrate that our proposed iNBMT model significantly outperforms the other comparison algorithms in both synthetic and real data

    Simple approximate MAP Inference for Dirichlet processes

    Full text link
    The Dirichlet process mixture (DPM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibb's sampling are required. As a result, DPM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithms for DPMs. This algorithm is as simple as K-means clustering, performs in experiments as well as Gibb's sampling, while requiring only a fraction of the computational effort. Unlike related small variance asymptotics, our algorithm is non-degenerate and so inherits the "rich get richer" property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables standard tools such as cross-validation to be used. This is a well-posed approximation to the MAP solution of the probabilistic DPM model.Comment: 11 pages, 4 Figures, 5 Table

    Simple approximate MAP inference for Dirichlet processes mixtures

    Get PDF
    The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics

    Scattering statistics of rock outcrops: Model-data comparisons and Bayesian inference using mixture distributions

    Get PDF
    The probability density function of the acoustic field amplitude scattered by the seafloor was measured in a rocky environment off the coast of Norway using a synthetic aperture sonar system, and is reported here in terms of the probability of false alarm. Interpretation of the measurements focused on finding appropriate class of statistical models (single versus two-component mixture models), and on appropriate models within these two classes. It was found that two-component mixture models performed better than single models. The two mixture models that performed the best (and had a basis in the physics of scattering) were a mixture between two K distributions, and a mixture between a Rayleigh and generalized Pareto distribution. Bayes' theorem was used to estimate the probability density function of the mixture model parameters. It was found that the K-K mixture exhibits significant correlation between its parameters. The mixture between the Rayleigh and generalized Pareto distributions also had significant parameter correlation, but also contained multiple modes. We conclude that the mixture between two K distributions is the most applicable to this dataset.Comment: 15 pages, 7 figures, Accepted to the Journal of the Acoustical Society of Americ
    corecore