360 research outputs found

    Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture

    Get PDF
    This paper presents a novel algorithm, based upon the dependent Dirichlet process mixture model (DDPMM), for clustering batch-sequential data containing an unknown number of evolving clusters. The algorithm is derived via a low-variance asymptotic analysis of the Gibbs sampling algorithm for the DDPMM, and provides a hard clustering with convergence guarantees similar to those of the k-means algorithm. Empirical results from a synthetic test with moving Gaussian clusters and a test with real ADS-B aircraft trajectory data demonstrate that the algorithm requires orders of magnitude less computational time than contemporary probabilistic and hard clustering algorithms, while providing higher accuracy on the examined datasets.Comment: This paper is from NIPS 2013. Please use the following BibTeX citation: @inproceedings{Campbell13_NIPS, Author = {Trevor Campbell and Miao Liu and Brian Kulis and Jonathan P. How and Lawrence Carin}, Title = {Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process}, Booktitle = {Advances in Neural Information Processing Systems (NIPS)}, Year = {2013}

    Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks

    Full text link
    We present a novel Bayesian nonparametric regression model for covariates X and continuous, real response variable Y. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F(y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates
    • …
    corecore